Calc Guide

Chapter 8
Using the DataPilot





Copyright

This document is Copyright © 2009–2011 by its contributors as listed below. You may distribute it and/or modify it under the terms of the Creative Commons Attribution-Share Alike License (http://creativecommons.org/licenses/by-sa/3.0/), version 3.0 or later.

All trademarks within this guide belong to their legitimate owners.

Authors

Barbara Duprey
Hal Parker

Feedback

Please direct any comments or suggestions about this document to: documentation@libreoffice.org

Acknowledgments

This chapter is based on Chapter 8 of the OpenOffice.org 3.3 Calc Guide, which was adapted from a German original written by Stefan Weigel and translated into English by Sigrid Kronenberger. Other contributors to that chapter are:

Jean Hollis Weber Andy Brown Sharon Whiston
Claire Wood Martin Fox

Publication date and software version

Published 23 January 2011. Based on LibreOffice 3.3.

Note for Mac users

Some keystrokes and menu items are different on a Mac from those used in Windows and Linux. The table below gives some common substitutions for the instructions in this chapter. For a more detailed list, see the application Help.

Windows/Linux

Mac equivalent

Effect

Tools Options menu selection

LibreOffice Preferences

Access setup options

Right-click

Control+click

Open context menu

Ctrl (Control)

z (Command)

Used with other keys

F5

Shift+z+F5

Open the Navigator

F11

z+T

Open Styles & Formatting window

Contents

Copyright 2

Note for Mac users 2

Introduction 5

Examples with step by step descriptions 5

Example 1: Sales volume overview 5

Practical problems and questions 6

Solution 6

Advantages 8

Example 2: Timekeeping 12

Practical problems and questions 13

Solution 13

Differences and advantages 17

Example 3: Frequency distribution 17

Solution with a matrix formula 20

Solution with the DataPilot 20

DataPilot functions in detail 24

The database (preconditions) 24

Start 25

Data source 25

Calc spreadsheet 25

Registered data source 26

External data (OLAP) 26

The DataPilot dialog 26

Basic layout 26

More options 27

More settings for the fields – Field Options 29

Data fields 29

Row and column fields 30

Page fields 34

Working with the results of the DataPilot 34

Start the dialog 34

Change layout by using drag and drop 34

Grouping rows or columns 34

Grouping of categories with scalar values 34

Grouping of categories with date or time values 35

Grouping without the automatic creation of intervals 36

Sorting the result 38

Drilling (showing details) 39

Filtering 41

Updating (refreshing) changed values 41

Cell formatting 42

Multiple data fields 42

Shortcuts 45

Function GETPIVOTDATA 45

Difficulty 45

Syntax 46

First syntax variation 47

Second syntax variation 47



Introduction

Many requests for software support are the result of using complicated formulas and solutions to solve simple day to day problems. More efficient and effective solutions use the DataPilot, a tool for combining, comparing, and analyzing large amounts of data easily. Using the DataPilot, you can view different summaries of the source data, display the details of areas of interest, and create reports, whether you are a beginner, an intermediate or advanced user.

This chapter is divided into two sections:

Examples with step by step descriptions

This section demonstrates some of the possibilities of the DataPilot in typical use cases. By following the step by step description, you can recreate the examples and learn about the power of the DataPilot.

Example 1: Sales volume overview

A typical introductory example in courses and books for beginners with spreadsheets is a simple sales volume overview.

Rahmen1

This example demonstrates the user interface and how to insert text and numbers into cells. Useful aids like AutoFill and drag and drop have been demonstrated in other chapters. The most important part is the connection between cells through formulas, for example addition with the plus operator or the SUM function.

This small exercise might be useful for a first contact with the program, but it shows only a very small fraction of the tasks in an office. To create such a sales overview, you also need the original base data. That is, before you can use a spreadsheet to create a sales overview, you need to add the (possibly many) single purchases from different lists and then enter the sums into the relevant cells C5 to F7.

Practical problems and questions

Solution

The most important part of your task in the example is the addition of the Total sales per month cells, which had to be done manually. To do this automatically with the program, just get the data into Calc. You can enter the single numbers by hand or you can import a file from your bookkeeping software. In any case we assume a continuous table that keeps track of all sales in a somewhat primitive form.

Rahmen2

You can create the sales volume overview by following these instructions:

  1. Select the cell A1 (or any other single cell within the list).

  2. Select Data DataPilot Start. On the Select Source dialog, choose Current selection and click OK.

  3. The DataPilot dialog (Figure 3) has four white layout areas and several fields that look like buttons. These small fields are the titles of the different columns of your list.

  1. Click More, to see more options in the lower part of the dialog.

  2. In the Results to drop-down list, select – new sheet –.

  3. Click OK.

Rahmen3

  1. The result appears on a new sheet. It has the desired structure, but the columns are not yet grouped into months.

Rahmen4

  1. To group the columns, select cell B4 or any other cell that contains a date. Then select Data Group and Outline Group. On the Grouping dialog (Figure 5), make sure Intervals and Months are selected in the Group by section, and click OK. The result is now grouped for months (Figure 6).

Frame2

Rahmen5

In this result you will recognize the beginners’ example. It is very easy to produce, without any further knowledge about the spreadsheet. You do not have to enter any formulas.

Advantages

  1. No manual entering or adding of any values is necessary. There is less work and fewer errors.

  2. The layout is very flexible: months horizontal and categories vertical or vice versa, in two mouse clicks.

  3. Additional differentiating factors are immediately available.

  4. Many types of evaluation are possible; for example, number or average instead of sum, accumulated values, comparisons, and so on.

We will now demonstrate some of these advantages.

Starting with the result of Figure 6, use the mouse to drag the Date field under the Category field, as shown in Figure 7.

Frame7

Now the summary is as shown in Figure 8.

Rahmen6

To transpose the table completely, just drag the Category field above the area of the displayed values, to cell C3 (see Figure 9). The result of this action is shown in Figure 10.

Frame8

Rahmen7

In contrast to the beginners’ example in Figure 1, it is now very simple to view or add different aspects of the underlying data. For example, to see the values for different regions, do the following:

  1. Select the cell A3 (or any other single cell that is part of the DataPilot result).

  2. Select Data DataPilot Start to start the DataPilot again.

Drag the Region field into the Row Fields area. Depending on the order you choose for the row fields, the result is either regions with date subdivisions or vice versa.

  1. Click OK. The result is shown in Figure 11.

In another variation you may want to add the employees.

  1. Select the cell A3 (or any other single cell that is part of the DataPilot result).

  2. Select Data DataPilot Start, to start the DataPilot again.

  1. Click OK. The result is shown in Figure 12.

Fields that you use as page fields are placed in the result above the summary with the name Filter. You then have a drop-down list that you can use to show only the sums of a given employee.

Rahmen8

Rahmen9

Up to now we have not seen the most powerful features of the DataPilot. The following examples will show you more.

Example 2: Timekeeping

This example is often used by consultants and in several variations in user support. The task is to provide a means for one or more users to keep track of working hours.

Note

In the real world timekeeping requires tracking working time on individual projects. Specialized database software is often used but for simple applications in a small company a Calc spreadsheet and the Data Pilot might be all that is needed to track work done.

A typical way of doing this is to create a spreadsheet per month and a sum sheet with all the results of one year. For each employee, there is one file (see Figures 13 and 14 for examples of two pages from the file for one employee).

Rahmen10

Rahmen11

Practical problems and questions

Solution

To use the DataPilot for this task, collect all the data into one single table. This can be done either manually or by importing a file from an (electronic) timekeeping machine.

In very simple cases, each employee takes care of their own working hours. If you need calculations that cover several employees, departments, or the whole company, just copy everything into one huge table (Figure 15).

Rahmen12

Using the DataPilot requires only 12 mouse clicks and gives you a nice overview within seconds:

  1. Select the cell A1 (or any other single cell within the list).

  2. Choose Data DataPilot Start and click OK.

  3. On the DataPilot dialog (Figure 16):

Frame1

  1. Click More to show more options in the lower part of the dialog.

  2. Choose – new sheet – for Results to.

  3. Click OK.

The result appears on a new sheet (Figure 17).

Rahmen14

The result is much more powerful than is possible with the classic formula-based calculation. For example, you can summarize the daily results to a monthly result very easily:

  1. To group together the rows, select the cell A5 (or any other cell that contains a date).

  2. Choose Data Group and Outline Group . On the Grouping dialog, leave Start and End as Automatically; in the Group by section, choose Intervals and Months. Click OK. The result is now grouped into months.

Rahmen15

If you need a result with a percentage, start the DataPilot from this page.

  1. Select the cell A3 (or any other single cell that contains a result of the DataPilot).

  2. Choose Data DataPilot Start.

  3. Double-click on Sum – hours to open the Data Field dialog (Figure 19).

  4. Click on More, to see more options.

  5. Switch Displayed value Type to % of column.

  6. Click OK to return to the DataPilot dialog, then click OK again.

The result shows percentages as a decimal number. If you prefer to have a percent format, select the cells and click on the % icon in the menu bar (Figure 20).

Rahmen16

Rahmen17

To get a comparison between employees, start the DataPilot again from the output sheet:

  1. Select the cell A3 (or any other cell that contains a result of the DataPilot).

  2. Choose Data DataPilot Start.

  3. Double-click on Sum – hours to open the Data Field dialog.

  4. Click More, to see more options.

  1. Click OK to return to the DataPilot dialog, then click OK again.

Rahmen19

As a final example we switch to an accumulated view; that is, continuing sums of all values:

  1. Select the cell A3 (or any other cell that contains a result of the DataPilot).

  2. Choose Data DataPilot Start.

  3. Double-click on Sum – hours to open the Data Field dialog.

  4. Click More, to see more options.

  1. Click OK to return to the DataPilot dialog, then click OK again.

Rahmen21

Differences and advantages

These examples show an important aspect of the DataPilot. Normally you have to collect your data according to the way you want the result to be represented. This means you have to use a specific structure and you are stuck with it.

The DataPilot works more like a real database. The source data are collected in a simple spreadsheet that contains all data. Only when you want to look at it do you select which part of the data you want to use.

Example 3: Frequency distribution

For showing the frequency of incidents, Calc uses the function FREQUENCY. This formula has to be used in a so-called matrix formula. This is an advanced feature. Alternatively you can use the DataPilot, which requires almost no knowledge of spreadsheets.

In our example we want to investigate the number of emails that go to the Germanophone support mailing list. We want to know how the activity on the list is distributed during the day.

The starting point for this example is the mailbox file of the Thunderbird mail client. The command (in Linux)

grep 'Delivery-Date:' Inbox > users.txt

creates a text file that contains, for the timespan of more than 2 years, one line for each message, with date and time. (See Figure 23.)

Rahmen22

To import the data into a table in Calc, choose Insert Sheet From File and select the text file users.txt containing the raw data. The Text Import dialog (Figure 24) appears. Here you can choose your import options. Use the following properties:

Separator options: Fixed width

Divide into columns at position 20, 31 and 40

Column types:

Column 1: Hide
Column 2: Date (DMY)
Column 3: Standard
Column 4: Hide

Rahmen24

Figure 25 shows the imported raw data with a row added at the top containing titles for each column.

Rahmen23

Solution with a matrix formula

To calculate the frequency you have to create 24 classes, one for each hour. In the next column you enter the number of emails with the function FREQUENCY.

Rahmen25

The first argument is the cell area with the times of all approximately 38,000 emails. The second argument is the cell area E2:E25 that describes the frequency classes. To enter the formula, first select the cell area F2:F25, then enter the formula. Then finish the formula by using the key combination Shift+Ctrl+Enter. This indicates to the program that you want to use a matrix formula. To indicate the matrix formula, the program uses curly brackets.

This technique is most often known only to (and used by) advanced users.

Solution with the DataPilot

With the DataPilot you can achieve the same result much easier and faster. The solution is also possible for less advanced users. Starting with the raw data (Figure 25), you need only a few mouse clicks.

  1. Select the cell A1 (or any other cell within the list).

  2. Choose Data DataPilot Start and click OK.

  3. In the DataPilot:

    Rahmen26

  1. Click More to show more options in the lower part of the dialog.

  2. Choose – new sheet – for Results to.

  3. In this case we need to count the number of values, not their sum. Double-click on Sum – Date to open the Data Field dialog and select the function Count (see Figure 28).

  4. Click OK. As an intermediate result, you get a DataPilot table that has for every time within the raw data a separate line.

Rahmen27

Note

This may be a very time-consuming process because of the large number of items. The time does not depend that much on the number of lines but rather on the number of rows needed for the table that contains the results.

  1. For grouping the rows, select the cell A4 or any other cell that contains a time.

  2. Choose Data Group and Outline Group, select for the interval Hours and click OK. The result is now grouped according to hours as shown in Figure 31.

    Rahmen28

  1. Restart the DataPilot and double-click on the Sum-Date data field. Figure 30 shows the Data Field dialog for the data field Number – Date. Click More and select as type % of column. The result is shown in Figure 32. Figure 31 shows the absolute occurrence.

Whether the relative values are shown as a decimal (0.1) or as a percentage (10%) depends only on the cell formatting itself and has nothing to do with the DataPilot.

Rahmen31

Rahmen29 Rahmen30

DataPilot functions in detail

This part describes the use and options of the DataPilot in detail.

The database (preconditions)

The first thing needed to work with the DataPilot is a list of raw data, similar to a database table, consisting of rows (data sets) and columns (data fields). The field names are in the first row above the list.

The data source could be an external file or database. For the simplest case, where data is contained in a Calc spreadsheet, Calc offers sorting functions that do not require the DataPilot.

For processing data in lists, the program needs to know where in the spreadsheet the table is. The table can be anywhere in the sheet, in any position. A spreadsheet can contains several unrelated tables.

Calc recognizes your lists automatically. It uses the following logic:

Starting from the cell you’ve selected (which must be within your list), Calc checks the surrounding cells in all 4 directions (left, right, above, below). The border is recognized if the program discovers an empty row or column, or if it hits the left or upper border of the spreadsheet.

This means that the described functions can only work correctly if there are no empty rows or columns in your list. Avoid empty lines (for example for formatting). You can format your list by using cell formats.

Rule

No empty rows or empty columns are allowed within lists.

If you select more than one single cell before you start sorting, filtering or calling the DataPilot, then the automatic list recognition is switched off. Calc assumes that the list matches exactly the cells you have selected.

Rule

For sorting, filtering, or using the DataPilot, always select only one cell.

A relatively common source of errors is to inadvertently declare a list by mistake and then sort the list. If you select multiple cells (for example, a whole column) then the sorting mixes up the data that should be together in one row.

In addition to these formal aspects, the logical structure of your table is very important when using the DataPilot.

Rule

Calc lists must have the normal form; that is, they must have a simple linear structure.

When entering the data, do not add outlines, groups, or summaries. This becomes clear when we think about what we could have done wrong in our Sales list example in the section “Examples with step by step descriptionsstarting at page 5. Here are some mistakes commonly made by inexperienced spreadsheet users:

  1. You made several sheets, for example, a sheet for each group of articles. Analyses are then possible only within each group. Analyses for several groups would be a lot of work.

  2. In the Sales list, instead of only one column for the amount, you made a column for the amounts for each employee. The amounts then had to be entered into the appropriate column. An analysis with the DataPilot would not be possible any more. In contrast, one result of the DataPilot is that you can get results for each employee if you have entered everything in one column.

  3. You entered the amounts in chronological order. At the end of each month you made a sum total. In this case, sorting the list for different criteria is not possible because the DataPilot will treat the sum totals the same as any other figure. Getting monthly results is one of the very fast and easy features of the DataPilot.

Start

Start the DataPilot with Data DataPilot Start. If the list to be analyzed is in a spreadsheet table, select only one cell within this list. Calc recognizes and selects the list automatically for use with the DataPilot.

Rahmen32

Data source

There are three possible data sources for the DataPilot: a Calc spreadsheet, an external data source that has to be registered in LibreOffice, or access to an OLAP system (not available at this time).

Calc spreadsheet

The simplest and most often used case is analyzing a list in a Calc spreadsheet. The list might be updated regularly or the data might be imported from a different application.

For example, a huge list can be copied from a different application and pasted into Calc. The behavior of Calc while inserting the data depends on the format of the data. If the data is in a common spreadsheet format, it is copied directly into Calc. However, if the data is in plain text, the Text Import dialog appears; see Figure 24 in this chapter and Chapter 1, Introducing Calc, for more more information.

Calc can import data from a huge number of foreign data formats, for example from other spreadsheets (Excel, Lotus 1, 2, 3), from databases (like dBase), and from simple text files including CSV formats.

The drawback of copying or importing foreign data is that it will not update automatically if there are changes in the source file. With a Calc file you were previously limited to 65,535 rows but this has been expanded to one million rows.

Registered data source

A registered data source in LibreOffice is a connection to data held in a database outside Calc. This means that the data to be analyzed will not be saved in Calc; Calc always uses the data from the original source. Calc is able to use many different data sources and also databases that are created and maintained with LibreOffice Base. See Chapter 10, Linking Calc Data, for more information.

External data (OLAP)

No implementation exists at the moment so this option is not available.

The DataPilot dialog

The function of the DataPilot is managed in two places: first in the DataPilot dialog and second through manipulations of the result in the spreadsheet. This section describes the dialog in detail.

Basic layout

In the DataPilot (Figure 34) are four white areas that show the layout of the result. Beside these white areas are buttons with the names of the fields in your data source. To choose a layout, drag and drop the field buttons into the white areas.

The Data Fields area in the middle must contain at least one field. Advanced users can use more than one field here. For the data field an aggregate function is used. For example, if you move the sales field into the Data Fields area, it appears there as Sum – sales.

Rahmen33

Row Fields and Column Fields indicate in which groups the result will be sorted in the rows and columns. If there are no entries in one of these area, then partial sums will not be provided for the corresponding rows or columns. Often more than one field is used at the same time get partial sums for rows or columns. The order of the fields gives the order of the sums from overall to specific.

For example, if you drag region and employee into the Row Fields area, the sum will be divided into the employees. Within the employees will be the listing for the different regions.

Fields that are placed into the Page Fields area appear in the result above as a drop down list. The summary in your result takes only that part of your base data into account that you have selected. For example, if you use employee as a page field, you can filter the result shown for each employee.

To remove a field from the white layout area just drag it to the border and drop it (the cursor will change to a crossed symbol), or click the Remove button.

More options

To expand the DataPilot and show more options, click More.

Rahmen34

Selection from

Shows the range of cells used in the DataPilot.

Results to

Results to defines where your result will be shown. If you do not enter anything, the DataPilot will put your result below the list that contains your data. This could result in overwriting any data that is already in that location. To avoid overwriting data, you can leave Results to as – undefined – and enter a cell reference to tell the DataPilot where to show the results.1 However, a generally better approach is to use - new sheet – to add a new sheet to the spreadsheet file and place the results there.

Ignore empty rows

If the source data is not in the recommended form, this option tells the DataPilot to ignore empty rows.

Identify categories

If the source data has missing entries in a list and does not meet the recommended data structure (see Figure 36), the DataPilot adds it to the listed category above it. If this option is not chosen, then the DataPilot inserts (empty) (see Figure 38).

Frame3

The option Identify categories ensures that in this case rows 3 and 4 are added to the product Apples and row 6 is added to Pears (see Figure 37).

Frame9

Without category recognition, the DataPilot shows an (empty) category (Figure 38).

Frame4

Logically, the behavior without category recognition is better. A list with the shown missing entries is also less useful, because you cannot use other functions such as sorting or filtering.

Total columns / total rows

With this option you decide if the DataPilot will show an extra row with the sums of each column or if it adds on the very right a column with the sums of each row. In some cases, an added total sum is meaningless, for example if your entries are accumulated or the result of comparisons.

Add filter

Use this option to add or hide the cell labelled Filter above the DataPilot results. This cell is a convenient button for additional filtering options within the DataPilot.

Enable drill to details

If you double-click on a single cell in the DataPilot result, this function gives a more detailed listing of an individual entry. If this function is disabled, the double-click will keep its usual edit function within a spreadsheet.

More settings for the fields – Field Options

The options discussed in the previous section are valid for the DataPilot in general. Additionally, you can change settings for every field that you have added to the DataPilot layout. Do this either by clicking on the Options button in the DataPilot dialog or by double-clicking on the appropriate field.

There are differences between data fields, row or column fields, and page fields of the DataPilot.

Data fields

In the Options dialog of a data field you can select the Sum function for accumulating the values from your data source. In many cases you will need the sum function, but other functions (like standard distribution or a counting function) are also available. For example, the counting function can be useful for non-numerical data fields.

On the Data Field dialog, click More to see more options.

Rahmen35

In the Displayed value section, you can choose other possibilities for analysis by using the aggregate function. Depending on the setting for Type, you may have to choose definitions for Base field and Base element. The table below lists the possible types of displayed value and associated base field and element, together with a note on usage.

Type

Base field

Base element

Analysis

Normal

Simple use of the chosen aggregate function (for example, sum)

Difference from

Selection of a field from the data source of the DataPilot (for example, employee)

Selection of an element from the selected base field (for example, Brigitte)

Result as difference to the result of the base element (for example, Sales volume of the employees as difference of the sales volume of Brigitte)

% of

Selection of a field from the data source of the DataPilot (for example, employee)

Selection of an element from the selected base field (for example, Brigitte)

Result as a ratio based on the result of the base element (for example, Sales result of the employee relative to the sales result of Brigitte)

% difference from

Selection of a field from the data source of the DataPilot (for example, employee)

Selection of an element from the selected base field (for example. Brigitte)

Result as relative difference to the result of the base element (for example, Sales volume of the employees as relative difference of the sales volume of Brigitte)

Running total in

Selection of a field from the data source of the DataPilot (for example, date)

Result as a continuing sum (for example, Continuing sum of the sales volume for days or months)

% of row

Result as relative part of the result in the whole row (for example the row sum)

% of column

Result as relative part of the total column (for example, the column sum)

% of total

Result as relative part of the overall result (for example the total sum)

Index

Default result x total result / (row result x column result)

Row and column fields

In the Options dialog for the row or column fields, you can choose to show partial sums for each category. Partial sums are deactivated by default. They are only useful if the values in one row or column field can be divided into partial sums for another (sub)field.

Some examples are shown in the next three figures.

Rahmen37

Rahmen38

Rahmen39

Choose the option Automatically to use the aggregate function for the partial results that can also be used for the data fields (see above). To set up the aggregate function for the partial results independently from the overall settings of the DataPilot, choose User-defined.

Rahmen36

Normally, the DataPilot does not show a row or column for categories that have no entries in the underlying database. By choosing the Show items with no data option, you can force this.

For illustration purposes, the data was manipulated in such a way, that the employee Brigitte has no sales values for the category golfing.

Rahmen40

Rahmen41

Page fields

The Options dialog for page fields is the same as for row and column fields, even though it appears to be useless to have the same settings as described for the row and column fields. With the flexibility of the DataPilot you can switch the different fields between pages, columns or rows. The fields keep the settings that you made for them. The page field has the same properties as a row or column field. These settings only take effect when you use the field not as page field but as row or column field.

Working with the results of the DataPilot

As mentioned above, the DataPilot is very flexible. An analysis can be totally restructured with only a few mouse clicks. Some functions of the DataPilot can be used only with the results of an analysis.

Start the dialog

Right-click in the area of the resulting table of the DataPilot. The command Start opens the DataPilot dialog with all current settings.

Change layout by using drag and drop

The easiest and fastest method to change the layout of the DataPilot is drag and drop. Within the result table of the DataPilot, move one of the page, column, or row fields to a different position (see “Examples with step by step descriptions” starting on page 5).

You can remove a column, row, or page field from the DataPilot by clicking on and dragging it out of the DataPilot.

Grouping rows or columns

For many analyses or summaries, the categories have to be grouped. You can merge the results in classes or periods. In the DataPilot you do a grouping only after you have first made an ungrouped DataPilot table.

You can access the grouping with the menu entry Data Group and Outline Group or by pressing F12. It is important that you select the correct cell area. How the grouping function works is determined mainly by the type of values that have to be grouped. You need to distinguish between scalar values, date or time values, or other values, such as text, that you want grouped.

Note

Before you can group, you have to produce a DataPilot with ungrouped data. The time needed for creating a DataPilot depends mostly on the number of columns and rows and not on the size of the basic data. Through grouping you can produce the DataPilot with a small number of rows and columns. The DataPilot can contain a lot of categories, depending on your data source.

Grouping of categories with scalar values

For grouping scalar values, select a single cell in the row or column of the category to be grouped.

Rahmen42 Rahmen44

Choose Data Group and Outline Group from the menu bar or press F12; you get the following dialog.

Rahmen43

You can define in which value range (start/end) the grouping should take place. The default setting is the whole range from smallest to biggest value. In the field Grouping after you can enter the class size, this means the interval size (in this example groups of 10 km/h each).

Grouping of categories with date or time values

For grouping date or time values select a single cell in the column or row of the category to be grouped. This was demonstrated in all three examples in the section “Examples with step by step descriptions” starting on page 5.

With the menu entry Data Group and Outline Group or by pressing F12, you get the dialog shown in Figure 49.

You can decide the range of dates or times (start/finish) over which grouping should take place. The default setting is the entire period from the earliest to the latest value. In the field Group by you can enter the class size (the interval), that should be used for grouping.

Rahmen45

Possible intervals are: seconds, minutes, hours, days, months, quarters and years. These can be combined, for example grouping first by years and then within each year according to month.

Alternatively you can enter any number of days as grouping interval.

Tip

For grouping the output of the DataPilot in calendar weeks, choose the beginning date on a Sunday or Monday and enter the grouping interval (Number of Days) as 7.

Grouping without the automatic creation of intervals

If the categories contain text fields, then the automatic creation of intervals is not possible. You can define for each field (for example Department) which values you want to put together in one group.

Every time you use the menu entry Data Group and Outline Group or you press F12 and you have more than one cell selected, then all the cells will be selected as one group.

Rahmen46 Rahmen47

For grouping of non scalar categories, select in the result of the DataPilot all the individual field values that you want to put in the one group.

Tip

You can select several non-contiguous cells in one step by pressing and holding the Control key while left-clicking with the mouse.

Given the input data shown in Figure 50, execute the Data Pilot with Department in the Row Field and Sum (Sick Days) in the Data Field. The output should look like that in Figure 51. Select with the mouse the Departments Accounting, Purchasing and Sales.

Choose the Data Group and Outline Group from the Menu bar or press F12. The output should now look like that in Figure 52. Repeat this for all groups that you want to create from the different categories (Select Assembly, Production and Warehouse and Group again. The output should look like Figure 53.

Rahmen48 Rahmen49

You can change the automatically given names for the groups and the newly created group field by editing the name in the input field (for example changing 'Group2' to 'Technical'). The DataPilot will remember these settings, even if you change the layout later on. For the following pictures, the dialog was called again (with a right-click, Start) and by selecting the icon “Department 2”, then Options, and finally from the preferences menu Automatic was selected. This generated the partial sum results shown in Figure 54. Double clicking Group 1 and Technical collapses the entries, as shown in Figure 55.

Rahmen50 Rahmen51

Note

A well-structured database makes manual sorting within the DataPilot obsolete. In the example shown, you could add another column with the name Department, that has the correct entry for each person based on whether the employee’s department belongs to the group Office or Technical. The mapping for this (1:n relationship) can be done easily with the VLOOKUP function from Calc.

Sorting the result

The result of any DataPilot is sorted (categories) in columns and rows in an ascending order. You can change the sorting in three ways:

Select sort order from drop-down menus on each column heading

The simplest way to sort entries is to click the arrow on the right side of the heading and check the box(es) for the desired sort order. The custom sorting dialog is shown in Figure 56. Additional options exist to show all, show only the current item, or hide only the current item.

Frame10

Sort manually by using drag and drop

You can change the order within the categories by moving the cells with the category values in the result table of the DataPilot.

Be aware that in Calc a cell must be selected. It is not enough that this cell contains the cell cursor. The background of a selected cell is marked with a different color. To achieve this, click in one cell with no extra key pressed and redo this by pressing also the Shift or Ctrl key. Another possibility is to keep the mouse button pressed on the cell you want to select, move the mouse to a neighbor cell and move back to your original cell before you release the mouse button.

Sort automatically

To sort automatically, start the options of the preferences of the row or column field: right-click on the table area with the DataPilot result and choose Start to open the DataPilot (Figure 3). Within the Layout area of the DataPilot, double-click the field you want to sort. In the Data Field dialog (Figure 43), click Options to display the Data Field Options dialog (Figure 57).

Rahmen52

For Sort by choose either Ascending or Descending. On the left side is a drop-down list where you can choose the field this setting should apply to. With this method you can specify that sorting does not happen according to the categories but according to the results of the data field.

Drilling (showing details)

Drill allows you to show the related detailed data for a single, compressed value in the DataPilot result. To activate a drill, double-click on the cell or choose Data Group and Outline Show Details. You have to distinguish two cases:

  1. The active cell is the category of a row or column field.

In this case drill means an additional breakdown into the categories of another field.

For example, double-click on the cell with the value golfing in the row field region. In this case the values that are aggregated in the category golfing are subdivided according to another field.

Rahmen53

    Since there are more possibilities for subdivision, a dialog appears so you can choose your setting.

Rahmen54

Rahmen55

To hide the details again, double-click on the cell golfing or choose Data Group and Outline Hide Details.

The DataPilot remembers your selection (in our example the field employee), so that for the next drill down for a category in the field region the dialog does not appear. To remove the selection employee, open the DataPilot dialog by right-clicking and choosing Start, then delete the unwanted selection in the row or column field.

  1. The active cell is a value of the data field.

In this case drill down means a listing of all data entries of the data source that aggregates to this value.

Double-click on the cell with the value $18,741 from Figure 58. You now have a new list of all data sets that are included in this value. This list is shown in a new sheet.

Rahmen56

Filtering

To limit the DataPilot analysis to a subset of the information that is contained in the data basis, you can filter with the DataPilot.

Note

An Autofilter or default filter used on the sheet has no effect on the DataPilot analysis process. The DataPilot always uses the complete list that was selected when it was started.

To do this, click Filter on the top left side above the results.

Rahmen57

In the Filter dialog, you can define up to 3 filter options that are used in the same way as Calc’s default filter.

Rahmen58

Note

Even if they are not called a filter, page fields are a practical way to filter the results. The advantage is that the filtering criteria used are clearly visible.

Updating (refreshing) changed values

After you have created the DataPilot, changes in the source data do not cause an automatic update in the resulting table. You have to always update (refresh) the DataPilot manually after changing any of the underlying data values.

Changes in the source data could appear in two ways:

  1. The content of existing data sets has been changed.

For example, you might have changed a sales value afterward. To update the DataPilot, right-click in the result area and choose Refresh (or choose Data DataPilot Refresh from the menu bar).

  1. You have added or deleted data sets in the original list.

In this case the change means that the DataPilot has to use a different area of the spreadsheet for its analysis. A change of the data reference for an existing DataPilot is not possible. You must redo the DataPilot from the beginning.

Cell formatting

The cells in the results area of the DataPilot are automatically formatted in a simple format by Calc. You can change this formatting using all the tools in Calc, but note that if you make any change in the design of the DataPilot or any updates, the formatting will return to the format applied automatically by Calc.

For the number format in the data field, Calc uses the number format that is used in the corresponding cell in the source list. In most cases, this is useful (for example, if the values are in the currency format, then the corresponding cell in the result area is also formatted as currency). However, if the result is a fraction or a percentage, the DataPilot does not recognize that this might be a problem; such results must either be without a unit or be displayed as a percentage. Although you can correct the number format manually, the correction stays in effect only until the next update.

Multiple data fields

Until now we have assumed that the layout of the DataPilot contains only one data field. However, it is possible to have several data fields in the middle of the layout. This makes summaries and analyses of multiple aspects possible.

You could, for example, list all the sales values per day and additionally give the number of entries per day. To do this, put both the sales field and the date field into the Data Fields area. For the date field, choose the Count option for the aggregate function (see Figure 64).

Rahmen59

Since every entry has a specific date, this field will give you the number of entries for each date. If you group the values per month, you get an overview with the sales value and the number of closed sales for each category and month (see Figure 65).

Rahmen60

When using multiple data fields, the DataPilot result area contains a field called Data to allow for manipulating the existing data fields. You can move this field just like any other row or column field by using drag and drop. This is an easy way to achieve different structures for the results (Figures 66 and 67; drag and drop the Data field onto the date field label, or onto the category field label).

Rahmen61

Rahmen62

If you want to put the different data fields in different columns and your DataPilot does not contain another column field or you sort the different data fields in different rows and don’t have another row field, then it is useful to disable the view of the row or column sums. Just drag the category field label up to the Filter area.

Rahmen63

Rahmen64

A frequent use case for multiple data fields is the aggregation of one value according to different aggregate functions at the same time. You can create a DataPilot that shows you the monthly sales values and shows you additionally the smallest and the largest amounts.

Rahmen65

Unfortunately, Calc does not allow you to use the same data field multiple times. For such multiple analyses you have to duplicate the respective column in the source. This column is then available within the DataPilot; it just happens to have identical values (but the three columns are referenced in the example as sales, sales2 and sales3).

Rahmen66

Shortcuts

If you use the DataPilot very often, you might find the frequent use of the menu paths (Data DataPilot Start and Data Group and Outline Group) inconvenient.

For grouping, a shortcut is already defined: F12. For starting the DataPilot, you can define your own keyboard shortcut. If you prefer to have toolbar icons instead of keyboard shortcuts, you can create a user-defined symbol and add it to either your own custom made toolbar or the Standard toolbar.

For an explanation how to create keyboard shortcuts or add icons to toolbars, see Chapter 14, Setting Up and Customizing Calc.

Function GETPIVOTDATA

The function GETPIVOTDATA can be used with formulas in Calc if you want to reuse the results from the DataPilot elsewhere in your spreadsheet.

Difficulty

Normally you create a reference to a value by entering the address of the cell that contains the value. For example, the formula =C6*2 creates a reference to cell C6 and returns the doubled value.

If this cell is located in the results area of the DataPilot, it contains the result that was calculated by referencing specific categories of the row and column fields. In Figure 72, the cell C6 contains the sum of the sales values of the employee Hans in the category Sailing. The formula in the cell C12 uses this value.

Rahmen67

If the underlying data or the layout of the DataPilot changes, then you must take into account that the sales value for Hans might appear in a different cell. Your formula still references the cell C6 and therefore uses a wrong value. The correct value is in a different location. For example, in Figure 73, the location is now C7.

Rahmen68

The function GETPIVOTDATA allows you to have a reference to a value inside the DataPilot by using the specific identifying categories for this value.

Syntax

The syntax has two variations:

GETPIVOTDATA(target field, DataPilot; [ Field name / Element; ... ])

GETPIVOTDATA(DataPilot; specification)

First syntax variation

The target field to specify which data field of the DataPilot is used within the function. If your DataPilot has only one data field, this entry is ignored, but you must enter it anyway.

If your DataPilot has more than one data field, then you have to enter the field name from the underlying data source (for example “sales value”) or the field name of the data field itself (for example “sum – sales value”).

The argument DataPilot specifies the DataPilot that you want to use. It is possible that your document contains more than one DataPilot. Enter here a cell reference that is inside the results area of your DataPilot. It might be a good idea to always use the upper left corner cell of your DataPilot, so you can be sure that the cell will always be within your DataPilot even if the layout changes.

Example: GETPIVOTDATA("sales value";A1)

If you enter only the first two arguments, then the function returns the total result of the DataPilot (“Sum – sales value” as the field will return a value of 408,765).

You can add more arguments as pairs with field name and element to retrieve specific partial sums. In the example in Figure 72, where we want to get the partial sum of Hans for sailing, the formula in cell C12 would look like this:

=GETPIVOTDATA("sales value";A1;"employee";"Hans";"category";"sailing")

Frame5

Second syntax variation

The argument DataPilot has to be given in the same way as for the first syntax variation.

For the specifications, enter a list separated by spaces to specify the value you want from the DataPilot. This list must contain the name of the data field, if there is more than one data field, otherwise it is not required. To select a specific partial result, add more entries in the form of Field name[element].

In the example in Figure 72, where we want to get the partial sum of Hans for Sailing, the formula in cell C12 would look like this:

=GETPIVOTDATA(A1;"sales value employee[Hans] category[sailing]")

Frame6



1In this case the word - undefined – is misleading because the output position is in fact defined.