Get started with data calculations

Get started with data calculations
  • As a data analyst, familiar tools and processes are frequently used, but learning new ones is essential for growth.
  • SQL can significantly improve efficiency when pulling data and conducting analysis.
  • Spreadsheets, conditional formulas, the SUMPRODUCT function, and pivot tables are useful tools for performing calculations in data analysis.
  • Pivot tables are particularly useful for organizing calculations.
  • SQL queries and calculations are closely related, with temporary tables in SQL providing a useful tool for temporarily storing data during analysis.
  • It's important to take time to understand and practice new concepts, and reviewing material is encouraged.

Common calculation formulas

  • Spreadsheet formulas are valuable tools for performing calculations in data analysis.
  • Functions like SUM, MIN, MAX, and AVERAGE can be used to perform basic calculations.
  • The fill handle in spreadsheets can be used to apply formulas across multiple cells.
  • Growth rates can be calculated by dividing the difference in sales by the sales of the previous year.
  • Conditional formatting is a useful tool for visualizing data in spreadsheets.
  • Always double-check your analysis to ensure it meets the objectives.

Functions and conditions

  • Functions like COUNTIF and SUMIF are powerful tools for data analysts, allowing them to perform calculations more easily and accurately.
  • COUNTIF counts the number of cells that match a specified value, helping to find and count errors or specific values in a dataset.
  • SUMIF adds numerical data based on one condition, useful for summing up values which meet a particular criteria.
=SUMIF (range, criteria/condition, [sum_range])
  • These functions are particularly useful when dealing with large datasets, significantly speeding up the analysis process.
  • In the example provided, COUNTIF and SUMIF were used to analyze the number of transactions and revenue for an online kitchen supplies retailer. This allowed for a quick and clear comparison of transactions with one item versus transactions with more than one item.
  • Summary tables provide an easy-to-understand snapshot of the analysis for stakeholders and team members.

SUMIF to SUMIFS

But, you could also build in multiple conditions by using the SUMIFS function. SUMIF and SUMIFS are very similar: They add up values in a range. But SUMIFS can include multiple conditions. This gives you more control over your summing criteria, which, in turn, allows you to perform more complex data analysis easily.

The basic syntax is: =SUMIFS(sum_range, criteria_range1, criterion1, [criteria_range2, criterion2, ...])

The square brackets let you know that this is optional. The ellipsis at the end of the statement enables as many repetitions of these parameters as needed. For example, if you wanted to calculate the sum of sales for ProductA in the East district in the first quarter, you could create a SUMIFS statement with multiple conditions, like this:

=SUMIFS(D2:D8, A2:A8, "ProductA", B2:B8, "East", C2:C8, "Q1")

In this example, B2:B8 is the second criterion_range and East is the second condition. The third criterion_range is C2:C8 and the third condition is Q1. As long as you follow the basic syntax, you can add up to 127 conditions to a SUMIFS statement!

COUNTIF to COUNTIFS

Just like the SUMIFS function, COUNTIFS allows you to create a COUNTIF function with multiple conditions. The definition for COUNTIF is a function that counts the number of cells in a range that meet a single condition. For example, using COUNTIF to track the number of days an temporary employee was absent in an attendance record.

The basic syntax is: =COUNTIF(range, criterion)

Just like SUMIF, you set the range and then the condition that needs to be met. For example, in Table 1, if you wanted to count the number of transactions for ProductA, you could use a COUNTIF function like this:

=COUNTIF(A2:A8, "ProductA")

COUNTIFS has the same basic syntax as SUMIFS: =COUNTIFS(criteria_range1, criterion1, [criteria_range2, criterion2, ...])

The criteria_range and criterion are in the same order, and you can add more conditions to the end of the function. So, if you wanted to find the number of sales transactions for ProductA in the East region in the first quarter, you could use COUNTIFS to apply those conditions, like this:

=COUNTIFS(A2:A8, "ProductA", B2:B8, "East", C2:C8, "Q2")

This enables you to find every instance where both of conditions (East and Q1) are true.

For more information

SUMIFS and COUNTIFS are just two examples of functions with multiple conditions. They help demonstrate how multiple conditions can be built into the basic syntax of a function. There are other functions with multiple conditions that you can use in your data analysis and many resources available online to help you get started:

  • How to use the Excel IFS function: This includes an explanation and example of the IFS function in Excel. It’s a great reference if you’re interested in learning more about IFS. The example is a useful way to understand this function and how it can be used.
  • VLOOKUP in Excel with multiple criteria: Similar to the previous resource, this resource goes into more detail about how to use VLOOKUP with multiple criteria. Being able to apply VLOOKUP with multiple criteria will be a useful skill, so check out this resource for more guidance on how you can start using it on your own spreadsheet data.
  • INDEX and MATCH in Excel with multiple criteria: This resource explains how to use the INDEX and MATCH functions with multiple criteria. It also includes an example, which demonstrates how these functions work with multiple criteria and actual data.
  • Using IF with AND, OR, and NOT functions in Excel: This resource combines IF with AND, OR, and NOT functions to create more complex functions. By combining these functions, you can perform your tasks more efficiently and cover more criteria at once.

Composite functions

  • SUMPRODUCT is a useful function in data analysis that multiplies arrays and returns the sum of those products. It can simplify complex calculations and save time.
=sumproduct(array1, [array2]....)
  • When using SUMPRODUCT, remember that arrays, which are collections of values in cells, should always be inside the parentheses.
  • SUMPRODUCT can be an efficient way to calculate total revenue and profit margin in business scenarios, as it can handle both multiplication and addition calculations simultaneously.
  • Always remember to cite your sources when using a process or method you learned from elsewhere - this is referred to as "stealing with pride".
  • Profit margin A percentage that indicates how many cents of profit has been generated for each dollar of sale

Start working with pivot tables

  1. Pivot tables are valuable analytical tools that provide various ways to view and interpret data, thus aiding in the discovery of trends and insights.
  2. They can perform a range of calculations, such as sums and averages, which is beneficial when dealing with extensive datasets.
  3. Pivot tables facilitate the grouping of data, for instance, annual revenue, making trend identification more straightforward.
  4. They can be used to test hypotheses and delve deeper into the data when unexpected patterns emerge.
  5. The use of filters and calculated fields alongside pivot tables can further enhance data analysis.Calculated field A new field within a pivot table that carries out certain calculations based on the values of other fields
  6. Pivot tables allow for the filtering and calculation of data, providing more precise results.
  7. The accuracy of data within a pivot table can be verified, ensuring reliable analysis.
  8. Formulas can be leveraged to compare data across years, providing a broader context for understanding trends.
  9. Regular practice and hands-on experience with functions, formulas, and pivot tables can significantly improve data analysis proficiency.

Read more