In this activity, we will work with two programs that can help us manipulate and visualize data.
Activity One: Excel
Excel is a powerful tool for storing data and performing calculations, as well as creating graphs to visualize large quantities of data. Today, we you will watch a video where I explain how to do several Excel functions, and you will work along, using the following dataset. Please download it now:
If you do not have Excel on your computer, you’ll find instructions on how to get it free of charge directly from Microsoft: Microsoft Office Free for Students. Note: If you are not using Microsoft 365, but have an older version, some of the features may not be as described in this tutorial. There may also be differences between Mac and Windows versions of the software. This is a good resource if you are having trouble with an older version: Microsoft Office Cheat Sheets.
PLEASE NOTE: At this end of this activity, you will submit an Excel spreadsheet with all the graphs and tables we are working through today. So, as you go, do not delete any of your work. Instead, you can just move to a new location in the spreadsheet.
You will also notice that I am asking you to take notes on the questions below. You will NOT submit these questions to me as part of the submission for this assignment, but DO keep a copy of those notes, as this information is likely to appear on a future exam.
1. Open 202-raw-sensor-data-FOP in Excel. With your breakout group, scroll through the data, and answer the following questions:
- How many data points are contained in this dataset?
- What sort of data is this? How do you think it was collected?
- Over what period of time was this data collected?
- Generate two or three questions that could conceivably be investigated with this dataset.
2. Now, create a line chart using the data in the “Temperature” column.
- First, add a “Reconcile” column.
- (Feel free to reformat the header, and adjust the size of the columns in whatever way makes it easiest for you to explore the data).
- Describe any pattern you see in this graph. How might you be able to explain this pattern?
- Do you see anything unusual in this graph? Investigate the dataset and see if you can locate the source of any anomalies.
- What is the best explanation for any outlying data points? How would you recommend dealing with them as part of the process of cleaning the data?
- Now, consider the X-axis of your graph. How is this data organized, based on the X-axis?
- What, ideally, would be the variable shown on the X-axis?
3. Now, create line charts for “Wind Speed,” “Wind Direction,” “Solar Radiation,” and “Rain.”
- Note any observations you have about these data sets.
- Can you identify any patterns and interactions between these variables (you can include Temperature in this discussion).
- Do you have any new/different ideas about what research questions could be investigated with these data?
4. Now, we will perform some manipulation of these data.
- Concatenate the “Month” and “Day columns into a single “Date” column, formatted with a backslash separating the numbers (e.g. 5/22).
- If “Month” is in column B, and “Day” is in column C, the formula is:
- =CONCATENATE(B2,”/”,C2)
- Concatenate the “Hour” and “Minute” columns into a single “Time” column, formatted with a colon separating the numbers (e.g. 15:10).
- If “Hour” is in column D, and “Minute” is in column E, the formula is:
- =CONCATENATE(D2,”:”,E2)
- Split the “Recorded By” column into two columns: “First Name,” and “Last Name.”
5. Let’s do a few of calculations, and then graph our results.
- Use “AVERAGEIF” to calculate the average temperature for each day.
- If “Date” is in column B, “Temperature” is in column E, and you’ve created the table with each data in column L, the formula is:
- =AVERAGEIF(B:B,L2,E:E)
- Use “AVERAGEIF” to calculate the average solar radiation for each day.
6. Graph the results of the previous calculations. You may choose line, column, or area graphs; feel free to experiment with different chart types and see which ones seem to visualize the data most effectively.
- Create a graph that visualizes both average temperatures and solar radiation for each day. Each variable should be graphed on its own Y-axis.
7. Now we will retrieve some data (without doing any calculations), and graph these results. Again, feel free to experiment with different chart types.
- Use “MINIFS” to retrieve the low temperature for each day.
- If “Temperature” is in column E, “Date” is in column B, and you’ve created the table for each day in column L, the formula is:
- =MINIFS(E:E,B:B,L2)
- Use “MAXIFS” to retrieve the high temperature for each day.
- If “Temperature” is in column E, “Date” is in column B, and you’ve created the table for each day in column L, the formula is:
- =MAXIFS(E:E,B:B,L2)
- Create a graph that visualizes both low and high daily temperatures.
8. On Your Own:
- Create one additional graph that compares any two variables (your choice) in the dataset.
- Do whatever research is needed to gather information, then use formulas to:
- Create a table that contains the number of incidences of rain for each day
- Calculate the average low temperature and average high temperature for the dataset
- Calculate the average windspeed for each day
- Graph average windspeed for each day and average wind direction for each day on a single graph
When you have completed all activities, please upload your Excel spreadsheet, containing all the graphs described above, to the appropriate Canvas assignment.
Activity Two: Python COMING SOON