In my previous post which you can read here, I discussed the third step in the data science process, process the data, which I am applying to l&d. The next step in the data science process is to, explore the data. The goal of this step is to explore the data so that you can understand it. The next step in the process is to analyse the data, but before you can do that you must understand the data. That is why this step is very important. This step is typically called exploratory data analysis (EDA) because you are exploring the data with the aim of understanding it. Some of the questions you will want to answer with EDA are?
- what are the characteristics of the data?
- How do the data variables relate to each other?
- Are there any corellations in the data?
- Are there any patterns or trends in the data?
In EDA you will use analytic techniques such as descriptive statistics and visualizations. Descriptive statistics you might use include:
- Measures of central tendency such as mean and median
- Measures of variability such as interquartile range and standard deviation
- Frequency distributions
- Counts and sums
Visualizations you might use are:
- Bar plots
- Scatter plots
- Box plots
- Line plots
- Density plots
These are simple analytic techniques that can be applied to data with tools such as Microsoft Excel and the open source statistical programming language R.
How would this work with the data the l&d team at XL Support collected for the question they need to answer? Here is a reminder of the data sources the l&d team identified.
- Details of face-to-face compliance training from the organisation's Compliance Training Tracker dashboard.
- Support needs of all the people XL Supports cares for. These support needs will be used to identify what staff need to know and be able to do. As it stands, the support needs are in 120 separate spreadsheets with messy data entered in different formats.
- Return data from an IT skills need survey administerd through survey monkey.
- A list of key leadership and management behaviours and expected outcomes that the leadership team believes should form the basis of the organisation's leadership development programme for first line managers.
This is simple data so the EDA won't be that complex. Some of the things the l&d team can expect to learn from the data are:
- Amount of people needing which compliance course.
- Support needs related to which team.
- Most needed IT skills and skill variability between different teams.
- Leadership teams impression of what their managers need to learn.
These discoveries can be put into different formats, but using summary tables and visualizations are helpful.
The next step in the data process is to, perform in-depth analysis. This is where analytical techniques such as machine learning will come in. As I mentioned in an earlier post, currently the need for machnine learning in learning and development is not high, neither is it necessary. Of course that will change as I believe, the more data savy we become, the more we will identify areas and uses for more complex analytic techniques. For this particular case the next stage will not be necessary, but I will still write a generic post showing how the step may apply to l&d.