

analytics-as-a-service platforms, that makes it easier to apply analytics both for novice and expert users. At the same time, the need for experts in analytics is increasing and the number of analytics applications is growing. Generating insights and value from data has become an important asset for organizations. These analyses serve to inform more cost-effective and equitable use of student data for predictive analytics applications in higher education. Moreover, algorithmic biases affect not only demographic minorities but also students with acquired disadvantages. Combining the two data sources does not fully neutralize the biases and still leads to high rates of underestimation among disadvantaged groups. In terms of fairness, using institutional data consistently underestimates historically disadvantaged student subpopulations more than their peers, whereas LMS data tend to overestimate some of these groups more often. Combining institutional data with LMS data leads to even higher accuracy than using either alone. We found that institutional data and LMS data both have decent predictive power, but survey data shows very little predictive utility. Using data from over 2,000 college students at a large public university, we examined the utility of institutional data, learning management system (LMS) data, and survey data for accurately and fairly predicting short-term and long-term student success. However, little is known about the overall utility of different data sources across prediction tasks and the fairness of their predictions with respect to different subpopulations. Separate feature sets are typically used for different prediction tasks, e.g., student activity logs for predicting in-course performance and registrar data for predicting long-term college success. In higher education, predictive analytics can provide action-able insights to diverse stakeholders such as administrators, instructors, and students. Analysis with our proposed methodology can be used to offer more tailored education, which in turn allows students to follow their interests and adapt to the ever-changing career market. We found that course choices diversify as programs progress, meaning that attempting to understand course choices by identifying a "typical" student gives less insight than understanding what characterizes course choice diversity. To compliment this analysis, we also used directed networks to identify the "typical" student, by looking at students' general course choices by semester. We compared our community detection results to actual major specializations within the computer science department and found strong similarities. This was done by applying community detection to a network of courses, where two courses were connected if a student had taken both. With these methods, we have explored student choices to identify their distinct fields of interest. Here, we use network analysis of the course selection of all students who enrolled in an undergraduate program in engineering, psychology, business or computer science at a Nordic university over a five year period. However, little emphasis has been put on utilizing the large amount of educational data to understand these course choices.


Gaining insight into course choices holds significant value for universities, especially those who aim for flexibility in their programs and wish to adapt quickly to changing demands of the job market.
