College Student Performance Analysis

Developed for Stats 101A at UCLA

By analyzing relationships between several student and campus factors and student performance (GPA), we determined which educational and socioeconomic factors affect students’ performance the most.

View GitHub

I. Introduction

In this project, we aim to investigate the various campus climate factors that could potentially impact undergraduate students’ academic performance, measured by GPA (Grade Point Average out of 4), through regression analysis. Our research question of interest is: How can a student’s GPA be affected by various educational and socioeconomic factors?

Predicting Car Accident Severity

Developed for Stats 101C at UCLA

Using exploratory data analysis, imputation, and machine learning classification techniques, we predicted the severity of car accidents (mild or severe) in the United States using a country-wide car accident dataset.

View GitHub | View Kaggle

Abstract

The goal of this project was to predict the severity of car accidents using a provided countrywide traffic accident dataset. Ultimately, our results were submitted to a Kaggle competition where our scores were ranked against our classmates in a 2-lecture wide competition. This dataset was extremely large, with a comprehensive 50,000 observations recorded over both a training and testing dataset. We were tasked with creating multiple predictive models based on provided and independently-developed predictors to attempt to achieve a high score on Kaggle – essentially how well our predictions match the real ‘test’ classifications. Our final model was a Random Forest model that produced a final score of 0.9355 – to earn us a score of 14th in our lecture.

Opioid Crisis and Possible Resolutions

Developed for Math 42 at UCLA

Given drug use data of several US states, we created a mathematical model that predicts the regions where opioid use originated and spread.

View GitHub

Abstract

We were tasked to offer insight to the Chief Administrator, Drug Enforcement Administration and the National Forensic Laboratory Information System by creating a mathematical model that predicts the regions where opioid use originated and spread.