Master's in Statistics Student at Columbia University
Welcome to my website! I’m Yonathan Amare, a Master’s student in Statistics at Columbia University with research interests in machine learning, statistical computing, and their applications in personalization and natural language processing. I graduated from Loyola Marymount University, where I was an NCAA Division I cross country and track student-athlete. In my free time, I enjoy reading biographical science books, cooking, and listening to music...while cooking.
Developed a robust Selenium WebDriver to automate data extraction for over 1000 missing entries from pharmacy websites, enhancing database accuracy and efficiency. Utilized Python for scripting and integration with MySQL databases.
Spearheaded the storage and organization of large datasets in a cloud-hosted SQL database, enabling dynamic data analysis and strategic decision-making for pharmacy cost management. Managed workflows and project deadlines through Confluence and Jira.
Collaborated on a project providing small pharmacies with tools to analyze discounts and profits. Delivered a solution involving web scraping, data analysis, and database management, enhancing their bargaining power under tight deadlines.
Provided faculty with assistance in the use of audio/visual technology in the classroom.
Handled incoming calls and provided over-the-phone troubleshooting.
Documented and followed up with issues through issue tracking software (ServiceNow).
Committed to an intensive training plan of 15+ hours a week.
Developed teamwork, communication, and leadership through collaborative group efforts on and off the course/track with a team of 20+.
Applied time management skills while balancing practice schedule and schoolwork.
Supervised Machine Learning: Regression and Classification
Joining Data with Pandas, Data Manipulation with Pandas, Introduction to Data Science in Python, Exploratory Data Analysis in Python, Machine Learning with Tree-Based Models in Python, Linear Classifiers in Python, Introduction to Linear Modeling in Python, Cluster Analysis in Python
Microsoft Office Specialist: Excel Associate (Office 2019)
Bloomberg Market Concepts
Utilized Python libraries (NumPy, Pandas, Matplotlib, Scikit-learn) to conduct EDA, create visualizations, and build machine learning models. Analyzed feature importance using Random Forest and Decision Tree algorithms, identifying key variables impacting EPA contract awards, and provided actionable insights to improve access for minority and women-owned businesses.
Worked on data ingestion, cleaning, and preparation, followed by exploration and visualization to identify patterns. Developed machine learning models (decision trees, random forest) and applied regression analysis (ridge, lasso) to predict profitability. Used statistical methods to optimize model performance and draw actionable business insights.
Hosted a static resume website on AWS, using S3, CloudFront, DynamoDB, and Lambda for security, optimization, and performance. Implemented a live visitor counter with Python Lambda interacting with a NoSQL database. Developed the frontend with HTML, CSS, and JavaScript, showcasing both backend and frontend integration.
Python, Pandas, NumPy, Matplotlib, Seaborn, Scikit-Learn, MySQL
Data Analysis, Machine Learning, Data Visualization, Data Modeling
Jupyter Notebook, Visual Studio Code, Google Colab, GitHub, Confluence, Jira, Microsoft Office (Word, Excel, PowerPoint, Access, Outlook)
AWS (S3, CloudFront, Lambda, DynamoDB)
Version Control, Git, Troubleshooting, Critical Thinking, Teamwork, Communication, Adaptability, Time Management