AI VS WALLSTREET: LOAN PORTFOLIO OPTIMIZATION
A machine learning research project demonstrating how predictive modeling and quantitative optimization can be used to construct higher-performing loan investment portfolios.
The project analyzes large-scale lending data from Prosper Marketplace and applies statistical learning techniques to identify patterns that signal loan default risk and expected investor return.
Using multiple machine learning algorithms - including logistic regression, naïve Bayes, random forests, gradient boosting, and LightGBM — the system predicts loan outcomes and estimates expected returns across several investment strategies.
Models achieved ~85% accuracy in predicting loan outcomes, enabling the construction of optimized portfolios that prioritize high-return and low-risk loans.
The project demonstrates how machine learning and financial optimization methods can be combined to build data-driven investment strategies for peer-to-peer lending markets.
ABOUT
New York University Stern School of Business & Columbia Business School:
The methodology and analytical framework explored in this project have been referenced in connection with the study “Data-Driven Investment Strategies for Peer-to-Peer Lending: A Case Study for Teaching Data Science,” a widely cited academic case used in analytics and data science programs at NYU Stern and Columbia Business School.Towards Data Science:
“Optimizing a Loan Portfolio Using a Data-Driven Strategy” - published in Towards Data Science, a widely followed data science publication reaching a global audience of machine learning practitioners, quantitative analysts, and AI researchers.Big Data Journal
The academic study “Data-Driven Investment Strategies for Peer-to-Peer Lending: A Case Study for Teaching Data Science” was published in the peer-reviewed journal Big Data, which focuses on the application of advanced analytics and machine learning to large-scale real-world problems.
PRESS
INFLUENTIAL TECHNOLOGIES & RESEARCH
Random Forest - Leo Breiman
An ensemble learning method that aggregates multiple decision trees to improve predictive accuracy and robustness in classification and regression tasks.
Gradient Boosting & LightGBM - Microsoft Research
A high-performance gradient boosting framework widely used in large-scale machine learning systems and financial modeling.
(MICROSOFT) SHOPPING ASSISTANT: COPILOT(MICROSOFT) RECOMMENDATION SYSTEMS: COPILOT(MICROSOFT) CONTENT DISCOVERY: COPILOT