• Skip to main content
  • Skip to footer

Fervent | Finance Courses, Investing Courses

Rigorous Courses, Backed by Research, Taught with Simplicity.

  • Home
  • Courses
  • Resource Hub
  • Articles
  • All Access Pass
12 Practical, Applied Finance Data Science Projects – Beginner, Intermediate, and Advanced Levels

12 Practical, Applied Finance Data Science Projects – Beginner, Intermediate, and Advanced Levels

August 17, 2022 By Vash Leave a Comment

Do you want to build your financial data science skills? Whether you’re doing this to become a Financial Data Scientist, or to leverage the power of Financial Data Science for your own investments – take your pick from the finance data science projects we’ve created below and build your knowledge and skills as you work on them.

Projects get progressively more challenging as you go further down. Do you have what it takes to complete all the projects?

All of these finance data science projects can be worked on using any/all of the following tools:

  • Microsoft Excel® / Google Sheets
  • Python
  • R
  • MATLAB

For the Advanced Level Projects, in particular, we strongly recommend using a programming language instead of Excel® / Google Sheets. It’s possible to work on them using Excel® / Google Sheets, but it’s far from efficient to do so.

NOTE: we’ve intentionally avoided projects on fraud detection, customer segmentation, transaction forecasting, etc since data for these types of projects are tricky to get in the real world. It’s possible to obtain “toy” datasets / dummy datasets for these types of projects, but we prefer to focus on real-world projects that require real-world data.

Table of Contents hide
1 Beginner Level Finance Data Science Projects
1.1 Project B-1: Calculate Stock Returns
1.2 Project B-2: Calculate The Total Risk of a Stock
1.3 Project B-3: Calculate The Market Risk of a Stock
2 Intermediate Level Finance Data Science Projects
2.1 Project I-1: Extract Financial Data Programmatically
2.2 Project I-2: Evaluate the Historic Performance Of Your Investment Portfolio
2.3 Project I-3: Formally Test The Validity of the Capital Asset Pricing Model (CAPM)
2.4 Project I-4: Test and Validate The Weak Form Of The Efficient Market Hypothesis
3 Advanced Level Finance Data Science Projects (❗️Not For The Faint Hearted)
3.1 Project A-1: Optimize Portfolios
3.2 Project A-2: Formally Test The Validity of the Fama French 3 Factor Model
3.3 Project A-3: Identify Themes Within Annual Reports
3.4 Project A-4: Conduct an Event Study To Evaluate The Impact of A ‘Major Event’ On Financial Markets
3.5 Project A-5: Test and Validate An Investment Hypothesis / Thesis
4 Summary and Next Steps

Beginner Level Finance Data Science Projects

Project B-1: Calculate Stock Returns

Discover powerful concepts like the Random Walk with this simple project on calculating stock returns.

💼 Project Brief

Calculate historical stock returns for a stock of your choice.

Plot out a line chart of the stock’s historic returns. This will also help you later on, when you’re working on the Intermediate Level Financial Data Science Projects (particularly, Project I-4).

☑️ What You’ll Need

To successfully complete this project, you’ll need access to historical price data for at least 1 stock of your choice.

Historical data for stock prices are available from Yahoo! Finance or from Google Sheets using the GOOGLEFINANCE() function.

Since this is a Beginner Level Project, we’re not suggesting the use of APIs to extract data programmatically. This is covered in an Intermediate Level Project, I-1. If you’re already comfortable using an API to extract data programmatically, feel free to do so for this project, too.

If you’re not sure how to calculate stock returns, check out the article linked.

⏳ Expected Time To Complete
Approximately 30-60 minutes, depending on your experience of working with stock price data.

Project B-2: Calculate The Total Risk of a Stock

💼 Project Brief

Calculate the total risk of your chosen stock using historical data for stock prices.

☑️ What You’ll Need

To successfully complete this project, you’ll need access to historical stock price data for at least 1 stock of your choice. Historical stock price data is available from Yahoo! Finance. Ideally, you should use the same stock you used in Project B-1. This will allow you to compare the stock’s returns (or expected returns) with its total risk.

⏳ Expected Time To Complete
Approximately 30-60 minutes, depending on your experience of working with stock price data.

Project B-3: Calculate The Market Risk of a Stock

Market Risk – aka Systematic Risk – refers to the stock’s exposure to the overall market portfolio.

💼 Project Brief

Calculate systematic risk for your stock.

☑️ What You’ll Need

To successfully complete this project, you’ll need access to historical price data for at least 1 stock of your choice. Historical stock price data is available from Yahoo! Finance. Ideally, you should use the same stock you used in Project B-1 (and/or Project B-2). This will allow you to compare the stock’s returns (or expected returns) and its total risk with its exposure to the stock market portfolio.

If you don’t know how to calculate the market risk (systematic risk) of a stock, take a look at the article linked above.

⏳ Expected Time To Complete

Approximately 30-60 minutes, depending on your experience of working with stock price data.


Related Course: Data-Driven Investing (with Python)

Get ahead of the game and learn the secrets to successful data-driven investing. You’ll gain insights into investment strategies and techniques used by quant hedge funds and the like. Data-Driven Investing Course.

Don’t let your ambition go to waste – enroll now and start building your data-driven investing system today!


Intermediate Level Finance Data Science Projects

These intermediate-level finance data science projects require at least a basic command of financial securities. They build on the projects above, but dive deeper into the data analytics side of financial data science.

You should’ve been able to work on the beginner-level finance data science projects quite seamlessly in order to work on these projects successfully.

Project I-1: Extract Financial Data Programmatically

💼 Project Brief

Connect to a financial data API of your choice and ‘pull’ financial data programmatically.

☑️ What You’ll Need

To successfully complete this project, you’ll need access to a Financial Data API / provider.

Take your pick from:

  • Nasdaq Data Link (formerly Quandl)
  • Pandas Datareader

Or search “finance data API” to explore current paid and free data providers.

If you’re using Microsoft Excel®, you can switch to Google Sheets and extract the data there using the “GOOGLEFINANCE()” function.

Aim to extract data for a minimum of 50 stocks, preferably 100+ stocks.

⏳ Expected Time To Complete

Approximately 1 – 2 hours, depending on your experience of working with large stock price data and programming knowledge.

Project I-2: Evaluate the Historic Performance Of Your Investment Portfolio

💼 Project Brief

Calculate portfolio return and portfolio risk of your investment portfolio.

☑️ What You’ll Need

You’ll need some sort of investment portfolio to successfully complete this project. This can be an actual investment portfolio you own, or one you intend to build from scratch.

Put differently, you’ll need a selection of stocks (and/or other securities from different asset classes if you’re feeling adventurous) and the historical price data for those stocks.

It’s best if you’ve worked on Project I-1 prior to starting this one. If you’ve been unable to work on Project I-1, but still want to work on this project, explore Kaggle stock market datasets here.

⏳ Expected Time To Complete

Approximately 1 – 2 hours, depending on your experience of working with large stock price data and financial data analysis.

Project I-3: Formally Test The Validity of the Capital Asset Pricing Model (CAPM)

If you’re looking at working on intermediate-level finance data science projects, it’s almost guaranteed you’ve at least heard of the Capital Asset Pricing Model.

It’s probably one of the most controversial asset pricing models of any.

Its supporters passionately put the model on a pedestal, and its critics condemn it as something that’s utterly useless.

Who’s right? Find out for yourself!

💼 Project Brief

Statistically test and validate the CAPM by running an OLS regression (a supervised machine learning model). If your t-stat on the coefficient (beta) is statistically significant, then the CAPM is valid. If not, join the critics and pave the way for a better model.

☑️ What You’ll Need

One way to test this would be to set:

  • The historical returns of your portfolio as your dependent variable, and
  • The historical returns on the market portfolio as your independent variable

Use an appropriate risk-free rate to calculate excess returns.

Data sources for the set of historical returns have already been highlighted earlier in Project I-1 and I-2. You can obtain information on the appropriate risk-free rate on Bloomberg or the FT, for example.

Finally, run your OLS regression on your statistical tool/package of choice.

Be sure to include an intercept term if the tool/package doesn’t already do this by default.

⏳ Expected Time To Complete

Approximately 1.5 – 3 hours, depending on your experience of working with large stock price data, asset pricing models, and financial data analysis.

Project I-4: Test and Validate The Weak Form Of The Efficient Market Hypothesis

You’ve probably heard of the Efficient Market Hypothesis and how it comprises of three different forms, including:

  • Weak Form Efficiency
  • Semi-Strong Form Efficiency, and
  • Strong Form Efficiency

Now, there are broadly 2 groups of people in Finance:

  • those that believe in the Efficient Market Hypothesis (EMH), and
  • those that don’t believe in it.

Put beliefs and opinions aside, and test the (weak form) of the EMH on your own.

💼 Project Brief

Formally test the validity of the weak form of the Efficient Market Hypothesis (EMH).

☑️ What You’ll Need

You’ll need historical returns of stocks and the market portfolio to formally test the weak form of the EMH.

Remember, the weak form of the EMH is true as long as abnormal returns cannot be earned consistently by using historic price information.

⏳ Expected Time To Complete

Approximately 1.5 – 3 hours, depending on your experience of working with large stock price data, asset pricing models, and financial data analysis.


Related Course: Data-Driven Investing (with Python)

Get ahead of the game and learn the secrets to successful data-driven investing. You’ll gain insights into investment strategies and techniques used by quant hedge funds and the like. Data-Driven Investing Course.

Don’t let your ambition go to waste – enroll now and start building your data-driven investing system today!


Advanced Level Finance Data Science Projects (❗️Not For The Faint Hearted)

Project A-1: Optimize Portfolios

💼 Project Brief

Optimize your financial investment portfolios to:

  • achieve a target expected return,
  • minimize risk
  • maximize risk-adjusted returns

☑️ What You’ll Need

To optimize your investment portfolio, you’ll need data on the historical returns of the individual securities that make up your portfolio.

Data sources for the set of historical returns have already been highlighted earlier in Project I-1 and I-2.

Feeling lost? Don’t worry – we actually teach these investment analysis / financial data science techniques in our course on Investment Analysis & Portfolio Management (with Excel®) as well as in our Investment Analysis & Portfolio Management (with Python) course.

You can enroll in either course to learn how to optimize investment portfolios this way. Both courses are identical in all aspects other than the tools used to conduct investment analysis (i.e., Excel® vs Python).

⏳ Expected Time To Complete

Approximately 3 – 5 hours, depending on your experience of working with large stock price data, asset pricing models, financial data analytics, and programming knowledge.

Project A-2: Formally Test The Validity of the Fama French 3 Factor Model

It paved the way for factor models and factor investing and continues to be applied in academia and in industry. But does the Fama French 3 Factor Model actually work?

Find out for yourself!

💼 Project Brief

Test the validity of the Fama French 3 Factor Model using an appropriate multivariate OLS regression (a supervised machine learning model).

☑️ What You’ll Need

Data on individual factors is available directly from the Kenneth French Data Library. Alternatively, for bonus points, you can replicate the factors from scratch.

⏳ Expected Time To Complete

Approximately 1 – 5 hours, depending on your experience of asset pricing models, multiple linear regression, financial data analytics, and programming knowledge. It’s also influenced by whether you work with the ‘ready-made’ data from the Kenneth French Data Library, or if you opt to compute the factor returns from scratch.

Project A-3: Identify Themes Within Annual Reports

Take a break from structured data and leverage the power of unstructured data, specifically text data.

💼 Project Brief

Identify topics/themes within annual reports by using an appropriate artificial intelligence / unsupervised machine learning model.

☑️ What You’ll Need

To successfully complete this project, you’ll need access to a reasonably large dataset of firm-level annual reports.

You can use 10-K reports sourced from the SEC’s Edgar Database if you’re working with US firms (or those listed on US stock exchanges).

For firms listed in other countries, you’ll likely need to collect the data yourself from the companies’ websites. Databases do exist, but they tend to be quite expensive.

You’ll also need reasonable experience with an artificial intelligence tool like LDA to identify themes in an unsupervised machine learning setting.

⏳ Expected Time To Complete

Approximately 1 – 5 days, depending on your availability of data, programming knowledge, and desired level of rigour (e.g., whether you’re working with a small dataset, or whether you’re working with Big Data that requires time to collect and process).

Project A-4: Conduct an Event Study To Evaluate The Impact of A ‘Major Event’ On Financial Markets

💼 Project Brief

If you read the financial news, you’ll notice a plethora of ‘gurus’ describing how “this ONE major event caused markets to panic more than ever before”.

Take your pick of an event that might have had an impact on financial markets. Be that a presidential election in the US, a generational event like Brexit in the UK, a regulatory paradigm shift in India – whatever you fancy.

Next, run a formal event study to statistically test and validate the precise impact that the major event may or may not have had on a financial market of your choice.

☑️ What You’ll Need

For this project, you’ll need:

  • 1 or more major events that plausibly had an impact on a financial market of your choice
  • Historical price data of a large cross-section of securities for a period of time that includes, precedes, and succeeds the major event(s)
  • Historical price data for an appropriate market portfolio for a period of time that includes, precedes, and succeeds the major event(s) (so that you calculate abnormal returns)
  • An appropriate risk-free rate of return (so that you can calculate excess returns)

You’ll also want to think about a variety of control variables if you’re serious about establishing causality vs mere correlation as part of this financial data analysis.

⏳ Expected Time To Complete

Approximately 1 – 2 weeks, depending on your availability of data, programming knowledge, desired level of rigor, and whether you’re looking to establish causality vs just correlation.

Project A-5: Test and Validate An Investment Hypothesis / Thesis

💼 Project Brief

Start by coming up with an investment idea. Next, transform your investment idea into a testable hypothesis. Lastly, statistically test and validate the hypothesis to see if your investment idea generates alpha.

If it doesn’t generate alpha, repeat the steps above with a new investment idea.

Run out of investment ideas? Stick to investing in the stock market as a whole (via a low-cost index fund).

☑️ What You’ll Need

To successfully complete this project, you’ll need:

  • 1 or more investment ideas that you can test in the data
  • Historical price data of a large cross-section of securities for a reasonably long period of time
  • Historical price data for an appropriate market portfolio (so that you can quantify alpha)
  • An appropriate risk-free rate of return (so that you can calculate excess returns)

Not sure how to go about this? Take a look at our Data Driven Investing with Python | Financial Data Science Course. An Excel® version is also available in our course on Data Driven Investing with Excel | Financial Data Science. Both courses will teach you how to formally/statistically test and validate an investment hypothesis / thesis from scratch.

BONUS: Found an alpha-generating investment idea? Go ahead and create an algorithmic trading / investment strategy and backtest your idea, including rebalancing portfolio weights when and where appropriate.

⏳ Expected Time To Complete

Approximately 1 – 8 weeks, depending on your availability of data, programming knowledge, desired level of rigor, and whether you’re able to find a strategy that generates alpha.

Summary and Next Steps

Did you complete all of the finance data science projects above? Hats off to you, well done! We’d love to hear about your findings, honestly. Feel free to reach out to us if you want to share what you found and learned and whether you went big with Big Data in your analysis.

If you’re looking to go even further, it’s worth exploring recent academic and practitioner articles and replicating their results on your own. Not only will this build your skills and knowledge further, it’ll also:

  • Give you an immediate and authentic insight into the current research in Finance (both in academia and in the finance industry)
  • Allow you to verify the findings reported in the articles you’ve read
  • Likely give you food for thought for more areas for you to research and explore in applying financial data science

If you completed some (but not all) of the finance data science projects above, take the time to block out your calendar and work on the ones you haven’t worked on yet. They’re challenging for sure. But they’re also incredibly rewarding once you’ve conquered them.

Our projects may not be as easy or seamless as the projects available elsewhere on “the internet”, but that’s because we’ve focused on projects that aren’t just applicable in the real-world, but also those that genuinely build your skills in using financial data science technologies.

By working on these projects and exploring our related investment courses, you’ll genuinely gain a solid command of crucial concepts in financial data science that’ll hold you in good stead for the rest of your life.

Alright, that’s a wrap from us for now though.

Keep learning and loving Finance!


Related Course: Data-Driven Investing (with Python)

Get ahead of the game and learn the secrets to successful data-driven investing. You’ll gain insights into investment strategies and techniques used by quant hedge funds and the like. Data-Driven Investing Course.

Don’t let your ambition go to waste – enroll now and start building your data-driven investing system today!

Filed Under: Finance, Financial Data Science, Investment Analysis

Reader Interactions

Leave a Reply Cancel reply

You must be logged in to post a comment.

Footer CTA

Do You Want To Crack The Code of Successful Investing?

Yes! Tell Me More

  • About Us
  • Write For Us
  • Contact Us

Copyright © 2025, Fervent · Privacy Policy · Terms and Conditions


Logos of institutions used are owned by those respective institutions. Neither Fervent nor the institutions endorse each other's products / services.

We ethically use cookies on our website to give you the best possible user experience. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
Cookie SettingsAccept All
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT