Poor man’s Serverless Quant Fund Infrastructure

Saeed Rahman
7 min readMay 9, 2021

With services like Quantopian shutting its door to the public, I thought it would be useful to share a serverless architecture that I build to do quantitative trading and research. Early last year, I open-sourced a quantitative trading docker based architecture called Microservices Based Algorithmic Trading System (MBATS) (LinkedIn article and GitHub Repo), with MBATS you could run analytical workloads locally using docker but this article is all about scaling MBATS via different Google cloud services provisioned and managed using an open source framework called Terraform.

Wondering why this title ? well I have designed the architecture in the leanest possible way using cheap/free serverless solutions wherever possible and at the end of the article I will also share some tips to get free cloud credits. Again, this project is not meant for that well funded firm but more for the bootstrapped individuals who wants to get started as quick as possible with a lean scalable analytical platform at the lowest price possible. Also, this project has been structured in a way prioritizing simplicity and maintainability. Therefore all the cloud services chosen are as close to the ones used in MBATS so that its possible to run a hybrid ecosystem (local and public cloud).

If you are someone like me one who’s impatient and want to see the code, here’s Github Repo for the project.

Time to get technical…..

To begin, I have made quite a few changes to MBATS and released MBATS V2.0 where I replaced superset with a multi-page Dash App and attached Celery executor for Airflow. And finally wrapped all the containers in a Nginx reverse proxy server for single access point.

In the below image you can see how the different services in MBATS V2.0 has been replaced with the different cloud services to achieve the same functionality but at a scale.

Architecture

The Terraform code not only creates the different cloud services but also manages loading the application logic into these services via Cloud build. This way, we can develop the code locally in the MBATS V2.0 environment and then push the changes to a branch in an online-repository(GitHub for example) and then the Cloud build will automatically update the cloud services.

I won’t go into the detail of implementation in this article, but will give you a brief idea of how the different cloud services were used.

  • Cloud Run is used to deploy containerized applications on a fully managed serverless platform. There are multiple ways we can use this service in this project, and we are mainly using it for processing low timeout data processing jobs that requires multiple technologies working together and that could be containerized. Also, the Dash app is hosted in Cloud Run, and its made public via a domain, the DNS routing and the firewall rules are all setup through CloudFlare via Terraform.
  • Cloud SQL is a fully managed PostgreSQL Server. All the security-master, position analytics and risk-analytics data can be stored here. Since its Postgres, making the switch from MBATS to Cloud SQL is seamless.
  • Cloud Storage is a scalable object storage. Minio is replaced with Cloud Storage. And since we are using Boto3 in all the codebase in MBATS, making the switch from local to cloud is also a piece of cake.
  • Cloud Function is a scalable pay-as-you-go functions as a service (FaaS). Cloud Functions can be utilized multiple ways in this project. The primary way we have used is to fetch the data from the data-vendors/brokers rest API and store it to the Cloud SQL. Flask is the underlying framework used to handle incoming HTTP requests. So you could trigger these either by Cloud Scheduler or via Airflow.
  • Compute Engine is used to create and run virtual machines. Apache Airflow, Celery workers and Flower are hosted in docker compose in the Compute Engine. The end-points are controlled by a reverse-proxy (Nginx) container and they are exposed to a domain via CloudFlare. The live strategy needs to be run on a machine that doesn’t have a timeout and that’s the reason we haven’t used Cloud Run (max timeout:15 minutes) or Cloud Function (max timeout:9 minutes) for this particular case.

Terraform

The best part of docker is that it is infrastructure as code which gives us immense flexibility when developing and deploying. One of the challenges with the cloud architecture is that it involves multiple services, multiple chained and dependent configurations and if you have multiple cloud in a hybrid or multi cloud architecture, then we have whole different level of complexity. Using Terraform we could set these complex rules and architecture as code. Its is a slight upfront investment, but having the architecture as code opens up a lot of opportunity, especially the flexibility to screw up big time but still be able to fix it in one command. For example, for this project I was able to setup all the above services in like 5–10 minutes with just one line of terraform command.

Cost and Simplicity

Now if you notice all the different services, the only service that’s not truly serverless is the Compute Engine. The reason why this is the case is because the hosted Airflow services in Google Cloud is Cloud Composer, this uses Kubernetes as backend and with my current workload and use-case I felt that it was an overkill. On-top of that sometimes serverless technologies can be cost prohibitive as well, so its combination of simplicity and cost that made me to go with Compute Engine. To that point there are several other ETL tools available in the Google Cloud ecosystem that could be utilized here for the data ingestion and processing, but the point of this project was not to be fancy but to be practical and I just followed the universal KISS (keep it simple stupid) principal.

Most Cloud providers have a free tier system and with Google Clouds you can get a 300$ credit when you signup as a new user. With that you could easily run the above set of services for a few weeks. With Terraform, since you are able to setup the infrastructure and destroy it in a few minutes you can be a bit creative and save some money if you wish.

There are a lot of free resources out there to get free cloud credits from all the big cloud providers as well, here’s one that has worked for me.

But what about Big Data ????

I can hear a lot of you saying that we won’t be able to process big data in the above architecture. . Market data especially price, reference and fundaments are datasets with very well defined schema and with the above infrastructure I was able to process intraday (minute level) data for around ~5000 active equities across NASDAQ, NYSE with ease. But still I agree that having a power house like spark cluster is always handy. But it really depends on what you would want to do and how complex you want to get.

At work, I deal with alternative data with millions and billions of records everyday and I use the unified analytical platform Databricks in AWS environment to do all of that and for that reason I am quite biased towards it. Spark from my perspective is not just an ETL framework but also a good tool for performing analytical work and implementing ML models as well. I won’t go into the details of how to implement the same in Databricks in this article as it can be an project of in itself. But you can expect a future article that would be focused on using Databricks and Delta Lake to implement a quant-trading infrastructure that would be able to process tick level data using Structured Streaming and Kafka.

Value your time

The GitHub codes and the multiple repositories I have mentioned here might look a bit intimidating to any who’s beginning in the space. I am a Data Scientist and my expertise are around models and analytics rather than cloud configuration technologies like Terraform. The best strategy in these situations where you face a technology that you are not familiar with but you need to use it for getting things done is to seek help.

Using portals like Freelancer, Upwork and many others you can easily find thousands of experts who would be able to help you out. This is what I did with Terraform, I learned how to implement one or two simple cloud service in Terraform, learned about what it can do and figured out how it would fit into what I am trying to achieve. Then I went out and got help from an amazing freelancer cloud architect, from Brazil, and got all basic code for all the different cloud services from him. Then it was only a matter of putting them together in a way I wanted. What had taken him 5 hours would probably have taken me at least 50–100 hours if I decided to do everything by myself.

Being an engineer at heart its very tempting to do everything by yourself especially when you are playing around with a new technology and you are excited. In the early phases of my career this has served me well, but soon you will realize there’s only so much you can do in 24 hours. So optimize for goals and deadline and I always stick to the below philosophy.

Never Mistake the Finger for the Moon The pointing finger is what guides you to the moon. Without the finger, you might not notice the moon. But the pointing finger isn’t what matters most. It only matters because it helps you see the moon for yourself.

--

--

Saeed Rahman

A technologist at the core with hands on experience in Data Science and Algorithmic trading. My core competency is listening to what the data has to say.