To develop automated stock trading system first we need to get historical stock price data to work on. In our case, day to day stock price data is required. This data can be acquired from various providers including Yahoo Finance, Google Finance, and others.
Stock data preparation
We got our test/development stock data in the form of zipped CSV files. Such format is easy to parse but unusable for direct high-speed analysis.
Each data entry can be described, as consisting of the:
- Name of the stock exchange
- Stock ticker symbol
- Open price
- High price
- Low price
- Close price
Expected data access pattern will be the processing of each stock data separately. This meant we should be on the lookout for the processing parallelization opportunities as well. One solution would be to load all this data into general purpose SQL database. But it's not suited for stock data analysis. We would end up pushing massive amounts of data back and forth. Also, there are good chances of fitting all data into memory to speed up processing.
Our development environment is .NET C#. For the first approach, we decided to store each required stock ticker symbol data in its own in-memory B+Tree data structure. Which is similar to what most databases already do internally. Another way could have been to use databases optimized for time series data or NoSQL like RocksDB or LevelDB can be adapted.
We prepared a 450MB compressed data file for a period of 4 years containing stock price data for 64k+ ticker symbols. When fully loaded into memory it takes around 4GB RAM.
Nothing is perfect. Bad input data means bad output data. There were multiple issues noted in the stock price data provided:
- Incomplete data - gaps and just missing data
- Holidays - days where there were no trades (volume was zero)
- Errors - unnatural price spikes (gains or losses over 50%) on a day to day data.
Implementing Simple Technical Indicators
Implementation of stock trading strategy requires calculation of various indicators. Our first simple strategy required to calculate ATR (Average True Range) and N-day stock price high points.
Average True Range (ATR)
ATR is a measure of average price volatility over last N days. The calculation is straightforward and complexity is close to O(n). It is calculated in parallel for each loaded stock ticker symbol.
N-day high and low points.
To find high and low price points effectively data is evaluated over the window while maintaining sorted sets (using Red-black tree) of high and low prices which gives the complexity of O(n log m) where m is the size of the window and n is the overall size of the dataset. All this is calculated in parallel for each loaded symbol. Thus scaling to all available cores.
Implementing a Simple Strategy
To test whether it is possible to build a system which produces consistent profit, we developed and tested very simple system based on rules described below.
- When N-day high (LONG position) or low (SHORT position) stock price is reached.
- Price drops/raises (for LONG/SHORT positions) below or above a threshold to stop losing money. The value of the threshold is determined by ATR. If we are making money on a stock, this threshold is adjusted to ensure that we keep the profits
- A stock price is becoming too volatile (ATR) and price moves against our position
- Position size (how many stocks to buy/sell) is calculated based on price volatility. In a case of larger volatility (more risk) our position size will be smaller in order to keep risk level more or less constant (1-2% from portfolio value).
We performed hundreds of thousands of separate simulations from July 2016 to June 2017. The strategy was tuned, without trying to over-optimize, by adjusting coefficients. With reasonably conservative settings and 1 million entry capital, we have achieved 8,5% APR average. The maximum drawdown was $876,889.97, highest portfolio value was $1,292,051.85 and lowest was $906,058,49 on exit.
Our development effort continues to implement a real-world trading system for our client by implementing more advanced entry, exit, and position sizing strategies. However, it is already seen that even with a simple strategy it is possible to make a consistent profit on the market.