My Journey to Fully Automate My Trading: Your See a Hammer, I See a Doji


I conduct my research on trading strategies on multiple platforms from Excel to almost everything that I can get my hands onto. From concepts to actual strategies, it can be as simple as just several lines of code to thousand lines of code. To say the least, It is a long and tedious process that takes patience. Similar to any other scientific research, even if your method of analysis is rigorous, your analysis is as good as your data only. If the data you get has consistency issues, your analysis will be questionable. And here I am, dealing with exactly that – data inconsistency across various data sources made available to retail traders.

Forex Traders Live With Data Inconsistencies All the Time

This data inconsistency issue is very common with forex trading. In general, the quotes you are getting are just consolidated quotes from several major banks that a brokerage has clearance through. Hence if you switch to a different brokerage that use some other banks for their forex clearance, your quotes would be drastically different. You may wonder why that matters since the price would be moving in the same direction within seconds. After all, due to arbitration opportunities from this type of discrepancies, they will be taken advantage of and quotes across different sources will go back in sync quickly. However, this minor difference can alter the decision making process on hundreds if not thousands of autotrading robots when a binary condition is flipped from one way to another.

For example, an hourly closing price can be very close to the previous day closing price. Many simple trading algorithms and human traders will take the cue very seriously whether such closing price is above or below that reference price level. Drastically different actions would be taken based on that simple logic alone. Hence, many professional traders never change to a different data source for trading because they know very well that such difference can lead to disastrous effect on their trading.

Centralized Trading at Exchanges is Supposed to Avoid this Problem

For index futures like the Emini S&P and stocks, they are traded in various exchanges with centralized bid and ask queues. The quotes are unified with trades recorded in sequence. Ignoring the dark pools which are not part of the normal markets, the centralized trading of these markets is supposed to avoid this inconsistency in data issue completely. In fact, we do get very clean historical data for all these markets going all the way back for at least 20 years.

So the data inconsistency problem should not exist with these markets, shouldn’t it?

Well, that’s not what I find out from my back testing of the same trading strategies across various platforms on Emini S&P.

And after careful analysis of the different versions of the algorithms based on the same logic, I discover the root cause of inconsistency in historical trading performance has nothing to do with my code. Luckily, I have experience is dealing with such issue with my forex trading so it does not affect me much. I am just surprised that the problem of such inconsistency still exists today.

Welcome to the World of Sub-Second Timestamp

On tick data resolution, meaning that every single trade is being compared, there is really ignorable differences among the various sources of historical data. This means the centralized trading at the exchanges in fact unified the data. The problem, interestingly, comes from various implementations of data collection methods into time based records.

The issue really boils down to a simple question – what do you consider as part of a one minute interval?

What I am talking about is how a programmer choose to collect data into 1 minute records can make a big difference in the charts you see everyday.

Conceptually, a minute starts the moment right when the zero second mark hit and the it ends right before the moment the next zero second mark is hit. This is pure science. There is really nothing to dispute about this definition. However, the programmers who are tasked to write the code to record the historical data, may have interpreted the concept of a minute with their personal interpretation.

From what I see, there are programmers who keep all the trades that happen with the zero second time stamp (e.g. 1:00:00) into the current 1 minute record. i.e. 0:59:00 to 1:00:00

However, that is technically wrong because all these trades actually belongs to the next minute. i.e. 1:00:00 to 1:01:00

These programmers do not realize there is no such thing that happens precisely at zero second mark. Anything that is recorded with that time, must have happened after. If they pay attention to the fact that these trades are happening between :00 and :01 with sub-second timestamp. e.g. 0:59:00:30

I can tell you that the most expensive institutional feeds have this done correctly. So are the more expensive retail data feeds that you have to pay money for. But not true for many others platforms out there.

Nightmare of Data Inconsistencies

What does this inconsistency implies?

Well, if you use price patterns on intraday data (and on daily data occasionally too) with your trading strategies, you may get totally different results when you switch from one brokerage to another.

Your moving averages are not really that precise depending on which data source you are looking at.

For some platform, your trading strategies do not execute until 1 second later while the institutional traders have a head start of 1 second before you do.

All sorts of technical indicators including oscillators will suffer from this data inconsistency.

It is not just the close of a minute record that is being affected. If the high or low of the minute happens to be the last second and that your feed includes the extra one second of data, the high and low could be affected too.

A Lesson to Remember for All Traders Not Just the Mechanical Traders

I am not saying that the data inconsistency would render intraday price data analysis completely useless. What I am saying is that you have to be aware of this widespread issue to build robust trading algorithms. And it does not just stop there because this matters much for discretionary traders too.

Many people who have never used multiple brokerages and different data sources have a simplistic worldview of the financial data they are working with. They have no idea that when they see a bottom in the making on their trusted 5-minute bars showing a candlestick hammer is in fact just a doji for thousands of traders out there.

Or that they are getting their favourite moving averages making a turn or crossover is just a kiss between those lines for many other traders out there.

Stop seeing the charts in black and white like that is the first step towards a much more robust approach to chart reading and technical trading.