In artificial intelligence, genetic programming is a technique of evolving programs over generations, starting from a population of unfit programs, towards a goal that is fit for a specific task. A process analogous to natural genetic processes is used.
Table of contents
The Machines Are Coming
Artificial intelligence means different things to different people. For some, it is the dawn of a new horizon in which man and machine live in harmony – with mankind reaping the benefits of giant computer brains.
For others, it really is a case of the machines are coming!
Broadly speaking, artificial intelligence is a kind of intelligence demonstrated by machines, which, unlike human intelligence, is devoid of emotion and feelings.
There are three broad types or categories of artificial intelligence:
- Artificial narrow intelligence (ANI), having a narrow range of abilities
- Artificial general intelligence (AGI), on par with human capabilities
- Artificial superintelligence (ASI), which is more capable than a human
A detailed look across the spectrum of AI is beyond the scope of this article.
Instead, in this article, we will focus on what is known as genetic programming, a form of ANI which has a clear application in the finance world and more specifically trading systems.
Application in Trading
Some trading platforms offer the ability to evolve parameters for the purposes of strategy optimization; other more specialist tools allow for the development of strategies actually built using genetic programming.
Instead of parameters, the actual trading logic is evolved to meet a pre-determined definition of fitness.
One question that frequently comes up with respect to Genetic Programming is its effectiveness in developing trading algorithms that avoid over-fitting.
Broadly speaking, over-fitting, or curve-fitting, as it’s also known, is the fitting of the model to match the data. An overfitted model is one that contains more parameters than can be justified by the data.
In this scenario, one has unknowingly extracted parts of the residual variation or noise as if that variation represented the underlying model structure.
What is different about the Genetic Programming model approach is that we’ve delegated what were otherwise manual steps in building our trading system.Not Credited
A typical Genetic Programming build cycle randomly evolves literally thousands of candidate algorithms. Given the dangers of over-fitting and the possibility of selection bias when determining which algorithms to choose, it’s understandable to ask whether you’re embarking on some fool’s errand!
Indeed, will the algorithm continue to perform as it did historically, or is it just the product of chance and a flawed process?
In order to start addressing the concerns inherent in the Genetic Programming model, you have to start by determining if you’re asking the right questions.
Fundamentally, any trading system can fail and most do not survive into perpetuity.
Typically we would observe price patterns and identify trends and opportunities; we’d evaluate indicators and come to a set of entry and exit rules that we think give us an edge in the markets.
We’d evolve our system based on feedback, apply filters until we arrive at a ‘finished’ system. That system would likely never be finished as we’d continue to make changes and tweak it.
Genetic programming really isn’t that different!
All we’ve done is delegated that manual process to the computer. We’ve defined the goal – the definition of fitness, and delegated to the genetic algorithm the process of getting there!
A computer can construct and test literally millions of combinations of indicators and identify patterns in the data series that are not visible to the human eye.
What Are the Right Questions?
We’ve determined that Genetic Programming per se is not so different from how we’d build a regular trading system.
Sure, the computer takes a consistent and diligent approach and can exploit and infer things in the data that we cannot.
In that sense, it is something of a black box. It may take time to understand and assimilate the logic in the resulting source code. Overall though, the back-tested performance results look fantastic.
But are we confident given that we’ve delegated so much to the computer?
The question is how did it get there?
The point here is that it is what the computer doesn’t know that is important. Fundamentally the process must include data in the series which is not known to the genetic algorithm.
This is called out-of-sample data and is fundamental if we are to put any trust into the Genetic Programming approach.
It’s only through the comparison of out-of-sample performance against in-sample performance that we are really able to gauge the effectiveness of the built system.Not Credited
By evaluating the out-of-sample performance we can determine if we got there by chance or not: at a minimum, we can have greater confidence that we are not simply curve-fitting.
Of course there is never any guarantee that the strategies we generate will continue to be profitable.
The nature of financial markets and the possibility that non-stationary data series will change over time is of course prevalent.
In future blog posts, we’ll explore in more detail techniques that can be applied in order to minimize the risk of over-fitting and mitigate the inherent weaknesses in the model.
These include the so-called Walk-Forward testing approach that can be applied to achieve greater confidence in your trading system.