Message

Author: rgearyiii

Subject: GTR1 and Market Timing

Date: 11/1/2012

Recommendations: 112

Introduction

I`ve completed another major step toward transforming the GTR1 backtester into a more powerful tool supporting the kinds of research that Zeelotes has demonstrated over the last decade. In case anyone missed it, the first and most laborious step toward this goal was moving from an annually-updated database to a daily-updated database that allows backtesting (and generating current screen selections) through the most recent market close. The step I`ve taken here, abstractly speaking, adds the ability for backtests to import field values generated within other backtests. The main purpose of this capability is to generate market timing signals and use them in backtests of screens, but it has a host of other traditional applications, such as backtesting overlaps, SOSs, etc. In this post I focus on market timing, using a WER screen to demonstrate both the mechanics of importing signals and the significant improvements in performance that can be obtained from a simple market timing signal.

importf and imports

I`ve added two new field functions, imports and importf, with the following syntax:

imports:,[|],,
importf:,[|],,

Broadly speaking, calling one of these functions spawns a separate and hidden backtest that exports the values of one of its fields to a temporary field file, which will be read into a field when the visible backtest is run. While terminology focuses on importing, there is both an invisible exporting step and a more visible importing step to the operation of these functions.

For these functions to work, a list of screens must be provided to the backtest engine in a text file using the argument cmdfile=, where each line of the file (numbered from 0) contains a backtest command. The user interface at backtest.org/gtr1/2011 creates this file behind the scenes automatically from the list of screen URLs provided in the new "Screen References" section of the form.

The argument in the above functions indicates which screen to import field values from (the "source screen").

The argument , if provided, indicates which field to import from the source screen. Since field numbering is mostly hidden (and out of the user`s control) behind the user interface at backtest.org/gtr1/2011, I recommend not referring to fields by number, but instead by label, i.e., the label applied to the field to export from the source screen. Referencing a field by label, of course, requires that the field you want to import actually has a label within source screen.

The argument specifies how many market days the imported field will lag relative to the market dates in the backtest that is doing the importing. Note that if the field being imported is a field file that is already lagged within the source screen, then any lag specified here will add to that lag.

The argument specifies a value to apply to any stocks for which the source screen did not produce a value in the field you are importing. (Since signals, by definition, apply to all stocks, this argument should have no effect for the field function imports.) When exactly this happens is described below.

The functions operate as follows: importf causes a new and separate backtest of the screen referenced by screen_number (i.e., the source screen) to be run in which the values of the field referenced by field_number or field_label are exported to a new temporary field file with default value equal to the value specified with default_value. When the temporary field file has been created, the backtest of the screen calling for the import is run. The field that makes the import call is populated like a regular field file using the temporary field file for values, with the argument lag_days specifying how many market days the field file is lagged. It is critical to note that when the temporary field file is created during the backtest of the source screen, only stocks eligible for the step in which the field is first invoked have their values exported, meaning all other stocks will be assigned the default value in the course of the import.

imports functions similarly, except that only one value is exported on each market date to the temporary field file, and that value is applied to all stocks in the course of the import. In order to be imported with imports, a field must be logically guaranteed to contain the same value for all stocks. If one attempts to import a field that is not a signal (i.e., is not logically guaranteed to have the same value for all stocks), then the backtester reports an error.

SQL Analogy

While the GTR1 backtester does not use an SQL engine, those familiar with SQL may find that picturing an SQL query helps in understanding how importf and imports work. importf performs a left join, where the left table consists of all stocks in the GTR1 universe on a given market date. The right table consists of the subset of all stocks eligible for the step where the field to export was first invoked within the source screen, either on the same market date (if lag_days is zero) or the market date lag_days prior. The join is made on GTR1 investment ID. The exported field is selected from the right table. Any stocks that do not exist in the right table (because they had already been filtered out by steps in the source screen when the field was exported) get the value specified by default_value in the imported field.

imports also performs a left join in a trivial sense. The left table is the same. However, the right table consists of a single record with a single field, the field being exported from the source screen. The value of that field is the unique value common to all stocks eligible for the step where the exported field is first invoked (if the field one is trying to import with imports does not have this uniqueness property, then it is ineligible for import with imports, and the backtest will report an error).

Counting Nasdaq New Highs and New Lows

The above description of the imports and importf functions isn`t likely to be understood on first reading by anyone, so here I will start a detailed example of the construction of a market timing signal and its use in a backtest.

First, here is a screen that picks Nasdaq stocks that are at new highs:

http://backtest.org/gtr1/2011/?dspo:al252:excd.a:et3:styp.a:...
Translation:
step0: [mkt Days Since security Opened for purchase] >= 252
step1: [Exchange Code; lag=1 days] == 3
step2: [Security Type; lag=1 days] == 10,11
step3: [[g-price; quote_lag=0 days]/[High g-price over 251 days; lag=1 days]] > 1; Long, Cash When None

Step 0 keeps only stocks with at least 252 market days of pricing history within the GTR1 universe.
Step 1 keeps only Nasdaq stocks.
Step 2 keeps only US corporations (i.e., it excludes all ADRs, companies incorporated outside the US, unit trusts, closed-end funds, REITs and ETFs).
Step 3 keeps only stocks that are at a new 252-day high. I.e., it ensures that current g-price (gprc(0)) is greater than the highest g-price over the previous 251 market dates (hgprc(1,251)).

The screen that picks Nasdaq stocks at new lows is very similar:

http://backtest.org/gtr1/2011/?dspo:al252:excd.a:et3:styp.a:...
Translation:
step0: [mkt Days Since security Opened for purchase] >= 252
step1: [Exchange Code; lag=1 days] == 3
step2: [Security Type; lag=1 days] == 10,11
step3: [[g-price; quote_lag=0 days]/[Low g-price over 251 days; lag=1 days]] < 1; Long, Cash When None

These two backtests are nothing new with this update, and anyone could have constructed Nasdaq NH-NL signals by running these two backtests in counting mode and downloading the spreadsheets with daily statistics. What is new is that the backtester now allows you to build the signal within a backtest and use it with any screen.

The next step is to generate counts of these new highs and new lows as fields. This is done as follows:

(0) http://backtest.org/gtr1/2011/?dspo:al252:excd.a:et3:styp.a:...
Translation:
Create [CountOfHighs]: [# Eligible at step4]
step0: [mkt Days Since security Opened for purchase] >= 252
step1: [Exchange Code; lag=1 days] == 3
step2: [Security Type; lag=1 days] == 10,11
step3: [[g-price; quote_lag=0 days]/[High g-price over 251 days; lag=1 days]] > 1
step4: [CountOfHighs] == -1; Long, Cash When None

Here I have created a field labeled "CountOfHighs" using the field function sum. This field consists of the summation of the number 1 among all stocks eligible for step 4, i.e., all stocks passing step 3, which is simply a count of the number of stocks passing step 3, i.e., those at new highs. Since every field defined must be used in some step, I have also added step 4 that uses this new field, [CountOfHighs]. Since it doesn`t matter what this step is for our purposes (we don`t actually care about this backtest--we are simply using it to create a field we are going to import into another backtest), I have made it a step that is guaranteed to not select any stocks in order to improve execution speed and save Jamie some electricity.

Likewise, the screen that counts the number of stocks at new lows is as follows:

(1) http://backtest.org/gtr1/2011/?dspo:al252:excd.a:et3:styp.a:...
Translation:
Create [CountOfLows]: [# Eligible at step4]
step0: [mkt Days Since security Opened for purchase] >= 252
step1: [Exchange Code; lag=1 days] == 3
step2: [Security Type; lag=1 days] == 10,11
step3: [[g-price; quote_lag=0 days]/[Low g-price over 251 days; lag=1 days]] < 1
step4: [CountOfLows] == -1; Long, Cash When None

Note that for both of these screens, the holding period is irrelevant for the purpose here, so I have left both at the default value of 20.

Creating a Nasdaq NH-NL Signal

The next step is to bring the count of new highs and the count of new lows together in one backtest so that a Nasdaq NH-NL signal can be created. An example of such a backtest is the following:

http://backtest.org/gtr1/2011/?NHNLDiff:gt0:aprc:tn10:CountO...

To construct this backtest, I have started with a blank form at backtest.org/gtr1/2011. I have copied the "count of highs" URL from above (labeled (0)) and pasted it in the box for screen reference 0. I have copied the "count of lows" URL from above (labeled (1)) and pasted it in the box for screen reference 1. I have then defined two labeled fields, "CountOfHighs" and "CountOfLows", where the first imports the field labeled "CountOfHighs" from screen 0 and the second imports the field labeled "CountOfLows" from screen 1.

I should point out that while I have labeled my imported fields with the same labels used in the screens I`ve imported them from, this is completely unnecessary. I could just as easily have labeled the expression "imports(0,CountOfHighs,0,0)" as "CountHighs" or anything else, or not labeled it at all and used it directly within a step.

It`s critical to note that I have used the field function imports, not importf, to import the count of highs and count of lows, thereby ensuring that the counts are imported for every single stock in the GTR1 universe. If I had used the field function importf instead, only the same stocks that passed the step in the source screen where the counts were made would have counts imported; all other stocks would be assigned default values of 0 by the import. For market timing signals, we want the field populated with the same value for all stocks.

The first step of this screen (step 0) allows stocks to pass only if [CountOfHighs] is greater than [CountOfLows]. Since these fields have the same values for all stocks, either all stocks will pass step 0, or none will, which is what we want from a market timing signal.

The second (and final) step of this screen (step 1) takes the top 10 stocks by aprc (actual price). The holding period is 20 market days. (The holding periods of the source screens are usually irrelevant, and can therefore differ from the holding period of the importing screen. There are exceptions, such as if the field function dsp is used in the source screen.)

In other words, the backtest simulates a trading strategy that involves portfolio updates every 20 market days. If the count of Nasdaq stocks at new highs is greater than the count of Nasdaq stocks at new lows, then the portfolio is filled with the 10 highest-priced stocks in equal dollar value and held for 20 market days; otherwise, any stocks already held are sold and the portfolio consists of cash for the next 20 market days.

The version of the Nasdaq NH-NL signal I have constructed is far from optimal, and using it every 20 market days is even less optimal. However, using the signal in this fashion does improve results, depending on one`s assumptions about trading costs. Without the signal, the backtest results for this screen (which I call "Buffet and Friends"),

http://backtest.org/gtr1/2011/?aprc:tn10

are an average CAGR of 12.9, an average GSD(20) of 18.3 and an average maximum drawdown of -58%. With the signal, the results are improved to an average CAGR of 14.2, an average GSD(20) of 12.5 and an average maximum drawdown of -23%. The only metric not improved is annual turnover, which rises from about 1.4 to 2.3.

A Smoother NHNL Signal

I`ve barely begun using these new functions myself, so I have no idea what the best signals based on new highs and new lows might look like, but here I present a signal along the same lines that improves results, but more importantly for this post, demonstrates the capabilities of the GTR1 backtester for exploring new signals.

Anyone developing a market timing signal will be looking for ways to reduce whipsaws, or smoothen out the signal. One way to do that is to use weighted moving averages of a base signal. Once a base signal has been constructed, it can be imported with varying values specified by lag_days to get the base signal`s values over an interval, which can then be averaged or summed. I use this technique in the construction of this signal:

(3) http://backtest.org/gtr1/2011/?NHNLRatio:et-1:WeightedCountO...

In this screen, I make repeated use of the same [CountOfHighs] and [CountOfLows] fields constructed previously, importing them with different lags. The URLs for their source screens are in the Screen References form next to "0" and "1", respectively.

I have used the field function linear to create a weighted sum, labeled [WeightedCountOfHighs], of the number of new highs over the nine market dates through the present. The current market date is given a weight of 9, the previous market date is given a weight of 8, and so on, down to a weight of 1 for the CountOfHighs with a lag of eight market days. I have likewise created a weighted moving sum of [WeightedCountOfLows] over the same nine market days.

Instead of calculating a difference, I have calculated a ratio in the field labeled [NHNLRatio], which consists of [WeightedCountOfHighs] divided by [WeightedCountOfLows]. Division by zero is not a concern, because while [CountOfLows] is sometimes zero, it has never been zero nine market days in a row, so [WeightedCountOfLows] is never zero.

Note that this screen only has one step, which requires that [NHNLRatio] equal -1. This never happens, so the screen never picks any stocks (the results you seen when you click "Run Backtest" are for a cash-only portfolio). The screen itself is irrelevant--we are only interested in importing [NHNLRatio] from it to be used in other backtests. As before, I have therefore used a step that ensures that no stocks are selected in order to speed up execution time.

If you wish to see the daily signals that this screen produces, simply change the first step to something like NHNLRatio > 1, run the backtest in counting mode and download the spreadsheet, which will show when stocks pass this step (either all stocks will pass it, or none will).

Improving a Simple WER Screen

I`ve chosen a WER screen to demonstrate the benefits of this signal because so far I`m finding that technically-based screens with high turnover (such as the weekly WER list) have the most to gain from Nasdaq NHNL market timing. The WER screen I`m testing is as follows:

http://backtest.org/gtr1/2011/?pe.w:gt0:pe.w:bn10
Translation:
step0: [WER P/E; lag=1 days] > 0
step1: [WER P/E; lag=1 days] Bottom param0; Long, Cash When Fewer
Holding period = 20 mkt days; Fully rebalance every 1 periods

19920103-20121031
Avg Min Max SD
CAGR: 29.78 25.33 35.23 2.73
TR: 24341.04 10707.84 52260.16 11000.98
GSD(20): 30.51 29.16 33.21 0.96
DD(20): 17.89 16.88 19.54 0.77
MDD: -54.92 -63.97 -40.89 5.80
UI(20): 15.84 11.16 22.23 3.21
Sharpe(20): 1.01 0.88 1.16 0.08
Beta(20): 0.98 0.93 1.05 0.03
TI(20): 27.38 23.75 31.68 2.34
AT: 7.41 7.21 7.59 0.12

To add my smoothed NHNL market timing signal to this screen, I start at the above link, switch to "Free-Form" and insert the step "NHNLRatio > 1" at the top (for step 0). I then copy and paste the URL for my smoothed NHNL signal (labeled (3) above) into the box for Screen Reference 0. I must also add a labeled field expression, "NHNLRatio: imports(0,NHNLRatio,0,0)". To get maximum benefit from the signal (i.e., enable the screen to respond to a signal change on any market day), I set the holding period to 1 market day. Finally, to cut down on turnover (both from frequent signal changes and from weekly changes in screen picks), I set the rebalancing frequency to every 20 market days and add an HTD condition that causes a stock to be held if the NHNLRatio is still greater than 0.9 and the stock has been held for less than 20 market days. The result of these changes is this:

http://backtest.org/gtr1/2011/?h1r20::NHNLRatio:gt1:pe.w:gt0...
Translation:
step0: [NHNLRatio] > 1
step1: [WER P/E; lag=1 days] > 0
step2: [WER P/E; lag=1 days] Bottom 10; Long, Cash When Fewer
Hold long while [[NHNLRatio] > 0.9 ? [[mkt Days Since Purchase] < 20 ? 1 : 0] : 0] > 0
Holding period = 1 mkt days; Fully rebalance every 20 periods

19920103-20121031
Avg Min Max SD
CAGR: 33.47 33.47 33.47 0.00
TR: 39793.17 39793.17 39793.17 0.00
GSD(20): 22.90 22.90 22.90 0.00
DD(20): 9.77 9.77 9.77 0.00
MDD: -25.26 -25.26 -25.26 0.00
UI(20): 5.19 5.19 5.19 0.00
Sharpe(20): 1.31 1.31 1.31 0.00
Beta(20): 0.51 0.51 0.51 0.00
TI(20): 56.28 56.28 56.28 0.00
AT: 7.94 7.94 7.94 0.00

The results speak for themselves. With only slightly greater annual turnover (7.94 versus 7.41), Sharpe ratio is raised from 1.01 to 1.31, and every other metric is improved, some quite dramatically (such as maximum drawdown, reduced from -55% to -25%). (Note that the Min and Max are identical, and the Standard Deviation is 0 for each metric because there is only one trading cycle when the holding period is set to 1.)

A Caution about Over-Fitting

One of the GTR1 backtester`s main advantages (and in fact its original reason for existing) is the ability to combat excessive curve-fitting by forcing all screens to make selections on every market day in the backtest (as opposed to once per month or once per week) and averaging results over all trading cycles. If market timing is to be used optimally, the holding period must be set to 1 and HTD conditions must be used to extend the holding period for stocks to reduce turnover. With a holding period of 1 market day, the advantage just mentioned is completely lost, and we are back to the same level of statistical noise that we find in monthly data backtests and the same amount of freedom for curve-fitting. This is unavoidable. Market timing calls for trading on certain days (when there are signal changes), which means any backtest of market timing must throw out a lot of market days where a screen`s selection criteria could otherwise be put to the test.

Thus, be extremely cautious about tuning a market timing signal to specific screens.

Copying and Pasting URLs with Field/Signal Imports

The user interface encodes referenced screens for imports within braces ({ and }). While braces are perfectly legitimate characters in URLs, unfortunately fool.com`s message boards break URLs at the first brace. To get around this problem (which I regard as a bug), I have manually replaced braces in URLs with their ASCII codes (%7b for `{` and %7d for `}`).

Another issue to keep in mind is that URLs with screen references can get very long. That last link I posted above is 1,182 characters long. Internet Explorer cannot handle URLs over 2,000 characters. Firefox has a much higher limit, but Jamie`s server (running Apache) limits URLs to 8,000 characters, though Jamie has told me he can increase that limit if need be.

More Features to Come

My next step will be to add the ability to create new investment universes consisting of screens or specific Yahoo! ticker symbols (with mostly indexes in mind). In the case of screen-based investment universes, we will be able to run backtests where screens are treated as if they are stocks. In the case of symbol-based investment universes, we will be able to run backtests where indexes are treated as if they are stocks. All of these backtests will be able to export signals and fields, which will then be able to be joined back into the set of fields used in regular backtests. All of this vastly expands the possibilities for building and backtesting market timing signals and WWL systems.

Robbie Geary