Pairs Trading
The Complete Statistical Arbitrage Guide
In the mid-1980s, a Morgan Stanley programmer named Nunzio Tartaglia gathered a group of physics PhDs in the basement to write code, and they discovered that two companies whose stock prices had historically moved in lockstep would mean-revert after a short-term divergence. They turned that insight into a systematic trading strategy that delivered 50%+ annual returns, and the technique came to be known as "pairs trading".
This article covers the math behind pairs trading, how Morgan Stanley got it off the ground, how the modern crypto version works, and why 95% of retail attempts fail. Python and Pine implementations included.
1. Pairs Trading in One Sentence
Go long one asset and short another highly correlated asset at the same time, betting that the spread between them will mean-revert after a short-term divergence.
Classic example: Coca-Cola (KO) and Pepsi (PEP) trade in a similar long-term pattern. One day KO rallies but PEP lags — you short KO and long PEP, then close both legs for a profit when the spread converges.
2. The Math: Cointegration, Spread, Z-Score
2.1 Cointegration
Two time series that are individually non-stationary (they trend up or down persistently) but whose linear combination is stationary (it reverts to a mean) are said to be cointegrated.
The selection test for pairs trading: the Engle-Granger or Johansen test (two statistical procedures). Passing one of these tests is what tells you a given pair is tradeable.
# Python: Engle-Granger cointegration test
from statsmodels.tsa.stattools import coint
import pandas as pd
# A, B are two assets' price series (same period)
score, pvalue, _ = coint(A, B)
# pvalue < 0.05 → cointegrated, consider pairing
# Smaller pvalue = more stable2.2 Spread
Take the log prices of A and B, run an OLS regression to get the hedge ratio β, and compute Spread = log(A) − β × log(B).
Pairs trading then becomes a bet that the spread reverts to its historical mean.
2.3 Z-Score Entry Signal
Standardize the spread: z = (spread − mean) / std. Common rules:
- z > +2: spread too high → short A, long B
- z < −2: spread too low → long A, short B
- |z| < 0.5: spread reverts → close positions
- |z| > 4: possible structural break → force stop-out
# Simplified pairs trading logic
window = 60 # 60-period mean/stdev
spread = np.log(A) - beta * np.log(B)
mean = spread.rolling(window).mean()
std = spread.rolling(window).std()
z = (spread - mean) / std
if z > 2 and position == 0:
short(A); long(B); position = -1
elif z < -2 and position == 0:
long(A); short(B); position = +1
elif abs(z) < 0.5 and position != 0:
close_all(); position = 0
elif abs(z) > 4:
close_all(); position = 0 # Model broken — emergency exit3. The Morgan Stanley Origin Story (Mid-to-Late 1980s)
Around 1986, Nunzio Tartaglia led a group at Morgan Stanley — including Gerry Bamberger (a programmer) and David Shaw (who would later launch DE Shaw, indirectly seeding Two Sigma and Citadel) — running experiments on stock "pairs".
At the time it was called "Black Box Trading"because nobody knew what was inside. It was later confirmed that the core logic was pairs trading plus some statistical arbitrage. Reportedly, the strategy returned 50%+ in 1987.
The 1998 LTCM Blow-Up
The biggest lesson from pairs trading: cointegration is not permanent.
In 1998, LTCM (Long-Term Capital Management) was running a similar logic on sovereign bonds at 25x leverage. When Russia defaulted, correlations suddenly broke, every spread that was "supposed to revert" kept diverging, and they lost $4.6B in three months — nearly bringing down the global financial system. The Fed had to convene the banks for an emergency rescue.
For the full breakdown, see Mean Reversion Strategy — The Full Breakdown of LTCM's $4.6B Collapse.
4. The Crypto Version: BTC vs ETH?
The most common pair example in crypto is BTC vs ETH. Their long-term correlation is around 0.85 (rolling one year), but they diverge over short windows.
That said, this pair isn't really "cointegrated" — the underlying drivers are different (BTC is a store-of-value narrative, ETH is a DeFi / smart contract narrative), and the ETH/BTC ratio has structural drift over the long term.
Pairs that are more amenable to a pairs strategy in crypto:
- Peer L1s: SOL vs AVAX, APT vs SUI (launched around the same time, similar narratives)
- Peer LSTs: stETH vs cbETH (both staked ETH, the difference is the issuer)
- Peer stablecoins: USDC vs USDT (extremely tight spread, only meaningful at high frequency)
- L1 vs derivative L2: ETH vs ARB / OP (ARB is strongly correlated to ETH but has its own premium cycles)
- BTC spot vs BTC perpetual: technically called "spot-futures arbitrage", but it's really just a z-score applied to the basis
5. A Simplified Pine Example
Here's a z-score pairs strategy in Pine for BTC vs ETH:
//@version=5
strategy("BTC-ETH Pairs", overlay=false, initial_capital=10000)
// Get prices of two assets
btc = request.security("BINANCE:BTCUSDT", timeframe.period, close)
eth = request.security("BINANCE:ETHUSDT", timeframe.period, close)
// Compute spread (simplified ratio here; real version uses OLS hedge ratio first)
ratio = btc / eth
window = input.int(100, "Window")
mean = ta.sma(ratio, window)
sd = ta.stdev(ratio, window)
z = (ratio - mean) / sd
// Signal rules
long_signal = z < -2
short_signal = z > 2
exit_signal = math.abs(z) < 0.5
if long_signal
strategy.entry("Long BTC / Short ETH", strategy.long)
if short_signal
strategy.entry("Short BTC / Long ETH", strategy.short)
if exit_signal
strategy.close_all()
plot(z, title="Z-Score")
hline(2, color=color.red)
hline(-2, color=color.green)
hline(0, color=color.gray)6. Five Traps Retail Traders Fall Into
- Confusing "correlation" with "cointegration". A 0.9 correlation doesn't mean the spread will revert — it just means they move in the same direction. Cointegration is about whether the spread itself has mean-reverting properties.
- Not re-estimating the hedge ratio dynamically. β shifts with market regime; a fixed β stops working after six months.
- Slippage on simultaneous fills. When the two legs don't fill at the same time, you already have one-sided exposure — that's a naked position, not a pair.
- No stop-loss. When cointegration breaks the spread diverges all the way, and adding to the position like LTCM did while waiting for reversion = self-destruction.
- Running too few pairs. Institutions run 100–500 pairs to diversify away pair-specific risk. Retail runs 1–2 pairs, and a single failure ends the game. Run at least 5 to diversify.
7. Why 95% of Retail Attempts Fail
Pairs trading looks simple (it's just a z-score), but it's actually a strategy with serious statistical demands.
- Pair selection requires a cointegration test and computing the hedge ratio
- Dynamically re-estimate β (rolling regression)
- Run multiple pairs in parallel to diversify risk
- Monitor for structural break signals (changes in Augmented Dickey-Fuller)
- Execute across exchanges to avoid contamination from a single funding rate or slippage source
Retail usually skips all of this — they see a simple BTC/ETH ratio signal and go live, then bleed through a structural divergence (an ETH upgrade event, say).
8. Implementing It on TVSBot
The pieces you need:
- Two Alerts (one for long, one for short), or a single Alert that routes to multiple strategies (TVSBot supports
payload.strategyto fire two strategies at once — see the multi-strategy routing docs) - Two TVSBot strategies, one for long BTC and one for short ETH, tied to the same API key so the hedge stays within one account
- Calibrated position sizing: set the long / short qty based on the hedge ratio β (β = 0.8 means $100 long BTC corresponds to $80 short ETH)
Get started
Want to run pairs trading or statistical arbitrage? TVSBot supports multi-strategy routing and cross-exchange execution, splitting signals across different API keys to fire orders in parallel.
Start free trial9. Three Key Takeaways
- Pairs trading is a market-neutral strategy — one leg long, one leg short, theoretically independent of broad market direction.
- Cointegration ≠ correlation. Pair selection requires statistical testing (Engle-Granger / ADF), not just eyeballing two charts that move together.
- LTCM 1998 is the biggest lesson: cointegration can break suddenly, and no stop-loss + high leverage = disaster. You need a circuit breaker like "force-close at z > 4".