遞迴最小平方法

遞迴最小平方法是普通最小平方法的一個擴展視窗版本。除了可以遞迴計算迴歸係數外,遞迴計算的殘差還可被用來建構統計量,以研究參數不穩定性。

RecursiveLS 類別允許計算遞迴殘差,並計算 CUSUM 和 CUSUM 平方統計量。將這些統計量與表示統計上顯著偏離參數穩定零假設的參考線一起繪製,可以輕鬆地視覺指示參數穩定性。

最後,RecursiveLS 模型允許對參數向量施加線性限制,並且可以使用公式介面構建。

[1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import statsmodels.api as sm
from pandas_datareader.data import DataReader

np.set_printoptions(suppress=True)

範例 1:銅

首先,我們考慮銅數據集中參數的穩定性(描述如下)。

[2]:
print(sm.datasets.copper.DESCRLONG)

dta = sm.datasets.copper.load_pandas().data
dta.index = pd.date_range("1951-01-01", "1975-01-01", freq="YS")
endog = dta["WORLDCONSUMPTION"]

# To the regressors in the dataset, we add a column of ones for an intercept
exog = sm.add_constant(
    dta[["COPPERPRICE", "INCOMEINDEX", "ALUMPRICE", "INVENTORYINDEX"]]
)
This data describes the world copper market from 1951 through 1975.  In an
example, in Gill, the outcome variable (of a 2 stage estimation) is the world
consumption of copper for the 25 years.  The explanatory variables are the
world consumption of copper in 1000 metric tons, the constant dollar adjusted
price of copper, the price of a substitute, aluminum, an index of real per
capita income base 1970, an annual measure of manufacturer inventory change,
and a time trend.

首先,構建並擬合模型,並列印摘要。儘管 RLS 模型會遞迴計算迴歸參數,因此估計值數量與資料點數量一樣多,但摘要表僅呈現根據整個樣本估計的迴歸參數;除了遞迴初始化的微小影響外,這些估計值與 OLS 估計值等效。

[3]:
mod = sm.RecursiveLS(endog, exog)
res = mod.fit()

print(res.summary())
                           Statespace Model Results
==============================================================================
Dep. Variable:       WORLDCONSUMPTION   No. Observations:                   25
Model:                    RecursiveLS   Log Likelihood                -154.720
Date:                Thu, 03 Oct 2024   R-squared:                       0.965
Time:                        15:46:06   AIC                            319.441
Sample:                    01-01-1951   BIC                            325.535
                         - 01-01-1975   HQIC                           321.131
Covariance Type:            nonrobust   Scale                       117717.127
==================================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
----------------------------------------------------------------------------------
const          -6562.3719   2378.939     -2.759      0.006   -1.12e+04   -1899.737
COPPERPRICE      -13.8132     15.041     -0.918      0.358     -43.292      15.666
INCOMEINDEX      1.21e+04    763.401     15.853      0.000    1.06e+04    1.36e+04
ALUMPRICE         70.4146     32.678      2.155      0.031       6.367     134.462
INVENTORYINDEX   311.7330   2130.084      0.146      0.884   -3863.155    4486.621
===================================================================================
Ljung-Box (L1) (Q):                   2.17   Jarque-Bera (JB):                 1.70
Prob(Q):                              0.14   Prob(JB):                         0.43
Heteroskedasticity (H):               3.38   Skew:                            -0.67
Prob(H) (two-sided):                  0.13   Kurtosis:                         2.53
===================================================================================

Warnings:
[1] Parameters and covariance matrix estimates are RLS estimates conditional on the entire sample.

遞迴係數可在 recursive_coefficients 屬性中取得。或者,可以使用 plot_recursive_coefficient 方法生成繪圖。

[4]:
print(res.recursive_coefficients.filtered[0])
res.plot_recursive_coefficient(range(mod.k_exog), alpha=None, figsize=(10, 6))
[     2.88890087      4.94795049   1558.41803044   1958.43326658
 -51474.9578655   -4168.94974192  -2252.61351128   -446.55908507
  -5288.39794736  -6942.31935786  -7846.0890355   -6643.15121393
  -6274.11015558  -7272.01696292  -6319.02648554  -5822.23929148
  -6256.30902754  -6737.4044603   -6477.42841448  -5995.90746904
  -6450.80677813  -6022.92166487  -5258.35152753  -5320.89136363
  -6562.37193573]
[4]:
../../../_images/examples_notebooks_generated_recursive_ls_7_1.png
../../../_images/examples_notebooks_generated_recursive_ls_7_2.png

CUSUM 統計量可在 cusum 屬性中取得,但通常使用 plot_cusum 方法視覺檢查參數穩定性更方便。在下面的繪圖中,CUSUM 統計量沒有超出 5% 的顯著性範圍,因此我們無法拒絕參數在 5% 水平上穩定的零假設。

[5]:
print(res.cusum)
fig = res.plot_cusum()
[ 0.69971508  0.65841244  1.24629674  2.05476032  2.39888918  3.1786198
  2.67244672  2.01783215  2.46131747  2.05268638  0.95054336 -1.04505546
 -2.55465286 -2.29908152 -1.45289492 -1.95353993 -1.3504662   0.15789829
  0.6328653  -1.48184586]
../../../_images/examples_notebooks_generated_recursive_ls_9_1.png

另一個相關的統計量是 CUSUM 平方。它可在 cusum_squares 屬性中取得,但同樣地,使用 plot_cusum_squares 方法視覺檢查更方便。在下面的繪圖中,CUSUM 平方統計量沒有超出 5% 的顯著性範圍,因此我們無法拒絕參數在 5% 水平上穩定的零假設。

[6]:
res.plot_cusum_squares()
[6]:
../../../_images/examples_notebooks_generated_recursive_ls_11_0.png
../../../_images/examples_notebooks_generated_recursive_ls_11_1.png

範例 2:貨幣數量理論

貨幣數量理論認為「貨幣數量變化率的既定變化會導致...價格通貨膨脹率的相等變化」(Lucas, 1980)。 繼 Lucas 之後,我們檢視貨幣成長和 CPI 通貨膨脹雙邊指數加權移動平均值之間的關係。 儘管 Lucas 發現這些變數之間的關係是穩定的,但最近這種關係似乎變得不穩定;請參閱例如 Sargent 和 Surico (2010)。

[7]:
start = "1959-12-01"
end = "2015-01-01"
m2 = DataReader("M2SL", "fred", start=start, end=end)
cpi = DataReader("CPIAUCSL", "fred", start=start, end=end)
[8]:
def ewma(series, beta, n_window):
    nobs = len(series)
    scalar = (1 - beta) / (1 + beta)
    ma = []
    k = np.arange(n_window, 0, -1)
    weights = np.r_[beta ** k, 1, beta ** k[::-1]]
    for t in range(n_window, nobs - n_window):
        window = series.iloc[t - n_window : t + n_window + 1].values
        ma.append(scalar * np.sum(weights * window))
    return pd.Series(ma, name=series.name, index=series.iloc[n_window:-n_window].index)


m2_ewma = ewma(np.log(m2["M2SL"].resample("QS").mean()).diff().iloc[1:], 0.95, 10 * 4)
cpi_ewma = ewma(
    np.log(cpi["CPIAUCSL"].resample("QS").mean()).diff().iloc[1:], 0.95, 10 * 4
)

在使用 Lucas 的 \(\beta = 0.95\) 濾波器(兩側視窗為 10 年)構建移動平均值後,我們在下方繪製每個數列。 儘管它們在樣本的一部分之前似乎一起移動,但在 1990 年之後,它們似乎出現分歧。

[9]:
fig, ax = plt.subplots(figsize=(13, 3))

ax.plot(m2_ewma, label="M2 Growth (EWMA)")
ax.plot(cpi_ewma, label="CPI Inflation (EWMA)")
ax.legend()
[9]:
<matplotlib.legend.Legend at 0x7f6e8ca133a0>
../../../_images/examples_notebooks_generated_recursive_ls_16_1.png
[10]:
endog = cpi_ewma
exog = sm.add_constant(m2_ewma)
exog.columns = ["const", "M2"]

mod = sm.RecursiveLS(endog, exog)
res = mod.fit()

print(res.summary())
                           Statespace Model Results
==============================================================================
Dep. Variable:               CPIAUCSL   No. Observations:                  141
Model:                    RecursiveLS   Log Likelihood                 692.878
Date:                Thu, 03 Oct 2024   R-squared:                       0.813
Time:                        15:46:11   AIC                          -1381.755
Sample:                    01-01-1970   BIC                          -1375.858
                         - 01-01-2005   HQIC                         -1379.358
Covariance Type:            nonrobust   Scale                            0.000
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const         -0.0034      0.001     -6.013      0.000      -0.004      -0.002
M2             0.9128      0.037     24.601      0.000       0.840       0.986
===================================================================================
Ljung-Box (L1) (Q):                 138.23   Jarque-Bera (JB):                18.20
Prob(Q):                              0.00   Prob(JB):                         0.00
Heteroskedasticity (H):               5.30   Skew:                            -0.81
Prob(H) (two-sided):                  0.00   Kurtosis:                         2.27
===================================================================================

Warnings:
[1] Parameters and covariance matrix estimates are RLS estimates conditional on the entire sample.
[11]:
res.plot_recursive_coefficient(1, alpha=None)
[11]:
../../../_images/examples_notebooks_generated_recursive_ls_18_0.png
../../../_images/examples_notebooks_generated_recursive_ls_18_1.png

CUSUM 繪圖現在顯示 5% 水平的顯著偏差,表明拒絕參數穩定性的零假設。

[12]:
res.plot_cusum()
[12]:
../../../_images/examples_notebooks_generated_recursive_ls_20_0.png
../../../_images/examples_notebooks_generated_recursive_ls_20_1.png

同樣地,CUSUM 平方顯示 5% 水平的顯著偏差,也表明拒絕參數穩定性的零假設。

[13]:
res.plot_cusum_squares()
[13]:
../../../_images/examples_notebooks_generated_recursive_ls_22_0.png
../../../_images/examples_notebooks_generated_recursive_ls_22_1.png

範例 3:線性限制和公式

線性限制

使用構建模型中的 constraints 參數,不難實現線性限制。

[14]:
endog = dta["WORLDCONSUMPTION"]
exog = sm.add_constant(
    dta[["COPPERPRICE", "INCOMEINDEX", "ALUMPRICE", "INVENTORYINDEX"]]
)

mod = sm.RecursiveLS(endog, exog, constraints="COPPERPRICE = ALUMPRICE")
res = mod.fit()
print(res.summary())
                           Statespace Model Results
==============================================================================
Dep. Variable:       WORLDCONSUMPTION   No. Observations:                   25
Model:                    RecursiveLS   Log Likelihood                -134.231
Date:                Thu, 03 Oct 2024   R-squared:                       0.989
Time:                        15:46:14   AIC                            276.462
Sample:                    01-01-1951   BIC                            281.338
                         - 01-01-1975   HQIC                           277.814
Covariance Type:            nonrobust   Scale                       137155.014
==================================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
----------------------------------------------------------------------------------
const          -4839.4836   2412.410     -2.006      0.045   -9567.721    -111.246
COPPERPRICE        5.9797     12.704      0.471      0.638     -18.921      30.880
INCOMEINDEX     1.115e+04    666.308     16.738      0.000    9847.000    1.25e+04
ALUMPRICE          5.9797     12.704      0.471      0.638     -18.921      30.880
INVENTORYINDEX   241.3452   2298.951      0.105      0.916   -4264.515    4747.206
===================================================================================
Ljung-Box (L1) (Q):                   6.27   Jarque-Bera (JB):                 1.78
Prob(Q):                              0.01   Prob(JB):                         0.41
Heteroskedasticity (H):               1.75   Skew:                            -0.63
Prob(H) (two-sided):                  0.48   Kurtosis:                         2.32
===================================================================================

Warnings:
[1] Parameters and covariance matrix estimates are RLS estimates conditional on the entire sample.

公式

可以使用類別方法 from_formula 擬合相同的模型。

[15]:
mod = sm.RecursiveLS.from_formula(
    "WORLDCONSUMPTION ~ COPPERPRICE + INCOMEINDEX + ALUMPRICE + INVENTORYINDEX",
    dta,
    constraints="COPPERPRICE = ALUMPRICE",
)
res = mod.fit()
print(res.summary())
                           Statespace Model Results
==============================================================================
Dep. Variable:       WORLDCONSUMPTION   No. Observations:                   25
Model:                    RecursiveLS   Log Likelihood                -134.231
Date:                Thu, 03 Oct 2024   R-squared:                       0.989
Time:                        15:46:14   AIC                            276.462
Sample:                    01-01-1951   BIC                            281.338
                         - 01-01-1975   HQIC                           277.814
Covariance Type:            nonrobust   Scale                       137155.014
==================================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
----------------------------------------------------------------------------------
Intercept      -4839.4836   2412.410     -2.006      0.045   -9567.721    -111.246
COPPERPRICE        5.9797     12.704      0.471      0.638     -18.921      30.880
INCOMEINDEX     1.115e+04    666.308     16.738      0.000    9847.000    1.25e+04
ALUMPRICE          5.9797     12.704      0.471      0.638     -18.921      30.880
INVENTORYINDEX   241.3452   2298.951      0.105      0.916   -4264.515    4747.206
===================================================================================
Ljung-Box (L1) (Q):                   6.27   Jarque-Bera (JB):                 1.78
Prob(Q):                              0.01   Prob(JB):                         0.41
Heteroskedasticity (H):               1.75   Skew:                            -0.63
Prob(H) (two-sided):                  0.48   Kurtosis:                         2.32
===================================================================================

Warnings:
[1] Parameters and covariance matrix estimates are RLS estimates conditional on the entire sample.
/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/statsmodels/tsa/base/tsa_model.py:473: ValueWarning: No frequency information was provided, so inferred frequency YS-JAN will be used.
  self._init_dates(dates, freq)

上次更新:2024 年 10 月 03 日