Dear all,

I have a question that hopefully is an R question and does not simply arise 
from my lack of understanding of the LASSO.

The code below generates two different sets of relationships between y and X, 
one in which both variables matter (coefficients .5 each, line 14) and one in 
which one of the two will be shrunk to zero (coefficients .9 and .01, line 13). 
Lines 15-18 normalize by demeaning y and creating an orthonormal X matrix. I 
assign a "budget" of 0.5 to the coefficients (line 21). Line 31 translates this 
budget to the implied lambda in the Lagrangian form of the Lasso (one potential 
source of error, but I hope I read JRSSB equation 6 correctly here).

In the scenario in which both variables matter, everything works fine: the sum 
of the coefficients is 0.5 as intended, and the expressions from Tibshirani, eq 
6 (lines 24,25), and those from glmnet agree. If, however, line 13 is switched 
on, so that the second coefficient gets shrunk to 0, the sum of the 
coefficients no longer equals the budget of 0.5 for either expression (which 
still agree). Any thoughts on why I only seem to do it right in the case in 
which there is no shrinkage to 0?

Best,
Christoph

rm(list=ls())
library(glmnet)
library(mvtnorm)
set.seed(38)

N = 50
K = 2
SigCorr=.0
Sigma = matrix(c(1,SigCorr,SigCorr,1),ncol=2)
X = rmvnorm(N, mean=rnorm(K), sigma=Sigma, method="chol")

u = rnorm(N)
y = .9*X[,1]+.01*X[,2]+u
#y = .5*X[,1]+.5*X[,2]+u
y = y-mean(y)
X = scale(X)
vX = var(X)
X = sqrt(N/(N-1))*X%*%solve(chol(vX)) # generates orthonormal matrix

reg = lm(y~X-1)
budget = .5

# where LASSO estimates should be according to Tibshirani (JRSSB 1996), eq (6)
beta_1_LASSO = max(c(0,budget/2+(reg$coefficients[1]-reg$coefficients[2])/2))
beta_2_LASSO = max(c(0,budget/2-(reg$coefficients[1]-reg$coefficients[2])/2))
beta_1_LASSO
beta_2_LASSO
beta_1_LASSO+beta_2_LASSO

# using glmnet. notice it minimizes 1/2*RSS+penalty, unlike in JRSSB, with 
implications for relationship betahat and lambda
lambda = (reg$coefficients[1]+reg$coefficients[2]-budget)/2
lasso.mod=glmnet(X,y,alpha=1,lambda=lambda)
coef(lasso.mod)
sum(coef(lasso.mod))


--
Prof. Dr. Christoph Hanck
Lehrstuhl f�r �konometrie
Universit�t Duisburg-Essen

+49 201 183 2263
christoph.ha...@vwl.uni-due.de<mailto:christoph.ha...@vwl.uni-due.de>
www.oek.wiwi.uni-due.de<http://www.oek.wiwi.uni-due.de/>


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to