[R] why does lm() not allow for negative weights?

2006-08-04 Thread Jens Hainmueller
Dear List,

Why do commonly used estimator functions (such as lm(), glm(), etc.) not
allow negative case weights? I suspect that there is a good reason for this.
Yet, I can see reasonable cases when one wants to use negative case weights.

Take lm() for example:

###

n <- 20
Y <- rnorm(n)
X <- cbind(rep(1,n),runif(n),rnorm(n))
Weights <- rnorm(n)
# Includes Pos and Neg Weights
Weights

# Now do Weighted LS and get beta coeffs:
b <- solve(t(X)%*%diag(Weights)%*%X) %*% t(X) %*% diag(Weights)%*%Y
b

# This seems like a valid model, but when I try
lm(Y ~ X[,2:3],weights=Weights)

# I get: "missing or negative weights not allowed"

###

What is the rationale for not allowing negative weights? I ask this, because
I am currently trying to implement a (two stage) estimator into R that
involves negative case weights. Weights are generated in the first stage, so
it would be nice if I could use canned functions such as
lm(,weights=Weights) in the second stage.

Thank you for your help.

Best,
Jens

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why does lm() not allow for negative weights?

2006-08-04 Thread Jens Hainmueller
Thanks Duncan Murdoch,

> > Why do commonly used estimator functions (such as lm(), 
> > glm(), etc.) 
> > not allow negative case weights?
 
> Residual sums of squares (or deviances) could be negative 
> with negative case weights.  This doesn't seem like a good 
> thing:  would you really want the fit to be far from those points?

Yes, this is actually what I want for this particular estimator. But I can
see now why this generally doesn't seem like a a good idea.

Best,
Jens



> -Ursprüngliche Nachricht-
> Von: Duncan Murdoch [mailto:[EMAIL PROTECTED] 
> Gesendet: Friday, August 04, 2006 7:36 PM
> An: Jens Hainmueller
> Cc: r-help@stat.math.ethz.ch
> Betreff: Re: [R] why does lm() not allow for negative weights?
> 
> On 8/4/2006 1:26 PM, Jens Hainmueller wrote:
> > Dear List,
> > 

> 

> 
>  > I suspect that there is a good reason for this.
> > Yet, I can see reasonable cases when one wants to use 
> negative case weights.
> > 
> > Take lm() for example:
> > 
> > ###
> > 
> > n <- 20
> > Y <- rnorm(n)
> > X <- cbind(rep(1,n),runif(n),rnorm(n)) Weights <- rnorm(n) 
> # Includes 
> > Pos and Neg Weights Weights
> > 
> > # Now do Weighted LS and get beta coeffs:
> > b <- solve(t(X)%*%diag(Weights)%*%X) %*% t(X) %*% diag(Weights)%*%Y
> 
> That formula does not necessarily give least squares 
> estimates in the case where weights might be negative.  For 
> example, with a single observation y, a single parameter mu, 
> design matrix X = 1, and weight -1, that formula becomes
> 
> b <- y,
> 
> but that is the worst possible estimator in a least squares 
> sense.  The residual sum of squares can be made arbitrarily 
> large and negative by setting b to a large value.
> 
> Duncan Murdoch
> 
> 
> > b
> > 
> > # This seems like a valid model, but when I try lm(Y ~ 
> > X[,2:3],weights=Weights)
> > 
> > # I get: "missing or negative weights not allowed"
> > 
> > ###
> > 
> > What is the rationale for not allowing negative weights? I 
> ask this, 
> > because I am currently trying to implement a (two stage) estimator 
> > into R that involves negative case weights. Weights are 
> generated in 
> > the first stage, so it would be nice if I could use canned 
> functions 
> > such as
> > lm(,weights=Weights) in the second stage.
> > 
> > Thank you for your help.
> > 
> > Best,
> > Jens
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Optim with two constraints

2005-10-12 Thread Jens Hainmueller
Hi R-list,

I am new to optimization in R and would appreciate help on the following
question. I would like to minimize the following function using two
constraints:

##
fn <- function(par,H,F){
  
 fval <- 0.5 * t(par) %*% H %*% par + F%*% par
 fval  
  
  }

# matrix H is (n by k)
# matrix F is (n by 1) 
# par is a (n by 1) set of weights 

# I need two constraints:
# 1. each element in par needs to be between 0 and 1
# 2. sum(par)=1 i.e. the elements in par need to sum to 1

## I try to use optim
res <- optim(c(runif(16),fn, method="L-BFGS-B", H=H, F=f
,control=list(fnscale=-1), lower=0, upper=1)
##

If I understand this correctly, using L-BFGS-B with lower=0 and upper=1
should take care of constraint 1 (box constraints). What I am lacking is the
skill to include constraint no 2.

I guess I could solve this by reparametrization but I am not sure how
exactly. I could not find (i.e. wasn't able to infer) the answer to this in
the archives despite the many comments on optim and constrained optimization
(sorry if I missed it there). I am using version 2.1.1 under windows XP.

Thank you very much.

Jens

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] sensitivity tests fo causal inference

2005-11-28 Thread Jens Hainmueller
Hi all,

Following up on Holger's email last week: 

Does anyone know if there exists a library that implements the sensitivity
tests for hidden bias for matched pairs and unmatched groups as proposed in
Rosenbaum's Observational Studies (2002: ch.4)?

Thanks.

Best,
jens

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] control of font size & colour for title, subtitles, axis, and tick marks in LATTICE graph

2004-09-15 Thread Jens Hainmueller
Hi,

I very much appreciate any help on this "fine tuning" problem in a lattice
graph (I am new to LATTICE and could not find an example in the help files
that worked for me. My apologies if I missed it there).

I am running the following box plots to compare conditional distributions of
x at different levels of y under two treatment conditions ID=1 (upper
panel ) & ID=0 (lower panel of the plot).

bwplot(HF.ELECYEAR ~ stparvotech | ID ,
data=data, aspect=1,
layout=c(1,2),
xlab="Changes in Party Vote Shares",
xlim=(-20:20),
ylab="Periods Following Last Federal Election",
main="Divided Government",
panel = function(x,y)
{
panel.bwplot(x,y)
panel.abline(v=0, col="red")
}
)

How can I:
1. Control the font size of the main title, the panel titles, the axis, and
the tick marks? The usual cex.main=1/3, cex.xlab etc. do not work.
2. Change the color of the boxes where the panel titles (ID=1 & ID=0) are
located?

Thank you very much.

Best,
Jens

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] help with multiple imputation using imp.mix

2004-12-15 Thread Jens Hainmueller
I am desperately trying to impute missing data using 'imp.mix' but always
run into this yucky error message to which I cannot find the solution. It's
the first time I am using mix and I'm trying really hard to understand, but
there's just this one step I don't get...perhaps someone knows the answer?

Thanks!
Jens

My code runs:

data<-read.table('http://www.courses.fas.harvard.edu/~gov2001/Data/immigrati
on.dat',header=TRUE)
library(mix)
rngseed(12345678)
# Preare data for imputation
gender1<-c()
 gender1<-as.integer(data$gender)
 gender1[gender1==1]<-2
 gender1[gender1==0]<-1
 data$gender<-gender1
x<-cbind(data$gender,data$ipip,data$ideol,data$prtyid, data$wage1992)
colnames(x)<-c("gender","ipip", "ideol", "prtyid","wage")
# start imputation
s <- prelim.mix(x,4)
thetahat <- em.mix(s)

And here comes the error message:

> newtheta <- da.mix(s,thetahat, steps=100,showits=TRUE)
Steps of Data Augmentation:
1...Error in da.mix(s, thetahat, steps = 100, showits = TRUE) :
Improper posterior--empty cells
> imp.mix(s, newtheta, x)

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] local average

2005-04-20 Thread Jens Hainmueller
Hello,

probably this isn't hard, but I can't get R to do this. Thanks for your
help!

Assume I have a matrix of two covariates:

n<- 1000
Y<- runif(n)
X<- runif(n,min=0,max=100)
data <- cbind(Y,X)

Now, I would like to compute the local average of Y for each X interval 0-1,
1-2, 2-3, ... 99-100. In other words, I would like to obtain 100 (local)
Ybars, one for each X interval with width 1.

Also, I would like to do the same but instead of local means of Y obtain
local medians of Y for each X interval.

Best,
Jens

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] drawing filled countries according to data using map('world')?

2004-03-07 Thread Jens Hainmueller
Hello,

I am looking for somebody who has experience with the map library (Becker
and Wilks 1993) and might be able to help me with the following problem:

Using the 'world' database I would like to draw filled countries in a world
map so that the filling colors of each country corresponds to the value of a
policy variable X at time t (the goal is to visualize a policy diffusion
pattern over time using different maps for t=1985, 1990, etc.).

In their explanatory note, Becker and Wilks show how to accomplish this with
the 'states' database, for filling US states with color according to the
republican vote in 1900.

> state.names <- unix(’tr "[A-Z]" "[a-z]"’, state.name)
> map.states <- unix(’sed "s/:.*//"’, map(names=T, plot=F))
> state.to.map <- match(map.states, state.names)
> color <- votes.repub[state.to.map, votes.year == 1900] / 100
> map(’state’, fill=T, col=color)
> map(’state’, add=T)

"The first expression changes uppercase to lowercase in the standard S
dataset giving state names,
so that these can be compared with the names returned by map. Next the
complete set of state
polygon names is requested (using map(names=T,plot=F); the default database
is
’state’) and the trailing portions (from the ‘‘:’’ onwards) are removed so
that we have a list of
the state for which each polygon is a part or the whole. Then we create
state.to.map that
gives the translation from the ordering of the states known to S
(alphabetical) to the ordering
known to the mapping mechanism. By using this vector, as in the next
expression, all the pieces
of a state will be colored the same color. The state.to.map vector is a
useful one to keep
around, for it will work in any context where the ordering of the state data
is as here. Notice that
unless such a vector is being reused, it will usually be the case that there
will be a step like this
one, finding the translation between the ordering for the regions in your
data and the ordering
according to map. In general, the translation will have to be computed each
time the set of
selected polygons changes."

My question then is, how to compute a similar procedure using the 'world'
database. Specifically, how can I access the country names in the 'world'
database to accomplish the translation to the country names in my dataset?
Is there any way to unpack the 'world' database to do the matching in an
external program? And does anybody now of other (more recent) world maps
that I could use?

Thanks very much!

Best,
Jens

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] drawing filled countries according to data using map('world')? - follow up

2004-03-07 Thread Jens Hainmueller
Hello,

this is a follow up on my previous inquiry regarding the use of the map
library (Becker and Wilks 1993).

Using the 'world' database I would like to draw filled countries in a world
map so that the filling colors of each country corresponds to the value of a
policy variable "fix.float" at a specific "year" (the goal is to visualize a
policy diffusion pattern over time using different maps for year=1985, 1990,
etc.).

In my dataset [Test] I have created a vector 'map.name' that contains
country names that I have made identical to the country names in file
world.N in .../library/maps/mapdata/.

> Test[1:10,]
   region fix.float wbcode  name year dv dv.lag map.name polygon
1 lacNAABW Aruba 1973 NA NAAruba1936
2 lacNAABW Aruba 1974 NA NAAruba1936
3 lacNAABW Aruba 1975 NA NAAruba1936
4 lacNAABW Aruba 1976 NA NAAruba1936
5 lacNAABW Aruba 1977 NA NAAruba1936
6 lacNAABW Aruba 1978 NA NAAruba1936
7 lacNAABW Aruba 1979 NA NAAruba1936
8 lacNAABW Aruba 1980 NA NAAruba1936
9 lacNAABW Aruba 1981 NA NAAruba1936
10lacNAABW Aruba 1982 NA NAAruba1936

Now I would like to translate the country names in the 'world' database to
the country names in my dataset (following Becker and Wilks 1993). For some
reason, the translation does not work.

> map.country<-  map(database = "world", names=T,plot=F)
> state.to.map <- match(map.name,map.country)
> color <- dv[state.to.map, year == 1980]
Error in dv[state.to.map, year == 1980] : incorrect number of dimensions

> color <- dv[state.to.map, year == 1980]/100
Error in dv[state.to.map, year == 1980] : incorrect number of dimensions

What am I doing wrong? (there are a few values missing in "fix.float")

Thanks for your help!

Best
Jens Hainmueller

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] vector extraction

2004-03-08 Thread Jens Hainmueller
Hello,

I could need some help on this one:

>From the data.frame "Test.dataset2" below (TSCS data for 151
"countries.to.map" for "year" 1973-95; each "country.to.map" is described by
a unique code), I would like to extract a vector "color" that for each
"country.to.map" takes on the value of "dv" (a categorical variable with
values 1,2,..4) for a specified "year". Thus, for a specified "year",
"color" should have 151 obs - one for each "country.to.map" represented by
its respective value of "dv").

I tried this:

> color <-  dv[country.to.map][(year == 1980)[country.to.map]]

but it does not give me what I need. I can't figure out where my error is,
however.

Thanks,
Jens

> Test.dataset2[1:40,]
 country.to.map dv year
1  1936 NA 1973
2  1936 NA 1974
3  1936 NA 1975
4  1936 NA 1976
5  1936 NA 1977
6  1936 NA 1978
7  1936 NA 1979
8  1936 NA 1980
9  1936 NA 1981
10 1936 NA 1982
11 1936 NA 1983
12 1936 NA 1984
13 1936 NA 1985
14 1936 NA 1986
15 1936 NA 1987
16 1936 NA 1988
17 1936  4 1989
18 1936  4 1990
19 1936  4 1991
20 1936  4 1992
21 1936  4 1993
22 1936  4 1994
23 1936  4 1995
24   56  4 1973
25   56  4 1974
26   56  2 1975
27   56  2 1976
28   56  2 1977
29   56  2 1978
30   56  2 1979
31   56  2 1980
32   56  2 1981
33   56  2 1982
34   56  4 1983
35   56  4 1984
36   56  4 1985
37   56  4 1986
38   56  4 1987
39   56  4 1988
40   56  4 1989

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html