Hi,

I have a model which tries to fit a set of data with 10-level ordered 
responses. Somehow, in my data, the majority of the observations are from level 
6-10 and leave only about 1-5% of total observations contributed to level 1-10. 
As a result, my model tends to perform badly on points that have lower level 
than 6. 

I would like to ask if there's any way to circumvent this problem or not. I was 
thinking of the followings ideas. But I am opened to any suggestions if you 
could please. 

1. Bootstrapping with small size of samples each time. Howevever, in each 
sample basket, I intentionally sample in such a way that there is a good mix 
between observations from each level. Then I have to do this many times. But I 
don't know how to obtain the true standard error of estimated parameters after 
all bootstrapping has been done. Is it going to be simply the average of all 
standard errors estimated each time?

2. Weighting points with level 1-6 more. But it's unclear to me how to put this 
weight back to maximum likelihood when estimating parameters. It's unlike OLS 
where your objective is to minimize error or, if you'd like, a penalty 
function. But MLE is obviously not a penalty function.

3. Do step-wise regression. I will segment the data into two regions, first 
points with response less than 6 and the rest with those above 6. The first 
step is a binary regression to determine if the point belongs to which of the 
two groups. Then in the second step, estimate ordered probit model for each 
group separately. The question here is then, why I am choosing 6 as a cutting 
point instead of others? 

Any suggestions would be really appreciated. Thank you.

- adschai

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to