Further to Simon's points,

Here is what is confusing to me and I highlight the section of the claims below:

The key assumption concerns "symmetrical error constraints". These "symmetrical 
error constraints" force a solution where the probabilities of positive and 
negative error are symmetrical across all cross product sums that are the basis 
of maximum likelihood logistic regression. As the number of independent 
variables increases, it becomes more and more likely that this symmetrical 
assumption is accurate. Because this error component can be reliably estimated 
and subtracted out with a large enough number of variables, the resulting model 
parameters are strikingly error-free and do not overfit the data. 

For me, maybe this is a bit old school here, but isn't the point of model 
development generating the most parsimonious model with the greatest 
explanatory power from the fewest variables.  I myself could just imagine going 
to a client and standing in a 'bored' (grin) room for a presentation, and 
saying hay client, here are the 200 variables that are driving choice 
behaviour.  I use latent class and bayes based approaches because they recover 
heterogeneity in utility allocation across the sample, that to me is a big 
battle in choice based analytics.  

I believe that after a certain point, a heap of predictors become meaningless.  
I can see some of my colleagues adopting this because it is in SAS and makes up 
for poor design.  

Anyway, from a technical point of view, I would have to read a little about the 
error they are referring to.  Good on them for developing a new technology, 
like any algorithm, it will have its strengths and weaknesses and depending on 
factors such as usability etc, will gain some level of acceptance.     

Paul


> Simon Blomberg <[EMAIL PROTECTED]> wrote:
> 
> >From what I've read (which isn't much), the idea is to estimate a
> utility (preference) function for discrete categories, using logistic
> regression, under the assumption that the residuals of the linear
> predictor of the utilities are ~ Type I Gumbel. This implies the
> "independence of irrelevant alternatives" in economic jargon. ie the
> utility of choice a versus choice b is independent of the introduction
> of a third choice c. It also implies homoscedasticity of the errors. The
> model can be generalized in various ways. If you are willing to
> introduce extra parameters into the model, such as the parameters of the
> Gumbel distribution, you may get more precision in the estimates of the
> utility function. An alternative (without the independence of irrelevant
> alternatives assumption) is to model the errors as multivariate normal
> (ie use probit regression), which is computationally much more
> difficult.
> 
> Whether it makes substantive sense to use these models outside of
> "discrete choice" experiments is another question.
> 
>  Patenting these methods is worrying. There have been a lot of people
> working on discrete choice experiments over the years. It's hard to
> believe that a single company could have ownership over an idea that is
> the result of a collaborative effort such as this.
> 
> Cheers,
> 
> Simon.
> 
>  On Thu, 2007-04-26 at 12:29 +1000, Tim Churches wrote:
> > This news item in a data mining newsletter makes various claims for a 
> technique called "Reduced Error Logistic Regression": 
> http://www.kdnuggets.com/news/2007/n08/12i.html
> > 
> > In brief, are these (ambitious) claims justified and if so, has this 
> technique been implemented in R (or does anyone have any plans to do 
> so)? 
> > 
> > Tim C
> > 
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> -- 
> Simon Blomberg, BSc (Hons), PhD, MAppStat. 
> Lecturer and Consultant Statistician 
> Faculty of Biological and Chemical Sciences 
> The University of Queensland 
> St. Lucia Queensland 4072 
> Australia
> 
> Room 320, Goddard Building (8)
> T: +61 7 3365 2506 
> email: S.Blomberg1_at_uq.edu.au 
> 
> The combination of some data and an aching desire for 
> an answer does not ensure that a reasonable answer can 
> be extracted from a given body of data. - John Tukey.
> 
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to