I vote for simplicity. Current practice in the social sciences is to fit multiple models, each with a different number of components, and use fit statistics to choose the best model.
There are some additional features I would like to see added and I have the code to contribute if it is not currently there. To be consistent with Mplus, we need have the algorithm use multiple random starts and run a few of the best starts to completion. Mplus uses this strategy to effectively overcome local minima. -----Original Message----- From: Becksfort, Jared [mailto:jared.becksf...@stjude.org] Sent: Wednesday, October 17, 2012 11:37 PM To: Commons Developers List Subject: RE: [Math] MATH-816 (mixture model distribution) . . I see. I am planning to submit the EM fit for multivariate normal mixture models in the next couple of weeks (Math-817). A Gibbs sampling DP fit may be a bit further out. I am not opposed to allowing the number of components to change, but I also like the simplicity of this class. Whatever you guys decide is probably fine. Jared ________________________________________ From: Ted Dunning [ted.dunn...@gmail.com] Sent: Wednesday, October 17, 2012 9:41 PM To: Commons Developers List Subject: Re: [Math] MATH-816 (mixture model distribution) =?utf-8?B?LiAgICAu? ==?utf-8?B?LiAgICAu?= The issue is that with a fixed number of components, you need to do multiple runs to find a best fit number of components. Gibbs sampling against a Dirichlet process can get you to the same answer in about the same cost as a single run of EM with a fixed number of models. On Wed, Oct 17, 2012 at 7:31 PM, Becksfort, Jared < jared.becksf...@stjude.org> wrote: > Ted, > > I am not sure I understand the problem with the fixed number of > components. My understanding is that CM prefers immutable objects. > Adding a component to an object would require reweighting in addition > to modifying the component list. A new mixture model could be > instantiated using the getComponents function and then adding or > removing more components if necessary. > > Jared > ________________________________________ > From: Ted Dunning [ted.dunn...@gmail.com] > Sent: Wednesday, October 17, 2012 5:21 PM > To: Commons Developers List > Subject: Re: [Math] MATH-816 (mixture model > distribution)=?utf-8?B?LiAgICAu? = > > Seems fine. > > I think that the limitation to a fixed number of mixture components is > a bit limiting. So is the limitation to a uniform set of components. > Both limitations can be eased without a huge difficultly. > > Avoiding the fixed number of components can be done by using some > variant of Dirichlet processes. Simply picking k_max relatively large > and then using an approximate DP over that finite set works well. > > That said, mixture models are pretty nice to have. > > On Wed, Oct 17, 2012 at 2:13 PM, Gilles Sadowski < > gil...@harfang.homelinux.org> wrote: > > > Hello. > > > > Any objection to commit the code as proposed on the report page? > > https://issues.apache.org/jira/browse/MATH-816 > > > > > > Regards, > > Gilles > > > > -------------------------------------------------------------------- > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > > > Email Disclaimer: www.stjude.org/emaildisclaimer Consultation > Disclaimer: www.stjude.org/consultationdisclaimer > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org