[jira] [Commented] (SPARK-5400) Rename GaussianMixtureEM to GaussianMixture

2015-01-30 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14298762#comment-14298762
 ] 

Apache Spark commented on SPARK-5400:
-

User 'tgaloppo' has created a pull request for this issue:
https://github.com/apache/spark/pull/4290

> Rename GaussianMixtureEM to GaussianMixture
> ---
>
> Key: SPARK-5400
> URL: https://issues.apache.org/jira/browse/SPARK-5400
> Project: Spark
>  Issue Type: Improvement
>  Components: MLlib
>Affects Versions: 1.3.0
>Reporter: Joseph K. Bradley
>Assignee: Travis Galoppo
>Priority: Minor
>
> GaussianMixtureEM is following the old naming convention of including the 
> optimization algorithm name in the class title.  We should probably rename it 
> to GaussianMixture so that it can use other optimization algorithms in the 
> future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5400) Rename GaussianMixtureEM to GaussianMixture

2015-01-29 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297648#comment-14297648
 ] 

Joseph K. Bradley commented on SPARK-5400:
--

Thanks!  Could you also please change the name of the test suite to match?

> Rename GaussianMixtureEM to GaussianMixture
> ---
>
> Key: SPARK-5400
> URL: https://issues.apache.org/jira/browse/SPARK-5400
> Project: Spark
>  Issue Type: Improvement
>  Components: MLlib
>Affects Versions: 1.3.0
>Reporter: Joseph K. Bradley
>Assignee: Travis Galoppo
>Priority: Minor
>
> GaussianMixtureEM is following the old naming convention of including the 
> optimization algorithm name in the class title.  We should probably rename it 
> to GaussianMixture so that it can use other optimization algorithms in the 
> future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5400) Rename GaussianMixtureEM to GaussianMixture

2015-01-29 Thread Travis Galoppo (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297613#comment-14297613
 ] 

Travis Galoppo commented on SPARK-5400:
---

Please assign to me and I will make the name change


> Rename GaussianMixtureEM to GaussianMixture
> ---
>
> Key: SPARK-5400
> URL: https://issues.apache.org/jira/browse/SPARK-5400
> Project: Spark
>  Issue Type: Improvement
>  Components: MLlib
>Affects Versions: 1.3.0
>Reporter: Joseph K. Bradley
>Priority: Minor
>
> GaussianMixtureEM is following the old naming convention of including the 
> optimization algorithm name in the class title.  We should probably rename it 
> to GaussianMixture so that it can use other optimization algorithms in the 
> future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5400) Rename GaussianMixtureEM to GaussianMixture

2015-01-26 Thread Xiangrui Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292695#comment-14292695
 ] 

Xiangrui Meng commented on SPARK-5400:
--

I like `GaussianMixture` better. I don't think it is worth making a generic EM 
algorithm. It is too general and we won't benefit much from code reuse.

> Rename GaussianMixtureEM to GaussianMixture
> ---
>
> Key: SPARK-5400
> URL: https://issues.apache.org/jira/browse/SPARK-5400
> Project: Spark
>  Issue Type: Improvement
>  Components: MLlib
>Affects Versions: 1.3.0
>Reporter: Joseph K. Bradley
>Priority: Minor
>
> GaussianMixtureEM is following the old naming convention of including the 
> optimization algorithm name in the class title.  We should probably rename it 
> to GaussianMixture so that it can use other optimization algorithms in the 
> future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5400) Rename GaussianMixtureEM to GaussianMixture

2015-01-26 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292176#comment-14292176
 ] 

Joseph K. Bradley commented on SPARK-5400:
--

I agree this could be done either way: Algorithm[Model] or Model[Algorithm].  
For users, exposing the model type may be easiest; a person who is new to ML 
and wants to do some clustering will know the name of a clustering model 
(KMeans, GMM) but may not want to worry about picking an optimization 
algorithm.  So I'd vote for Model[Algorithm].

That said, internally, I agree that Algorithm[Model] would be handy for 
generalizing.  We could do the combination by having an internal LearningState 
class:

{code}
class GaussianMixture {
  def setOptimizer // once we have more than 1 optimization method

  def run = {
val opt = new EM(new GMMLearningState(this))
...
  }
}

private[mllib] GMMLearningState extends OurModelAbstraction {
  def this(gm: GaussianMixture) = this(...)
}

class EM(model: OurModelAbstraction)
{code}


> Rename GaussianMixtureEM to GaussianMixture
> ---
>
> Key: SPARK-5400
> URL: https://issues.apache.org/jira/browse/SPARK-5400
> Project: Spark
>  Issue Type: Improvement
>  Components: MLlib
>Affects Versions: 1.3.0
>Reporter: Joseph K. Bradley
>Priority: Minor
>
> GaussianMixtureEM is following the old naming convention of including the 
> optimization algorithm name in the class title.  We should probably rename it 
> to GaussianMixture so that it can use other optimization algorithms in the 
> future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5400) Rename GaussianMixtureEM to GaussianMixture

2015-01-24 Thread Travis Galoppo (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290971#comment-14290971
 ] 

Travis Galoppo commented on SPARK-5400:
---

Hmm.  This has me thinking in a different direction.  We could generalize the 
expectation-maximization algorithm to work with any mixture model supporting a 
set of necessary likelihood compute/update methods... then we could ask for, 
e.g., "new ExpectationMaximization[GaussianMixtureModel]".  This would 
de-couple the model and the algorithm, and could open the door for the 
implementation to be applied to (for instance) tomographic image reconstruction 
(which seems like a great fit for Spark given the volume of data involved).


> Rename GaussianMixtureEM to GaussianMixture
> ---
>
> Key: SPARK-5400
> URL: https://issues.apache.org/jira/browse/SPARK-5400
> Project: Spark
>  Issue Type: Improvement
>  Components: MLlib
>Affects Versions: 1.3.0
>Reporter: Joseph K. Bradley
>Priority: Minor
>
> GaussianMixtureEM is following the old naming convention of including the 
> optimization algorithm name in the class title.  We should probably rename it 
> to GaussianMixture so that it can use other optimization algorithms in the 
> future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5400) Rename GaussianMixtureEM to GaussianMixture

2015-01-24 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290924#comment-14290924
 ] 

Joseph K. Bradley commented on SPARK-5400:
--

[~mengxr] [~tgaloppo]  What do you think?

> Rename GaussianMixtureEM to GaussianMixture
> ---
>
> Key: SPARK-5400
> URL: https://issues.apache.org/jira/browse/SPARK-5400
> Project: Spark
>  Issue Type: Improvement
>  Components: MLlib
>Affects Versions: 1.3.0
>Reporter: Joseph K. Bradley
>Priority: Minor
>
> GaussianMixtureEM is following the old naming convention of including the 
> optimization algorithm name in the class title.  We should probably rename it 
> to GaussianMixture so that it can use other optimization algorithms in the 
> future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org