[jira] [Updated] (SPARK-7090) Introduce LDAOptimizer to LDA to improve extensibility

2015-04-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-7090: -- Summary: Introduce LDAOptimizer to LDA to improve extensibility (was: Introduce LDAOptimizer to LDA

[jira] [Reopened] (SPARK-7090) Introduce LDAOptimizer to LDA to further improve extensibility

2015-04-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang reopened SPARK-7090: --- Reopen this since 7089 was already closed. Introduce LDAOptimizer to LDA to further improve

[jira] [Comment Edited] (SPARK-7090) Introduce LDAOptimizer to LDA to further improve extensibility

2015-04-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508907#comment-14508907 ] yuhao yang edited comment on SPARK-7090 at 4/23/15 12:00 PM: -

[jira] [Closed] (SPARK-6374) Add getter for GeneralizedLinearAlgorithm

2015-04-14 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang closed SPARK-6374. - fix merged. Thanks. Add getter for GeneralizedLinearAlgorithm -

[jira] [Closed] (SPARK-6693) add toString with max lines and width for matrix

2015-04-14 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang closed SPARK-6693. - Fix merged. Thanks. add toString with max lines and width for matrix

[jira] [Created] (SPARK-6693) add to string with max lines and width for matrix

2015-04-03 Thread yuhao yang (JIRA)
yuhao yang created SPARK-6693: - Summary: add to string with max lines and width for matrix Key: SPARK-6693 URL: https://issues.apache.org/jira/browse/SPARK-6693 Project: Spark Issue Type:

[jira] [Updated] (SPARK-6693) add toString with max lines and width for matrix

2015-04-03 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-6693: -- Summary: add toString with max lines and width for matrix (was: add to string with max lines and width

[jira] [Commented] (SPARK-5563) LDA with online variational inference

2015-03-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364350#comment-14364350 ] yuhao yang commented on SPARK-5563: --- Matthew Willson. Thanks for the attention and idea.

[jira] [Comment Edited] (SPARK-5563) LDA with online variational inference

2015-03-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364350#comment-14364350 ] yuhao yang edited comment on SPARK-5563 at 3/17/15 1:13 AM:

[jira] [Created] (SPARK-6374) Add getter for GeneralizedLinearAlgorithm

2015-03-16 Thread yuhao yang (JIRA)
yuhao yang created SPARK-6374: - Summary: Add getter for GeneralizedLinearAlgorithm Key: SPARK-6374 URL: https://issues.apache.org/jira/browse/SPARK-6374 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-6268) KMeans parameter getter methods

2015-03-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356125#comment-14356125 ] yuhao yang commented on SPARK-6268: --- Sure, I'll propose a PR very soon. Thanks! KMeans

[jira] [Comment Edited] (SPARK-6268) KMeans parameter getter methods

2015-03-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356106#comment-14356106 ] yuhao yang edited comment on SPARK-6268 at 3/11/15 2:14 AM: Hi

[jira] [Closed] (SPARK-6177) Add note in LDA example to remind possible coalesce

2015-03-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang closed SPARK-6177. - Fix and merged, thanks. Add note in LDA example to remind possible coalesce

[jira] [Updated] (SPARK-6177) LDA should check partitions size of the input

2015-03-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-6177: -- Description: Add comment to introduce coalesce to LDA example to avoid the possible massive partitions

[jira] [Updated] (SPARK-6177) Add note for

2015-03-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-6177: -- Summary: Add note for (was: LDA should check partitions size of the input) Add note for

[jira] [Updated] (SPARK-6177) Add note in LDA example to remind possible coalesce

2015-03-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-6177: -- Summary: Add note in LDA example to remind possible coalesce (was: Add note for ) Add note in LDA

[jira] [Created] (SPARK-6177) LDA should check partitions size of the input

2015-03-04 Thread yuhao yang (JIRA)
yuhao yang created SPARK-6177: - Summary: LDA should check partitions size of the input Key: SPARK-6177 URL: https://issues.apache.org/jira/browse/SPARK-6177 Project: Spark Issue Type:

[jira] [Updated] (SPARK-6177) LDA should check partitions size of the input

2015-03-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-6177: -- Description: sc.textFile will create RDD with one partition for each file, and the possible massive

[jira] [Closed] (SPARK-5717) add sc.stop to LDA examples

2015-02-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang closed SPARK-5717. - merged. Thanks add sc.stop to LDA examples --- Key: SPARK-5717

[jira] [Created] (SPARK-5717) add sc.stop to LDA examples

2015-02-10 Thread yuhao yang (JIRA)
yuhao yang created SPARK-5717: - Summary: add sc.stop to LDA examples Key: SPARK-5717 URL: https://issues.apache.org/jira/browse/SPARK-5717 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-5243) Spark will hang if (driver memory + executor memory) exceeds limit on a 1-worker cluster

2015-02-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-5243: -- Description: Spark will hang if calling spark-submit under the conditions: 1. the cluster has only one

[jira] [Updated] (SPARK-5243) Spark will hang if (driver memory + executor memory) exceeds limit on a 1-worker cluster

2015-02-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-5243: -- Description: Spark will hang if calling spark-submit under the conditions: 1. the cluster has only one

[jira] [Commented] (SPARK-5566) Tokenizer for mllib package

2015-02-05 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308733#comment-14308733 ] yuhao yang commented on SPARK-5566: --- I mean only the underlying implementation.

[jira] [Comment Edited] (SPARK-5563) LDA with online variational inference

2015-02-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305115#comment-14305115 ] yuhao yang edited comment on SPARK-5563 at 2/4/15 2:22 PM: ---

[jira] [Commented] (SPARK-5563) LDA with online variational inference

2015-02-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305115#comment-14305115 ] yuhao yang commented on SPARK-5563: --- Thanks Joseph for helping create the jira. Paste

[jira] [Comment Edited] (SPARK-5563) LDA with online variational inference

2015-02-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305115#comment-14305115 ] yuhao yang edited comment on SPARK-5563 at 2/4/15 2:23 PM: ---

[jira] [Commented] (SPARK-5563) LDA with online variational inference

2015-02-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305199#comment-14305199 ] yuhao yang commented on SPARK-5563: --- BTW, batch versions of online variational inference

[jira] [Commented] (SPARK-5566) Tokenizer for mllib package

2015-02-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305172#comment-14305172 ] yuhao yang commented on SPARK-5566: --- Actually I believe many current code like Word2Vec

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2015-02-03 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302952#comment-14302952 ] yuhao yang commented on SPARK-1405: --- Hi everyone, I'm sharing an implementation of

[jira] [Comment Edited] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2015-02-03 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302952#comment-14302952 ] yuhao yang edited comment on SPARK-1405 at 2/3/15 8:35 AM: --- Hi

[jira] [Closed] (SPARK-5406) LocalLAPACK mode in RowMatrix.computeSVD should have much smaller upper bound

2015-02-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang closed SPARK-5406. - fix and merged. Thanks LocalLAPACK mode in RowMatrix.computeSVD should have much smaller upper bound

[jira] [Commented] (SPARK-5510) How can I fix the spark-submit script and then running the program on cluster ?

2015-02-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300939#comment-14300939 ] yuhao yang commented on SPARK-5510: --- https://spark.apache.org/community.html check the

[jira] [Updated] (SPARK-5406) LocalLAPACK mode in RowMatrix.computeSVD should have much smaller upper bound

2015-01-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-5406: -- Description: In RowMatrix.computeSVD, under LocalLAPACK mode, the code would invoke brzSvd. Yet breeze

[jira] [Updated] (SPARK-5406) LocalLAPACK mode in RowMatrix.computeSVD should have much smaller upper bound

2015-01-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-5406: -- Description: In RowMatrix.computeSVD, under LocalLAPACK mode, the code would invoke brzSvd. Yet breeze

[jira] [Closed] (SPARK-5384) Vectors.sqdist return inconsistent result for sparse/dense vectors when the vectors have different lengths

2015-01-25 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang closed SPARK-5384. - fixed Vectors.sqdist return inconsistent result for sparse/dense vectors when the vectors have different

[jira] [Created] (SPARK-5406) LocalLAPACK mode in RowMatrix.computeSVD should have much smaller upper bound

2015-01-25 Thread yuhao yang (JIRA)
yuhao yang created SPARK-5406: - Summary: LocalLAPACK mode in RowMatrix.computeSVD should have much smaller upper bound Key: SPARK-5406 URL: https://issues.apache.org/jira/browse/SPARK-5406 Project: Spark

[jira] [Created] (SPARK-5384) Vectors.sqdist return inconsistent result for sparse/dense vectors when the vectors have different lengths

2015-01-23 Thread yuhao yang (JIRA)
yuhao yang created SPARK-5384: - Summary: Vectors.sqdist return inconsistent result for sparse/dense vectors when the vectors have different lengths Key: SPARK-5384 URL:

[jira] [Closed] (SPARK-5282) RowMatrix easily gets int overflow in the memory size warning

2015-01-19 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang closed SPARK-5282. - fixed RowMatrix easily gets int overflow in the memory size warning

[jira] [Commented] (SPARK-5186) Vector.equals and Vector.hashCode are very inefficient and fail on SparseVectors with large size

2015-01-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280025#comment-14280025 ] yuhao yang commented on SPARK-5186: --- I just updated the PR with a hashCode fix. Please

[jira] [Closed] (SPARK-5234) examples for ml don't have sparkContext.stop

2015-01-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang closed SPARK-5234. - fixed examples for ml don't have sparkContext.stop

[jira] [Created] (SPARK-5282) RowMatrix easily gets int overflow in the memory size warning

2015-01-16 Thread yuhao yang (JIRA)
yuhao yang created SPARK-5282: - Summary: RowMatrix easily gets int overflow in the memory size warning Key: SPARK-5282 URL: https://issues.apache.org/jira/browse/SPARK-5282 Project: Spark Issue

[jira] [Commented] (SPARK-5282) RowMatrix easily gets int overflow in the memory size warning

2015-01-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280159#comment-14280159 ] yuhao yang commented on SPARK-5282: --- typical wrong message: Row matrix: 17000 cloumns

[jira] [Created] (SPARK-5234) examples for ml don't have sparkContext.stop

2015-01-13 Thread yuhao yang (JIRA)
yuhao yang created SPARK-5234: - Summary: examples for ml don't have sparkContext.stop Key: SPARK-5234 URL: https://issues.apache.org/jira/browse/SPARK-5234 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-5243) Spark will hang if (driver memory + executor memory) exceeds limit on a 1-worker cluster

2015-01-13 Thread yuhao yang (JIRA)
yuhao yang created SPARK-5243: - Summary: Spark will hang if (driver memory + executor memory) exceeds limit on a 1-worker cluster Key: SPARK-5243 URL: https://issues.apache.org/jira/browse/SPARK-5243

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2015-01-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14270869#comment-14270869 ] yuhao yang commented on SPARK-1405: --- Great design doc and solid proposal. I noticed

<    1   2   3   4   5   6