[jira] [Updated] (SPARK-5782) Python Worker / Pyspark Daemon Memory Issue

2015-03-16 Thread Mark Khaitman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Khaitman updated SPARK-5782: - Priority: Blocker (was: Critical) Python Worker / Pyspark Daemon Memory Issue

[jira] [Created] (SPARK-6366) In Python API, the default save mode for save and saveAsTable should be error instead of append.

2015-03-16 Thread Yin Huai (JIRA)
Yin Huai created SPARK-6366: --- Summary: In Python API, the default save mode for save and saveAsTable should be error instead of append. Key: SPARK-6366 URL: https://issues.apache.org/jira/browse/SPARK-6366

[jira] [Commented] (SPARK-4808) Spark fails to spill with small number of large objects

2015-03-16 Thread Mingyu Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364016#comment-14364016 ] Mingyu Kim commented on SPARK-4808: --- [~andrewor14], should this now be closed with the

[jira] [Updated] (SPARK-6228) Move SASL support into network/common module

2015-03-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-6228: --- Summary: Move SASL support into network/common module (was: Provide SASL support in network/common

[jira] [Updated] (SPARK-6319) DISTINCT doesn't work for binary type

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6319: Target Version/s: 1.4.0 (was: 1.3.1) DISTINCT doesn't work for binary type

[jira] [Commented] (SPARK-5310) Update SQL programming guide for 1.3

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364206#comment-14364206 ] Michael Armbrust commented on SPARK-5310: - I think I'd rather just publish more

[jira] [Created] (SPARK-6367) Use the proper data type for those expressions that are hijacking existing data types.

2015-03-16 Thread Yin Huai (JIRA)
Yin Huai created SPARK-6367: --- Summary: Use the proper data type for those expressions that are hijacking existing data types. Key: SPARK-6367 URL: https://issues.apache.org/jira/browse/SPARK-6367 Project:

[jira] [Commented] (SPARK-6340) mllib.IDF for LabelPoints

2015-03-16 Thread Kian Ho (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364246#comment-14364246 ] Kian Ho commented on SPARK-6340: Hi Joseph, I initially considered that as a solution,

[jira] [Commented] (SPARK-6304) Checkpointing doesn't retain driver port

2015-03-16 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364254#comment-14364254 ] Saisai Shao commented on SPARK-6304: Hi [~msoutier], the reason to remove these two

[jira] [Updated] (SPARK-6146) Support more datatype in SqlParser

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6146: Target Version/s: 1.4.0 (was: 1.3.0) Support more datatype in SqlParser

[jira] [Updated] (SPARK-5463) Fix Parquet filter push-down

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5463: Priority: Blocker (was: Critical) Fix Parquet filter push-down

[jira] [Updated] (SPARK-5821) JSONRelation should check if delete is successful for the overwrite operation.

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5821: Target Version/s: 1.3.1 (was: 1.3.0) JSONRelation should check if delete is successful

[jira] [Updated] (SPARK-5463) Fix Parquet filter push-down

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5463: Target Version/s: 1.4.0 (was: 1.3.0, 1.2.2) Fix Parquet filter push-down

[jira] [Resolved] (SPARK-5183) Document data source API

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5183. - Resolution: Fixed Fix Version/s: 1.3.0 Document data source API

[jira] [Reopened] (SPARK-6250) Types are now reserved words in DDL parser.

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reopened SPARK-6250: - Assignee: Yin Huai (was: Michael Armbrust) Types are now reserved words in DDL

[jira] [Commented] (SPARK-6250) Types are now reserved words in DDL parser.

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364427#comment-14364427 ] Michael Armbrust commented on SPARK-6250: - Okay, thanks for explaining the

[jira] [Created] (SPARK-6372) spark-submit --conf is not being propagated to child processes

2015-03-16 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-6372: - Summary: spark-submit --conf is not being propagated to child processes Key: SPARK-6372 URL: https://issues.apache.org/jira/browse/SPARK-6372 Project: Spark

[jira] [Created] (SPARK-6373) Add SSL/TLS for the Netty based BlockTransferService

2015-03-16 Thread Jeffrey Turpin (JIRA)
Jeffrey Turpin created SPARK-6373: - Summary: Add SSL/TLS for the Netty based BlockTransferService Key: SPARK-6373 URL: https://issues.apache.org/jira/browse/SPARK-6373 Project: Spark Issue

[jira] [Commented] (SPARK-5782) Python Worker / Pyspark Daemon Memory Issue

2015-03-16 Thread Mark Khaitman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363894#comment-14363894 ] Mark Khaitman commented on SPARK-5782: -- I've upped this JIRA ticket to blocker since

[jira] [Resolved] (SPARK-6327) Run PySpark with python directly is broken

2015-03-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-6327. --- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5019

[jira] [Resolved] (SPARK-5310) Update SQL programming guide for 1.3

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5310. - Resolution: Fixed Fix Version/s: 1.3.0 Update SQL programming guide for 1.3

[jira] [Closed] (SPARK-6340) mllib.IDF for LabelPoints

2015-03-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-6340. Resolution: Not a Problem mllib.IDF for LabelPoints -

[jira] [Updated] (SPARK-6247) Certain self joins cannot be analyzed

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6247: Priority: Critical (was: Major) Certain self joins cannot be analyzed

[jira] [Updated] (SPARK-6231) Join on two tables (generated from same one) is broken

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6231: Target Version/s: 1.3.1 Join on two tables (generated from same one) is broken

[jira] [Commented] (SPARK-6146) Support more datatype in SqlParser

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364272#comment-14364272 ] Michael Armbrust commented on SPARK-6146: - Now that we have our own DDL parser

[jira] [Updated] (SPARK-5881) RDD remains cached after the table gets overridden by CACHE TABLE

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5881: Target Version/s: 1.4.0 (was: 1.3.0) RDD remains cached after the table gets overridden

[jira] [Updated] (SPARK-5881) RDD remains cached after the table gets overridden by CACHE TABLE

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5881: Priority: Critical (was: Major) RDD remains cached after the table gets overridden by

[jira] [Commented] (SPARK-5523) TaskMetrics and TaskInfo have innumerable copies of the hostname string

2015-03-16 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364279#comment-14364279 ] Tathagata Das commented on SPARK-5523: -- As long as the hostname object is short-lived

[jira] [Commented] (SPARK-6348) Enable useFeatureScaling in SVMWithSGD

2015-03-16 Thread tanyinyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364310#comment-14364310 ] tanyinyan commented on SPARK-6348: -- Yes,I use a one-hot encoding before SVM , which is

[jira] [Commented] (SPARK-6320) Adding new query plan strategy to SQLContext

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364345#comment-14364345 ] Michael Armbrust commented on SPARK-6320: - Hmm, interesting. So far I had only

[jira] [Commented] (SPARK-6349) Add probability estimates in SVMModel predict result

2015-03-16 Thread tanyinyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364361#comment-14364361 ] tanyinyan commented on SPARK-6349: -- Yes, this doesn't solve the problem of picking which

[jira] [Commented] (SPARK-6371) Update version to 1.4.0-SNAPSHOT

2015-03-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364410#comment-14364410 ] Apache Spark commented on SPARK-6371: - User 'vanzin' has created a pull request for

[jira] [Commented] (SPARK-6250) Types are now reserved words in DDL parser.

2015-03-16 Thread Nitay Joffe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364429#comment-14364429 ] Nitay Joffe commented on SPARK-6250: Thanks [~marmbrus] and [~yhuai]. Types are now

[jira] [Commented] (SPARK-6207) YARN secure cluster mode doesn't obtain a hive-metastore token

2015-03-16 Thread Doug Balog (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364441#comment-14364441 ] Doug Balog commented on SPARK-6207: --- Need to catch

[jira] [Created] (SPARK-6376) Relation are thrown away too early in dataframes

2015-03-16 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-6376: --- Summary: Relation are thrown away too early in dataframes Key: SPARK-6376 URL: https://issues.apache.org/jira/browse/SPARK-6376 Project: Spark Issue

[jira] [Updated] (SPARK-6247) Certain self joins cannot be analyzed

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6247: Target Version/s: 1.3.1 (was: 1.3.0) Certain self joins cannot be analyzed

[jira] [Commented] (SPARK-6340) mllib.IDF for LabelPoints

2015-03-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364266#comment-14364266 ] Joseph K. Bradley commented on SPARK-6340: -- You should be able to reliably zip

[jira] [Updated] (SPARK-6247) Certain self joins cannot be analyzed

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6247: Priority: Blocker (was: Critical) Certain self joins cannot be analyzed

[jira] [Updated] (SPARK-6368) Build a specialized serializer for Exchange operator.

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6368: Priority: Critical (was: Major) Build a specialized serializer for Exchange operator.

[jira] [Updated] (SPARK-6367) Use the proper data type for those expressions that are hijacking existing data types.

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6367: Assignee: Yin Huai Use the proper data type for those expressions that are hijacking

[jira] [Commented] (SPARK-6200) Support dialect in SQL

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364297#comment-14364297 ] Michael Armbrust commented on SPARK-6200: - I'll add this seems to be mostly

[jira] [Commented] (SPARK-5563) LDA with online variational inference

2015-03-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364350#comment-14364350 ] yuhao yang commented on SPARK-5563: --- Matthew Willson. Thanks for the attention and idea.

Re: [jira] [Created] (SPARK-6370) RDD sampling with replacement intermittently yields incorrect number of samples

2015-03-16 Thread Sean Owen
What's the bug? Each element is sampled with probability 0.5. I think the expected size is 14 but not all samples would be that size. On Mar 17, 2015 12:12 AM, Marko Bonaci (JIRA) j...@apache.org wrote: Marko Bonaci created SPARK-6370: --- Summary:

[jira] [Resolved] (SPARK-6250) Types are now reserved words in DDL parser.

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-6250. - Resolution: Won't Fix Types are now reserved words in DDL parser.

[jira] [Commented] (SPARK-6250) Types are now reserved words in DDL parser.

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364353#comment-14364353 ] Michael Armbrust commented on SPARK-6250: - We have confirmed that this does work

[jira] [Commented] (SPARK-6250) Types are now reserved words in DDL parser.

2015-03-16 Thread Nitay Joffe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364417#comment-14364417 ] Nitay Joffe commented on SPARK-6250: The error is always the same:

[jira] [Commented] (SPARK-6200) Support dialect in SQL

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364264#comment-14364264 ] Michael Armbrust commented on SPARK-6200: - Thank you for working on this. I would

[jira] [Updated] (SPARK-6109) Unit tests fail when compiled against Hive 0.12.0

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6109: Target Version/s: 1.4.0 (was: 1.3.0) Unit tests fail when compiled against Hive 0.12.0

[jira] [Updated] (SPARK-6199) Support CTE

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6199: Target Version/s: 1.4.0 (was: 1.3.0) Support CTE --- Key:

[jira] [Updated] (SPARK-5911) Make Column.cast(to: String) support fixed precision and scale decimal type

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5911: Target Version/s: 1.4.0 (was: 1.3.0) Make Column.cast(to: String) support fixed precision

[jira] [Updated] (SPARK-6366) In Python API, the default save mode for save and saveAsTable should be error instead of append.

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6366: Priority: Blocker (was: Major) In Python API, the default save mode for save and

[jira] [Assigned] (SPARK-6247) Certain self joins cannot be analyzed

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reassigned SPARK-6247: --- Assignee: Michael Armbrust Certain self joins cannot be analyzed

[jira] [Updated] (SPARK-6231) Join on two tables (generated from same one) is broken

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6231: Priority: Blocker (was: Critical) Join on two tables (generated from same one) is broken

[jira] [Updated] (SPARK-6366) In Python API, the default save mode for save and saveAsTable should be error instead of append.

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6366: Assignee: Yin Huai In Python API, the default save mode for save and saveAsTable should be

[jira] [Commented] (SPARK-6348) Enable useFeatureScaling in SVMWithSGD

2015-03-16 Thread tanyinyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364309#comment-14364309 ] tanyinyan commented on SPARK-6348: -- Yes,I use a one-hot encoding before SVM , which is

[jira] [Commented] (SPARK-6348) Enable useFeatureScaling in SVMWithSGD

2015-03-16 Thread tanyinyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364311#comment-14364311 ] tanyinyan commented on SPARK-6348: -- Yes,I use a one-hot encoding before SVM , which is

[jira] [Commented] (SPARK-6348) Enable useFeatureScaling in SVMWithSGD

2015-03-16 Thread tanyinyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364313#comment-14364313 ] tanyinyan commented on SPARK-6348: -- Yes,I use a one-hot encoding before SVM , which is

[jira] [Commented] (SPARK-6348) Enable useFeatureScaling in SVMWithSGD

2015-03-16 Thread tanyinyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364312#comment-14364312 ] tanyinyan commented on SPARK-6348: -- Yes,I use a one-hot encoding before SVM , which is

[jira] [Issue Comment Deleted] (SPARK-6348) Enable useFeatureScaling in SVMWithSGD

2015-03-16 Thread tanyinyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tanyinyan updated SPARK-6348: - Comment: was deleted (was: Yes,I use a one-hot encoding before SVM , which is the 'sparsed before SVM '

[jira] [Issue Comment Deleted] (SPARK-6348) Enable useFeatureScaling in SVMWithSGD

2015-03-16 Thread tanyinyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tanyinyan updated SPARK-6348: - Comment: was deleted (was: Yes,I use a one-hot encoding before SVM , which is the 'sparsed before SVM '

[jira] [Issue Comment Deleted] (SPARK-6348) Enable useFeatureScaling in SVMWithSGD

2015-03-16 Thread tanyinyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tanyinyan updated SPARK-6348: - Comment: was deleted (was: Yes,I use a one-hot encoding before SVM , which is the 'sparsed before SVM '

[jira] [Issue Comment Deleted] (SPARK-6348) Enable useFeatureScaling in SVMWithSGD

2015-03-16 Thread tanyinyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tanyinyan updated SPARK-6348: - Comment: was deleted (was: Yes,I use a one-hot encoding before SVM , which is the 'sparsed before SVM '

[jira] [Comment Edited] (SPARK-5563) LDA with online variational inference

2015-03-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364350#comment-14364350 ] yuhao yang edited comment on SPARK-5563 at 3/17/15 1:13 AM:

[jira] [Commented] (SPARK-6250) Types are now reserved words in DDL parser.

2015-03-16 Thread Nitay Joffe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364414#comment-14364414 ] Nitay Joffe commented on SPARK-6250: Backticks doesn't work for me on existing data.

[jira] [Created] (SPARK-6375) Bad formatting in analysis errors

2015-03-16 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-6375: --- Summary: Bad formatting in analysis errors Key: SPARK-6375 URL: https://issues.apache.org/jira/browse/SPARK-6375 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-6377) Set the number of shuffle partitions automatically based on the size of input tables and the reduce-side operation.

2015-03-16 Thread Yin Huai (JIRA)
Yin Huai created SPARK-6377: --- Summary: Set the number of shuffle partitions automatically based on the size of input tables and the reduce-side operation. Key: SPARK-6377 URL:

[jira] [Created] (SPARK-6369) InsertIntoHiveTable should use logic from SparkHadoopWriter

2015-03-16 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-6369: --- Summary: InsertIntoHiveTable should use logic from SparkHadoopWriter Key: SPARK-6369 URL: https://issues.apache.org/jira/browse/SPARK-6369 Project: Spark

[jira] [Updated] (SPARK-5451) And predicates are not properly pushed down

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5451: Target Version/s: 1.4.0 (was: 1.3.0, 1.2.2) And predicates are not properly pushed down

[jira] [Updated] (SPARK-6200) Support dialect in SQL

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6200: Target Version/s: 1.4.0 (was: 1.3.0) Support dialect in SQL --

[jira] [Assigned] (SPARK-5183) Document data source API

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reassigned SPARK-5183: --- Assignee: Michael Armbrust Document data source API

[jira] [Commented] (SPARK-6370) RDD sampling with replacement intermittently yields incorrect number of samples

2015-03-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364355#comment-14364355 ] Sean Owen commented on SPARK-6370: -- What's the bug? Each element is sampled with

[jira] [Commented] (SPARK-6370) RDD sampling with replacement intermittently yields incorrect number of samples

2015-03-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364354#comment-14364354 ] Sean Owen commented on SPARK-6370: -- What's the bug? Each element is sampled with

[jira] [Commented] (SPARK-6250) Types are now reserved words in DDL parser.

2015-03-16 Thread Nitay Joffe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364367#comment-14364367 ] Nitay Joffe commented on SPARK-6250: Is it hard to fix this? Seems to me it would be a

[jira] [Commented] (SPARK-6250) Types are now reserved words in DDL parser.

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364373#comment-14364373 ] Michael Armbrust commented on SPARK-6250: - I'm not suggesting you change your

[jira] [Commented] (SPARK-6250) Types are now reserved words in DDL parser.

2015-03-16 Thread Nitay Joffe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364375#comment-14364375 ] Nitay Joffe commented on SPARK-6250: How would I do a select * from an existing hive

[jira] [Commented] (SPARK-6348) Enable useFeatureScaling in SVMWithSGD

2015-03-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364408#comment-14364408 ] Apache Spark commented on SPARK-6348: - User 'tanyinyan' has created a pull request for

[jira] [Created] (SPARK-6371) Update version to 1.4.0-SNAPSHOT

2015-03-16 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-6371: - Summary: Update version to 1.4.0-SNAPSHOT Key: SPARK-6371 URL: https://issues.apache.org/jira/browse/SPARK-6371 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-6372) spark-submit --conf is not being propagated to child processes

2015-03-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1436#comment-1436 ] Apache Spark commented on SPARK-6372: - User 'vanzin' has created a pull request for

[jira] [Commented] (SPARK-6374) Add getter for GeneralizedLinearAlgorithm

2015-03-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364471#comment-14364471 ] Apache Spark commented on SPARK-6374: - User 'hhbyyh' has created a pull request for

[jira] [Created] (SPARK-6374) Add getter for GeneralizedLinearAlgorithm

2015-03-16 Thread yuhao yang (JIRA)
yuhao yang created SPARK-6374: - Summary: Add getter for GeneralizedLinearAlgorithm Key: SPARK-6374 URL: https://issues.apache.org/jira/browse/SPARK-6374 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-6330) newParquetRelation gets incorrect FileSystem

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6330: Target Version/s: 1.3.1 newParquetRelation gets incorrect FileSystem

[jira] [Comment Edited] (SPARK-6192) Enhance MLlib's Python API (GSoC 2015)

2015-03-16 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363767#comment-14363767 ] Manoj Kumar edited comment on SPARK-6192 at 3/17/15 3:16 AM: -

[jira] [Commented] (SPARK-5068) When the path not found in the hdfs,we can't get the result

2015-03-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364490#comment-14364490 ] Apache Spark commented on SPARK-5068: - User 'lazyman500' has created a pull request

[jira] [Comment Edited] (SPARK-6340) mllib.IDF for LabelPoints

2015-03-16 Thread Kian Ho (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364246#comment-14364246 ] Kian Ho edited comment on SPARK-6340 at 3/17/15 12:01 AM: -- Hi

[jira] [Commented] (SPARK-6340) mllib.IDF for LabelPoints

2015-03-16 Thread Kian Ho (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364321#comment-14364321 ] Kian Ho commented on SPARK-6340: I appreciate the swift response! happy to keep this issue

[jira] [Updated] (SPARK-6368) Build a specialized serializer for Exchange operator.

2015-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6368: Assignee: Yin Huai Build a specialized serializer for Exchange operator.

[jira] [Commented] (SPARK-6293) SQLContext.implicits should provide automatic conversion for RDD[Row]

2015-03-16 Thread Chen Song (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364340#comment-14364340 ] Chen Song commented on SPARK-6293: -- OK, I have created a pull request

[jira] [Commented] (SPARK-6250) Types are now reserved words in DDL parser.

2015-03-16 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364382#comment-14364382 ] Yin Huai commented on SPARK-6250: - [~nitay] Have you tried backticks? Does it work?

[jira] [Commented] (SPARK-6355) Spark standalone cluster does not support local:/ url for jar file

2015-03-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363232#comment-14363232 ] Sean Owen commented on SPARK-6355: -- Oh, I learned something then. Yeah that looks like

[jira] [Comment Edited] (SPARK-6355) Spark standalone cluster does not support local:/ url for jar file

2015-03-16 Thread Jesper Lundgren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363227#comment-14363227 ] Jesper Lundgren edited comment on SPARK-6355 at 3/16/15 2:14 PM:

[jira] [Comment Edited] (SPARK-6355) Spark standalone cluster does not support local:/ url for jar file

2015-03-16 Thread Jesper Lundgren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363242#comment-14363242 ] Jesper Lundgren edited comment on SPARK-6355 at 3/16/15 2:23 PM:

[jira] [Commented] (SPARK-6355) Spark standalone cluster does not support local:/ url for jar file

2015-03-16 Thread Jesper Lundgren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363227#comment-14363227 ] Jesper Lundgren commented on SPARK-6355: [~srowen] spark-submit --class class.Main

[jira] [Commented] (SPARK-3278) Isotonic regression

2015-03-16 Thread Martin Zapletal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362986#comment-14362986 ] Martin Zapletal commented on SPARK-3278: Vladimir, just to update you on the

[jira] [Commented] (SPARK-6299) ClassNotFoundException in standalone mode when running groupByKey with class defined in REPL.

2015-03-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363239#comment-14363239 ] Apache Spark commented on SPARK-6299: - User 'swkimme' has created a pull request for

[jira] [Comment Edited] (SPARK-3278) Isotonic regression

2015-03-16 Thread Martin Zapletal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362986#comment-14362986 ] Martin Zapletal edited comment on SPARK-3278 at 3/16/15 9:37 AM:

[jira] [Resolved] (SPARK-6300) sc.addFile(path) does not support the relative path.

2015-03-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6300. -- Resolution: Fixed Fix Version/s: 1.3.1 1.4.0 Issue resolved by pull request

[jira] [Commented] (SPARK-6316) add a parameter for SparkContext(conf).textFile() method , support for multi-language hdfs file , e.g. gbk

2015-03-16 Thread yunzhi.lyz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363161#comment-14363161 ] yunzhi.lyz commented on SPARK-6316: --- i have a try read file non UTF-8 encodings

[jira] [Created] (SPARK-6356) Support the ROLLUP/CUBE/GROUPING SETS/grouping() in SQLContext

2015-03-16 Thread Yadong Qi (JIRA)
Yadong Qi created SPARK-6356: Summary: Support the ROLLUP/CUBE/GROUPING SETS/grouping() in SQLContext Key: SPARK-6356 URL: https://issues.apache.org/jira/browse/SPARK-6356 Project: Spark Issue

[jira] [Updated] (SPARK-6356) Support the ROLLUP/CUBE/GROUPING SETS/grouping() in SQLContext

2015-03-16 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-6356: - Description: Support for the expression below: ``` GROUP BY expression list WITH ROLLUP GROUP BY

[jira] [Commented] (SPARK-6305) Add support for log4j 2.x to Spark

2015-03-16 Thread Lior Chaga (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362996#comment-14362996 ] Lior Chaga commented on SPARK-6305: --- Works by adding log4j 2.x jars with log4j1.2-api

  1   2   3   >