[jira] [Updated] (SPARK-7028) Add filterNot to RDD

2015-04-20 Thread Marius Soutier (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marius Soutier updated SPARK-7028: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > Add filterNot to RDD >

[jira] [Created] (SPARK-7028) Add filterNot to RDD

2015-04-20 Thread Marius Soutier (JIRA)
Marius Soutier created SPARK-7028: - Summary: Add filterNot to RDD Key: SPARK-7028 URL: https://issues.apache.org/jira/browse/SPARK-7028 Project: Spark Issue Type: Bug Reporter: Ma

[jira] [Commented] (SPARK-7001) Partitions for a long single line file

2015-04-20 Thread Victor Bashurov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504452#comment-14504452 ] Victor Bashurov commented on SPARK-7001: Need to try InputFormat then or just chan

[jira] [Created] (SPARK-7027) Spark 1.2.2 Hadoop 2.4 download doesn't work

2015-04-20 Thread Marius Soutier (JIRA)
Marius Soutier created SPARK-7027: - Summary: Spark 1.2.2 Hadoop 2.4 download doesn't work Key: SPARK-7027 URL: https://issues.apache.org/jira/browse/SPARK-7027 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-7026) LeftSemiJoin can not work when it has equal condition and not equal condition.

2015-04-20 Thread Zhongshuai Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhongshuai Pei updated SPARK-7026: -- Description: Run sql like that {panel} select * from web_sales ws1 left semi join web_sales ws2

[jira] [Resolved] (SPARK-5081) Shuffle write increases

2015-04-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5081. Resolution: Duplicate I'm pretty sure this is fixed via SPARK-6905. Closing this and we can

[jira] [Comment Edited] (SPARK-5081) Shuffle write increases

2015-04-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504442#comment-14504442 ] Patrick Wendell edited comment on SPARK-5081 at 4/21/15 6:36 AM: ---

[jira] [Updated] (SPARK-7026) LeftSemiJoin can not work when it has equal condition and not equal condition.

2015-04-20 Thread Zhongshuai Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhongshuai Pei updated SPARK-7026: -- Summary: LeftSemiJoin can not work when it has equal condition and not equal condition. (was:

[jira] [Updated] (SPARK-7026) LeftSemiJoin can not work when it has not equal condition and equal condition.

2015-04-20 Thread Zhongshuai Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhongshuai Pei updated SPARK-7026: -- Summary: LeftSemiJoin can not work when it has not equal condition and equal condition. (was:

[jira] [Updated] (SPARK-2044) Pluggable interface for shuffles

2015-04-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2044: --- Fix Version/s: (was: 1.1.2) 1.1.0 > Pluggable interface for shuffles >

[jira] [Resolved] (SPARK-2044) Pluggable interface for shuffles

2015-04-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2044. Resolution: Fixed Fix Version/s: 1.1.2 > Pluggable interface for shuffles > -

[jira] [Updated] (SPARK-7008) An Implement of Factorization Machine (LibFM)

2015-04-20 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-7008: --- Attachment: FM_convergence_rate.xlsx QQ20150421-1.png QQ20150421-2.png

[jira] [Updated] (SPARK-7026) LeftSemiJoin can not work when it has not equal condition.

2015-04-20 Thread Zhongshuai Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhongshuai Pei updated SPARK-7026: -- Description: Run sql like that {panel} select * from web_sales ws1 left semi join web_sales ws2

[jira] [Resolved] (SPARK-6719) Update spark.apache.org/mllib page to 1.3

2015-04-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6719. -- Resolution: Done > Update spark.apache.org/mllib page to 1.3 > -

[jira] [Updated] (SPARK-7026) LeftSemiJoin can not work when it has not equal condition.

2015-04-20 Thread Zhongshuai Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhongshuai Pei updated SPARK-7026: -- Description: Run sql like that {panel} select * from web_sales ws1 left semi join web_sales ws2

[jira] [Updated] (SPARK-7026) LeftSemiJoin can not work when it has not equal condition.

2015-04-20 Thread Zhongshuai Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhongshuai Pei updated SPARK-7026: -- Description: Run sql like that {quote} select * from web_sales ws1 left semi join web_sales ws2

[jira] [Resolved] (SPARK-6490) Deprecate configurations for "askWithReply" and use new configuration names

2015-04-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-6490. Resolution: Fixed Fix Version/s: 1.4.0 Assignee: Shixiong Zhu > Deprecate configurat

[jira] [Updated] (SPARK-6490) Deprecate configurations for "askWithReply" and use new configuration names

2015-04-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-6490: --- Issue Type: Sub-task (was: Improvement) Parent: SPARK-5293 > Deprecate configurations for "as

[jira] [Created] (SPARK-7026) LeftSemiJoin can not work when it has not equal condition.

2015-04-20 Thread Zhongshuai Pei (JIRA)
Zhongshuai Pei created SPARK-7026: - Summary: LeftSemiJoin can not work when it has not equal condition. Key: SPARK-7026 URL: https://issues.apache.org/jira/browse/SPARK-7026 Project: Spark

[jira] [Resolved] (SPARK-6867) Dropout regularization

2015-04-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6867. -- Resolution: Later Rakesh, thanks for sharing the papers! Per discussion on the PR page, we need

[jira] [Commented] (SPARK-6923) Get invalid hive table columns after save DataFrame to hive table

2015-04-20 Thread pin_zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504409#comment-14504409 ] pin_zhang commented on SPARK-6923: -- Hi, Michael We run spark app in Spark1.3, and use

[jira] [Assigned] (SPARK-1442) Add Window function support

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-1442: --- Assignee: (was: Apache Spark) > Add Window function support > ---

[jira] [Assigned] (SPARK-1442) Add Window function support

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-1442: --- Assignee: Apache Spark > Add Window function support > --- > >

[jira] [Commented] (SPARK-1442) Add Window function support

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504407#comment-14504407 ] Apache Spark commented on SPARK-1442: - User 'guowei2' has created a pull request for t

[jira] [Commented] (SPARK-6932) A Prototype of Parameter Server

2015-04-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504403#comment-14504403 ] Xiangrui Meng commented on SPARK-6932: -- [~chouqin] Could you list the changes to core

[jira] [Updated] (SPARK-7025) Create a Java-friendly input source API

2015-04-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7025: --- Description: The goal of this ticket is to create a simple input source API that we can maintain and

[jira] [Commented] (SPARK-7008) An Implement of Factorization Machine (LibFM)

2015-04-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504369#comment-14504369 ] Xiangrui Meng commented on SPARK-7008: -- [~podongfeng] You implementation assumes that

[jira] [Assigned] (SPARK-7025) Create a Java-friendly input source API

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7025: --- Assignee: Reynold Xin (was: Apache Spark) > Create a Java-friendly input source API > --

[jira] [Commented] (SPARK-7025) Create a Java-friendly input source API

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504368#comment-14504368 ] Apache Spark commented on SPARK-7025: - User 'rxin' has created a pull request for this

[jira] [Assigned] (SPARK-7025) Create a Java-friendly input source API

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7025: --- Assignee: Apache Spark (was: Reynold Xin) > Create a Java-friendly input source API > --

[jira] [Updated] (SPARK-7025) Create a Java-friendly input source API

2015-04-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7025: --- Description: The goal of this ticket is to create a simple input source API that we can maintain and

[jira] [Created] (SPARK-7025) Create a Java-friendly input source API

2015-04-20 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-7025: -- Summary: Create a Java-friendly input source API Key: SPARK-7025 URL: https://issues.apache.org/jira/browse/SPARK-7025 Project: Spark Issue Type: Improvement

[jira] [Comment Edited] (SPARK-7015) Multiclass to Binary Reduction

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504335#comment-14504335 ] Joseph K. Bradley edited comment on SPARK-7015 at 4/21/15 5:21 AM: -

[jira] [Commented] (SPARK-7015) Multiclass to Binary Reduction

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504335#comment-14504335 ] Joseph K. Bradley commented on SPARK-7015: -- Your reference looks newer than ones

[jira] [Updated] (SPARK-7022) PySpark is missing ParamGridBuilder

2015-04-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7022: - Target Version/s: 1.4.0 > PySpark is missing ParamGridBuilder > --

[jira] [Updated] (SPARK-7022) PySpark is missing ParamGridBuilder

2015-04-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7022: - Assignee: Omede Firouz > PySpark is missing ParamGridBuilder > ---

[jira] [Updated] (SPARK-6954) Dynamic allocation: numExecutorsPending in ExecutorAllocationManager should never become negative

2015-04-20 Thread Cheolsoo Park (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated SPARK-6954: - Attachment: without_fix.png with_fix.png I am uploading two diagrams that shows ho

[jira] [Assigned] (SPARK-4131) Support "Writing data into the filesystem from queries"

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-4131: --- Assignee: Fei Wang (was: Apache Spark) > Support "Writing data into the filesystem from quer

[jira] [Assigned] (SPARK-4131) Support "Writing data into the filesystem from queries"

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-4131: --- Assignee: Apache Spark (was: Fei Wang) > Support "Writing data into the filesystem from quer

[jira] [Assigned] (SPARK-7024) Improve performance of function containsStar

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7024: --- Assignee: (was: Apache Spark) > Improve performance of function containsStar > --

[jira] [Commented] (SPARK-7024) Improve performance of function containsStar

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504255#comment-14504255 ] Apache Spark commented on SPARK-7024: - User 'watermen' has created a pull request for

[jira] [Commented] (SPARK-6900) spark ec2 script enters infinite loop when run-instance fails

2015-04-20 Thread Guodong Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504253#comment-14504253 ] Guodong Wang commented on SPARK-6900: - In my opinion, it does not cost us much to fix

[jira] [Assigned] (SPARK-7024) Improve performance of function containsStar

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7024: --- Assignee: Apache Spark > Improve performance of function containsStar > -

[jira] [Commented] (SPARK-6900) spark ec2 script enters infinite loop when run-instance fails

2015-04-20 Thread Guodong Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504236#comment-14504236 ] Guodong Wang commented on SPARK-6900: - Hi Nick, sorry for my late reply. I mark this

[jira] [Created] (SPARK-7024) Improve performance of function containsStar

2015-04-20 Thread Yadong Qi (JIRA)
Yadong Qi created SPARK-7024: Summary: Improve performance of function containsStar Key: SPARK-7024 URL: https://issues.apache.org/jira/browse/SPARK-7024 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-6738) EstimateSize is difference with spill file size

2015-04-20 Thread Hong Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Shen updated SPARK-6738: - Description: ExternalAppendOnlyMap spill 2.2 GB data to disk: {code} 15/04/07 20:27:37 INFO collection.E

[jira] [Updated] (SPARK-6738) EstimateSize is difference with spill file size

2015-04-20 Thread Hong Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Shen updated SPARK-6738: - Description: ExternalAppendOnlyMap spill 2.2 GB data to disk: {code} 15/04/07 20:27:37 INFO collection.E

[jira] [Comment Edited] (SPARK-6738) EstimateSize is difference with spill file size

2015-04-20 Thread Hong Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504202#comment-14504202 ] Hong Shen edited comment on SPARK-6738 at 4/21/15 2:54 AM: --- Ther

[jira] [Reopened] (SPARK-6738) EstimateSize is difference with spill file size

2015-04-20 Thread Hong Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Shen reopened SPARK-6738: -- There is a in SizeEstimator > EstimateSize is difference with spill file size > --

[jira] [Created] (SPARK-7023) [Spark SQL] Can't populate table size inforamtion into Hive metastore when create table or insert into table

2015-04-20 Thread Yi Zhou (JIRA)
Yi Zhou created SPARK-7023: -- Summary: [Spark SQL] Can't populate table size inforamtion into Hive metastore when create table or insert into table Key: SPARK-7023 URL: https://issues.apache.org/jira/browse/SPARK-7023

[jira] [Commented] (SPARK-7015) Multiclass to Binary Reduction

2015-04-20 Thread Ram Sriharsha (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504197#comment-14504197 ] Ram Sriharsha commented on SPARK-7015: -- sounds good. Let me know what reference you h

[jira] [Updated] (SPARK-7015) Multiclass to Binary Reduction

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7015: - Component/s: (was: MLlib) ML > Multiclass to Binary Reduction > -

[jira] [Commented] (SPARK-7015) Multiclass to Binary Reduction

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504189#comment-14504189 ] Joseph K. Bradley commented on SPARK-7015: -- +1 I'd strongly vote for supporting

[jira] [Comment Edited] (SPARK-7015) Multiclass to Binary Reduction

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504189#comment-14504189 ] Joseph K. Bradley edited comment on SPARK-7015 at 4/21/15 2:43 AM: -

[jira] [Updated] (SPARK-6635) DataFrame.withColumn can create columns with identical names

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6635: - Assignee: Liang-Chi Hsieh > DataFrame.withColumn can create columns with identical names >

[jira] [Updated] (SPARK-4766) ML Estimator Params should be distinct from Transformer Params

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4766: - Description: Currently, in spark.ml, both Transformers and Estimators extend the same Para

[jira] [Updated] (SPARK-4766) ML Estimator Params should be distinct from Transformer Params

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4766: - Summary: ML Estimator Params should be distinct from Transformer Params (was: ML Estimato

[jira] [Commented] (SPARK-4766) ML Estimator Params should subclass Transformer Params

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504181#comment-14504181 ] Joseph K. Bradley commented on SPARK-4766: -- *Update*: A new issue was brought up

[jira] [Commented] (SPARK-6529) Word2Vec transformer

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504180#comment-14504180 ] Joseph K. Bradley commented on SPARK-6529: -- [~yinxusen] brings up a good point (i

[jira] [Resolved] (SPARK-6635) DataFrame.withColumn can create columns with identical names

2015-04-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-6635. - Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5541 [https:/

[jira] [Resolved] (SPARK-6368) Build a specialized serializer for Exchange operator.

2015-04-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-6368. - Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5497 [https:/

[jira] [Commented] (SPARK-5100) Spark Thrift server monitor page

2015-04-20 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504132#comment-14504132 ] Cheng Lian commented on SPARK-5100: --- Had offline discussion with [~tianyi], he's rebasin

[jira] [Resolved] (SPARK-4521) Parquet fails to read columns with spaces in the name

2015-04-20 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-4521. --- Resolution: Done This ticket is covered by SPARK-6607. > Parquet fails to read columns with spaces in

[jira] [Commented] (SPARK-4521) Parquet fails to read columns with spaces in the name

2015-04-20 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504127#comment-14504127 ] Cheng Lian commented on SPARK-4521: --- Yes, I'm resolving this one. > Parquet fails to re

[jira] [Commented] (SPARK-6932) A Prototype of Parameter Server

2015-04-20 Thread He Yunlong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504125#comment-14504125 ] He Yunlong commented on SPARK-6932: --- I think this is a very good clue to discuss why and

[jira] [Commented] (SPARK-7008) An Implement of Factorization Machine (LibFM)

2015-04-20 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504114#comment-14504114 ] zhengruifeng commented on SPARK-7008: - thanks for this information! > An Implement of

[jira] [Updated] (SPARK-5995) Make ML Prediction Developer APIs public

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5995: - Description: Previously, some Developer APIs were added to spark.ml for classification and

[jira] [Commented] (SPARK-6635) DataFrame.withColumn can create columns with identical names

2015-04-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504092#comment-14504092 ] Michael Armbrust commented on SPARK-6635: - Sorry, updated. I meant {{withColumn}}

[jira] [Comment Edited] (SPARK-6635) DataFrame.withColumn can create columns with identical names

2015-04-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504030#comment-14504030 ] Michael Armbrust edited comment on SPARK-6635 at 4/21/15 1:07 AM: --

[jira] [Commented] (SPARK-5995) Make ML Prediction Developer APIs public

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504090#comment-14504090 ] Joseph K. Bradley commented on SPARK-5995: -- I just updated the design doc linked

[jira] [Commented] (SPARK-6635) DataFrame.withColumn can create columns with identical names

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504084#comment-14504084 ] Joseph K. Bradley commented on SPARK-6635: -- Just to clarify, does that mean {{wit

[jira] [Commented] (SPARK-6635) DataFrame.withColumn can create columns with identical names

2015-04-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504030#comment-14504030 ] Michael Armbrust commented on SPARK-6635: - +1 to {{withName}} overwriting existing

[jira] [Assigned] (SPARK-7022) PySpark is missing ParamGridBuilder

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7022: --- Assignee: Apache Spark > PySpark is missing ParamGridBuilder > --

[jira] [Assigned] (SPARK-7022) PySpark is missing ParamGridBuilder

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7022: --- Assignee: (was: Apache Spark) > PySpark is missing ParamGridBuilder > ---

[jira] [Commented] (SPARK-7022) PySpark is missing ParamGridBuilder

2015-04-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503957#comment-14503957 ] Apache Spark commented on SPARK-7022: - User 'oefirouz' has created a pull request for

[jira] [Updated] (SPARK-7022) PySpark is missing ParamGridBuilder

2015-04-20 Thread Omede Firouz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omede Firouz updated SPARK-7022: Description: PySpark is missing the entirety of ML.Tuning (see: https://issues.apache.org/jira/brow

[jira] [Created] (SPARK-7022) PySpark is missing ParamGridBuilder

2015-04-20 Thread Omede Firouz (JIRA)
Omede Firouz created SPARK-7022: --- Summary: PySpark is missing ParamGridBuilder Key: SPARK-7022 URL: https://issues.apache.org/jira/browse/SPARK-7022 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-6921) Spark SQL API "saveAsParquetFile" will output tachyon file with different block size

2015-04-20 Thread Sebastian YEPES FERNANDEZ (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503921#comment-14503921 ] Sebastian YEPES FERNANDEZ commented on SPARK-6921: -- I can also validate t

[jira] [Commented] (SPARK-6917) Broken data returned to PySpark dataframe if any large numbers used in Scala land

2015-04-20 Thread Harry Brundage (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503915#comment-14503915 ] Harry Brundage commented on SPARK-6917: --- [~davies] or [~joshrosen] any idea why this

[jira] [Created] (SPARK-7021) JUnit output for Python tests

2015-04-20 Thread Brennon York (JIRA)
Brennon York created SPARK-7021: --- Summary: JUnit output for Python tests Key: SPARK-7021 URL: https://issues.apache.org/jira/browse/SPARK-7021 Project: Spark Issue Type: Improvement C

[jira] [Updated] (SPARK-7020) Restrict module testing based on commit contents

2015-04-20 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brennon York updated SPARK-7020: Description: Currently all builds trigger all tests. This does not need to happen and, to minimize t

[jira] [Created] (SPARK-7020) Restrict module testing based on commit contents

2015-04-20 Thread Brennon York (JIRA)
Brennon York created SPARK-7020: --- Summary: Restrict module testing based on commit contents Key: SPARK-7020 URL: https://issues.apache.org/jira/browse/SPARK-7020 Project: Spark Issue Type: Impr

[jira] [Created] (SPARK-7019) Build docs on doc changes

2015-04-20 Thread Brennon York (JIRA)
Brennon York created SPARK-7019: --- Summary: Build docs on doc changes Key: SPARK-7019 URL: https://issues.apache.org/jira/browse/SPARK-7019 Project: Spark Issue Type: New Feature Compo

[jira] [Created] (SPARK-7018) Refactor dev/run-tests-jenkins into Python

2015-04-20 Thread Brennon York (JIRA)
Brennon York created SPARK-7018: --- Summary: Refactor dev/run-tests-jenkins into Python Key: SPARK-7018 URL: https://issues.apache.org/jira/browse/SPARK-7018 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-7017) Refactor dev/run-tests into Python

2015-04-20 Thread Brennon York (JIRA)
Brennon York created SPARK-7017: --- Summary: Refactor dev/run-tests into Python Key: SPARK-7017 URL: https://issues.apache.org/jira/browse/SPARK-7017 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-7016) Refactor dev/run-tests(-jenkins) from Bash to Python

2015-04-20 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brennon York updated SPARK-7016: Description: Currently the {{dev/run-tests}} and {{dev/run-tests-jenkins}} scripts are written in B

[jira] [Created] (SPARK-7016) Refactor {dev/run-tests(-jenkins)} from Bash to Python

2015-04-20 Thread Brennon York (JIRA)
Brennon York created SPARK-7016: --- Summary: Refactor {dev/run-tests(-jenkins)} from Bash to Python Key: SPARK-7016 URL: https://issues.apache.org/jira/browse/SPARK-7016 Project: Spark Issue Type

[jira] [Updated] (SPARK-7016) Refactor dev/run-tests(-jenkins) from Bash to Python

2015-04-20 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brennon York updated SPARK-7016: Summary: Refactor dev/run-tests(-jenkins) from Bash to Python (was: Refactor {{dev/run-tests(-jenki

[jira] [Updated] (SPARK-7016) Refactor {{dev/run-tests(-jenkins)}} from Bash to Python

2015-04-20 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brennon York updated SPARK-7016: Summary: Refactor {{dev/run-tests(-jenkins)}} from Bash to Python (was: Refactor {dev/run-tests(-je

[jira] [Updated] (SPARK-6954) Dynamic allocation: numExecutorsPending in ExecutorAllocationManager should never become negative

2015-04-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-6954: - Priority: Major (was: Minor) > Dynamic allocation: numExecutorsPending in ExecutorAllocationManager shoul

[jira] [Commented] (SPARK-7002) Persist on RDD fails the second time if the action is called on a child RDD without showing a FAILED message

2015-04-20 Thread Tom Hubregtsen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503783#comment-14503783 ] Tom Hubregtsen commented on SPARK-7002: --- Great, thanks for your help :) I will be h

[jira] [Commented] (SPARK-7002) Persist on RDD fails the second time if the action is called on a child RDD without showing a FAILED message

2015-04-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503761#comment-14503761 ] Sean Owen commented on SPARK-7002: -- The shuffle data is a sort of hidden, second type of

[jira] [Commented] (SPARK-7009) Build assembly JAR via ant to avoid zip64 problems

2015-04-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503754#comment-14503754 ] Sean Owen commented on SPARK-7009: -- Or warnings, yes. These add to the case that updating

[jira] [Commented] (SPARK-7009) Build assembly JAR via ant to avoid zip64 problems

2015-04-20 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503752#comment-14503752 ] Steve Loughran commented on SPARK-7009: --- most of the others seemed fix by documentat

[jira] [Comment Edited] (SPARK-7002) Persist on RDD fails the second time if the action is called on a child RDD without showing a FAILED message

2015-04-20 Thread Tom Hubregtsen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503730#comment-14503730 ] Tom Hubregtsen edited comment on SPARK-7002 at 4/20/15 9:46 PM:

[jira] [Commented] (SPARK-7002) Persist on RDD fails the second time if the action is called on a child RDD without showing a FAILED message

2015-04-20 Thread Tom Hubregtsen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503730#comment-14503730 ] Tom Hubregtsen commented on SPARK-7002: --- Your speculation was correct: After the ab

[jira] [Updated] (SPARK-6787) Model export/import for spark.ml: StandardScaler

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6787: - Target Version/s: (was: 1.4.0) > Model export/import for spark.ml: StandardScaler >

[jira] [Updated] (SPARK-6786) Model export/import for spark.ml: Normalizer

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6786: - Target Version/s: (was: 1.4.0) > Model export/import for spark.ml: Normalizer >

[jira] [Updated] (SPARK-6726) Model export/import for spark.ml: LogisticRegression

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6726: - Target Version/s: (was: 1.4.0) > Model export/import for spark.ml: LogisticRegression >

[jira] [Updated] (SPARK-6789) Model export/import for spark.ml: ALS

2015-04-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6789: - Target Version/s: (was: 1.4.0) > Model export/import for spark.ml: ALS > ---

  1   2   3   >