[jira] [Updated] (SPARK-11685) Find duplicate content under examples/

2015-12-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-11685: -- Fix Version/s: 1.6.0 > Find duplicate content under examples/ >

[jira] [Commented] (SPARK-12386) Setting "spark.executor.port" leads to NPE in SparkEnv

2015-12-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061072#comment-15061072 ] Shixiong Zhu commented on SPARK-12386: -- Yeah, I will submit a PR to fix it. However, I don't think

[jira] [Commented] (SPARK-12379) Copy GBT implementation to spark.ml

2015-12-16 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061101#comment-15061101 ] Seth Hendrickson commented on SPARK-12379: -- I wouldn't mind taking a crack at this if that seems

[jira] [Created] (SPARK-12378) CREATE EXTERNAL TABLE AS SELECT EXPORT AWS S3 ERROR

2015-12-16 Thread CESAR MICHELETTI (JIRA)
CESAR MICHELETTI created SPARK-12378: Summary: CREATE EXTERNAL TABLE AS SELECT EXPORT AWS S3 ERROR Key: SPARK-12378 URL: https://issues.apache.org/jira/browse/SPARK-12378 Project: Spark

[jira] [Comment Edited] (SPARK-8855) Python API for Association Rules

2015-12-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15047432#comment-15047432 ] Joseph K. Bradley edited comment on SPARK-8855 at 12/16/15 10:44 PM: -

[jira] [Commented] (SPARK-8855) Python API for Association Rules

2015-12-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061044#comment-15061044 ] Joseph K. Bradley commented on SPARK-8855: -- Oops, apologies for not realizing AssociationRules

[jira] [Commented] (SPARK-12386) Setting "spark.executor.port" leads to NPE in SparkEnv

2015-12-16 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061084#comment-15061084 ] Marcelo Vanzin commented on SPARK-12386: bq. The user can just remove that unused config The

[jira] [Resolved] (SPARK-11677) ORC filter tests all pass if filters are actually not pushed down.

2015-12-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11677. -- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 9687

[jira] [Resolved] (SPARK-9690) Add random seed Param to PySpark CrossValidator

2015-12-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-9690. -- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 10268

[jira] [Created] (SPARK-12386) Setting "spark.executor.port" leads to NPE in SparkEnv

2015-12-16 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-12386: -- Summary: Setting "spark.executor.port" leads to NPE in SparkEnv Key: SPARK-12386 URL: https://issues.apache.org/jira/browse/SPARK-12386 Project: Spark

[jira] [Commented] (SPARK-11685) Find duplicate content under examples/

2015-12-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061046#comment-15061046 ] Joseph K. Bradley commented on SPARK-11685: --- OK appreciate it! > Find duplicate content under

[jira] [Updated] (SPARK-11677) ORC filter tests all pass if filters are actually not pushed down.

2015-12-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11677: - Assignee: Hyukjin Kwon > ORC filter tests all pass if filters are actually not pushed

[jira] [Assigned] (SPARK-12376) Spark Streaming Java8APISuite fails in assertOrderInvariantEquals method

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12376: Assignee: (was: Apache Spark) > Spark Streaming Java8APISuite fails in

[jira] [Created] (SPARK-12379) Copy GBT implementation to spark.ml

2015-12-16 Thread Seth Hendrickson (JIRA)
Seth Hendrickson created SPARK-12379: Summary: Copy GBT implementation to spark.ml Key: SPARK-12379 URL: https://issues.apache.org/jira/browse/SPARK-12379 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-12376) Spark Streaming Java8APISuite fails in assertOrderInvariantEquals method

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12376: Assignee: Apache Spark > Spark Streaming Java8APISuite fails in

[jira] [Created] (SPARK-12384) Allow -Xms to be set differently then -Xmx

2015-12-16 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-12384: - Summary: Allow -Xms to be set differently then -Xmx Key: SPARK-12384 URL: https://issues.apache.org/jira/browse/SPARK-12384 Project: Spark Issue Type:

[jira] [Commented] (SPARK-12386) Setting "spark.executor.port" leads to NPE in SparkEnv

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061097#comment-15061097 ] Apache Spark commented on SPARK-12386: -- User 'vanzin' has created a pull request for this issue:

[jira] [Updated] (SPARK-12380) MLLib should use existing SQLContext instead create new one

2015-12-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12380: -- Target Version/s: 1.6.1, 2.0.0 Component/s: PySpark

[jira] [Created] (SPARK-12385) Push projection into Join

2015-12-16 Thread Davies Liu (JIRA)
Davies Liu created SPARK-12385: -- Summary: Push projection into Join Key: SPARK-12385 URL: https://issues.apache.org/jira/browse/SPARK-12385 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-12386) Setting "spark.executor.port" leads to NPE in SparkEnv

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12386: Assignee: Apache Spark > Setting "spark.executor.port" leads to NPE in SparkEnv >

[jira] [Commented] (SPARK-12386) Setting "spark.executor.port" leads to NPE in SparkEnv

2015-12-16 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061096#comment-15061096 ] Marcelo Vanzin commented on SPARK-12386: BTW the bot seems to be lagging, but:

[jira] [Commented] (SPARK-12386) Setting "spark.executor.port" leads to NPE in SparkEnv

2015-12-16 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061119#comment-15061119 ] Marcelo Vanzin commented on SPARK-12386: The config option is not about assumptions; it's about

[jira] [Commented] (SPARK-12386) Setting "spark.executor.port" leads to NPE in SparkEnv

2015-12-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061130#comment-15061130 ] Shixiong Zhu commented on SPARK-12386: -- So the consequence is, if the user removes

[jira] [Created] (SPARK-12387) JDBC IN operator push down

2015-12-16 Thread Huaxin Gao (JIRA)
Huaxin Gao created SPARK-12387: -- Summary: JDBC IN operator push down Key: SPARK-12387 URL: https://issues.apache.org/jira/browse/SPARK-12387 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-12387) JDBC IN operator push down

2015-12-16 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061189#comment-15061189 ] Huaxin Gao commented on SPARK-12387: I will submit a PR soon. > JDBC IN operator push down >

[jira] [Assigned] (SPARK-12388) Change default compressor to LZ4

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12388: Assignee: Apache Spark > Change default compressor to LZ4 >

[jira] [Assigned] (SPARK-12388) Change default compressor to LZ4

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12388: Assignee: (was: Apache Spark) > Change default compressor to LZ4 >

[jira] [Updated] (SPARK-12388) Change default compressor to LZ4

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12388: Labels: releasenotes (was: ) > Change default compressor to LZ4 >

[jira] [Commented] (SPARK-10931) PySpark ML Models should contain Param values

2015-12-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061324#comment-15061324 ] Joseph K. Bradley commented on SPARK-10931: --- I was thinking of fetching it from the Java object

[jira] [Resolved] (SPARK-12365) Use ShutdownHookManager where Runtime.getRuntime.addShutdownHook() is called

2015-12-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-12365. --- Resolution: Fixed Assignee: Ted Yu (was: Apache Spark) Fix Version/s: 2.0.0

[jira] [Resolved] (SPARK-12186) stage web URI will redirect to the wrong location if it is the first URI from the application to be requested from the history server

2015-12-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-12186. --- Resolution: Fixed Fix Version/s: 1.6.0 Target Version/s: 1.6.0 > stage web URI will

[jira] [Updated] (SPARK-12186) stage web URI will redirect to the wrong location if it is the first URI from the application to be requested from the history server

2015-12-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12186: -- Fix Version/s: (was: 1.6.0) 2.0.0 1.6.1 > stage web URI will

[jira] [Updated] (SPARK-12186) stage web URI will redirect to the wrong location if it is the first URI from the application to be requested from the history server

2015-12-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12186: -- Target Version/s: 1.6.1, 2.0.0 (was: 1.6.0) > stage web URI will redirect to the wrong location if it

[jira] [Updated] (SPARK-12186) stage web URI will redirect to the wrong location if it is the first URI from the application to be requested from the history server

2015-12-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12186: -- Assignee: Rohit Agarwal > stage web URI will redirect to the wrong location if it is the first URI

[jira] [Commented] (SPARK-12391) JDBC OR operator push down

2015-12-16 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061498#comment-15061498 ] Huaxin Gao commented on SPARK-12391: Will submit a PR soon > JDBC OR operator push down >

[jira] [Created] (SPARK-12391) JDBC OR operator push down

2015-12-16 Thread Huaxin Gao (JIRA)
Huaxin Gao created SPARK-12391: -- Summary: JDBC OR operator push down Key: SPARK-12391 URL: https://issues.apache.org/jira/browse/SPARK-12391 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-12392) Optimize a location order of broadcast blocks by considering preferred local hosts

2015-12-16 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-12392: Summary: Optimize a location order of broadcast blocks by considering preferred local hosts Key: SPARK-12392 URL: https://issues.apache.org/jira/browse/SPARK-12392

[jira] [Closed] (SPARK-3863) Cache broadcasted tables and reuse them across queries

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-3863. -- Resolution: Fixed Closing this as later. > Cache broadcasted tables and reuse them across queries >

[jira] [Reopened] (SPARK-3863) Cache broadcasted tables and reuse them across queries

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reopened SPARK-3863: > Cache broadcasted tables and reuse them across queries >

[jira] [Closed] (SPARK-3864) Specialize join for tables with unique integer keys

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-3864. -- Resolution: Later Closing this as later. > Specialize join for tables with unique integer keys >

[jira] [Closed] (SPARK-3863) Cache broadcasted tables and reuse them across queries

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-3863. -- Resolution: Later Closing this as later. > Cache broadcasted tables and reuse them across queries >

[jira] [Closed] (SPARK-3860) Improve dimension joins

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-3860. -- Resolution: Later Closing this as later. > Improve dimension joins > --- > >

[jira] [Created] (SPARK-12395) Result of DataFrame.join(usingColumns) could be wrong for outer join

2015-12-16 Thread Davies Liu (JIRA)
Davies Liu created SPARK-12395: -- Summary: Result of DataFrame.join(usingColumns) could be wrong for outer join Key: SPARK-12395 URL: https://issues.apache.org/jira/browse/SPARK-12395 Project: Spark

[jira] [Commented] (SPARK-12387) JDBC IN operator push down

2015-12-16 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061532#comment-15061532 ] Huaxin Gao commented on SPARK-12387: Submitted PR: https://github.com/apache/spark/pull/10345 > JDBC

[jira] [Assigned] (SPARK-12392) Optimize a location order of broadcast blocks by considering preferred local hosts

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12392: Assignee: (was: Apache Spark) > Optimize a location order of broadcast blocks by

[jira] [Assigned] (SPARK-12392) Optimize a location order of broadcast blocks by considering preferred local hosts

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12392: Assignee: Apache Spark > Optimize a location order of broadcast blocks by considering

[jira] [Commented] (SPARK-12279) Requesting a HBase table with kerberos is not working

2015-12-16 Thread Y Bodnar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061569#comment-15061569 ] Y Bodnar commented on SPARK-12279: -- Got it working on Spark 1.5.2 (built for hadoop2.4) with Spark

[jira] [Updated] (SPARK-11904) pyspark reduceByKeyAndWindow with invFunc=None requires checkpointing

2015-12-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-11904: - Issue Type: Improvement (was: Bug) > pyspark reduceByKeyAndWindow with invFunc=None requires

[jira] [Resolved] (SPARK-11904) pyspark reduceByKeyAndWindow with invFunc=None requires checkpointing

2015-12-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-11904. -- Resolution: Fixed Assignee: David Tolpin Fix Version/s: 2.0.0 > pyspark

[jira] [Updated] (SPARK-12393) Add read.text and write.text for SparkR

2015-12-16 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-12393: Issue Type: Sub-task (was: New Feature) Parent: SPARK-12144 > Add read.text and

[jira] [Created] (SPARK-12393) Add read.text and write.text for SparkR

2015-12-16 Thread Yanbo Liang (JIRA)
Yanbo Liang created SPARK-12393: --- Summary: Add read.text and write.text for SparkR Key: SPARK-12393 URL: https://issues.apache.org/jira/browse/SPARK-12393 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-6006) Optimize count distinct in case of high cardinality columns

2015-12-16 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061594#comment-15061594 ] Yin Huai commented on SPARK-6006: - SPARK-12077 fixed it. > Optimize count distinct in case of high

[jira] [Closed] (SPARK-6006) Optimize count distinct in case of high cardinality columns

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-6006. -- Resolution: Fixed Assignee: Davies Liu Fix Version/s: 1.6.0 This is fixed as of Spark

[jira] [Closed] (SPARK-2365) Add IndexedRDD, an efficient updatable key-value store

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-2365. -- Resolution: Later I'm closing this as later for now to cut down the number of unresolved JIRA tickets.

[jira] [Commented] (SPARK-12353) wrong output for countByValue and countByValueAndWindow

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061607#comment-15061607 ] Apache Spark commented on SPARK-12353: -- User 'jerryshao' has created a pull request for this issue:

[jira] [Commented] (SPARK-12393) Add read.text and write.text for SparkR

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061606#comment-15061606 ] Apache Spark commented on SPARK-12393: -- User 'yanboliang' has created a pull request for this issue:

[jira] [Commented] (SPARK-12384) Allow -Xms to be set differently then -Xmx

2015-12-16 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061639#comment-15061639 ] Saisai Shao commented on SPARK-12384: - IIUC, there's also another limitation in container level. For

[jira] [Resolved] (SPARK-12057) Prevent failure on corrupt JSON records

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-12057. - Resolution: Fixed Fix Version/s: 2.0.0 1.6.1 > Prevent failure on

[jira] [Commented] (SPARK-12387) JDBC IN operator push down

2015-12-16 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061535#comment-15061535 ] Huaxin Gao commented on SPARK-12387: Somehow my PR doesn't automatically link to the jira. I

[jira] [Commented] (SPARK-6918) Secure HBase with Kerberos does not work over YARN

2015-12-16 Thread Y Bodnar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061572#comment-15061572 ] Y Bodnar commented on SPARK-6918: - This actually works, albeit setup is a bit tricky. Added a comment

[jira] [Assigned] (SPARK-12391) JDBC OR operator push down

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12391: Assignee: Apache Spark > JDBC OR operator push down > -- > >

[jira] [Commented] (SPARK-12391) JDBC OR operator push down

2015-12-16 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061574#comment-15061574 ] Huaxin Gao commented on SPARK-12391: submitted PR: https://github.com/apache/spark/pull/10347 > JDBC

[jira] [Closed] (SPARK-11512) Bucket Join

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-11512. --- Resolution: Duplicate > Bucket Join > --- > > Key: SPARK-11512 >

[jira] [Created] (SPARK-12394) Support writing out pre-hash-partitioned data and exploit that in join optimizations to avoid shuffle (i.e. bucketing in Hive)

2015-12-16 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-12394: --- Summary: Support writing out pre-hash-partitioned data and exploit that in join optimizations to avoid shuffle (i.e. bucketing in Hive) Key: SPARK-12394 URL:

[jira] [Closed] (SPARK-5292) optimize join for table that are already sharded/support for hive bucket

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-5292. -- Resolution: Duplicate > optimize join for table that are already sharded/support for hive bucket >

[jira] [Assigned] (SPARK-12393) Add read.text and write.text for SparkR

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12393: Assignee: Apache Spark > Add read.text and write.text for SparkR >

[jira] [Assigned] (SPARK-12393) Add read.text and write.text for SparkR

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12393: Assignee: (was: Apache Spark) > Add read.text and write.text for SparkR >

[jira] [Closed] (SPARK-8836) Sorted join

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-8836. -- Resolution: Done Marking this as done. Users can get it from both Dataset and DataFrame and SQL now.

[jira] [Closed] (SPARK-3785) Support off-loading computations to a GPU

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-3785. -- Resolution: Later I'm going to close this ticket as "later". But feel free to use this as a venue for

[jira] [Commented] (SPARK-8287) Filters not pushed with substitution through aggregation

2015-12-16 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061628#comment-15061628 ] Yin Huai commented on SPARK-8287: - It is fixed in 1.6. > Filters not pushed with substitution through

[jira] [Closed] (SPARK-8287) Filters not pushed with substitution through aggregation

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-8287. -- Resolution: Duplicate Assignee: Davies Liu (was: Li Sheng) Fix Version/s: 1.6.0 >

[jira] [Comment Edited] (SPARK-8287) Filters not pushed with substitution through aggregation

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14580403#comment-14580403 ] Reynold Xin edited comment on SPARK-8287 at 12/17/15 6:58 AM: -- Sorry [~lian

[jira] [Updated] (SPARK-11904) pyspark reduceByKeyAndWindow with invFunc=None requires checkpointing

2015-12-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-11904: - Affects Version/s: 1.6.0 > pyspark reduceByKeyAndWindow with invFunc=None requires checkpointing

[jira] [Commented] (SPARK-12391) JDBC OR operator push down

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061576#comment-15061576 ] Apache Spark commented on SPARK-12391: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Closed] (SPARK-2211) Join Optimization

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-2211. -- Resolution: Later Closing as later to cut down the number of unresolved jiras. We will create these

[jira] [Closed] (SPARK-2215) Multi-way join

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-2215. -- Resolution: Later Closing this as later to cut down the number of unresolved jiras. We will open

[jira] [Closed] (SPARK-2216) Cost-based join reordering

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-2216. -- Resolution: Later Closing this as later to cut down the number of unresolved jiras. We will open

[jira] [Closed] (SPARK-4644) Implement skewed join

2015-12-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-4644. -- Resolution: Won't Fix Closing as won't fix for now. If we were going to implement skew join, we should

[jira] [Commented] (SPARK-8287) Filters not pushed with substitution through aggregation

2015-12-16 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061623#comment-15061623 ] Yin Huai commented on SPARK-8287: - SPARK-11179 and SPARK-11973 fixed it. > Filters not pushed with

[jira] [Updated] (SPARK-12395) Result of DataFrame.join(usingColumns) could be wrong for outer join

2015-12-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-12395: --- Priority: Critical (was: Major) > Result of DataFrame.join(usingColumns) could be wrong for outer

[jira] [Updated] (SPARK-12395) Result of DataFrame.join(usingColumns) could be wrong for outer join

2015-12-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-12395: --- Priority: Blocker (was: Critical) > Result of DataFrame.join(usingColumns) could be wrong for outer

[jira] [Updated] (SPARK-12372) Document limitations of MLlib linear algebra

2015-12-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12372: -- Description: This JIRA is now for documenting limitations of MLlib's local linear

[jira] [Updated] (SPARK-12389) In Cluster RDD Action results are not consistent

2015-12-16 Thread vinoth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoth updated SPARK-12389: --- Attachment: local_spark.txt cluster_wide.txt > In Cluster RDD Action results are not

[jira] [Commented] (SPARK-12180) DataFrame.join() in PySpark gives misleading exception when column name exists on both side

2015-12-16 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061320#comment-15061320 ] Jeff Zhang commented on SPARK-12180: Simulate your sample code, but it works for me. But I am on

[jira] [Commented] (SPARK-12218) Boolean logic in sql does not work "not (A and B)" is not the same as "(not A) or (not B)"

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061415#comment-15061415 ] Apache Spark commented on SPARK-12218: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Created] (SPARK-12388) Change default compressor to LZ4

2015-12-16 Thread Davies Liu (JIRA)
Davies Liu created SPARK-12388: -- Summary: Change default compressor to LZ4 Key: SPARK-12388 URL: https://issues.apache.org/jira/browse/SPARK-12388 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-12375) VectorIndexer: allow unknown categories

2015-12-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061263#comment-15061263 ] yuhao yang commented on SPARK-12375: Anyone working on this? If not, I'll start to. > VectorIndexer:

[jira] [Assigned] (SPARK-12390) Clean up unused serializer parameter in BlockManager

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12390: Assignee: Apache Spark (was: Andrew Or) > Clean up unused serializer parameter in

[jira] [Resolved] (SPARK-10248) DAGSchedulerSuite should check there were no errors in EventProcessLoop

2015-12-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-10248. --- Resolution: Fixed Assignee: Imran Rashid Fix Version/s: 2.0.0

[jira] [Commented] (SPARK-6817) DataFrame UDFs in R

2015-12-16 Thread Antonio Piccolboni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061420#comment-15061420 ] Antonio Piccolboni commented on SPARK-6817: --- Will this form of partition-UDF available only in R

[jira] [Updated] (SPARK-12345) Mesos cluster mode is broken

2015-12-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12345: -- Assignee: Timothy Chen (was: Luc Bourlier) > Mesos cluster mode is broken >

[jira] [Commented] (SPARK-11677) ORC filter tests all pass if filters are actually not pushed down.

2015-12-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061257#comment-15061257 ] Apache Spark commented on SPARK-11677: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Updated] (SPARK-12372) Document limitations of MLlib local linear algebra

2015-12-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12372: -- Summary: Document limitations of MLlib local linear algebra (was: Document

[jira] [Resolved] (SPARK-12386) Setting "spark.executor.port" leads to NPE in SparkEnv

2015-12-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-12386. --- Resolution: Fixed Assignee: Marcelo Vanzin (was: Apache Spark) Fix

[jira] [Updated] (SPARK-12390) Clean up unused serializer parameter in BlockManager

2015-12-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12390: -- Fix Version/s: 2.0.0 > Clean up unused serializer parameter in BlockManager >

[jira] [Updated] (SPARK-12388) Change default compressor to LZ4

2015-12-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-12388: --- Description: According the benchmark [1], LZ4-java could be 80% (or 30%) faster than Snappy. After

[jira] [Created] (SPARK-12389) In Cluster RDD Action results are not consistent

2015-12-16 Thread vinoth (JIRA)
vinoth created SPARK-12389: -- Summary: In Cluster RDD Action results are not consistent Key: SPARK-12389 URL: https://issues.apache.org/jira/browse/SPARK-12389 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-12218) Boolean logic in sql does not work "not (A and B)" is not the same as "(not A) or (not B)"

2015-12-16 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-12218: - Target Version/s: 1.5.3, 1.6.1, 2.0.0 > Boolean logic in sql does not work "not (A and B)" is not the

[jira] [Commented] (SPARK-12372) Unary operator "-" fails for MLlib vectors

2015-12-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061310#comment-15061310 ] Joseph K. Bradley commented on SPARK-12372: --- That's a good point. I'll reopen this and edit it

[jira] [Updated] (SPARK-12372) Document limitations of MLlib linear algebra

2015-12-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12372: -- Summary: Document limitations of MLlib linear algebra (was: Unary operator "-" fails

<    1   2   3   4   >