[jira] [Commented] (SPARK-10697) Lift Calculation in Association Rule mining

2016-09-06 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469778#comment-15469778 ] Daniel Müller commented on SPARK-10697: --- It really would make sense to add the lift

[jira] [Commented] (SPARK-17094) provide simplified API for ML pipeline

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469765#comment-15469765 ] Sean Owen commented on SPARK-17094: --- This is already pretty much possible as: {code} va

[jira] [Updated] (SPARK-16785) dapply doesn't return array or raw columns

2016-09-06 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman updated SPARK-16785: -- Assignee: Clark Fitzgerald > dapply doesn't return array or raw columns > -

[jira] [Resolved] (SPARK-16785) dapply doesn't return array or raw columns

2016-09-06 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman resolved SPARK-16785. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue

[jira] [Comment Edited] (SPARK-17428) SparkR executors/workers support virtualenv

2016-09-06 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469736#comment-15469736 ] Yanbo Liang edited comment on SPARK-17428 at 9/7/16 6:40 AM: -

[jira] [Commented] (SPARK-17428) SparkR executors/workers support virtualenv

2016-09-06 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469736#comment-15469736 ] Yanbo Liang commented on SPARK-17428: - cc [~shivaram] [~felixcheung] > SparkR execut

[jira] [Updated] (SPARK-17428) SparkR executors/workers support virtualenv

2016-09-06 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-17428: Description: Many users have requirements to use third party R packages in executors/workers, but

[jira] [Created] (SPARK-17428) SparkR executors/workers support virtualenv

2016-09-06 Thread Yanbo Liang (JIRA)
Yanbo Liang created SPARK-17428: --- Summary: SparkR executors/workers support virtualenv Key: SPARK-17428 URL: https://issues.apache.org/jira/browse/SPARK-17428 Project: Spark Issue Type: New Fea

[jira] [Commented] (SPARK-17094) provide simplified API for ML pipeline

2016-09-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469609#comment-15469609 ] yuhao yang commented on SPARK-17094: Something like Stanford CoreNLP pipeline: prop

[jira] [Assigned] (SPARK-17427) function SIZE should return -1 when parameter is null

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17427: Assignee: (was: Apache Spark) > function SIZE should return -1 when parameter is null

[jira] [Commented] (SPARK-17427) function SIZE should return -1 when parameter is null

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469569#comment-15469569 ] Apache Spark commented on SPARK-17427: -- User 'adrian-wang' has created a pull reques

[jira] [Assigned] (SPARK-17427) function SIZE should return -1 when parameter is null

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17427: Assignee: Apache Spark > function SIZE should return -1 when parameter is null > -

[jira] [Created] (SPARK-17427) function SIZE should return -1 when parameter is null

2016-09-06 Thread Adrian Wang (JIRA)
Adrian Wang created SPARK-17427: --- Summary: function SIZE should return -1 when parameter is null Key: SPARK-17427 URL: https://issues.apache.org/jira/browse/SPARK-17427 Project: Spark Issue Typ

[jira] [Commented] (SPARK-17368) Scala value classes create encoder problems and break at runtime

2016-09-06 Thread Aris Vlasakakis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469564#comment-15469564 ] Aris Vlasakakis commented on SPARK-17368: - It goes from inconvenient to actually

[jira] [Assigned] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17426: Assignee: Apache Spark > Current TreeNode.toJSON may trigger OOM under some corner cases >

[jira] [Assigned] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17426: Assignee: (was: Apache Spark) > Current TreeNode.toJSON may trigger OOM under some cor

[jira] [Commented] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469557#comment-15469557 ] Apache Spark commented on SPARK-17426: -- User 'clockfly' has created a pull request f

[jira] [Updated] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17426: --- Target Version/s: 2.1.0 Description: In SPARK-17356, we fix the OOM issue when Metadata is s

[jira] [Updated] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17426: --- Description: In SPARK-17356, we fix the OOM issue when Metadata is super big. There are other cases

[jira] [Updated] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17426: --- Component/s: SQL > Current TreeNode.toJSON may trigger OOM under some corner cases >

[jira] [Created] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17426: -- Summary: Current TreeNode.toJSON may trigger OOM under some corner cases Key: SPARK-17426 URL: https://issues.apache.org/jira/browse/SPARK-17426 Project: Spark

[jira] [Commented] (SPARK-17405) Simple aggregation query OOMing after SPARK-16525

2016-09-06 Thread Jacek Laskowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469487#comment-15469487 ] Jacek Laskowski commented on SPARK-17405: - It definitely got better with the buil

[jira] [Updated] (SPARK-6235) Address various 2G limits

2016-09-06 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-6235: --- Attachment: (was: SPARK-6235_Design_V0.01.pdf) > Address various 2G limits > -

[jira] [Updated] (SPARK-6235) Address various 2G limits

2016-09-06 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-6235: --- Attachment: SPARK-6235_Design_V0.02.pdf > Address various 2G limits > - > >

[jira] [Commented] (SPARK-17110) Pyspark with locality ANY throw java.io.StreamCorruptedException

2016-09-06 Thread Tomer Kaftan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469295#comment-15469295 ] Tomer Kaftan commented on SPARK-17110: -- Thanks all who helped out with this! > Pysp

[jira] [Resolved] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-17372. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Updated] (SPARK-17279) better error message for exceptions during ScalaUDF execution

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-17279: Fix Version/s: 2.0.1 > better error message for exceptions during ScalaUDF execution >

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReuseExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Summary: Override sameResult in HiveTableScanExec to make ReuseExchange work in text format table (was

[jira] [Assigned] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17425: Assignee: (was: Apache Spark) > Override sameResult in HiveTableScanExec to make Reuse

[jira] [Commented] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469249#comment-15469249 ] Apache Spark commented on SPARK-17425: -- User 'watermen' has created a pull request f

[jira] [Assigned] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17425: Assignee: Apache Spark > Override sameResult in HiveTableScanExec to make ReusedExchange w

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Description: When I run the below SQL(table src is text format): {code:sql} SELECT * FROM src t1 JOIN s

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Description: When I run the below SQL(table src is text format): {code:sql} SELECT 1 FROM src t1 JOIN s

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Description: When I run the below SQL(table src is text format): {code:sql} SELECT 1 FROM src t1 JOIN s

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Description: When I run the below SQL(table src is text format): {code:sql} SELECT 1 FROM src t1 JOIN s

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Description: When I run the below SQL(table src is text format): {code:sql} SELECT 1 FROM src t1 JOIN s

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Affects Version/s: 2.0.0 Component/s: SQL > Override sameResult in HiveTableScanExec to make

[jira] [Created] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
Yadong Qi created SPARK-17425: - Summary: Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table Key: SPARK-17425 URL: https://issues.apache.org/jira/browse/SPARK-17425 P

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Description: When I run the below SQL(table src is text format): {code:sql} select 1 from src t1 join s

[jira] [Resolved] (SPARK-17238) simplify the logic for converting data source table into hive compatible format

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17238. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14809 [https://githu

[jira] [Updated] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-17372: -- Description: When we create a filestream on a directory that has partitioned subdirs (i.e. dir

[jira] [Updated] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-17372: -- Description: When we create a filestream on a directory that has partitioned subdirs (i.e. dir

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-09-06 Thread Sital Kedia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469092#comment-15469092 ] Sital Kedia commented on SPARK-16922: - There is no noticable performance gain I obser

[jira] [Commented] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469072#comment-15469072 ] Apache Spark commented on SPARK-17372: -- User 'tdas' has created a pull request for t

[jira] [Assigned] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17372: Assignee: Tathagata Das (was: Apache Spark) > Running a file stream on a directory with p

[jira] [Assigned] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17372: Assignee: Apache Spark (was: Tathagata Das) > Running a file stream on a directory with p

[jira] [Updated] (SPARK-17408) Flaky test: org.apache.spark.sql.hive.StatisticsSuite

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-17408: Assignee: Xiao Li > Flaky test: org.apache.spark.sql.hive.StatisticsSuite > ---

[jira] [Resolved] (SPARK-17408) Flaky test: org.apache.spark.sql.hive.StatisticsSuite

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17408. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14978 [https://githu

[jira] [Updated] (SPARK-17371) Resubmitted stage outputs deleted by zombie map tasks on stop()

2016-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-17371: --- Assignee: Eric Liang > Resubmitted stage outputs deleted by zombie map tasks on stop() >

[jira] [Resolved] (SPARK-17371) Resubmitted stage outputs deleted by zombie map tasks on stop()

2016-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-17371. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14932 [https://github.

[jira] [Commented] (SPARK-17423) Support IGNORE NULLS option in Window functions

2016-09-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468973#comment-15468973 ] Herman van Hovell commented on SPARK-17423: --- IMO LEAD and LAG with ignore nulls

[jira] [Assigned] (SPARK-17421) Warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17421: Assignee: Apache Spark > Warnings about "MaxPermSize" parameter when building with Maven a

[jira] [Commented] (SPARK-17421) Warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468943#comment-15468943 ] Apache Spark commented on SPARK-17421: -- User 'frreiss' has created a pull request fo

[jira] [Assigned] (SPARK-17421) Warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17421: Assignee: (was: Apache Spark) > Warnings about "MaxPermSize" parameter when building w

[jira] [Commented] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468932#comment-15468932 ] Ryan Blue commented on SPARK-17396: --- I opened a PR with a fix. It still uses a ForkJoin

[jira] [Commented] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468929#comment-15468929 ] Apache Spark commented on SPARK-17396: -- User 'rdblue' has created a pull request for

[jira] [Commented] (SPARK-17296) Spark SQL: cross join + two joins = BUG

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468928#comment-15468928 ] Apache Spark commented on SPARK-17296: -- User 'hvanhovell' has created a pull request

[jira] [Assigned] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17396: Assignee: Apache Spark > Threads number keep increasing when query on external CSV partiti

[jira] [Assigned] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17396: Assignee: (was: Apache Spark) > Threads number keep increasing when query on external

[jira] [Updated] (SPARK-17424) Dataset job fails from unsound substitution in ScalaReflect

2016-09-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-17424: -- Description: I have a job that uses datasets in 1.6.1 and is failing with this error: {code} 16/09/02

[jira] [Created] (SPARK-17424) Dataset job fails from unsound substitution in ScalaReflect

2016-09-06 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-17424: - Summary: Dataset job fails from unsound substitution in ScalaReflect Key: SPARK-17424 URL: https://issues.apache.org/jira/browse/SPARK-17424 Project: Spark Issue

[jira] [Commented] (SPARK-16026) Cost-based Optimizer framework

2016-09-06 Thread Srinath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468891#comment-15468891 ] Srinath commented on SPARK-16026: - I have a couple of comments/questions on the proposal.

[jira] [Resolved] (SPARK-17253) Left join where ON clause does not reference the right table produces analysis error

2016-09-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-17253. --- Resolution: Duplicate Assignee: Herman van Hovell Fix Version/s: 2.1.0

[jira] [Resolved] (SPARK-17296) Spark SQL: cross join + two joins = BUG

2016-09-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-17296. --- Resolution: Fixed Fix Version/s: 2.1.0 > Spark SQL: cross join + two joins = B

[jira] [Assigned] (SPARK-17296) Spark SQL: cross join + two joins = BUG

2016-09-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell reassigned SPARK-17296: - Assignee: Herman van Hovell > Spark SQL: cross join + two joins = BUG >

[jira] [Created] (SPARK-17423) Support IGNORE NULLS option in Window functions

2016-09-06 Thread Tim Chan (JIRA)
Tim Chan created SPARK-17423: Summary: Support IGNORE NULLS option in Window functions Key: SPARK-17423 URL: https://issues.apache.org/jira/browse/SPARK-17423 Project: Spark Issue Type: Improveme

[jira] [Comment Edited] (SPARK-17368) Scala value classes create encoder problems and break at runtime

2016-09-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468833#comment-15468833 ] Jakob Odersky edited comment on SPARK-17368 at 9/6/16 10:57 PM: ---

[jira] [Resolved] (SPARK-15891) Make YARN logs less noisy

2016-09-06 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-15891. Resolution: Fixed Assignee: Marcelo Vanzin Fix Version/s: 2.1.0 > Make YARN

[jira] [Updated] (SPARK-17422) Update Ganglia project with new license

2016-09-06 Thread Luciano Resende (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luciano Resende updated SPARK-17422: Description: It seems that Ganglia is now BSD licensed http://ganglia.info/ and https://sou

[jira] [Commented] (SPARK-17368) Scala value classes create encoder problems and break at runtime

2016-09-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468833#comment-15468833 ] Jakob Odersky commented on SPARK-17368: --- So I thought about this a bit more and alt

[jira] [Created] (SPARK-17422) Update Ganglia project with new license

2016-09-06 Thread Luciano Resende (JIRA)
Luciano Resende created SPARK-17422: --- Summary: Update Ganglia project with new license Key: SPARK-17422 URL: https://issues.apache.org/jira/browse/SPARK-17422 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-17421) Warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-06 Thread Frederick Reiss (JIRA)
Frederick Reiss created SPARK-17421: --- Summary: Warnings about "MaxPermSize" parameter when building with Maven and Java 8 Key: SPARK-17421 URL: https://issues.apache.org/jira/browse/SPARK-17421 Proj

[jira] [Resolved] (SPARK-17110) Pyspark with locality ANY throw java.io.StreamCorruptedException

2016-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-17110. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull reque

[jira] [Commented] (SPARK-17405) Simple aggregation query OOMing after SPARK-16525

2016-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468737#comment-15468737 ] Josh Rosen commented on SPARK-17405: On the Spark Dev list, [~jlaskowski] found a sim

[jira] [Commented] (SPARK-17381) Memory leak org.apache.spark.sql.execution.ui.SQLTaskMetrics

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468738#comment-15468738 ] Davies Liu commented on SPARK-17381: cc [~cloud_fan] > Memory leak org.apache.spark

[jira] [Closed] (SPARK-17384) SQL - Running query with outer join from 1.6 fails

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-17384. -- Resolution: Duplicate Assignee: Herman van Hovell > SQL - Running query with outer join from 1.6

[jira] [Commented] (SPARK-17384) SQL - Running query with outer join from 1.6 fails

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468636#comment-15468636 ] Davies Liu commented on SPARK-17384: This is caused by the SQL parser change, the par

[jira] [Commented] (SPARK-17417) Fix # of partitions for RDD while checkpointing - Currently limited by 10000(%05d)

2016-09-06 Thread Dhruve Ashar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468622#comment-15468622 ] Dhruve Ashar commented on SPARK-17417: -- Thanks for the suggestion. I'll work on the

[jira] [Updated] (SPARK-17299) TRIM/LTRIM/RTRIM strips characters other than spaces

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17299: -- Assignee: Sandeep Singh > TRIM/LTRIM/RTRIM strips characters other than spaces > --

[jira] [Resolved] (SPARK-17299) TRIM/LTRIM/RTRIM strips characters other than spaces

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17299. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull request

[jira] [Resolved] (SPARK-17378) Upgrade snappy-java to 1.1.2.6

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17378. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 1.6.3 Issue

[jira] [Commented] (SPARK-17377) Joining Datasets read and aggregated from a partitioned Parquet file gives wrong results

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468583#comment-15468583 ] Davies Liu commented on SPARK-17377: Tested this with latest master and 2.0 on databr

[jira] [Assigned] (SPARK-17377) Joining Datasets read and aggregated from a partitioned Parquet file gives wrong results

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-17377: -- Assignee: Davies Liu > Joining Datasets read and aggregated from a partitioned Parquet file gi

[jira] [Commented] (SPARK-17316) Don't block StandaloneSchedulerBackend.executorRemoved

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468566#comment-15468566 ] Apache Spark commented on SPARK-17316: -- User 'zsxwing' has created a pull request fo

[jira] [Updated] (SPARK-17377) Joining Datasets read and aggregated from a partitioned Parquet file gives wrong results

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17377: --- Description: Reproduction: 1) Read two Datasets from a partitioned Parquet file with different filt

[jira] [Commented] (SPARK-17420) Install rmarkdown R package on Jenkins machines

2016-09-06 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468539#comment-15468539 ] Shivaram Venkataraman commented on SPARK-17420: --- This came up in https://gi

[jira] [Created] (SPARK-17420) Install rmarkdown R package on Jenkins machines

2016-09-06 Thread Shivaram Venkataraman (JIRA)
Shivaram Venkataraman created SPARK-17420: - Summary: Install rmarkdown R package on Jenkins machines Key: SPARK-17420 URL: https://issues.apache.org/jira/browse/SPARK-17420 Project: Spark

[jira] [Commented] (SPARK-17403) Fatal Error: Scan cached strings

2016-09-06 Thread Ruben Hernando (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468528#comment-15468528 ] Ruben Hernando commented on SPARK-17403: I'm sorry I can't share the data. This

[jira] [Commented] (SPARK-11478) ML StringIndexer return inconsistent schema

2016-09-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468525#comment-15468525 ] Joseph K. Bradley commented on SPARK-11478: --- I'm not sure if it was on purpose.

[jira] [Updated] (SPARK-11478) ML StringIndexer return inconsistent schema

2016-09-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-11478: -- Priority: Minor (was: Major) > ML StringIndexer return inconsistent schema > -

[jira] [Commented] (SPARK-17417) Fix # of partitions for RDD while checkpointing - Currently limited by 10000(%05d)

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468519#comment-15468519 ] Sean Owen commented on SPARK-17417: --- I'd bump the padding to allow 10 digits, because t

[jira] [Created] (SPARK-17419) Mesos virtual network support

2016-09-06 Thread Michael Gummelt (JIRA)
Michael Gummelt created SPARK-17419: --- Summary: Mesos virtual network support Key: SPARK-17419 URL: https://issues.apache.org/jira/browse/SPARK-17419 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-17201) Investigate numerical instability for MLOR without regularization

2016-09-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468475#comment-15468475 ] Joseph K. Bradley commented on SPARK-17201: --- Does this actually resolve the iss

[jira] [Commented] (SPARK-17403) Fatal Error: Scan cached strings

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468449#comment-15468449 ] Davies Liu commented on SPARK-17403: [~rhernando] Could you pull out the string colum

[jira] [Commented] (SPARK-17418) Spark release must NOT distribute Kinesis related artifacts

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468442#comment-15468442 ] Sean Owen commented on SPARK-17418: --- Aha. The problem is this: https://mvnrepository.c

[jira] [Updated] (SPARK-17403) Fatal Error: Scan cached strings

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17403: --- Summary: Fatal Error: Scan cached strings (was: Fatal Error: SIGSEGV on Jdbc joins) > Fatal Error:

[jira] [Commented] (SPARK-17418) Spark release must NOT distribute Kinesis related artifacts

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468410#comment-15468410 ] Sean Owen commented on SPARK-17418: --- I don't see any Kinesis artifacts in the Spark dis

[jira] [Commented] (SPARK-17407) Unable to update structured stream from CSV

2016-09-06 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468372#comment-15468372 ] Seth Hendrickson commented on SPARK-17407: -- [~chriddyp] The file stream source c

[jira] [Commented] (SPARK-17418) Spark release must NOT distribute Kinesis related artifacts

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468362#comment-15468362 ] Apache Spark commented on SPARK-17418: -- User 'lresende' has created a pull request f

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468364#comment-15468364 ] Davies Liu commented on SPARK-16922: Is there any performance difference comparing to

  1   2   >