[jira] [Commented] (SPARK-17094) provide simplified API for ML pipeline

2016-09-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469609#comment-15469609 ] yuhao yang commented on SPARK-17094: Something like Stanford CoreNLP pipeline:

[jira] [Assigned] (SPARK-17427) function SIZE should return -1 when parameter is null

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17427: Assignee: (was: Apache Spark) > function SIZE should return -1 when parameter is null

[jira] [Commented] (SPARK-17427) function SIZE should return -1 when parameter is null

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469569#comment-15469569 ] Apache Spark commented on SPARK-17427: -- User 'adrian-wang' has created a pull request for this

[jira] [Assigned] (SPARK-17427) function SIZE should return -1 when parameter is null

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17427: Assignee: Apache Spark > function SIZE should return -1 when parameter is null >

[jira] [Created] (SPARK-17427) function SIZE should return -1 when parameter is null

2016-09-06 Thread Adrian Wang (JIRA)
Adrian Wang created SPARK-17427: --- Summary: function SIZE should return -1 when parameter is null Key: SPARK-17427 URL: https://issues.apache.org/jira/browse/SPARK-17427 Project: Spark Issue

[jira] [Commented] (SPARK-17368) Scala value classes create encoder problems and break at runtime

2016-09-06 Thread Aris Vlasakakis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469564#comment-15469564 ] Aris Vlasakakis commented on SPARK-17368: - It goes from inconvenient to actually prohibitive in a

[jira] [Assigned] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17426: Assignee: (was: Apache Spark) > Current TreeNode.toJSON may trigger OOM under some

[jira] [Assigned] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17426: Assignee: Apache Spark > Current TreeNode.toJSON may trigger OOM under some corner cases

[jira] [Commented] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469557#comment-15469557 ] Apache Spark commented on SPARK-17426: -- User 'clockfly' has created a pull request for this issue:

[jira] [Updated] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17426: --- Target Version/s: 2.1.0 Description: In SPARK-17356, we fix the OOM issue when Metadata is

[jira] [Updated] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17426: --- Description: In SPARK-17356, we fix the OOM issue when Metadata is super big. There are other cases

[jira] [Updated] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17426: --- Component/s: SQL > Current TreeNode.toJSON may trigger OOM under some corner cases >

[jira] [Created] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17426: -- Summary: Current TreeNode.toJSON may trigger OOM under some corner cases Key: SPARK-17426 URL: https://issues.apache.org/jira/browse/SPARK-17426 Project: Spark

[jira] [Commented] (SPARK-17405) Simple aggregation query OOMing after SPARK-16525

2016-09-06 Thread Jacek Laskowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469487#comment-15469487 ] Jacek Laskowski commented on SPARK-17405: - It definitely got better with the build today Sept,

[jira] [Updated] (SPARK-6235) Address various 2G limits

2016-09-06 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-6235: --- Attachment: (was: SPARK-6235_Design_V0.01.pdf) > Address various 2G limits >

[jira] [Updated] (SPARK-6235) Address various 2G limits

2016-09-06 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-6235: --- Attachment: SPARK-6235_Design_V0.02.pdf > Address various 2G limits > - > >

[jira] [Commented] (SPARK-17110) Pyspark with locality ANY throw java.io.StreamCorruptedException

2016-09-06 Thread Tomer Kaftan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469295#comment-15469295 ] Tomer Kaftan commented on SPARK-17110: -- Thanks all who helped out with this! > Pyspark with

[jira] [Resolved] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-17372. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by

[jira] [Updated] (SPARK-17279) better error message for exceptions during ScalaUDF execution

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-17279: Fix Version/s: 2.0.1 > better error message for exceptions during ScalaUDF execution >

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReuseExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Summary: Override sameResult in HiveTableScanExec to make ReuseExchange work in text format table

[jira] [Assigned] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17425: Assignee: (was: Apache Spark) > Override sameResult in HiveTableScanExec to make

[jira] [Commented] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469249#comment-15469249 ] Apache Spark commented on SPARK-17425: -- User 'watermen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17425: Assignee: Apache Spark > Override sameResult in HiveTableScanExec to make ReusedExchange

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Description: When I run the below SQL(table src is text format): {code:sql} SELECT * FROM src t1 JOIN

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Description: When I run the below SQL(table src is text format): {code:sql} SELECT 1 FROM src t1 JOIN

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Description: When I run the below SQL(table src is text format): {code:sql} SELECT 1 FROM src t1 JOIN

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Description: When I run the below SQL(table src is text format): {code:sql} SELECT 1 FROM src t1 JOIN

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Description: When I run the below SQL(table src is text format): {code:sql} SELECT 1 FROM src t1 JOIN

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Affects Version/s: 2.0.0 Component/s: SQL > Override sameResult in HiveTableScanExec to make

[jira] [Created] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
Yadong Qi created SPARK-17425: - Summary: Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table Key: SPARK-17425 URL: https://issues.apache.org/jira/browse/SPARK-17425

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Description: When I run the below SQL(table src is text format): {code:sql} select 1 from src t1 join

[jira] [Resolved] (SPARK-17238) simplify the logic for converting data source table into hive compatible format

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17238. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14809

[jira] [Updated] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-17372: -- Description: When we create a filestream on a directory that has partitioned subdirs (i.e.

[jira] [Updated] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-17372: -- Description: When we create a filestream on a directory that has partitioned subdirs (i.e.

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-09-06 Thread Sital Kedia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469092#comment-15469092 ] Sital Kedia commented on SPARK-16922: - There is no noticable performance gain I observed comparing to

[jira] [Commented] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469072#comment-15469072 ] Apache Spark commented on SPARK-17372: -- User 'tdas' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17372: Assignee: Tathagata Das (was: Apache Spark) > Running a file stream on a directory with

[jira] [Assigned] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17372: Assignee: Apache Spark (was: Tathagata Das) > Running a file stream on a directory with

[jira] [Updated] (SPARK-17408) Flaky test: org.apache.spark.sql.hive.StatisticsSuite

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-17408: Assignee: Xiao Li > Flaky test: org.apache.spark.sql.hive.StatisticsSuite >

[jira] [Resolved] (SPARK-17408) Flaky test: org.apache.spark.sql.hive.StatisticsSuite

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17408. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14978

[jira] [Updated] (SPARK-17371) Resubmitted stage outputs deleted by zombie map tasks on stop()

2016-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-17371: --- Assignee: Eric Liang > Resubmitted stage outputs deleted by zombie map tasks on stop() >

[jira] [Resolved] (SPARK-17371) Resubmitted stage outputs deleted by zombie map tasks on stop()

2016-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-17371. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14932

[jira] [Commented] (SPARK-17423) Support IGNORE NULLS option in Window functions

2016-09-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468973#comment-15468973 ] Herman van Hovell commented on SPARK-17423: --- IMO LEAD and LAG with ignore nulls seem a bit

[jira] [Assigned] (SPARK-17421) Warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17421: Assignee: Apache Spark > Warnings about "MaxPermSize" parameter when building with Maven

[jira] [Commented] (SPARK-17421) Warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468943#comment-15468943 ] Apache Spark commented on SPARK-17421: -- User 'frreiss' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17421) Warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17421: Assignee: (was: Apache Spark) > Warnings about "MaxPermSize" parameter when building

[jira] [Commented] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468932#comment-15468932 ] Ryan Blue commented on SPARK-17396: --- I opened a PR with a fix. It still uses a ForkJoinPool because the

[jira] [Commented] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468929#comment-15468929 ] Apache Spark commented on SPARK-17396: -- User 'rdblue' has created a pull request for this issue:

[jira] [Commented] (SPARK-17296) Spark SQL: cross join + two joins = BUG

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468928#comment-15468928 ] Apache Spark commented on SPARK-17296: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17396: Assignee: Apache Spark > Threads number keep increasing when query on external CSV

[jira] [Assigned] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17396: Assignee: (was: Apache Spark) > Threads number keep increasing when query on external

[jira] [Updated] (SPARK-17424) Dataset job fails from unsound substitution in ScalaReflect

2016-09-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-17424: -- Description: I have a job that uses datasets in 1.6.1 and is failing with this error: {code} 16/09/02

[jira] [Created] (SPARK-17424) Dataset job fails from unsound substitution in ScalaReflect

2016-09-06 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-17424: - Summary: Dataset job fails from unsound substitution in ScalaReflect Key: SPARK-17424 URL: https://issues.apache.org/jira/browse/SPARK-17424 Project: Spark Issue

[jira] [Commented] (SPARK-16026) Cost-based Optimizer framework

2016-09-06 Thread Srinath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468891#comment-15468891 ] Srinath commented on SPARK-16026: - I have a couple of comments/questions on the proposal. Regarding the

[jira] [Resolved] (SPARK-17253) Left join where ON clause does not reference the right table produces analysis error

2016-09-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-17253. --- Resolution: Duplicate Assignee: Herman van Hovell Fix Version/s:

[jira] [Resolved] (SPARK-17296) Spark SQL: cross join + two joins = BUG

2016-09-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-17296. --- Resolution: Fixed Fix Version/s: 2.1.0 > Spark SQL: cross join + two joins =

[jira] [Assigned] (SPARK-17296) Spark SQL: cross join + two joins = BUG

2016-09-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell reassigned SPARK-17296: - Assignee: Herman van Hovell > Spark SQL: cross join + two joins = BUG >

[jira] [Created] (SPARK-17423) Support IGNORE NULLS option in Window functions

2016-09-06 Thread Tim Chan (JIRA)
Tim Chan created SPARK-17423: Summary: Support IGNORE NULLS option in Window functions Key: SPARK-17423 URL: https://issues.apache.org/jira/browse/SPARK-17423 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-17368) Scala value classes create encoder problems and break at runtime

2016-09-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468833#comment-15468833 ] Jakob Odersky edited comment on SPARK-17368 at 9/6/16 10:57 PM: So I

[jira] [Resolved] (SPARK-15891) Make YARN logs less noisy

2016-09-06 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-15891. Resolution: Fixed Assignee: Marcelo Vanzin Fix Version/s: 2.1.0 > Make

[jira] [Updated] (SPARK-17422) Update Ganglia project with new license

2016-09-06 Thread Luciano Resende (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luciano Resende updated SPARK-17422: Description: It seems that Ganglia is now BSD licensed http://ganglia.info/ and

[jira] [Commented] (SPARK-17368) Scala value classes create encoder problems and break at runtime

2016-09-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468833#comment-15468833 ] Jakob Odersky commented on SPARK-17368: --- So I thought about this a bit more and although it is

[jira] [Created] (SPARK-17422) Update Ganglia project with new license

2016-09-06 Thread Luciano Resende (JIRA)
Luciano Resende created SPARK-17422: --- Summary: Update Ganglia project with new license Key: SPARK-17422 URL: https://issues.apache.org/jira/browse/SPARK-17422 Project: Spark Issue Type:

[jira] [Created] (SPARK-17421) Warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-06 Thread Frederick Reiss (JIRA)
Frederick Reiss created SPARK-17421: --- Summary: Warnings about "MaxPermSize" parameter when building with Maven and Java 8 Key: SPARK-17421 URL: https://issues.apache.org/jira/browse/SPARK-17421

[jira] [Resolved] (SPARK-17110) Pyspark with locality ANY throw java.io.StreamCorruptedException

2016-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-17110. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Commented] (SPARK-17405) Simple aggregation query OOMing after SPARK-16525

2016-09-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468737#comment-15468737 ] Josh Rosen commented on SPARK-17405: On the Spark Dev list, [~jlaskowski] found a simpler example

[jira] [Commented] (SPARK-17381) Memory leak org.apache.spark.sql.execution.ui.SQLTaskMetrics

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468738#comment-15468738 ] Davies Liu commented on SPARK-17381: cc [~cloud_fan] > Memory leak

[jira] [Closed] (SPARK-17384) SQL - Running query with outer join from 1.6 fails

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-17384. -- Resolution: Duplicate Assignee: Herman van Hovell > SQL - Running query with outer join from 1.6

[jira] [Commented] (SPARK-17384) SQL - Running query with outer join from 1.6 fails

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468636#comment-15468636 ] Davies Liu commented on SPARK-17384: This is caused by the SQL parser change, the parsed plan in 1.6:

[jira] [Commented] (SPARK-17417) Fix # of partitions for RDD while checkpointing - Currently limited by 10000(%05d)

2016-09-06 Thread Dhruve Ashar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468622#comment-15468622 ] Dhruve Ashar commented on SPARK-17417: -- Thanks for the suggestion. I'll work on the changes and

[jira] [Updated] (SPARK-17299) TRIM/LTRIM/RTRIM strips characters other than spaces

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17299: -- Assignee: Sandeep Singh > TRIM/LTRIM/RTRIM strips characters other than spaces >

[jira] [Resolved] (SPARK-17299) TRIM/LTRIM/RTRIM strips characters other than spaces

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17299. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Resolved] (SPARK-17378) Upgrade snappy-java to 1.1.2.6

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17378. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 1.6.3

[jira] [Commented] (SPARK-17377) Joining Datasets read and aggregated from a partitioned Parquet file gives wrong results

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468583#comment-15468583 ] Davies Liu commented on SPARK-17377: Tested this with latest master and 2.0 on databricks[1], they

[jira] [Assigned] (SPARK-17377) Joining Datasets read and aggregated from a partitioned Parquet file gives wrong results

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-17377: -- Assignee: Davies Liu > Joining Datasets read and aggregated from a partitioned Parquet file

[jira] [Commented] (SPARK-17316) Don't block StandaloneSchedulerBackend.executorRemoved

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468566#comment-15468566 ] Apache Spark commented on SPARK-17316: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Updated] (SPARK-17377) Joining Datasets read and aggregated from a partitioned Parquet file gives wrong results

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17377: --- Description: Reproduction: 1) Read two Datasets from a partitioned Parquet file with different

[jira] [Commented] (SPARK-17420) Install rmarkdown R package on Jenkins machines

2016-09-06 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468539#comment-15468539 ] Shivaram Venkataraman commented on SPARK-17420: --- This came up in

[jira] [Created] (SPARK-17420) Install rmarkdown R package on Jenkins machines

2016-09-06 Thread Shivaram Venkataraman (JIRA)
Shivaram Venkataraman created SPARK-17420: - Summary: Install rmarkdown R package on Jenkins machines Key: SPARK-17420 URL: https://issues.apache.org/jira/browse/SPARK-17420 Project: Spark

[jira] [Commented] (SPARK-17403) Fatal Error: Scan cached strings

2016-09-06 Thread Ruben Hernando (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468528#comment-15468528 ] Ruben Hernando commented on SPARK-17403: I'm sorry I can't share the data. This is a 2 tables

[jira] [Commented] (SPARK-11478) ML StringIndexer return inconsistent schema

2016-09-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468525#comment-15468525 ] Joseph K. Bradley commented on SPARK-11478: --- I'm not sure if it was on purpose. I could see

[jira] [Updated] (SPARK-11478) ML StringIndexer return inconsistent schema

2016-09-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-11478: -- Priority: Minor (was: Major) > ML StringIndexer return inconsistent schema >

[jira] [Commented] (SPARK-17417) Fix # of partitions for RDD while checkpointing - Currently limited by 10000(%05d)

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468519#comment-15468519 ] Sean Owen commented on SPARK-17417: --- I'd bump the padding to allow 10 digits, because that would

[jira] [Created] (SPARK-17419) Mesos virtual network support

2016-09-06 Thread Michael Gummelt (JIRA)
Michael Gummelt created SPARK-17419: --- Summary: Mesos virtual network support Key: SPARK-17419 URL: https://issues.apache.org/jira/browse/SPARK-17419 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-17201) Investigate numerical instability for MLOR without regularization

2016-09-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468475#comment-15468475 ] Joseph K. Bradley commented on SPARK-17201: --- Does this actually resolve the issue? The quotes

[jira] [Commented] (SPARK-17403) Fatal Error: Scan cached strings

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468449#comment-15468449 ] Davies Liu commented on SPARK-17403: [~rhernando] Could you pull out the string column (SL_RD_ColR_N)

[jira] [Commented] (SPARK-17418) Spark release must NOT distribute Kinesis related artifacts

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468442#comment-15468442 ] Sean Owen commented on SPARK-17418: --- Aha. The problem is this:

[jira] [Updated] (SPARK-17403) Fatal Error: Scan cached strings

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17403: --- Summary: Fatal Error: Scan cached strings (was: Fatal Error: SIGSEGV on Jdbc joins) > Fatal Error:

[jira] [Commented] (SPARK-17418) Spark release must NOT distribute Kinesis related artifacts

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468410#comment-15468410 ] Sean Owen commented on SPARK-17418: --- I don't see any Kinesis artifacts in the Spark distribution. Where

[jira] [Commented] (SPARK-17407) Unable to update structured stream from CSV

2016-09-06 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468372#comment-15468372 ] Seth Hendrickson commented on SPARK-17407: -- [~chriddyp] The file stream source checks for new

[jira] [Commented] (SPARK-17418) Spark release must NOT distribute Kinesis related artifacts

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468362#comment-15468362 ] Apache Spark commented on SPARK-17418: -- User 'lresende' has created a pull request for this issue:

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468364#comment-15468364 ] Davies Liu commented on SPARK-16922: Is there any performance difference comparing to

[jira] [Assigned] (SPARK-17418) Spark release must NOT distribute Kinesis related artifacts

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17418: Assignee: Apache Spark > Spark release must NOT distribute Kinesis related artifacts >

[jira] [Assigned] (SPARK-17418) Spark release must NOT distribute Kinesis related artifacts

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17418: Assignee: (was: Apache Spark) > Spark release must NOT distribute Kinesis related

[jira] [Commented] (SPARK-17418) Spark release must NOT distribute Kinesis related artifacts

2016-09-06 Thread Luciano Resende (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468348#comment-15468348 ] Luciano Resende commented on SPARK-17418: - I am going to create a PR for this, basically removing

[jira] [Created] (SPARK-17418) Spark release must NOT distribute Kinesis related artifacts

2016-09-06 Thread Luciano Resende (JIRA)
Luciano Resende created SPARK-17418: --- Summary: Spark release must NOT distribute Kinesis related artifacts Key: SPARK-17418 URL: https://issues.apache.org/jira/browse/SPARK-17418 Project: Spark

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-09-06 Thread Sital Kedia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468339#comment-15468339 ] Sital Kedia commented on SPARK-16922: - [~davies] - Thanks for looking into this. I tested the

[jira] [Created] (SPARK-17417) Fix # of partitions for RDD while checkpointing - Currently limited by 10000(%05d)

2016-09-06 Thread Dhruve Ashar (JIRA)
Dhruve Ashar created SPARK-17417: Summary: Fix # of partitions for RDD while checkpointing - Currently limited by 1(%05d) Key: SPARK-17417 URL: https://issues.apache.org/jira/browse/SPARK-17417

[jira] [Assigned] (SPARK-17317) Add package vignette to SparkR

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17317: Assignee: Apache Spark > Add package vignette to SparkR > --

[jira] [Commented] (SPARK-17317) Add package vignette to SparkR

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468250#comment-15468250 ] Apache Spark commented on SPARK-17317: -- User 'junyangq' has created a pull request for this issue:

  1   2   >