[jira] [Created] (SPARK-18751) Deadlock when SparkContext.stop is called in Utils.tryOrStopSparkContext

2016-12-06 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-18751: Summary: Deadlock when SparkContext.stop is called in Utils.tryOrStopSparkContext Key: SPARK-18751 URL: https://issues.apache.org/jira/browse/SPARK-18751 Project:

[jira] [Updated] (SPARK-18697) Upgrade sbt plugins

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18697: -- Assignee: Weiqing Yang > Upgrade sbt plugins > --- > > Key:

[jira] [Resolved] (SPARK-18697) Upgrade sbt plugins

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18697. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16159

[jira] [Updated] (SPARK-18652) Include the example data and third-party licenses in pyspark package

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18652: -- Assignee: Shuai Lin > Include the example data and third-party licenses in pyspark package >

[jira] [Resolved] (SPARK-18652) Include the example data and third-party licenses in pyspark package

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18652. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 16082

[jira] [Commented] (SPARK-18731) Task size in K-means is so large

2016-12-06 Thread Xiaoye Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726855#comment-15726855 ] Xiaoye Sun commented on SPARK-18731: The size of the broadcast variable (the model) in my case is

[jira] [Commented] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726837#comment-15726837 ] Sean Owen commented on SPARK-18750: --- Hm, I am not immediately sure how these are related. Where does

[jira] [Commented] (SPARK-18731) Task size in K-means is so large

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726824#comment-15726824 ] Sean Owen commented on SPARK-18731: --- Yes, that's the kind of thing worth looking at. Nothing here is

[jira] [Updated] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2016-12-06 Thread Neerja Khattar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neerja Khattar updated SPARK-18750: --- Description: When running Sql queries on large datasets. Job fails with stack overflow

[jira] [Created] (SPARK-18750) spark should be able to control the number of executor and should not throw stack overslow

2016-12-06 Thread Neerja Khattar (JIRA)
Neerja Khattar created SPARK-18750: -- Summary: spark should be able to control the number of executor and should not throw stack overslow Key: SPARK-18750 URL: https://issues.apache.org/jira/browse/SPARK-18750

[jira] [Updated] (SPARK-18737) Serialization setting "spark.serializer" ignored in Spark 2.x

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18737: -- Flags: (was: Important) Priority: Major (was: Blocker) Don't set Blocker, etc. Please read

[jira] [Updated] (SPARK-18744) Remove workaround for Netty memory leak

2016-12-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-18744: - Affects Version/s: 2.1.0 > Remove workaround for Netty memory leak >

[jira] [Resolved] (SPARK-18744) Remove workaround for Netty memory leak

2016-12-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-18744. -- Resolution: Fixed Fix Version/s: 2.2.0 > Remove workaround for Netty memory leak >

[jira] [Commented] (SPARK-8617) Handle history files better

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726761#comment-15726761 ] Apache Spark commented on SPARK-8617: - User 'seyfe' has created a pull request for this issue:

[jira] [Assigned] (SPARK-8617) Handle history files better

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-8617: --- Assignee: (was: Apache Spark) > Handle history files better >

[jira] [Assigned] (SPARK-8617) Handle history files better

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-8617: --- Assignee: Apache Spark > Handle history files better > --- > >

[jira] [Updated] (SPARK-18374) Incorrect words in StopWords/english.txt

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18374: -- Assignee: yuhao yang Labels: releasenotes (was: ) > Incorrect words in StopWords/english.txt >

[jira] [Updated] (SPARK-18374) Incorrect words in StopWords/english.txt

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18374: -- Priority: Minor (was: Major) > Incorrect words in StopWords/english.txt >

[jira] [Resolved] (SPARK-18671) Add tests to ensure stability of that all Structured Streaming log formats

2016-12-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-18671. -- Resolution: Fixed Assignee: Tathagata Das Fix Version/s: 2.1.0 > Add tests to

[jira] [Resolved] (SPARK-18374) Incorrect words in StopWords/english.txt

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18374. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16103

[jira] [Assigned] (SPARK-17760) DataFrame's pivot doesn't see column created in groupBy

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17760: Assignee: Apache Spark > DataFrame's pivot doesn't see column created in groupBy >

[jira] [Assigned] (SPARK-17760) DataFrame's pivot doesn't see column created in groupBy

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17760: Assignee: (was: Apache Spark) > DataFrame's pivot doesn't see column created in

[jira] [Commented] (SPARK-17760) DataFrame's pivot doesn't see column created in groupBy

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726715#comment-15726715 ] Apache Spark commented on SPARK-17760: -- User 'aray' has created a pull request for this issue:

[jira] [Updated] (SPARK-18681) Throw Filtering is supported only on partition keys of type string exception

2016-12-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18681: Target Version/s: 2.1.0 > Throw Filtering is supported only on partition keys of type string

[jira] [Commented] (SPARK-18733) Spark history server file cleaner excludes in-progress files

2016-12-06 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726701#comment-15726701 ] Thomas Graves commented on SPARK-18733: --- oh nevermind its looking at lastupdated time. > Spark

[jira] [Commented] (SPARK-18209) More robust view canonicalization without full SQL expansion

2016-12-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726694#comment-15726694 ] Reynold Xin commented on SPARK-18209: - I took a look at the change quickly and here are my high level

[jira] [Comment Edited] (SPARK-18209) More robust view canonicalization without full SQL expansion

2016-12-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726694#comment-15726694 ] Reynold Xin edited comment on SPARK-18209 at 12/6/16 9:01 PM: -- I took a look

[jira] [Commented] (SPARK-18733) Spark history server file cleaner excludes in-progress files

2016-12-06 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726671#comment-15726671 ] Thomas Graves commented on SPARK-18733: --- yes looks like a dup but I'm not sure on current solution.

[jira] [Commented] (SPARK-18728) Consider using Algebird's Aggregator instead of org.apache.spark.sql.expressions.Aggregator

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726647#comment-15726647 ] Sean Owen commented on SPARK-18728: --- It's pretty much what I mentioned there: another dependency in the

[jira] [Resolved] (SPARK-18733) Spark history server file cleaner excludes in-progress files

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18733. --- Resolution: Duplicate > Spark history server file cleaner excludes in-progress files >

[jira] [Closed] (SPARK-18749) CLONE - checkpointLocation being set in memory streams fail after restart. Should fail fast

2016-12-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust closed SPARK-18749. Resolution: Invalid > CLONE - checkpointLocation being set in memory streams fail after

[jira] [Created] (SPARK-18749) CLONE - checkpointLocation being set in memory streams fail after restart. Should fail fast

2016-12-06 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-18749: Summary: CLONE - checkpointLocation being set in memory streams fail after restart. Should fail fast Key: SPARK-18749 URL:

[jira] [Closed] (SPARK-17921) checkpointLocation being set in memory streams fail after restart. Should fail fast

2016-12-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust closed SPARK-17921. Resolution: Won't Fix > checkpointLocation being set in memory streams fail after restart.

[jira] [Updated] (SPARK-18744) Remove workaround for Netty memory leak

2016-12-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18744: -- Priority: Minor (was: Major) > Remove workaround for Netty memory leak >

[jira] [Closed] (SPARK-18747) UDF multiple evaluations causes very poor performance

2016-12-06 Thread Ohad Raviv (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ohad Raviv closed SPARK-18747. -- Resolution: Duplicate > UDF multiple evaluations causes very poor performance >

[jira] [Updated] (SPARK-18745) java.lang.IndexOutOfBoundsException running query 68 Spark SQL on (100TB)

2016-12-06 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-18745: - Affects Version/s: 2.2.0 > java.lang.IndexOutOfBoundsException running query 68 Spark

[jira] [Assigned] (SPARK-18746) Add newBigDecimalEncoder

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18746: Assignee: (was: Apache Spark) > Add newBigDecimalEncoder > >

[jira] [Commented] (SPARK-18746) Add newBigDecimalEncoder

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726504#comment-15726504 ] Apache Spark commented on SPARK-18746: -- User 'weiqingy' has created a pull request for this issue:

[jira] [Resolved] (SPARK-18714) SparkSession.time - a simple timer function

2016-12-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18714. --- Resolution: Fixed Fix Version/s: 2.1.0 > SparkSession.time - a simple timer

[jira] [Assigned] (SPARK-18746) Add newBigDecimalEncoder

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18746: Assignee: Apache Spark > Add newBigDecimalEncoder > > >

[jira] [Commented] (SPARK-18745) java.lang.IndexOutOfBoundsException running query 68 Spark SQL on (100TB)

2016-12-06 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726486#comment-15726486 ] Kazuaki Ishizaki commented on SPARK-18745: -- I work with [~jfc...@us.ibm.com] >

[jira] [Created] (SPARK-18748) UDF multiple evaluations causes very poor performance

2016-12-06 Thread Ohad Raviv (JIRA)
Ohad Raviv created SPARK-18748: -- Summary: UDF multiple evaluations causes very poor performance Key: SPARK-18748 URL: https://issues.apache.org/jira/browse/SPARK-18748 Project: Spark Issue

[jira] [Created] (SPARK-18747) UDF multiple evaluations causes very poor performance

2016-12-06 Thread Ohad Raviv (JIRA)
Ohad Raviv created SPARK-18747: -- Summary: UDF multiple evaluations causes very poor performance Key: SPARK-18747 URL: https://issues.apache.org/jira/browse/SPARK-18747 Project: Spark Issue

[jira] [Created] (SPARK-18746) Add newBigDecimalEncoder

2016-12-06 Thread Weiqing Yang (JIRA)
Weiqing Yang created SPARK-18746: Summary: Add newBigDecimalEncoder Key: SPARK-18746 URL: https://issues.apache.org/jira/browse/SPARK-18746 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-18681) Throw Filtering is supported only on partition keys of type string exception

2016-12-06 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726464#comment-15726464 ] Michael Allman commented on SPARK-18681: [~rxin] I think this should be a blocker for 2.1. This

[jira] [Updated] (SPARK-18745) java.lang.IndexOutOfBoundsException running query 68 Spark SQL on (100TB)

2016-12-06 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JESSE CHEN updated SPARK-18745: --- Labels: (was: core dump) > java.lang.IndexOutOfBoundsException running query 68 Spark SQL on

[jira] [Updated] (SPARK-18745) java.lang.IndexOutOfBoundsException running query 68 Spark SQL on (100TB)

2016-12-06 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JESSE CHEN updated SPARK-18745: --- Description: Running query 68 with decreased executor memory (using 12GB executors instead of 24GB)

[jira] [Comment Edited] (SPARK-18731) Task size in K-means is so large

2016-12-06 Thread Xiaoye Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726277#comment-15726277 ] Xiaoye Sun edited comment on SPARK-18731 at 12/6/16 7:28 PM: - Hi Sean, Here

[jira] [Created] (SPARK-18745) java.lang.IndexOutOfBoundsException running query 68 Spark SQL on (100TB)

2016-12-06 Thread JESSE CHEN (JIRA)
JESSE CHEN created SPARK-18745: -- Summary: java.lang.IndexOutOfBoundsException running query 68 Spark SQL on (100TB) Key: SPARK-18745 URL: https://issues.apache.org/jira/browse/SPARK-18745 Project: Spark

[jira] [Commented] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-06 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726437#comment-15726437 ] Michael Allman commented on SPARK-18676: > maybe we could switch to ShuffleJoin when it realize

[jira] [Commented] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-06 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726431#comment-15726431 ] Michael Allman commented on SPARK-18676: I'm spending some more time this week to understand

[jira] [Comment Edited] (SPARK-18731) Task size in K-means is so large

2016-12-06 Thread Xiaoye Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726277#comment-15726277 ] Xiaoye Sun edited comment on SPARK-18731 at 12/6/16 7:19 PM: - Hi Sean, Here

[jira] [Commented] (SPARK-18744) Remove workaround for Netty memory leak

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726411#comment-15726411 ] Apache Spark commented on SPARK-18744: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18744) Remove workaround for Netty memory leak

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18744: Assignee: Apache Spark (was: Shixiong Zhu) > Remove workaround for Netty memory leak >

[jira] [Assigned] (SPARK-18744) Remove workaround for Netty memory leak

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18744: Assignee: Shixiong Zhu (was: Apache Spark) > Remove workaround for Netty memory leak >

[jira] [Created] (SPARK-18744) Remove workaround for Netty memory leak

2016-12-06 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-18744: Summary: Remove workaround for Netty memory leak Key: SPARK-18744 URL: https://issues.apache.org/jira/browse/SPARK-18744 Project: Spark Issue Type:

[jira] [Updated] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-06 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-18676: --- Description: Commit [c481bdf|https://github.com/apache/spark/commit/c481bdf] significantly

[jira] [Assigned] (SPARK-17460) Dataset.joinWith broadcasts gigabyte sized table, causes OOM Exception

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17460: Assignee: Apache Spark > Dataset.joinWith broadcasts gigabyte sized table, causes OOM

[jira] [Assigned] (SPARK-17460) Dataset.joinWith broadcasts gigabyte sized table, causes OOM Exception

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17460: Assignee: (was: Apache Spark) > Dataset.joinWith broadcasts gigabyte sized table,

[jira] [Commented] (SPARK-17460) Dataset.joinWith broadcasts gigabyte sized table, causes OOM Exception

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726314#comment-15726314 ] Apache Spark commented on SPARK-17460: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Commented] (SPARK-18512) FileNotFoundException on _temporary directory with Spark Streaming 2.0.1 and S3A

2016-12-06 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726310#comment-15726310 ] Steve Loughran commented on SPARK-18512: It'd be good to get some more details from people who

[jira] [Commented] (SPARK-18733) Spark history server file cleaner excludes in-progress files

2016-12-06 Thread Ergin Seyfe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726292#comment-15726292 ] Ergin Seyfe commented on SPARK-18733: - Hi [~vanzin]. I searched the Jira before creating a new one

[jira] [Commented] (SPARK-18731) Task size in K-means is so large

2016-12-06 Thread Xiaoye Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726277#comment-15726277 ] Xiaoye Sun commented on SPARK-18731: Hi Sean, Here is a part of the output I collected at the

[jira] [Resolved] (SPARK-18740) Log spark.app.name in driver log

2016-12-06 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-18740. Resolution: Fixed Assignee: Peter Ableda Fix Version/s: 2.2.0 > Log

[jira] [Commented] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726247#comment-15726247 ] Davies Liu commented on SPARK-18676: What's the schema and plan of the child looks like? It's

[jira] [Commented] (SPARK-18736) CreateMap allows non-unique keys

2016-12-06 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726232#comment-15726232 ] Eyal Farago commented on SPARK-18736: - @shuai Lin, I already have a pr in progress that addresses the

[jira] [Commented] (SPARK-18512) FileNotFoundException on _temporary directory with Spark Streaming 2.0.1 and S3A

2016-12-06 Thread Adrian Bridgett (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726148#comment-15726148 ] Adrian Bridgett commented on SPARK-18512: - Thanks Steve - just seen a "next few weeks" mentioned

[jira] [Updated] (SPARK-18743) StreamingContext.textFileStream(directory) has no events shown in Web UI

2016-12-06 Thread Viktor Vojnovski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Vojnovski updated SPARK-18743: - Attachment: screenshot-1.png > StreamingContext.textFileStream(directory) has no events

[jira] [Commented] (SPARK-18728) Consider using Algebird's Aggregator instead of org.apache.spark.sql.expressions.Aggregator

2016-12-06 Thread Mansur Ashraf (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726142#comment-15726142 ] Mansur Ashraf commented on SPARK-18728: --- Hi Sean, Dataset API has removed 'aggregateByKey` and

[jira] [Updated] (SPARK-18743) StreamingContext.textFileStream(directory) has no events shown in Web UI

2016-12-06 Thread Viktor Vojnovski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Vojnovski updated SPARK-18743: - Description: StreamingContext.textFileStream input is not reflected in the Web UI, ie.

[jira] [Commented] (SPARK-18743) StreamingContext.textFileStream(directory) has no events shown in Web UI

2016-12-06 Thread Viktor Vojnovski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726138#comment-15726138 ] Viktor Vojnovski commented on SPARK-18743: -- A similar issue:

[jira] [Commented] (SPARK-18512) FileNotFoundException on _temporary directory with Spark Streaming 2.0.1 and S3A

2016-12-06 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726135#comment-15726135 ] Steve Loughran commented on SPARK-18512: ah, the "when will 2.8 ship" question. Really close, Jun

[jira] [Created] (SPARK-18743) StreamingContext.textFileStream(directory) has no events shown in Web UI

2016-12-06 Thread Viktor Vojnovski (JIRA)
Viktor Vojnovski created SPARK-18743: Summary: StreamingContext.textFileStream(directory) has no events shown in Web UI Key: SPARK-18743 URL: https://issues.apache.org/jira/browse/SPARK-18743

[jira] [Commented] (SPARK-18733) Spark history server file cleaner excludes in-progress files

2016-12-06 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726084#comment-15726084 ] Marcelo Vanzin commented on SPARK-18733: This is basically a dupe of SPARK-8617; there have been

[jira] [Commented] (SPARK-18736) CreateMap allows non-unique keys

2016-12-06 Thread Shuai Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726074#comment-15726074 ] Shuai Lin commented on SPARK-18736: --- If the keys are all literas, then we can detect and remove the

[jira] [Commented] (SPARK-18085) Better History Server scalability for many / large applications

2016-12-06 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726053#comment-15726053 ] Marcelo Vanzin commented on SPARK-18085: bq. Since SHS keeps this data in memory I don't see how

[jira] [Assigned] (SPARK-18741) Reuse/Explicitly clean-up SparkContext in Streaming tests

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18741: Assignee: Apache Spark > Reuse/Explicitly clean-up SparkContext in Streaming tests >

[jira] [Commented] (SPARK-18741) Reuse/Explicitly clean-up SparkContext in Streaming tests

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726041#comment-15726041 ] Apache Spark commented on SPARK-18741: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18741) Reuse/Explicitly clean-up SparkContext in Streaming tests

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18741: Assignee: (was: Apache Spark) > Reuse/Explicitly clean-up SparkContext in Streaming

[jira] [Assigned] (SPARK-18742) readd spark.broadcast.factory for user-defined BroadcastFactory

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18742: Assignee: (was: Apache Spark) > readd spark.broadcast.factory for user-defined

[jira] [Assigned] (SPARK-18742) readd spark.broadcast.factory for user-defined BroadcastFactory

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18742: Assignee: Apache Spark > readd spark.broadcast.factory for user-defined BroadcastFactory

[jira] [Commented] (SPARK-18742) readd spark.broadcast.factory for user-defined BroadcastFactory

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725942#comment-15725942 ] Apache Spark commented on SPARK-18742: -- User 'windpiger' has created a pull request for this issue:

[jira] [Created] (SPARK-18742) readd spark.broadcast.factory for user-defined BroadcastFactory

2016-12-06 Thread Song Jun (JIRA)
Song Jun created SPARK-18742: Summary: readd spark.broadcast.factory for user-defined BroadcastFactory Key: SPARK-18742 URL: https://issues.apache.org/jira/browse/SPARK-18742 Project: Spark

[jira] [Comment Edited] (SPARK-18085) Better History Server scalability for many / large applications

2016-12-06 Thread Dmitry Buzolin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725880#comment-15725880 ] Dmitry Buzolin edited comment on SPARK-18085 at 12/6/16 3:58 PM: - Spark

[jira] [Commented] (SPARK-18085) Better History Server scalability for many / large applications

2016-12-06 Thread Dmitry Buzolin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725880#comment-15725880 ] Dmitry Buzolin commented on SPARK-18085: Spark log size is directly depending on few things: -

[jira] [Created] (SPARK-18741) Reuse/Explicitly clean-up SparkContext in Streaming tests

2016-12-06 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-18741: - Summary: Reuse/Explicitly clean-up SparkContext in Streaming tests Key: SPARK-18741 URL: https://issues.apache.org/jira/browse/SPARK-18741 Project: Spark

[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2016-12-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725790#comment-15725790 ] Herman van Hovell commented on SPARK-650: - A creatively applied broadcast variable might also do

[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2016-12-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725784#comment-15725784 ] Herman van Hovell commented on SPARK-650: - If you only try to propagate information, then you can

[jira] [Assigned] (SPARK-18740) Log spark.app.name in driver log

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18740: Assignee: (was: Apache Spark) > Log spark.app.name in driver log >

[jira] [Assigned] (SPARK-18740) Log spark.app.name in driver log

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18740: Assignee: Apache Spark > Log spark.app.name in driver log >

[jira] [Commented] (SPARK-18740) Log spark.app.name in driver log

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725633#comment-15725633 ] Apache Spark commented on SPARK-18740: -- User 'peterableda' has created a pull request for this

[jira] [Commented] (SPARK-18740) Log spark.app.name in driver log

2016-12-06 Thread Peter Ableda (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725584#comment-15725584 ] Peter Ableda commented on SPARK-18740: -- I will create a pull request for this change. > Log

[jira] [Updated] (SPARK-18740) Log spark.app.name in driver log

2016-12-06 Thread Peter Ableda (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Ableda updated SPARK-18740: - Priority: Minor (was: Major) > Log spark.app.name in driver log >

[jira] [Created] (SPARK-18740) Log spark.app.name in driver log

2016-12-06 Thread Peter Ableda (JIRA)
Peter Ableda created SPARK-18740: Summary: Log spark.app.name in driver log Key: SPARK-18740 URL: https://issues.apache.org/jira/browse/SPARK-18740 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-18736) CreateMap allows non-unique keys

2016-12-06 Thread Shuai Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725579#comment-15725579 ] Shuai Lin commented on SPARK-18736: --- I can work on this. > CreateMap allows non-unique keys >

[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2016-12-06 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725542#comment-15725542 ] Michael Schmeißer commented on SPARK-650: - Sure it can be included in the closure and this was also

[jira] [Commented] (SPARK-18230) MatrixFactorizationModel.recommendProducts throws NoSuchElement exception when the user does not exist

2016-12-06 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725420#comment-15725420 ] chris snow commented on SPARK-18230: If you are trying to indicate a non existing product or user

[jira] [Comment Edited] (SPARK-18230) MatrixFactorizationModel.recommendProducts throws NoSuchElement exception when the user does not exist

2016-12-06 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725420#comment-15725420 ] chris snow edited comment on SPARK-18230 at 12/6/16 12:56 PM: -- If you are

[jira] [Assigned] (SPARK-18739) Models in pyspark.classification support setXXXCol methods

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18739: Assignee: Apache Spark > Models in pyspark.classification support setXXXCol methods >

[jira] [Assigned] (SPARK-18739) Models in pyspark.classification support setXXXCol methods

2016-12-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18739: Assignee: (was: Apache Spark) > Models in pyspark.classification support setXXXCol

<    1   2   3   >