[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963815#comment-15963815 ] Wenchen Fan commented on SPARK-12837: - [~Tagar] due to some implementation limitations, in Spark SQL

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963812#comment-15963812 ] Ruslan Dautkhanov commented on SPARK-12837: --- [~cloud_fan] I didn't realize torrent broadcast

[jira] [Resolved] (SPARK-17564) Flaky RequestTimeoutIntegrationSuite, furtherRequestsDelay

2017-04-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17564. - Resolution: Fixed Assignee: Shixiong Zhu Fix Version/s: 2.2.0

[jira] [Commented] (SPARK-3383) DecisionTree aggregate size could be smaller

2017-04-10 Thread 颜发才
[ https://issues.apache.org/jira/browse/SPARK-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963741#comment-15963741 ] Yan Facai (颜发才) commented on SPARK-3383: I think the task contains two subtask: 1. separate

[jira] [Comment Edited] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963705#comment-15963705 ] Wenchen Fan edited comment on SPARK-12837 at 4/11/17 1:06 AM: -- [~Tagar] I

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963705#comment-15963705 ] Wenchen Fan commented on SPARK-12837: - [~Tagar] I think this may be because the size estimation was

[jira] [Created] (SPARK-20287) Kafka Consumer should be able to subscribe to more than one topic partition

2017-04-10 Thread Stephane Maarek (JIRA)
Stephane Maarek created SPARK-20287: --- Summary: Kafka Consumer should be able to subscribe to more than one topic partition Key: SPARK-20287 URL: https://issues.apache.org/jira/browse/SPARK-20287

[jira] [Commented] (SPARK-18555) na.fill miss up original values in long integers

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963685#comment-15963685 ] Apache Spark commented on SPARK-18555: -- User 'dbtsai' has created a pull request for this issue:

[jira] [Commented] (SPARK-18555) na.fill miss up original values in long integers

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963684#comment-15963684 ] Apache Spark commented on SPARK-18555: -- User 'dbtsai' has created a pull request for this issue:

[jira] [Updated] (SPARK-20270) na.fill will change the values in long or integer when the default value is in double

2017-04-10 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai updated SPARK-20270: Fix Version/s: 2.1.1 2.0.3 > na.fill will change the values in long or integer when the

[jira] [Updated] (SPARK-18555) na.fill miss up original values in long integers

2017-04-10 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai updated SPARK-18555: Fix Version/s: 2.1.1 2.0.3 Component/s: SQL > na.fill miss up original values in

[jira] [Assigned] (SPARK-17564) Flaky RequestTimeoutIntegrationSuite, furtherRequestsDelay

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17564: Assignee: Apache Spark > Flaky RequestTimeoutIntegrationSuite, furtherRequestsDelay >

[jira] [Assigned] (SPARK-17564) Flaky RequestTimeoutIntegrationSuite, furtherRequestsDelay

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17564: Assignee: (was: Apache Spark) > Flaky RequestTimeoutIntegrationSuite,

[jira] [Commented] (SPARK-17564) Flaky RequestTimeoutIntegrationSuite, furtherRequestsDelay

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963663#comment-15963663 ] Apache Spark commented on SPARK-17564: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-18084) write.partitionBy() does not recognize nested columns that select() can access

2017-04-10 Thread Rupesh Mane (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963621#comment-15963621 ] Rupesh Mane commented on SPARK-18084: - Any update on when this will be fixed? > write.partitionBy()

[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963601#comment-15963601 ] Michael Gummelt commented on SPARK-16742: - bq. So, assuming that Mesos is configured properly,

[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963592#comment-15963592 ] Marcelo Vanzin commented on SPARK-16742: bq. It authenticates the Mesos principal, and this

[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963583#comment-15963583 ] Michael Gummelt commented on SPARK-16742: - bq. That sounds problematic. The way YARN works is

[jira] [Updated] (SPARK-20284) Make SerializationStream and DeserializationStream extend Closeable

2017-04-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-20284: -- Priority: Trivial (was: Minor) It might be fine to add, but, does it help anything?

[jira] [Created] (SPARK-20286) dynamicAllocation.executorIdleTimeout is ignored after unpersist

2017-04-10 Thread JIRA
Miguel Pérez created SPARK-20286: Summary: dynamicAllocation.executorIdleTimeout is ignored after unpersist Key: SPARK-20286 URL: https://issues.apache.org/jira/browse/SPARK-20286 Project: Spark

[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963559#comment-15963559 ] Marcelo Vanzin commented on SPARK-16742: bq. But in Spark, this isn't currently derived from the

[jira] [Comment Edited] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963510#comment-15963510 ] Ruslan Dautkhanov edited comment on SPARK-12837 at 4/10/17 9:29 PM:

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963510#comment-15963510 ] Ruslan Dautkhanov commented on SPARK-12837: --- It might be a bug in broadcast join. Following

[jira] [Resolved] (SPARK-20283) Add preOptimizationBatches

2017-04-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-20283. - Resolution: Fixed Fix Version/s: 2.2.0 > Add preOptimizationBatches >

[jira] [Resolved] (SPARK-20282) Flaky test: org.apache.spark.sql.streaming/StreamingQuerySuite/OneTime_trigger__commit_log__and_exception

2017-04-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-20282. -- Resolution: Fixed Assignee: Shixiong Zhu Fix Version/s: 2.2.0 > Flaky test: >

[jira] [Updated] (SPARK-20285) Flaky test: pyspark.streaming.tests.BasicOperationTests.test_cogroup

2017-04-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-20285: - Affects Version/s: 2.1.1 2.0.3 > Flaky test:

[jira] [Updated] (SPARK-20285) Flaky test: pyspark.streaming.tests.BasicOperationTests.test_cogroup

2017-04-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-20285: - Affects Version/s: (was: 2.1.1) (was: 2.0.3)

[jira] [Resolved] (SPARK-20285) Flaky test: pyspark.streaming.tests.BasicOperationTests.test_cogroup

2017-04-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-20285. -- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 2.0.3

[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963469#comment-15963469 ] Michael Gummelt commented on SPARK-16742: - [~jerryshao] Great! The current RPC used in Mesos is

[jira] [Comment Edited] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963446#comment-15963446 ] Michael Gummelt edited comment on SPARK-16742 at 4/10/17 8:35 PM: --

[jira] [Comment Edited] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963446#comment-15963446 ] Michael Gummelt edited comment on SPARK-16742 at 4/10/17 8:20 PM: --

[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963446#comment-15963446 ] Michael Gummelt commented on SPARK-16742: - bq. The most basic feature needed for any

[jira] [Comment Edited] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963446#comment-15963446 ] Michael Gummelt edited comment on SPARK-16742 at 4/10/17 8:18 PM: --

[jira] [Resolved] (SPARK-20280) SharedInMemoryCache Weigher integer overflow

2017-04-10 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-20280. --- Resolution: Fixed Assignee: Bogdan Raducanu Fix Version/s: 2.2.0

[jira] [Commented] (SPARK-20285) Flaky test: pyspark.streaming.tests.BasicOperationTests.test_cogroup

2017-04-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963422#comment-15963422 ] Shixiong Zhu commented on SPARK-20285: -- https://github.com/apache/spark/pull/17597 > Flaky test:

[jira] [Issue Comment Deleted] (SPARK-20285) Flaky test: pyspark.streaming.tests.BasicOperationTests.test_cogroup

2017-04-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-20285: - Comment: was deleted (was: https://github.com/apache/spark/pull/17597) > Flaky test:

[jira] [Assigned] (SPARK-20284) Make SerializationStream and DeserializationStream extend Closeable

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20284: Assignee: (was: Apache Spark) > Make SerializationStream and DeserializationStream

[jira] [Commented] (SPARK-20284) Make SerializationStream and DeserializationStream extend Closeable

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963419#comment-15963419 ] Apache Spark commented on SPARK-20284: -- User 'superbobry' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20284) Make SerializationStream and DeserializationStream extend Closeable

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20284: Assignee: Apache Spark > Make SerializationStream and DeserializationStream extend

[jira] [Assigned] (SPARK-20285) Flaky test: pyspark.streaming.tests.BasicOperationTests.test_cogroup

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20285: Assignee: Shixiong Zhu (was: Apache Spark) > Flaky test:

[jira] [Commented] (SPARK-20285) Flaky test: pyspark.streaming.tests.BasicOperationTests.test_cogroup

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963412#comment-15963412 ] Apache Spark commented on SPARK-20285: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20285) Flaky test: pyspark.streaming.tests.BasicOperationTests.test_cogroup

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20285: Assignee: Apache Spark (was: Shixiong Zhu) > Flaky test:

[jira] [Created] (SPARK-20285) Flaky test:

2017-04-10 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-20285: Summary: Flaky test: Key: SPARK-20285 URL: https://issues.apache.org/jira/browse/SPARK-20285 Project: Spark Issue Type: Bug Components: Tests

[jira] [Updated] (SPARK-20285) Flaky test: pyspark.streaming.tests.BasicOperationTests.test_cogroup

2017-04-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-20285: - Summary: Flaky test: pyspark.streaming.tests.BasicOperationTests.test_cogroup (was: Flaky test:

[jira] [Commented] (SPARK-20156) Java String toLowerCase "Turkish locale bug" causes Spark problems

2017-04-10 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-20156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963406#comment-15963406 ] Serkan Taş commented on SPARK-20156: Thank you Sean, it was more than i expected i think. > Java

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963402#comment-15963402 ] Apache Spark commented on SPARK-12837: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12837: Assignee: Apache Spark (was: Wenchen Fan) > Spark driver requires large memory space for

[jira] [Assigned] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12837: Assignee: Wenchen Fan (was: Apache Spark) > Spark driver requires large memory space for

[jira] [Created] (SPARK-20284) Make SerializationStream and DeserializationStream extend Closeable

2017-04-10 Thread Sergei Lebedev (JIRA)
Sergei Lebedev created SPARK-20284: -- Summary: Make SerializationStream and DeserializationStream extend Closeable Key: SPARK-20284 URL: https://issues.apache.org/jira/browse/SPARK-20284 Project:

[jira] [Resolved] (SPARK-20156) Java String toLowerCase "Turkish locale bug" causes Spark problems

2017-04-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20156. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17527

[jira] [Commented] (SPARK-19352) Sorting issues on relatively big datasets

2017-04-10 Thread Charles Pritchard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963367#comment-15963367 ] Charles Pritchard commented on SPARK-19352: --- Does this fix the issue in SPARK-18934 ? >

[jira] [Commented] (SPARK-19352) Sorting issues on relatively big datasets

2017-04-10 Thread Charles Pritchard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963365#comment-15963365 ] Charles Pritchard commented on SPARK-19352: --- [~cloud_fan] Yes, Hive relies on sorting

[jira] [Commented] (SPARK-19067) mapGroupsWithState - arbitrary stateful operations with Structured Streaming (similar to DStream.mapWithState)

2017-04-10 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963353#comment-15963353 ] Michael Armbrust commented on SPARK-19067: -- No, this will be available in Spark 2.2.0 >

[jira] [Commented] (SPARK-20283) Add preOptimizationBatches

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963351#comment-15963351 ] Apache Spark commented on SPARK-20283: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20283) Add preOptimizationBatches

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20283: Assignee: Reynold Xin (was: Apache Spark) > Add preOptimizationBatches >

[jira] [Assigned] (SPARK-20283) Add preOptimizationBatches

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20283: Assignee: Apache Spark (was: Reynold Xin) > Add preOptimizationBatches >

[jira] [Updated] (SPARK-20283) Add preOptimizationBatches

2017-04-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-20283: Description: We currently have postHocOptimizationBatches, but not preOptimizationBatches. This

[jira] [Created] (SPARK-20283) Add preOptimizationBatches

2017-04-10 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-20283: --- Summary: Add preOptimizationBatches Key: SPARK-20283 URL: https://issues.apache.org/jira/browse/SPARK-20283 Project: Spark Issue Type: New Feature

[jira] [Assigned] (SPARK-20282) Flaky test: org.apache.spark.sql.streaming/StreamingQuerySuite/OneTime_trigger__commit_log__and_exception

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20282: Assignee: (was: Apache Spark) > Flaky test: >

[jira] [Assigned] (SPARK-20282) Flaky test: org.apache.spark.sql.streaming/StreamingQuerySuite/OneTime_trigger__commit_log__and_exception

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20282: Assignee: Apache Spark > Flaky test: >

[jira] [Commented] (SPARK-20282) Flaky test: org.apache.spark.sql.streaming/StreamingQuerySuite/OneTime_trigger__commit_log__and_exception

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963341#comment-15963341 ] Apache Spark commented on SPARK-20282: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Updated] (SPARK-20282) Flaky test: org.apache.spark.sql.streaming/StreamingQuerySuite/OneTime_trigger__commit_log__and_exception

2017-04-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-20282: - Issue Type: Test (was: Bug) > Flaky test: >

[jira] [Created] (SPARK-20282) Flaky test: org.apache.spark.sql.streaming/StreamingQuerySuite/OneTime_trigger__commit_log__and_exception

2017-04-10 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-20282: Summary: Flaky test: org.apache.spark.sql.streaming/StreamingQuerySuite/OneTime_trigger__commit_log__and_exception Key: SPARK-20282 URL:

[jira] [Comment Edited] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-04-10 Thread Ryan Williams (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963304#comment-15963304 ] Ryan Williams edited comment on SPARK-650 at 4/10/17 6:42 PM: -- Both suggested

[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-04-10 Thread Ryan Williams (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963304#comment-15963304 ] Ryan Williams commented on SPARK-650: - Both suggested workarounds here are lacking or broken / actively

[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-04-10 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963136#comment-15963136 ] Marcelo Vanzin commented on SPARK-16742: bq. The problem is then that a kerberos-authenticated

[jira] [Resolved] (SPARK-20273) Disallow Non-deterministic Filter push-down into Join Conditions

2017-04-10 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-20273. - Resolution: Fixed Fix Version/s: 2.2.0 > Disallow Non-deterministic Filter push-down into Join

[jira] [Resolved] (SPARK-19518) IGNORE NULLS in first_value / last_value should be supported in SQL statements

2017-04-10 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-19518. --- Resolution: Fixed Assignee: Hyukjin Kwon Fix Version/s: 2.2.0 >

[jira] [Resolved] (SPARK-20243) DebugFilesystem.assertNoOpenStreams thread race

2017-04-10 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-20243. --- Resolution: Fixed Assignee: Bogdan Raducanu Fix Version/s: 2.2.0 >

[jira] [Updated] (SPARK-20000) Spark Hive tests aborted due to lz4-java on ppc64le

2017-04-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-2: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) OK so this is about updating

[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive

2017-04-10 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962903#comment-15962903 ] Steve Loughran commented on SPARK-20202: One thing I do recall as trouble here was that ivy

[jira] [Commented] (SPARK-20000) Spark Hive tests aborted due to lz4-java on ppc64le

2017-04-10 Thread Ayappan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962896#comment-15962896 ] Ayappan commented on SPARK-2: - https://github.com/lz4/lz4-java/pull/84 > Spark Hive tests aborted

[jira] [Commented] (SPARK-20277) Allow Spark on YARN to be launched with Docker

2017-04-10 Thread Zhankun Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962787#comment-15962787 ] Zhankun Tang commented on SPARK-20277: -- It requires at least hadoop 2.8 Alpha1 from the roadmap. I

[jira] [Updated] (SPARK-20279) In web ui,'Only showing 200' should be changed to 'only showing last 200'.

2017-04-10 Thread guoxiaolongzte (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guoxiaolongzte updated SPARK-20279: --- Description: In web ui,'Only showing 200' should be changed to 'only showing last 200' in

[jira] [Updated] (SPARK-20279) In web ui,'Only showing 200' should be changed to 'only showing last 200'.

2017-04-10 Thread guoxiaolongzte (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guoxiaolongzte updated SPARK-20279: --- Attachment: stages.png jobs.png > In web ui,'Only showing 200' should be

[jira] [Assigned] (SPARK-20279) In web ui,'Only showing 200' should be changed to 'only showing last 200'.

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20279: Assignee: (was: Apache Spark) > In web ui,'Only showing 200' should be changed to

[jira] [Assigned] (SPARK-20279) In web ui,'Only showing 200' should be changed to 'only showing last 200'.

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20279: Assignee: Apache Spark > In web ui,'Only showing 200' should be changed to 'only showing

[jira] [Commented] (SPARK-20279) In web ui,'Only showing 200' should be changed to 'only showing last 200'.

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962763#comment-15962763 ] Apache Spark commented on SPARK-20279: -- User 'guoxiaolongzte' has created a pull request for this

[jira] [Assigned] (SPARK-20243) DebugFilesystem.assertNoOpenStreams thread race

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20243: Assignee: (was: Apache Spark) > DebugFilesystem.assertNoOpenStreams thread race >

[jira] [Assigned] (SPARK-20243) DebugFilesystem.assertNoOpenStreams thread race

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20243: Assignee: Apache Spark > DebugFilesystem.assertNoOpenStreams thread race >

[jira] [Commented] (SPARK-20243) DebugFilesystem.assertNoOpenStreams thread race

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962761#comment-15962761 ] Apache Spark commented on SPARK-20243: -- User 'bogdanrdc' has created a pull request for this issue:

[jira] [Created] (SPARK-20281) Table-valued function range in SQL should use the same number of partitions as spark.range

2017-04-10 Thread Jacek Laskowski (JIRA)
Jacek Laskowski created SPARK-20281: --- Summary: Table-valued function range in SQL should use the same number of partitions as spark.range Key: SPARK-20281 URL: https://issues.apache.org/jira/browse/SPARK-20281

[jira] [Updated] (SPARK-20279) In web ui,'Only showing 200' should be changed to 'only showing last 200'.

2017-04-10 Thread guoxiaolongzte (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guoxiaolongzte updated SPARK-20279: --- Summary: In web ui,'Only showing 200' should be changed to 'only showing last 200'. (was:

[jira] [Updated] (SPARK-20279) In web ui,'Only showing 200' should be changed to 'only showing last 200' in the page of 'jobs' or'stages'.

2017-04-10 Thread guoxiaolongzte (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guoxiaolongzte updated SPARK-20279: --- Summary: In web ui,'Only showing 200' should be changed to 'only showing last 200' in the

[jira] [Assigned] (SPARK-20280) SharedInMemoryCache Weigher integer overflow

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20280: Assignee: (was: Apache Spark) > SharedInMemoryCache Weigher integer overflow >

[jira] [Commented] (SPARK-20280) SharedInMemoryCache Weigher integer overflow

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962751#comment-15962751 ] Apache Spark commented on SPARK-20280: -- User 'bogdanrdc' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20280) SharedInMemoryCache Weigher integer overflow

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20280: Assignee: Apache Spark > SharedInMemoryCache Weigher integer overflow >

[jira] [Commented] (SPARK-20280) SharedInMemoryCache Weigher integer overflow

2017-04-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962750#comment-15962750 ] Sean Owen commented on SPARK-20280: --- I guess cap it at {{Int.MaxValue}}? > SharedInMemoryCache Weigher

[jira] [Comment Edited] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962731#comment-15962731 ] Wenchen Fan edited comment on SPARK-12837 at 4/10/17 11:54 AM: --- [~jbherman]

[jira] [Updated] (SPARK-20280) SharedInMemoryCache Weigher integer overflow

2017-04-10 Thread Bogdan Raducanu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bogdan Raducanu updated SPARK-20280: Description: in FileStatusCache.scala: {code} .weigher(new Weigher[(ClientId, Path),

[jira] [Created] (SPARK-20280) SharedInMemoryCache Weigher integer overflow

2017-04-10 Thread Bogdan Raducanu (JIRA)
Bogdan Raducanu created SPARK-20280: --- Summary: SharedInMemoryCache Weigher integer overflow Key: SPARK-20280 URL: https://issues.apache.org/jira/browse/SPARK-20280 Project: Spark Issue

[jira] [Created] (SPARK-20279) In web ui,'Only showing 200' should be changed to 'only showing last 200'.

2017-04-10 Thread guoxiaolongzte (JIRA)
guoxiaolongzte created SPARK-20279: -- Summary: In web ui,'Only showing 200' should be changed to 'only showing last 200'. Key: SPARK-20279 URL: https://issues.apache.org/jira/browse/SPARK-20279

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962731#comment-15962731 ] Wenchen Fan commented on SPARK-12837: - [~jbherman] I tried and debugged your example, the actual task

[jira] [Assigned] (SPARK-20278) Disable 'multiple_dots_linter' lint rule that is against project's code style

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20278: Assignee: Apache Spark > Disable 'multiple_dots_linter' lint rule that is against

[jira] [Assigned] (SPARK-20278) Disable 'multiple_dots_linter' lint rule that is against project's code style

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20278: Assignee: (was: Apache Spark) > Disable 'multiple_dots_linter' lint rule that is

[jira] [Commented] (SPARK-20278) Disable 'multiple_dots_linter' lint rule that is against project's code style

2017-04-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962656#comment-15962656 ] Apache Spark commented on SPARK-20278: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Updated] (SPARK-20278) Disable 'multiple_dots_linter; lint rule that is against project's code style

2017-04-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-20278: - Description: Currently, multi-dot separated variables in R is not allowed. For example, {code}

[jira] [Updated] (SPARK-20278) Disable 'multiple_dots_linter' lint rule that is against project's code style

2017-04-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-20278: - Summary: Disable 'multiple_dots_linter' lint rule that is against project's code style (was:

[jira] [Created] (SPARK-20278) Disable 'multiple_dots_linter; lint rule that is against project's code style

2017-04-10 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-20278: Summary: Disable 'multiple_dots_linter; lint rule that is against project's code style Key: SPARK-20278 URL: https://issues.apache.org/jira/browse/SPARK-20278

[jira] [Commented] (SPARK-20252) java.lang.ClassNotFoundException: $line22.$read$$iwC$$iwC$movie_row

2017-04-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962636#comment-15962636 ] Sean Owen commented on SPARK-20252: --- It's likely related to other spark-shell + case class issues,

  1   2   >