date:20151201

[jira] [Assigned] (SPARK-12075) Speed up HiveComparisionTest suites by speeding up / avoiding reset()

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12075: Assignee: Josh Rosen (was: Apache Spark) > Speed up HiveComparisionTest suites by

[jira] [Created] (SPARK-12077) Use more robust plan for single distinct aggregation

2015-12-01 Thread Davies Liu (JIRA)

Davies Liu created SPARK-12077: -- Summary: Use more robust plan for single distinct aggregation Key: SPARK-12077 URL: https://issues.apache.org/jira/browse/SPARK-12077 Project: Spark Issue Type:

[jira] [Commented] (SPARK-11801) Notify driver when OOM is thrown before executor JVM is killed

2015-12-01 Thread Marcelo Vanzin (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034659#comment-15034659 ] Marcelo Vanzin commented on SPARK-11801: So, how much does this really help? A quick read of the

[jira] [Assigned] (SPARK-12079) Run Catalyst subproject's tests in parallel

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12079: Assignee: Josh Rosen (was: Apache Spark) > Run Catalyst subproject's tests in parallel >

[jira] [Commented] (SPARK-12079) Run Catalyst subproject's tests in parallel

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034678#comment-15034678 ] Apache Spark commented on SPARK-12079: -- User 'JoshRosen' has created a pull request for this issue:

[jira] [Commented] (SPARK-12002) offsetRanges attribute missing in Kafka RDD when resuming from checkpoint

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034549#comment-15034549 ] Apache Spark commented on SPARK-12002: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-11701) YARN - dynamic allocation and speculation active task accounting wrong

2015-12-01 Thread Thomas Graves (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves reassigned SPARK-11701: - Assignee: Thomas Graves > YARN - dynamic allocation and speculation active task

[jira] [Created] (SPARK-12080) Kryo - Support multiple user registrators

2015-12-01 Thread Rotem (JIRA)

Rotem created SPARK-12080: - Summary: Kryo - Support multiple user registrators Key: SPARK-12080 URL: https://issues.apache.org/jira/browse/SPARK-12080 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-12080) Kryo - Support multiple user registrators

2015-12-01 Thread Rotem (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rotem updated SPARK-12080: -- Target Version/s: (was: 1.6.1) > Kryo - Support multiple user registrators >

[jira] [Updated] (SPARK-12080) Kryo - Support multiple user registrators

2015-12-01 Thread Rotem (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rotem updated SPARK-12080: -- Affects Version/s: 1.6.1 > Kryo - Support multiple user registrators >

[jira] [Assigned] (SPARK-11352) codegen.GeneratePredicate fails due to unquoted comment

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11352: Assignee: (was: Apache Spark) > codegen.GeneratePredicate fails due to unquoted

[jira] [Assigned] (SPARK-11352) codegen.GeneratePredicate fails due to unquoted comment

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11352: Assignee: Apache Spark > codegen.GeneratePredicate fails due to unquoted comment >

[jira] [Resolved] (SPARK-12030) Incorrect results when aggregate joined data

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-12030. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 10068

[jira] [Commented] (SPARK-12071) Programming guide should explain NULL in JVM translate to NA in R

2015-12-01 Thread holdenk (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034566#comment-15034566 ] holdenk commented on SPARK-12071: - Seems like this could be a great starter issue if anyone is

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-12-01 Thread Joseph K. Bradley (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034569#comment-15034569 ] Joseph K. Bradley commented on SPARK-11605: --- I just checked, and you're right about it being a

[jira] [Commented] (SPARK-12072) python dataframe ._jdf.schema().json() breaks on large metadata dataframes

2015-12-01 Thread holdenk (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034564#comment-15034564 ] holdenk commented on SPARK-12072: - Around what # of attrs are you seeing failures at? Do you think we

[jira] [Commented] (SPARK-11886) R function name conflicts with base or stats package ones

2015-12-01 Thread Felix Cheung (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034700#comment-15034700 ] Felix Cheung commented on SPARK-11886: -- {code} > library(dplyr) > library(SparkR,

[jira] [Assigned] (SPARK-12079) Run Catalyst subproject's tests in parallel

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12079: Assignee: Apache Spark (was: Josh Rosen) > Run Catalyst subproject's tests in parallel >

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-12-01 Thread Joseph K. Bradley (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034559#comment-15034559 ] Joseph K. Bradley commented on SPARK-11605: --- Oops, yes, those should have been private. I

[jira] [Commented] (SPARK-12073) Backpressure causes individual Kafka partitions to lag

2015-12-01 Thread holdenk (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034561#comment-15034561 ] holdenk commented on SPARK-12073: - Could you add a link to the PR in your fork? > Backpressure causes

[jira] [Created] (SPARK-12078) Fix ByteBuffer.limit misuse

2015-12-01 Thread Shixiong Zhu (JIRA)

Shixiong Zhu created SPARK-12078: Summary: Fix ByteBuffer.limit misuse Key: SPARK-12078 URL: https://issues.apache.org/jira/browse/SPARK-12078 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-12075) Speed up HiveComparisionTest suites by speeding up / avoiding reset()

2015-12-01 Thread Josh Rosen (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-12075: --- Issue Type: Sub-task (was: Improvement) Parent: SPARK-9288 > Speed up HiveComparisionTest

[jira] [Commented] (SPARK-11949) Query on DataFrame from cube gives wrong results

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034752#comment-15034752 ] Apache Spark commented on SPARK-11949: -- User 'viirya' has created a pull request for this issue:

[jira] [Commented] (SPARK-11604) ML 1.6 QA: API: Python API coverage

2015-12-01 Thread Joseph K. Bradley (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034544#comment-15034544 ] Joseph K. Bradley commented on SPARK-11604: --- OK, I will resolve it. Thank you for the thorough

[jira] [Resolved] (SPARK-11604) ML 1.6 QA: API: Python API coverage

2015-12-01 Thread Joseph K. Bradley (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-11604. --- Resolution: Fixed Fix Version/s: 1.6.0 > ML 1.6 QA: API: Python API coverage

[jira] [Assigned] (SPARK-12032) Filter can't be pushed down to correct Join because of bad order of Join

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12032: Assignee: Apache Spark (was: Davies Liu) > Filter can't be pushed down to correct Join

[jira] [Commented] (SPARK-11939) PySpark support model export/import for Pipeline API

2015-12-01 Thread Joseph K. Bradley (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034582#comment-15034582 ] Joseph K. Bradley commented on SPARK-11939: --- Also, I'd split this into its own task and make it

[jira] [Commented] (SPARK-12030) Incorrect results when aggregate joined data

2015-12-01 Thread JIRA

[ https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034641#comment-15034641 ] Maciej Bryński commented on SPARK-12030: Will the fix be included in 1.6.0 ? > Incorrect results

[jira] [Commented] (SPARK-12055) TimSort failing with error when writing a partitioned data set

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034640#comment-15034640 ] Yin Huai commented on SPARK-12055: -- It is very possible that

[jira] [Commented] (SPARK-12000) `sbt publishLocal` hits a Scala compiler bug caused by `Since` annotation

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034479#comment-15034479 ] Apache Spark commented on SPARK-12000: -- User 'mengxr' has created a pull request for this issue:

[jira] [Commented] (SPARK-11352) codegen.GeneratePredicate fails due to unquoted comment

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034493#comment-15034493 ] Apache Spark commented on SPARK-11352: -- User 'yhuai' has created a pull request for this issue:

[jira] [Commented] (SPARK-12015) Auto convert int to Double when required in pyspark.ml

2015-12-01 Thread holdenk (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034570#comment-15034570 ] holdenk commented on SPARK-12015: - This is a duplicate of

[jira] [Updated] (SPARK-12030) Incorrect results when aggregate joined data

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-12030: - Assignee: Nong Li > Incorrect results when aggregate joined data >

[jira] [Commented] (SPARK-11922) Python API for ml.feature.QuantileDiscretizer

2015-12-01 Thread holdenk (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034602#comment-15034602 ] holdenk commented on SPARK-11922: - I'll start working on this. > Python API for

[jira] [Commented] (SPARK-12078) Fix ByteBuffer.limit misuse

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034644#comment-15034644 ] Apache Spark commented on SPARK-12078: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-12078) Fix ByteBuffer.limit misuse

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12078: Assignee: (was: Apache Spark) > Fix ByteBuffer.limit misuse >

[jira] [Assigned] (SPARK-12078) Fix ByteBuffer.limit misuse

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12078: Assignee: Apache Spark > Fix ByteBuffer.limit misuse > --- > >

[jira] [Commented] (SPARK-11701) YARN - dynamic allocation and speculation active task accounting wrong

2015-12-01 Thread Thomas Graves (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034749#comment-15034749 ] Thomas Graves commented on SPARK-11701: --- The same issue existing with dynamic allocation in 1.6.

[jira] [Assigned] (SPARK-12080) Kryo - Support multiple user registrators

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12080: Assignee: (was: Apache Spark) > Kryo - Support multiple user registrators >

[jira] [Created] (SPARK-12081) Make unified memory management work with small heaps

2015-12-01 Thread Andrew Or (JIRA)

Andrew Or created SPARK-12081: - Summary: Make unified memory management work with small heaps Key: SPARK-12081 URL: https://issues.apache.org/jira/browse/SPARK-12081 Project: Spark Issue Type:

[jira] [Commented] (SPARK-11964) Create user guide section explaining export/import

2015-12-01 Thread Joseph K. Bradley (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034588#comment-15034588 ] Joseph K. Bradley commented on SPARK-11964: --- I agree; I think it'd be good to put at the bottom

[jira] [Commented] (SPARK-12077) Use more robust plan for single distinct aggregation

2015-12-01 Thread Davies Liu (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034620#comment-15034620 ] Davies Liu commented on SPARK-12077: https://github.com/apache/spark/pull/10075 > Use more robust

[jira] [Commented] (SPARK-10277) Add @since annotation to pyspark.mllib.regression

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034622#comment-15034622 ] Apache Spark commented on SPARK-10277: -- User 'davies' has created a pull request for this issue:

[jira] [Created] (SPARK-12079) Run Catalyst subproject's tests in parallel

2015-12-01 Thread Josh Rosen (JIRA)

Josh Rosen created SPARK-12079: -- Summary: Run Catalyst subproject's tests in parallel Key: SPARK-12079 URL: https://issues.apache.org/jira/browse/SPARK-12079 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-12076) countDistinct behaves inconsistently

2015-12-01 Thread Paul Zaczkieiwcz (JIRA)

Paul Zaczkieiwcz created SPARK-12076: Summary: countDistinct behaves inconsistently Key: SPARK-12076 URL: https://issues.apache.org/jira/browse/SPARK-12076 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-11939) PySpark support model export/import for Pipeline API

2015-12-01 Thread Joseph K. Bradley (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034580#comment-15034580 ] Joseph K. Bradley commented on SPARK-11939: --- * save for Estimator/Transformer: I agree. * load

[jira] [Commented] (SPARK-12030) Incorrect results when aggregate joined data

2015-12-01 Thread Davies Liu (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034615#comment-15034615 ] Davies Liu commented on SPARK-12030: I also figured out the root cause last night, that's an

[jira] [Commented] (SPARK-5968) Parquet warning in spark-shell

2015-12-01 Thread swetha k (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034624#comment-15034624 ] swetha k commented on SPARK-5968: - [~lian cheng] Following are the dependencies and the versions that I

[jira] [Commented] (SPARK-11701) YARN - dynamic allocation and speculation active task accounting wrong

2015-12-01 Thread Thomas Graves (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034625#comment-15034625 ] Thomas Graves commented on SPARK-11701: --- So it looks like this is a race condition. If the task end

[jira] [Commented] (SPARK-11620) parquet.hadoop.ParquetOutputCommitter.commitJob() throws parquet.io.ParquetEncodingException

2015-12-01 Thread swetha k (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034650#comment-15034650 ] swetha k commented on SPARK-11620: -- [~hyukjin.kwon] I have the following code that saves the parquet

[jira] [Commented] (SPARK-12030) Incorrect results when aggregate joined data

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034652#comment-15034652 ] Yin Huai commented on SPARK-12030: -- Yes, it will be in 1.6.0. > Incorrect results when aggregate joined

[jira] [Comment Edited] (SPARK-11620) parquet.hadoop.ParquetOutputCommitter.commitJob() throws parquet.io.ParquetEncodingException

2015-12-01 Thread swetha k (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034650#comment-15034650 ] swetha k edited comment on SPARK-11620 at 12/1/15 9:50 PM: --- [~hyukjin.kwon] I

[jira] [Comment Edited] (SPARK-11964) Create user guide section explaining export/import

2015-12-01 Thread Bill Chambers (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034178#comment-15034178 ] Bill Chambers edited comment on SPARK-11964 at 12/1/15 8:27 PM: Happy to

[jira] [Created] (SPARK-12075) Speed up HiveComparisionTest suites by speeding up / avoiding reset()

2015-12-01 Thread Josh Rosen (JIRA)

Josh Rosen created SPARK-12075: -- Summary: Speed up HiveComparisionTest suites by speeding up / avoiding reset() Key: SPARK-12075 URL: https://issues.apache.org/jira/browse/SPARK-12075 Project: Spark

[jira] [Assigned] (SPARK-12075) Speed up HiveComparisionTest suites by speeding up / avoiding reset()

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12075: Assignee: Apache Spark (was: Josh Rosen) > Speed up HiveComparisionTest suites by

[jira] [Commented] (SPARK-12075) Speed up HiveComparisionTest suites by speeding up / avoiding reset()

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034527#comment-15034527 ] Apache Spark commented on SPARK-12075: -- User 'JoshRosen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-12032) Filter can't be pushed down to correct Join because of bad order of Join

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12032: Assignee: Davies Liu (was: Apache Spark) > Filter can't be pushed down to correct Join

[jira] [Commented] (SPARK-12032) Filter can't be pushed down to correct Join because of bad order of Join

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034545#comment-15034545 ] Apache Spark commented on SPARK-12032: -- User 'davies' has created a pull request for this issue:

[jira] [Commented] (SPARK-6830) Memoize frequently queried vals in RDD, such as numPartitions, count etc.

2015-12-01 Thread Davies Liu (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-6830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034635#comment-15034635 ] Davies Liu commented on SPARK-6830: --- +1 > Memoize frequently queried vals in RDD, such as

[jira] [Commented] (SPARK-12080) Kryo - Support multiple user registrators

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034757#comment-15034757 ] Apache Spark commented on SPARK-12080: -- User 'Botnaim' has created a pull request for this issue:

[jira] [Updated] (SPARK-11596) SQL execution very slow for nested query plans because of DataFrame.withNewExecutionId

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-11596: - Target Version/s: 1.6.0 > SQL execution very slow for nested query plans because of >

[jira] [Commented] (SPARK-11596) SQL execution very slow for nested query plans because of DataFrame.withNewExecutionId

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034787#comment-15034787 ] Apache Spark commented on SPARK-11596: -- User 'yhuai' has created a pull request for this issue:

[jira] [Assigned] (SPARK-11596) SQL execution very slow for nested query plans because of DataFrame.withNewExecutionId

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11596: Assignee: Apache Spark > SQL execution very slow for nested query plans because of >

[jira] [Assigned] (SPARK-11596) SQL execution very slow for nested query plans because of DataFrame.withNewExecutionId

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11596: Assignee: (was: Apache Spark) > SQL execution very slow for nested query plans

[jira] [Resolved] (SPARK-11788) Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC dataframes causes SQLServerException

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-11788. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9872

[jira] [Updated] (SPARK-11788) Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC dataframes causes SQLServerException

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-11788: - Assignee: Huaxin Gao > Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC > dataframes

[jira] [Resolved] (SPARK-12055) TimSort failing with error when writing a partitioned data set

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-12055. -- Resolution: Duplicate Assignee: Nong Li > TimSort failing with error when writing a partitioned

[jira] [Commented] (SPARK-12055) TimSort failing with error when writing a partitioned data set

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034870#comment-15034870 ] Yin Huai commented on SPARK-12055: -- We have done some tests. Looks like

[jira] [Updated] (SPARK-12000) `sbt publishLocal` hits a Scala compiler bug caused by `Since` annotation

2015-12-01 Thread Xiangrui Meng (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-12000: -- Fix Version/s: (was: 1.6.0) > `sbt publishLocal` hits a Scala compiler bug caused by

[jira] [Commented] (SPARK-7131) Move tree,forest implementation from spark.mllib to spark.ml

2015-12-01 Thread Seth Hendrickson (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034859#comment-15034859 ] Seth Hendrickson commented on SPARK-7131: - [~josephkb] Numbers two and three on the list above are

[jira] [Reopened] (SPARK-12000) `sbt publishLocal` hits a Scala compiler bug caused by `Since` annotation

2015-12-01 Thread Xiangrui Meng (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reopened SPARK-12000: --- > `sbt publishLocal` hits a Scala compiler bug caused by `Since` annotation >

[jira] [Resolved] (SPARK-11328) Correctly propagate error message in the case of failures when writing parquet

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-11328. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 10080

[jira] [Resolved] (SPARK-12075) Speed up HiveComparisionTest suites by speeding up / avoiding reset()

2015-12-01 Thread Reynold Xin (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-12075. - Resolution: Fixed Fix Version/s: 1.6.0 > Speed up HiveComparisionTest suites by speeding

[jira] [Commented] (SPARK-11873) Regression for TPC-DS query 63 when used with decimal datatype and windows function

2015-12-01 Thread Michael Armbrust (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034908#comment-15034908 ] Michael Armbrust commented on SPARK-11873: -- We did a lot of performance work in Spark 1.6 (e.g.,

[jira] [Updated] (SPARK-11328) Provide more informative error message when direct parquet output committer is used and there is a file already exists error.

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-11328: - Summary: Provide more informative error message when direct parquet output committer is used and there

[jira] [Commented] (SPARK-11801) Notify driver when OOM is thrown before executor JVM is killed

2015-12-01 Thread Mridul Muralidharan (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034914#comment-15034914 ] Mridul Muralidharan commented on SPARK-11801: - There are few aspects here : a) A race

[jira] [Comment Edited] (SPARK-11801) Notify driver when OOM is thrown before executor JVM is killed

2015-12-01 Thread Mridul Muralidharan (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034914#comment-15034914 ] Mridul Muralidharan edited comment on SPARK-11801 at 12/2/15 12:01 AM:

[jira] [Updated] (SPARK-12061) Persist for Map/filter with Lambda Functions don't always read from Cache

2015-12-01 Thread Michael Armbrust (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12061: - Target Version/s: 1.7.0 > Persist for Map/filter with Lambda Functions don't always read

[jira] [Updated] (SPARK-12061) [SQL] Dataset API: Adding Persist for Map/filter with Lambda Functions

2015-12-01 Thread Michael Armbrust (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12061: - Issue Type: Bug (was: Improvement) > [SQL] Dataset API: Adding Persist for Map/filter

[jira] [Created] (SPARK-12084) Fix codes that uses ByteBuffer.array incorrectly

2015-12-01 Thread Shixiong Zhu (JIRA)

Shixiong Zhu created SPARK-12084: Summary: Fix codes that uses ByteBuffer.array incorrectly Key: SPARK-12084 URL: https://issues.apache.org/jira/browse/SPARK-12084 Project: Spark Issue Type:

[jira] [Commented] (SPARK-10911) Executors should System.exit on clean shutdown

2015-12-01 Thread Marcelo Vanzin (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034776#comment-15034776 ] Marcelo Vanzin commented on SPARK-10911: So, one thing that I still haven't understood is: why

[jira] [Commented] (SPARK-11801) Notify driver when OOM is thrown before executor JVM is killed

2015-12-01 Thread Imran Rashid (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034798#comment-15034798 ] Imran Rashid commented on SPARK-11801: -- This surprised me too, but [~vsr] reported (offline) that

[jira] [Commented] (SPARK-11928) Master retry deadlock

2015-12-01 Thread Bryan Cutler (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034797#comment-15034797 ] Bryan Cutler commented on SPARK-11928: -- I was able to reproduce the {{RejectedExecutionException}},

[jira] [Commented] (SPARK-11801) Notify driver when OOM is thrown before executor JVM is killed

2015-12-01 Thread Marcelo Vanzin (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034827#comment-15034827 ] Marcelo Vanzin commented on SPARK-11801: bq. But like I said earlier, I'm OK with generic

[jira] [Created] (SPARK-12083) java.lang.IllegalArgumentException: requirement failed: Overflowed precision (q98)

2015-12-01 Thread Dileep Kumar (JIRA)

Dileep Kumar created SPARK-12083: Summary: java.lang.IllegalArgumentException: requirement failed: Overflowed precision (q98) Key: SPARK-12083 URL: https://issues.apache.org/jira/browse/SPARK-12083

[jira] [Updated] (SPARK-12000) `sbt publishLocal` hits a Scala compiler bug caused by `Since` annotation

2015-12-01 Thread Michael Armbrust (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12000: - Priority: Blocker (was: Major) > `sbt publishLocal` hits a Scala compiler bug caused by

[jira] [Updated] (SPARK-11932) trackStateByKey throws java.lang.IllegalArgumentException: requirement failed on restarting from checkpoint

2015-12-01 Thread Michael Armbrust (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11932: - Priority: Critical (was: Blocker) > trackStateByKey throws

[jira] [Resolved] (SPARK-11503) SQL API audit for Spark 1.6

2015-12-01 Thread Michael Armbrust (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11503. -- Resolution: Fixed Fix Version/s: 1.6.0 > SQL API audit for Spark 1.6 >

[jira] [Commented] (SPARK-12030) Incorrect results when aggregate joined data

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034936#comment-15034936 ] Yin Huai commented on SPARK-12030: -- I also merged the patch to branch 1.5. Please note that, in 1.5, we

[jira] [Assigned] (SPARK-11713) Initial RDD for updateStateByKey for pyspark

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11713: Assignee: Apache Spark > Initial RDD for updateStateByKey for pyspark >

[jira] [Assigned] (SPARK-11713) Initial RDD for updateStateByKey for pyspark

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11713: Assignee: (was: Apache Spark) > Initial RDD for updateStateByKey for pyspark >

[jira] [Commented] (SPARK-11713) Initial RDD for updateStateByKey for pyspark

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034940#comment-15034940 ] Apache Spark commented on SPARK-11713: -- User 'BryanCutler' has created a pull request for this

[jira] [Comment Edited] (SPARK-12059) Standalone Master assertion error

2015-12-01 Thread Saisai Shao (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034997#comment-15034997 ] Saisai Shao edited comment on SPARK-12059 at 12/2/15 12:47 AM: --- A simple

[jira] [Commented] (SPARK-12059) Standalone Master assertion error

2015-12-01 Thread Saisai Shao (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034997#comment-15034997 ] Saisai Shao commented on SPARK-12059: - A simple solution is to loose the condition or remove the

[jira] [Assigned] (SPARK-12080) Kryo - Support multiple user registrators

2015-12-01 Thread Apache Spark (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12080: Assignee: Apache Spark > Kryo - Support multiple user registrators >

[jira] [Commented] (SPARK-10969) Spark Streaming Kinesis: Allow specifying separate credentials for Kinesis and DynamoDB

2015-12-01 Thread Brian London (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-10969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034820#comment-15034820 ] Brian London commented on SPARK-10969: -- Was this fixed with

[jira] [Resolved] (SPARK-11961) User guide section for ChiSqSelector transformer

2015-12-01 Thread Joseph K. Bradley (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-11961. --- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9965

[jira] [Updated] (SPARK-11780) Provide type aliases in org.apache.spark.sql.types for backwards compatibility

2015-12-01 Thread Michael Armbrust (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11780: - Assignee: Santiago M. Mola > Provide type aliases in org.apache.spark.sql.types for

[jira] [Updated] (SPARK-11788) Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC dataframes causes SQLServerException

2015-12-01 Thread Yin Huai (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-11788: - Fix Version/s: 1.5.3 > Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC > dataframes

[jira] [Updated] (SPARK-11596) SQL execution very slow for nested query plans because of DataFrame.withNewExecutionId

2015-12-01 Thread Michael Armbrust (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11596: - Assignee: Yin Huai > SQL execution very slow for nested query plans because of >

1 2 3 >

1 - 100 of 272 matches

Mail list logo