[jira] [Assigned] (SPARK-22521) VectorIndexerModel support handle unseen categories via handleInvalid: Python API

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22521: Assignee: Apache Spark > VectorIndexerModel support handle unseen categories via

[jira] [Assigned] (SPARK-22521) VectorIndexerModel support handle unseen categories via handleInvalid: Python API

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22521: Assignee: (was: Apache Spark) > VectorIndexerModel support handle unseen categories

[jira] [Commented] (SPARK-22521) VectorIndexerModel support handle unseen categories via handleInvalid: Python API

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252999#comment-16252999 ] Apache Spark commented on SPARK-22521: -- User 'WeichenXu123' has created a pull request for this

[jira] [Comment Edited] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-11-14 Thread Thunder Stumpges (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252953#comment-16252953 ] Thunder Stumpges edited comment on SPARK-19371 at 11/15/17 5:50 AM:

[jira] [Comment Edited] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-11-14 Thread Thunder Stumpges (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252969#comment-16252969 ] Thunder Stumpges edited comment on SPARK-19371 at 11/15/17 5:52 AM:

[jira] [Commented] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-11-14 Thread Thunder Stumpges (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252969#comment-16252969 ] Thunder Stumpges commented on SPARK-19371: -- Here's a view of something that happened this time

[jira] [Updated] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-11-14 Thread Thunder Stumpges (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thunder Stumpges updated SPARK-19371: - Attachment: RDD Block Distribution on two executors.png > Cannot spread cached

[jira] [Commented] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-11-14 Thread Thunder Stumpges (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252953#comment-16252953 ] Thunder Stumpges commented on SPARK-19371: -- In my case, it is important because that cached RDD

[jira] [Updated] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-11-14 Thread Thunder Stumpges (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thunder Stumpges updated SPARK-19371: - Attachment: execution timeline.png > Cannot spread cached partitions evenly across

[jira] [Commented] (SPARK-20937) Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, DataFrames and Datasets Guide

2017-11-14 Thread sw (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252942#comment-16252942 ] sw commented on SPARK-20937: ++100 data already write. so how can I fix it? > Describe

[jira] [Commented] (SPARK-22522) Convert to apache-release to publish Maven artifacts to Nexus/repository.apache.org

2017-11-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252937#comment-16252937 ] Felix Cheung commented on SPARK-22522: -- and repository.apache.org is been the same place we are

[jira] [Updated] (SPARK-22522) Convert to apache-release to publish Maven artifacts to Nexus/repository.apache.org

2017-11-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22522: - Description: see http://www.apache.org/dev/publishing-maven-artifacts.html ...at the very least

[jira] [Updated] (SPARK-22522) Convert to apache-release to publish Maven artifacts to Nexus/repository.apache.org

2017-11-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22522: - Description: see http://www.apache.org/dev/publishing-maven-artifacts.html ...at the very least

[jira] [Updated] (SPARK-22522) Convert to apache-release to publish Maven artifacts to Nexus/repository.apache.org

2017-11-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22522: - Description: see http://www.apache.org/dev/publishing-maven-artifacts.html ...at the very least

[jira] [Commented] (SPARK-22522) Convert to apache-release to publish Maven artifacts to Nexus/repository.apache.org

2017-11-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252934#comment-16252934 ] Felix Cheung commented on SPARK-22522: -- the profile and associated config is inherited from the

[jira] [Updated] (SPARK-22523) Janino throws StackOverflowError on nested structs with many fields

2017-11-14 Thread Utku Demir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Utku Demir updated SPARK-22523: --- Description: When running the below application, Janino throws StackOverflow: {code} Exception in

[jira] [Created] (SPARK-22523) Janino throws StackOverflowError on nested structs with many fields

2017-11-14 Thread Utku Demir (JIRA)
Utku Demir created SPARK-22523: -- Summary: Janino throws StackOverflowError on nested structs with many fields Key: SPARK-22523 URL: https://issues.apache.org/jira/browse/SPARK-22523 Project: Spark

[jira] [Commented] (SPARK-22522) Convert to apache-release to publish Maven artifacts to Nexus/repository.apache.org

2017-11-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252864#comment-16252864 ] Sean Owen commented on SPARK-22522: --- Is that repo active in maven and SBT by default? Just wondering if

[jira] [Created] (SPARK-22522) Convert to apache-release to publish Maven artifacts to Nexus/repository.apache.org

2017-11-14 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-22522: Summary: Convert to apache-release to publish Maven artifacts to Nexus/repository.apache.org Key: SPARK-22522 URL: https://issues.apache.org/jira/browse/SPARK-22522

[jira] [Created] (SPARK-22521) VectorIndexerModel support handle unseen categories via handleInvalid: Python API

2017-11-14 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-22521: -- Summary: VectorIndexerModel support handle unseen categories via handleInvalid: Python API Key: SPARK-22521 URL: https://issues.apache.org/jira/browse/SPARK-22521

[jira] [Resolved] (SPARK-13846) VectorIndexer output on unknown feature should be more descriptive

2017-11-14 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-13846. --- Resolution: Fixed Assignee: Weichen Xu Fix Version/s: 2.3.0 >

[jira] [Commented] (SPARK-11373) Add metrics to the History Server and providers

2017-11-14 Thread Nick Dimiduk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252776#comment-16252776 ] Nick Dimiduk commented on SPARK-11373: -- I'm chasing a goose through the wild and have found my way

[jira] [Commented] (SPARK-13846) VectorIndexer output on unknown feature should be more descriptive

2017-11-14 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252775#comment-16252775 ] Joseph K. Bradley commented on SPARK-13846: --- Linking JIRA for task which solves this issue.

[jira] [Resolved] (SPARK-12375) VectorIndexer: allow unknown categories

2017-11-14 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-12375. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 19588

[jira] [Resolved] (SPARK-21087) CrossValidator, TrainValidationSplit should collect all models when fitting: Scala API

2017-11-14 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-21087. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 19208

[jira] [Assigned] (SPARK-22511) Update maven central repo address

2017-11-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-22511: - Assignee: Sean Owen > Update maven central repo address > - > >

[jira] [Resolved] (SPARK-22511) Update maven central repo address

2017-11-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-22511. --- Resolution: Fixed Fix Version/s: 2.3.0 2.2.2 Issue resolved by pull

[jira] [Assigned] (SPARK-22519) Remove unnecessary stagingDirPath null check in ApplicationMaster.cleanupStagingDir()

2017-11-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-22519: -- Assignee: Devaraj K > Remove unnecessary stagingDirPath null check in >

[jira] [Resolved] (SPARK-22519) Remove unnecessary stagingDirPath null check in ApplicationMaster.cleanupStagingDir()

2017-11-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-22519. Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 19749

[jira] [Comment Edited] (SPARK-22231) Support of map, filter, withColumn, dropColumn in nested list of structures

2017-11-14 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252611#comment-16252611 ] DB Tsai edited comment on SPARK-22231 at 11/14/17 10:45 PM: Thanks

[jira] [Commented] (SPARK-22231) Support of map, filter, withColumn, dropColumn in nested list of structures

2017-11-14 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252611#comment-16252611 ] DB Tsai commented on SPARK-22231: - Thanks [~jeremyrsmith] for adding more details. There are couple

[jira] [Commented] (SPARK-20653) Add auto-cleanup of old elements to the new app state store

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252560#comment-16252560 ] Apache Spark commented on SPARK-20653: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20653) Add auto-cleanup of old elements to the new app state store

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20653: Assignee: (was: Apache Spark) > Add auto-cleanup of old elements to the new app state

[jira] [Assigned] (SPARK-20653) Add auto-cleanup of old elements to the new app state store

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20653: Assignee: Apache Spark > Add auto-cleanup of old elements to the new app state store >

[jira] [Assigned] (SPARK-22520) Support code generation also for complex CASE WHEN

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22520: Assignee: Apache Spark > Support code generation also for complex CASE WHEN >

[jira] [Assigned] (SPARK-22520) Support code generation also for complex CASE WHEN

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22520: Assignee: (was: Apache Spark) > Support code generation also for complex CASE WHEN >

[jira] [Commented] (SPARK-22520) Support code generation also for complex CASE WHEN

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252561#comment-16252561 ] Apache Spark commented on SPARK-22520: -- User 'mgaido91' has created a pull request for this issue:

[jira] [Created] (SPARK-22520) Support code generation also for complex CASE WHEN

2017-11-14 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-22520: --- Summary: Support code generation also for complex CASE WHEN Key: SPARK-22520 URL: https://issues.apache.org/jira/browse/SPARK-22520 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-20650) Remove JobProgressListener (and other unneeded classes)

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20650: Assignee: Apache Spark > Remove JobProgressListener (and other unneeded classes) >

[jira] [Assigned] (SPARK-20650) Remove JobProgressListener (and other unneeded classes)

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20650: Assignee: (was: Apache Spark) > Remove JobProgressListener (and other unneeded

[jira] [Commented] (SPARK-20650) Remove JobProgressListener (and other unneeded classes)

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252441#comment-16252441 ] Apache Spark commented on SPARK-20650: -- User 'vanzin' has created a pull request for this issue:

[jira] [Resolved] (SPARK-20652) Make SQL UI use new app state store

2017-11-14 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid resolved SPARK-20652. -- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 19681

[jira] [Assigned] (SPARK-20652) Make SQL UI use new app state store

2017-11-14 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid reassigned SPARK-20652: Assignee: Marcelo Vanzin > Make SQL UI use new app state store >

[jira] [Commented] (SPARK-22519) Remove unnecessary stagingDirPath null check in ApplicationMaster.cleanupStagingDir()

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252110#comment-16252110 ] Apache Spark commented on SPARK-22519: -- User 'devaraj-kavali' has created a pull request for this

[jira] [Assigned] (SPARK-22519) Remove unnecessary stagingDirPath null check in ApplicationMaster.cleanupStagingDir()

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22519: Assignee: Apache Spark > Remove unnecessary stagingDirPath null check in >

[jira] [Assigned] (SPARK-22519) Remove unnecessary stagingDirPath null check in ApplicationMaster.cleanupStagingDir()

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22519: Assignee: (was: Apache Spark) > Remove unnecessary stagingDirPath null check in >

[jira] [Updated] (SPARK-22519) Remove unnecessary stagingDirPath null check in ApplicationMaster.cleanupStagingDir()

2017-11-14 Thread Devaraj K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated SPARK-22519: -- Summary: Remove unnecessary stagingDirPath null check in ApplicationMaster.cleanupStagingDir() (was:

[jira] [Updated] (SPARK-22519) Remove unnecessary stagingDirPath null check in ApplicationMaster.cleanupStagingDir()

2017-11-14 Thread Devaraj K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated SPARK-22519: -- Priority: Trivial (was: Minor) > Remove unnecessary stagingDirPath null check in >

[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version

2017-11-14 Thread Aihua Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16252007#comment-16252007 ] Aihua Xu commented on SPARK-18673: -- HIVE-15016 has been committed, so now Hive supports hadoop-3. Can

[jira] [Commented] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-11-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251990#comment-16251990 ] Sean Owen commented on SPARK-19371: --- The question is whether this is something that needs a change in

[jira] [Updated] (SPARK-22505) toDF() / createDataFrame() type inference doesn't work as expected

2017-11-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-22505: -- Issue Type: Improvement (was: Bug) > toDF() / createDataFrame() type inference

[jira] [Updated] (SPARK-22505) toDF() / createDataFrame() type inference doesn't work as expected

2017-11-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-22505: -- Affects Version/s: 2.3.0 > toDF() / createDataFrame() type inference doesn't work as

[jira] [Comment Edited] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-11-14 Thread Thunder Stumpges (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251971#comment-16251971 ] Thunder Stumpges edited comment on SPARK-19371 at 11/14/17 7:04 PM:

[jira] [Commented] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-11-14 Thread Thunder Stumpges (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251971#comment-16251971 ] Thunder Stumpges commented on SPARK-19371: -- Hi Sean, >From my perspective, you can attach

[jira] [Updated] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-11-14 Thread Thunder Stumpges (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thunder Stumpges updated SPARK-19371: - Attachment: Unbalanced RDD Blocks, and resulting task imbalance.png > Cannot spread

[jira] [Updated] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-11-14 Thread Thunder Stumpges (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thunder Stumpges updated SPARK-19371: - Attachment: Unbalanced RDD Blocks, and resulting task imbalance.png > Cannot spread

[jira] [Commented] (SPARK-22519) ApplicationMaster.cleanupStagingDir() throws NPE when SPARK_YARN_STAGING_DIR env var is not available

2017-11-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251929#comment-16251929 ] Marcelo Vanzin commented on SPARK-22519: Yeah, that check seems bogus, but

[jira] [Commented] (SPARK-22519) ApplicationMaster.cleanupStagingDir() throws NPE when SPARK_YARN_STAGING_DIR env var is not available

2017-11-14 Thread Devaraj K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251925#comment-16251925 ] Devaraj K commented on SPARK-22519: --- It is not an usual case, I have seen this NPE while working

[jira] [Commented] (SPARK-22519) ApplicationMaster.cleanupStagingDir() throws NPE when SPARK_YARN_STAGING_DIR env var is not available

2017-11-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251866#comment-16251866 ] Marcelo Vanzin commented on SPARK-22519: Question is, when would it be null since it's set by

[jira] [Created] (SPARK-22519) ApplicationMaster.cleanupStagingDir() throws NPE when SPARK_YARN_STAGING_DIR env var is not available

2017-11-14 Thread Devaraj K (JIRA)
Devaraj K created SPARK-22519: - Summary: ApplicationMaster.cleanupStagingDir() throws NPE when SPARK_YARN_STAGING_DIR env var is not available Key: SPARK-22519 URL: https://issues.apache.org/jira/browse/SPARK-22519

[jira] [Commented] (SPARK-15428) Disable support for multiple streaming aggregations

2017-11-14 Thread Shashidhar Reddy Sudi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251836#comment-16251836 ] Shashidhar Reddy Sudi commented on SPARK-15428: --- Any plan for this availability in the

[jira] [Assigned] (SPARK-20649) Simplify REST API class hierarchy

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20649: Assignee: (was: Apache Spark) > Simplify REST API class hierarchy >

[jira] [Commented] (SPARK-20649) Simplify REST API class hierarchy

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251825#comment-16251825 ] Apache Spark commented on SPARK-20649: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20649) Simplify REST API class hierarchy

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20649: Assignee: Apache Spark > Simplify REST API class hierarchy >

[jira] [Comment Edited] (SPARK-22267) Spark SQL incorrectly reads ORC file when column order is different

2017-11-14 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251788#comment-16251788 ] Dongjoon Hyun edited comment on SPARK-22267 at 11/14/17 5:36 PM: -

[jira] [Commented] (SPARK-22267) Spark SQL incorrectly reads ORC file when column order is different

2017-11-14 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251788#comment-16251788 ] Dongjoon Hyun commented on SPARK-22267: --- [~cloud_fan]. This issue comes from old Hive reader path.

[jira] [Assigned] (SPARK-22431) Creating Permanent view with illegal type

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22431: Assignee: Apache Spark > Creating Permanent view with illegal type >

[jira] [Commented] (SPARK-22431) Creating Permanent view with illegal type

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251739#comment-16251739 ] Apache Spark commented on SPARK-22431: -- User 'skambha' has created a pull request for this issue:

[jira] [Assigned] (SPARK-22431) Creating Permanent view with illegal type

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22431: Assignee: (was: Apache Spark) > Creating Permanent view with illegal type >

[jira] [Assigned] (SPARK-20648) Make Jobs and Stages pages use the new app state store

2017-11-14 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid reassigned SPARK-20648: Assignee: Marcelo Vanzin > Make Jobs and Stages pages use the new app state store >

[jira] [Resolved] (SPARK-20648) Make Jobs and Stages pages use the new app state store

2017-11-14 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid resolved SPARK-20648. -- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 19698

[jira] [Assigned] (SPARK-2489) Unsupported parquet datatype optional fixed_len_byte_array

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-2489: --- Assignee: Apache Spark > Unsupported parquet datatype optional fixed_len_byte_array >

[jira] [Assigned] (SPARK-2489) Unsupported parquet datatype optional fixed_len_byte_array

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-2489: --- Assignee: (was: Apache Spark) > Unsupported parquet datatype optional

[jira] [Assigned] (SPARK-17074) generate equi-height histogram for column

2017-11-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-17074: --- Assignee: Zhenhua Wang > generate equi-height histogram for column >

[jira] [Resolved] (SPARK-17074) generate equi-height histogram for column

2017-11-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17074. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 19479

[jira] [Commented] (SPARK-22346) Update VectorAssembler to work with Structured Streaming

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251518#comment-16251518 ] Apache Spark commented on SPARK-22346: -- User 'MrBago' has created a pull request for this issue:

[jira] [Assigned] (SPARK-22346) Update VectorAssembler to work with Structured Streaming

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22346: Assignee: (was: Apache Spark) > Update VectorAssembler to work with Structured

[jira] [Assigned] (SPARK-22346) Update VectorAssembler to work with Structured Streaming

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22346: Assignee: Apache Spark > Update VectorAssembler to work with Structured Streaming >

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251457#comment-16251457 ] Li Yuanjian commented on SPARK-2926: I just giving a preview PR above, I'll collect more suggestions

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251452#comment-16251452 ] Apache Spark commented on SPARK-2926: - User 'xuanyuanking' has created a pull request for this issue:

[jira] [Commented] (SPARK-15428) Disable support for multiple streaming aggregations

2017-11-14 Thread Hristo Angelov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251407#comment-16251407 ] Hristo Angelov commented on SPARK-15428: [~tdas] Is enabling of multiple aggregations in

[jira] [Issue Comment Deleted] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-2926: --- Comment: was deleted (was: The follow up work for SortShuffleReader in current master branch, detail

[jira] [Updated] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-2926: --- Attachment: SortBasedShuffleReader on Spark 2.x.pdf The follow up work for SortShuffleReader in

[jira] [Comment Edited] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251398#comment-16251398 ] Li Yuanjian edited comment on SPARK-2926 at 11/14/17 1:54 PM: -- During our

[jira] [Comment Edited] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251398#comment-16251398 ] Li Yuanjian edited comment on SPARK-2926 at 11/14/17 1:53 PM: -- During our

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251398#comment-16251398 ] Li Yuanjian commented on SPARK-2926: During our work of migrating some old Hadoop job to Spark, I

[jira] [Updated] (SPARK-21337) SQL which has large ‘case when’ expressions may cause code generation beyond 64KB

2017-11-14 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-21337: - Issue Type: Sub-task (was: Bug) Parent: SPARK-22510 > SQL which has large ‘case

[jira] [Commented] (SPARK-4502) Spark SQL reads unneccesary nested fields from Parquet

2017-11-14 Thread Damian Momot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251354#comment-16251354 ] Damian Momot commented on SPARK-4502: - Well this PR is ready:

[jira] [Updated] (SPARK-22516) CSV Read breaks: When "multiLine" = "true", if "comment" option is set as last line's first character

2017-11-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-22516: -- Priority: Minor (was: Major) > CSV Read breaks: When "multiLine" = "true", if "comment" option is set

[jira] [Updated] (SPARK-22518) Make default cache storage level configurable

2017-11-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-22518: -- Issue Type: Improvement (was: Bug) You can choose the storage level with persist(). You can of course

[jira] [Commented] (SPARK-22267) Spark SQL incorrectly reads ORC file when column order is different

2017-11-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251305#comment-16251305 ] Wenchen Fan commented on SPARK-22267: - [~dongjoon] will this be fixed by the new orc reader? > Spark

[jira] [Assigned] (SPARK-17310) Disable Parquet's record-by-record filter in normal parquet reader and do it in Spark-side

2017-11-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-17310: --- Assignee: Hyukjin Kwon > Disable Parquet's record-by-record filter in normal parquet reader

[jira] [Resolved] (SPARK-17310) Disable Parquet's record-by-record filter in normal parquet reader and do it in Spark-side

2017-11-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17310. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 15049

[jira] [Assigned] (SPARK-22267) Spark SQL incorrectly reads ORC file when column order is different

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22267: Assignee: (was: Apache Spark) > Spark SQL incorrectly reads ORC file when column

[jira] [Assigned] (SPARK-22267) Spark SQL incorrectly reads ORC file when column order is different

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22267: Assignee: Apache Spark > Spark SQL incorrectly reads ORC file when column order is

[jira] [Commented] (SPARK-22267) Spark SQL incorrectly reads ORC file when column order is different

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251250#comment-16251250 ] Apache Spark commented on SPARK-22267: -- User 'mpetruska' has created a pull request for this issue:

[jira] [Commented] (SPARK-22491) union all can't execute parallel with group by

2017-11-14 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251194#comment-16251194 ] Liang-Chi Hsieh commented on SPARK-22491: - If the aggregation is removed, there is no shuffle

[jira] [Created] (SPARK-22518) Make default cache storage level configurable

2017-11-14 Thread Rares Mirica (JIRA)
Rares Mirica created SPARK-22518: Summary: Make default cache storage level configurable Key: SPARK-22518 URL: https://issues.apache.org/jira/browse/SPARK-22518 Project: Spark Issue Type:

[jira] [Created] (SPARK-22517) NullPointerException in ShuffleExternalSorter.spill()

2017-11-14 Thread Andreas Maier (JIRA)
Andreas Maier created SPARK-22517: - Summary: NullPointerException in ShuffleExternalSorter.spill() Key: SPARK-22517 URL: https://issues.apache.org/jira/browse/SPARK-22517 Project: Spark

[jira] [Assigned] (SPARK-22515) Estimation relation size based on numRows * rowSize

2017-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22515: Assignee: Apache Spark > Estimation relation size based on numRows * rowSize >

  1   2   >