[jira] [Commented] (SPARK-17071) Fetch Parquet schema within driver-side when there is single file to touch without another Spark job

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422241#comment-15422241 ] Apache Spark commented on SPARK-17071: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-17071) Fetch Parquet schema within driver-side when there is single file to touch without another Spark job

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17071: Assignee: (was: Apache Spark) > Fetch Parquet schema within driver-side when there is

[jira] [Assigned] (SPARK-17071) Fetch Parquet schema within driver-side when there is single file to touch without another Spark job

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17071: Assignee: Apache Spark > Fetch Parquet schema within driver-side when there is single

[jira] [Created] (SPARK-17071) Fetch Parquet schema within driver-side when there is single file to touch without another Spark job

2016-08-15 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-17071: Summary: Fetch Parquet schema within driver-side when there is single file to touch without another Spark job Key: SPARK-17071 URL:

[jira] [Closed] (SPARK-16760) Pass 'jobId' to Task

2016-08-15 Thread Weiqing Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang closed SPARK-16760. Resolution: Duplicate The code for this jira has been put into the PR of SPARK-16757 > Pass

[jira] [Resolved] (SPARK-16916) serde/storage properties should not have limitations

2016-08-15 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-16916. -- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14506

[jira] [Assigned] (SPARK-16757) Set up caller context to HDFS

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16757: Assignee: (was: Apache Spark) > Set up caller context to HDFS >

[jira] [Commented] (SPARK-16757) Set up caller context to HDFS

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422160#comment-15422160 ] Apache Spark commented on SPARK-16757: -- User 'Sherry302' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16757) Set up caller context to HDFS

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16757: Assignee: Apache Spark > Set up caller context to HDFS > - >

[jira] [Commented] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422141#comment-15422141 ] Apache Spark commented on SPARK-5928: - User 'witgo' has created a pull request for this issue:

[jira] [Assigned] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-5928: --- Assignee: (was: Apache Spark) > Remote Shuffle Blocks cannot be more than 2 GB >

[jira] [Assigned] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-5928: --- Assignee: Apache Spark > Remote Shuffle Blocks cannot be more than 2 GB >

[jira] [Updated] (SPARK-17070) Zookeeper server refused to accept the client (mesos-master)

2016-08-15 Thread Anh Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anh Nguyen updated SPARK-17070: --- Description: I started zookepper server: ./bin/zkServer.sh start-foreground conf/cnf_zoo.cfg and

[jira] [Created] (SPARK-17070) Zookeeper server refused to accept the client (mesos-master)

2016-08-15 Thread Anh Nguyen (JIRA)
Anh Nguyen created SPARK-17070: -- Summary: Zookeeper server refused to accept the client (mesos-master) Key: SPARK-17070 URL: https://issues.apache.org/jira/browse/SPARK-17070 Project: Spark

[jira] [Commented] (SPARK-16990) Define the data structure to hold the statistics for CBO

2016-08-15 Thread Ron Hu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422105#comment-15422105 ] Ron Hu commented on SPARK-16990: We will follow the agreed design spec and add the following information

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422058#comment-15422058 ] Jeff Zhang commented on SPARK-17054: I have single node hadoop cluster in my laptop, and I run R

[jira] [Commented] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422051#comment-15422051 ] Liwei Lin commented on SPARK-17061: --- This can be reproduced against the master branch; let me look into

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-15 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422020#comment-15422020 ] Shivaram Venkataraman commented on SPARK-17054: --- And is this connecting to a remote YARN

[jira] [Assigned] (SPARK-17068) Retain view visibility information through out Analysis

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17068: Assignee: Herman van Hovell (was: Apache Spark) > Retain view visibility information

[jira] [Commented] (SPARK-17068) Retain view visibility information through out Analysis

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421917#comment-15421917 ] Apache Spark commented on SPARK-17068: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17068) Retain view visibility information through out Analysis

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17068: Assignee: Apache Spark (was: Herman van Hovell) > Retain view visibility information

[jira] [Commented] (SPARK-16578) Configurable hostname for RBackend

2016-08-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421878#comment-15421878 ] Jeff Zhang commented on SPARK-16578: I think this feature can also be applied in pyspark. >

[jira] [Assigned] (SPARK-17069) Expose spark.range() as table-valued function in SQL

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17069: Assignee: (was: Apache Spark) > Expose spark.range() as table-valued function in SQL

[jira] [Commented] (SPARK-17069) Expose spark.range() as table-valued function in SQL

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421853#comment-15421853 ] Apache Spark commented on SPARK-17069: -- User 'ericl' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17069) Expose spark.range() as table-valued function in SQL

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17069: Assignee: Apache Spark > Expose spark.range() as table-valued function in SQL >

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421851#comment-15421851 ] Jeff Zhang commented on SPARK-17054: Here's the command I run. {code} bin/spark-submit --master

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421846#comment-15421846 ] Jeff Zhang commented on SPARK-17054: Do you run it as yarn-cluster mode ? > SparkR can not run in

[jira] [Created] (SPARK-17069) Expose spark.range() as table-valued function in SQL

2016-08-15 Thread Eric Liang (JIRA)
Eric Liang created SPARK-17069: -- Summary: Expose spark.range() as table-valued function in SQL Key: SPARK-17069 URL: https://issues.apache.org/jira/browse/SPARK-17069 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-17065) Improve the error message when encountering an incompatible DataSourceRegister

2016-08-15 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-17065. -- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull request

[jira] [Commented] (SPARK-16158) Support pluggable dynamic allocation heuristics

2016-08-15 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421803#comment-15421803 ] Thomas Graves commented on SPARK-16158: --- seems like an ok idea to me, but you have to make sure to

[jira] [Commented] (SPARK-17038) StreamingSource reports metrics for lastCompletedBatch instead of lastReceivedBatch

2016-08-15 Thread Xin Ren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421789#comment-15421789 ] Xin Ren commented on SPARK-17038: - hi [~ozzieba] if you don't have time, I can just submit a quick path

[jira] [Commented] (SPARK-16964) Remove private[sql] and private[spark] from sql.execution package

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421783#comment-15421783 ] Apache Spark commented on SPARK-16964: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Commented] (SPARK-16669) Partition pruning for metastore relation size estimates for better join selection.

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421782#comment-15421782 ] Apache Spark commented on SPARK-16669: -- User 'Parth-Brahmbhatt' has created a pull request for this

[jira] [Created] (SPARK-17068) Retain view visibility information through out Analysis

2016-08-15 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-17068: - Summary: Retain view visibility information through out Analysis Key: SPARK-17068 URL: https://issues.apache.org/jira/browse/SPARK-17068 Project: Spark

[jira] [Commented] (SPARK-17066) dateFormat should be used when writing dataframes as csv files

2016-08-15 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421758#comment-15421758 ] Barry Becker commented on SPARK-17066: -- Yes, I think its fair to say this is a subset of

[jira] [Commented] (SPARK-10931) PySpark ML Models should contain Param values

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421753#comment-15421753 ] Apache Spark commented on SPARK-10931: -- User 'evanyc15' has created a pull request for this issue:

[jira] [Commented] (SPARK-16964) Remove private[sql] and private[spark] from sql.execution package

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421746#comment-15421746 ] Apache Spark commented on SPARK-16964: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Commented] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421731#comment-15421731 ] Sean Owen commented on SPARK-16320: --- I think that was the problem being solved there though, right?

[jira] [Comment Edited] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-15 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421728#comment-15421728 ] Maciej Bryński edited comment on SPARK-16320 at 8/15/16 9:42 PM: - Maybe

[jira] [Commented] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-15 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421728#comment-15421728 ] Maciej Bryński commented on SPARK-16320: Maybe we can change SPARK-12384 a little bit and set

[jira] [Commented] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421719#comment-15421719 ] Sean Owen commented on SPARK-16320: --- I see, I wonder if this deserves a bit of documentation in the

[jira] [Commented] (SPARK-17066) dateFormat should be used when writing dataframes as csv files

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421723#comment-15421723 ] Sean Owen commented on SPARK-17066: --- I think this is a subset of

[jira] [Updated] (SPARK-17066) dateFormat should be used when writing dataframes as csv files

2016-08-15 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Barry Becker updated SPARK-17066: - Description: I noticed this when running tests after pulling and building @lw-lin 's PR

[jira] [Commented] (SPARK-16158) Support pluggable dynamic allocation heuristics

2016-08-15 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421707#comment-15421707 ] Reynold Xin commented on SPARK-16158: - Actually seems like a great idea, if just for modularity of

[jira] [Commented] (SPARK-17067) Revocable resource support

2016-08-15 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421698#comment-15421698 ] Michael Gummelt commented on SPARK-17067: - Add revocable resource support:

[jira] [Commented] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-15 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421699#comment-15421699 ] Maciej Bryński commented on SPARK-16320: [~srowen], [~michael], I found the reason why G1GC with

[jira] [Created] (SPARK-17067) Revocable resource support

2016-08-15 Thread Michael Gummelt (JIRA)
Michael Gummelt created SPARK-17067: --- Summary: Revocable resource support Key: SPARK-17067 URL: https://issues.apache.org/jira/browse/SPARK-17067 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-17066) dateFormat should be used when writing dataframes as csv files

2016-08-15 Thread Barry Becker (JIRA)
Barry Becker created SPARK-17066: Summary: dateFormat should be used when writing dataframes as csv files Key: SPARK-17066 URL: https://issues.apache.org/jira/browse/SPARK-17066 Project: Spark

[jira] [Commented] (SPARK-17039) cannot read null dates from csv file

2016-08-15 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421688#comment-15421688 ] Barry Becker commented on SPARK-17039: -- I did notice that

[jira] [Assigned] (SPARK-17065) Improve the error message when encountering an incompatible DataSourceRegister

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17065: Assignee: Apache Spark (was: Shixiong Zhu) > Improve the error message when encountering

[jira] [Assigned] (SPARK-17065) Improve the error message when encountering an incompatible DataSourceRegister

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17065: Assignee: Shixiong Zhu (was: Apache Spark) > Improve the error message when encountering

[jira] [Commented] (SPARK-17065) Improve the error message when encountering an incompatible DataSourceRegister

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421671#comment-15421671 ] Apache Spark commented on SPARK-17065: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Created] (SPARK-17065) Improve the error message when encountering an incompatible DataSourceRegister

2016-08-15 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-17065: Summary: Improve the error message when encountering an incompatible DataSourceRegister Key: SPARK-17065 URL: https://issues.apache.org/jira/browse/SPARK-17065

[jira] [Commented] (SPARK-17039) cannot read null dates from csv file

2016-08-15 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421611#comment-15421611 ] Barry Becker commented on SPARK-17039: -- I was able to pull the patch

[jira] [Commented] (SPARK-17064) Reconsider spark.job.interruptOnCancel

2016-08-15 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421610#comment-15421610 ] Reynold Xin commented on SPARK-17064: - Yea so my worry is that even if Hadoop has addressed this

[jira] [Resolved] (SPARK-16700) StructType doesn't accept Python dicts anymore

2016-08-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-16700. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14469

[jira] [Comment Edited] (SPARK-16321) [Spark 2.0] Performance regression when reading parquet and using PPD and non-vectorized reader

2016-08-15 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421468#comment-15421468 ] Maciej Bryński edited comment on SPARK-16321 at 8/15/16 7:34 PM: -

[jira] [Resolved] (SPARK-16321) [Spark 2.0] Performance regression when reading parquet and using PPD and non-vectorized reader

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16321. --- Resolution: Fixed Assignee: Liang-Chi Hsieh Fix Version/s: 2.1.0

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-08-15 Thread Sital Kedia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421524#comment-15421524 ] Sital Kedia commented on SPARK-16922: - Yes, I have the above mentioned PR as well. > Query with

[jira] [Updated] (SPARK-17064) Reconsider spark.job.interruptOnCancel

2016-08-15 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hamstra updated SPARK-17064: - Description: There is a frequent need or desire in Spark to cancel already running Tasks. This

[jira] [Commented] (SPARK-17064) Reconsider spark.job.interruptOnCancel

2016-08-15 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421503#comment-15421503 ] Mark Hamstra commented on SPARK-17064: -- [~kayousterhout] [~r...@databricks.com] [~imranr] >

[jira] [Created] (SPARK-17064) Reconsider spark.job.interruptOnCancel

2016-08-15 Thread Mark Hamstra (JIRA)
Mark Hamstra created SPARK-17064: Summary: Reconsider spark.job.interruptOnCancel Key: SPARK-17064 URL: https://issues.apache.org/jira/browse/SPARK-17064 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-08-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421484#comment-15421484 ] Davies Liu commented on SPARK-16922: Have you also have this one?

[jira] [Commented] (SPARK-11714) Make Spark on Mesos honor port restrictions

2016-08-15 Thread Charles Allen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421478#comment-15421478 ] Charles Allen commented on SPARK-11714: --- Awesome! Thanks guys! > Make Spark on Mesos honor port

[jira] [Commented] (SPARK-16321) [Spark 2.0] Performance regression when reading parquet and using PPD and non-vectorized reader

2016-08-15 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421468#comment-15421468 ] Maciej Bryński commented on SPARK-16321: [~davies] I think you mark this one as resolved. >

[jira] [Commented] (SPARK-16087) Spark Hangs When Using Union With Persisted Hadoop RDD

2016-08-15 Thread Nick Sakovich (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421457#comment-15421457 ] Nick Sakovich commented on SPARK-16087: --- [~kevinconaway], [~srowen] today i met the same issue ..

[jira] [Resolved] (SPARK-16671) Merge variable substitution code in core and SQL

2016-08-15 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-16671. Resolution: Fixed Assignee: Marcelo Vanzin Fix Version/s: 2.1.0 > Merge

[jira] [Commented] (SPARK-16508) Fix documentation warnings found by R CMD check

2016-08-15 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421393#comment-15421393 ] Shivaram Venkataraman commented on SPARK-16508: --- We merged

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-15 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421351#comment-15421351 ] Shivaram Venkataraman commented on SPARK-17054: --- [~zjffdu] We added this new feature as a

[jira] [Commented] (SPARK-16578) Configurable hostname for RBackend

2016-08-15 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421336#comment-15421336 ] Shivaram Venkataraman commented on SPARK-16578: --- [~zjffdu] The main goal I had for this

[jira] [Assigned] (SPARK-17063) MSCK REPAIR TABLE is super slow with Hive metastore

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17063: Assignee: Apache Spark (was: Davies Liu) > MSCK REPAIR TABLE is super slow with Hive

[jira] [Assigned] (SPARK-17063) MSCK REPAIR TABLE is super slow with Hive metastore

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17063: Assignee: Davies Liu (was: Apache Spark) > MSCK REPAIR TABLE is super slow with Hive

[jira] [Commented] (SPARK-17063) MSCK REPAIR TABLE is super slow with Hive metastore

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421335#comment-15421335 ] Apache Spark commented on SPARK-17063: -- User 'davies' has created a pull request for this issue:

[jira] [Created] (SPARK-17063) MSCK REPAIR TABLE is super slow with Hive metastore

2016-08-15 Thread Davies Liu (JIRA)
Davies Liu created SPARK-17063: -- Summary: MSCK REPAIR TABLE is super slow with Hive metastore Key: SPARK-17063 URL: https://issues.apache.org/jira/browse/SPARK-17063 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16158) Support pluggable dynamic allocation heuristics

2016-08-15 Thread Nezih Yigitbasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421327#comment-15421327 ] Nezih Yigitbasi commented on SPARK-16158: - [~andrewor14] [~rxin] how do you guys feel about this?

[jira] [Assigned] (SPARK-17062) Add --conf to mesos dispatcher process

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17062: Assignee: Apache Spark > Add --conf to mesos dispatcher process >

[jira] [Assigned] (SPARK-17062) Add --conf to mesos dispatcher process

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17062: Assignee: (was: Apache Spark) > Add --conf to mesos dispatcher process >

[jira] [Commented] (SPARK-17062) Add --conf to mesos dispatcher process

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421315#comment-15421315 ] Apache Spark commented on SPARK-17062: -- User 'skonto' has created a pull request for this issue:

[jira] [Created] (SPARK-17062) Add --conf to mesos dispatcher process

2016-08-15 Thread Stavros Kontopoulos (JIRA)
Stavros Kontopoulos created SPARK-17062: --- Summary: Add --conf to mesos dispatcher process Key: SPARK-17062 URL: https://issues.apache.org/jira/browse/SPARK-17062 Project: Spark Issue

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421306#comment-15421306 ] Jamie Hutton commented on SPARK-15002: -- Hi Sean, Looking at the stack trace on the executors, quite

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-15 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421288#comment-15421288 ] Miao Wang commented on SPARK-17054: --- I use Mac and build from source. sparkR works fine. How to

[jira] [Updated] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Hutton updated SPARK-17061: - Affects Version/s: 2.0.1 > Incorrect results returned following a join of two datasets and a map

[jira] [Reopened] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Hutton reopened SPARK-17061: -- Tested in 2.0.1 nightly snapshot and still not resolved so this appears not to be a dupe >

[jira] [Commented] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421277#comment-15421277 ] Jamie Hutton commented on SPARK-17061: -- I have just downloaded 2.0.1 nightly build from here:

[jira] [Updated] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Hutton updated SPARK-17061: - Priority: Critical (was: Blocker) > Incorrect results returned following a join of two datasets

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421234#comment-15421234 ] Sean Owen commented on SPARK-15002: --- Yeah I just mean it ought to be fine and there's no obvious reason

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421192#comment-15421192 ] Jamie Hutton commented on SPARK-15002: -- I took a look at the executors and there is nothing in a

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421186#comment-15421186 ] Sean Owen commented on SPARK-15002: --- It would be the workers. Oh, and I meant RUNNING rather than

[jira] [Commented] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421178#comment-15421178 ] Sean Owen commented on SPARK-17061: --- It is likely to be -- see also SPARK-17043. At least, I'd try a

[jira] [Assigned] (SPARK-17059) Allow FileFormat to specify partition pruning strategy

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17059: Assignee: Apache Spark > Allow FileFormat to specify partition pruning strategy >

[jira] [Commented] (SPARK-17059) Allow FileFormat to specify partition pruning strategy

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421183#comment-15421183 ] Apache Spark commented on SPARK-17059: -- User 'andreweduffy' has created a pull request for this

[jira] [Assigned] (SPARK-17059) Allow FileFormat to specify partition pruning strategy

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17059: Assignee: (was: Apache Spark) > Allow FileFormat to specify partition pruning

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421182#comment-15421182 ] Jamie Hutton commented on SPARK-15002: -- GC time is 1.5seconds and does not increase whilst in the

[jira] [Commented] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421169#comment-15421169 ] Jamie Hutton commented on SPARK-17061: -- Apologies for setting blocker. I wont use that again. Is

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421164#comment-15421164 ] Sean Owen commented on SPARK-15002: --- In the UI, go look at a heap dump of the pegged executor. It

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421160#comment-15421160 ] Jamie Hutton commented on SPARK-15002: -- Hi Sean. There are no errors. When i run the code above the

[jira] [Resolved] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17061. --- Resolution: Duplicate Search JIRA please, and don't set blocker > Incorrect results returned

[jira] [Assigned] (SPARK-16995) TreeNodeException when flat mapping RelationalGroupedDataset created from DataFrame containing a column created with lit/expr

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16995: Assignee: (was: Apache Spark) > TreeNodeException when flat mapping

[jira] [Assigned] (SPARK-16995) TreeNodeException when flat mapping RelationalGroupedDataset created from DataFrame containing a column created with lit/expr

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16995: Assignee: Apache Spark > TreeNodeException when flat mapping RelationalGroupedDataset

[jira] [Commented] (SPARK-16995) TreeNodeException when flat mapping RelationalGroupedDataset created from DataFrame containing a column created with lit/expr

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421146#comment-15421146 ] Apache Spark commented on SPARK-16995: -- User 'viirya' has created a pull request for this issue:

  1   2   >