[jira] [Updated] (SPARK-16934) Update LogisticCostAggregator serialization code to make it consistent with LinearRegression

2016-08-15 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16934: --- Summary: Update LogisticCostAggregator serialization code to make it consistent with

[jira] [Updated] (SPARK-16934) Update LogisticCostAggregator serialization code to make it consistent with LinearRegression

2016-08-15 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16934: --- Description: Update LogisticCostAggregator serialization code to make it consistent with

[jira] [Resolved] (SPARK-11714) Make Spark on Mesos honor port restrictions

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-11714. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 11157

[jira] [Commented] (SPARK-16781) java launched by PySpark as gateway may not be the same java used in the spark environment

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420747#comment-15420747 ] Sean Owen commented on SPARK-16781: --- Yeah, I think this is something that's up to the execution

[jira] [Commented] (SPARK-6235) Address various 2G limits

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420746#comment-15420746 ] Sean Owen commented on SPARK-6235: -- How does this relate to the existing subtasks and their

[jira] [Commented] (SPARK-12261) pyspark crash for large dataset

2016-08-15 Thread Guangyang Nie (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420702#comment-15420702 ] Guangyang Nie commented on SPARK-12261: --- This way exactly solve my problem. Thank you so much! >

[jira] [Updated] (SPARK-11714) Make Spark on Mesos honor port restrictions

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-11714: -- Assignee: Stavros Kontopoulos > Make Spark on Mesos honor port restrictions >

[jira] [Commented] (SPARK-17055) add labelKFold to CrossValidator

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420748#comment-15420748 ] Sean Owen commented on SPARK-17055: --- Hm, when would you want a label to not be present in training but

[jira] [Commented] (SPARK-16578) Configurable hostname for RBackend

2016-08-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420749#comment-15420749 ] Jeff Zhang commented on SPARK-16578: I think one purpose of this ticket is to share the same

[jira] [Updated] (SPARK-16216) CSV data source does not write date and timestamp correctly

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16216: -- Priority: Major (was: Minor) > CSV data source does not write date and timestamp correctly >

[jira] [Updated] (SPARK-16216) CSV data source does not write date and timestamp correctly

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16216: -- Labels: releasenotes (was: ) > CSV data source does not write date and timestamp correctly >

[jira] [Resolved] (SPARK-16821) GraphX MCL algorithm

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16821. --- Resolution: Won't Fix > GraphX MCL algorithm > > > Key:

[jira] [Resolved] (SPARK-16852) RejectedExecutionException when exit at some times

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16852. --- Resolution: Not A Problem > RejectedExecutionException when exit at some times >

[jira] [Resolved] (SPARK-16978) exception (java.lang.VerifyError) is thrown while submiting jar in standalone cluster

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16978. --- Resolution: Not A Problem > exception (java.lang.VerifyError) is thrown while submiting jar in

[jira] [Created] (SPARK-17058) Add maven snapshots-and-staging profile to build/test against staging artifacts

2016-08-15 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-17058: -- Summary: Add maven snapshots-and-staging profile to build/test against staging artifacts Key: SPARK-17058 URL: https://issues.apache.org/jira/browse/SPARK-17058

[jira] [Assigned] (SPARK-17058) Add maven snapshots-and-staging profile to build/test against staging artifacts

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17058: Assignee: Apache Spark > Add maven snapshots-and-staging profile to build/test against

[jira] [Commented] (SPARK-17058) Add maven snapshots-and-staging profile to build/test against staging artifacts

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420856#comment-15420856 ] Apache Spark commented on SPARK-17058: -- User 'steveloughran' has created a pull request for this

[jira] [Assigned] (SPARK-17058) Add maven snapshots-and-staging profile to build/test against staging artifacts

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17058: Assignee: (was: Apache Spark) > Add maven snapshots-and-staging profile to build/test

[jira] [Commented] (SPARK-17055) add labelKFold to CrossValidator

2016-08-15 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420867#comment-15420867 ] Vincent commented on SPARK-17055: - one of the most common tasks is to fit a "model" to a set of training

[jira] [Updated] (SPARK-14387) Enable Hive-1.x ORC compatibility with spark.sql.hive.convertMetastoreOrc

2016-08-15 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-14387: --- Affects Version/s: 2.0.0 Target Version/s: 2.0.1 Component/s: SQL > Enable

[jira] [Resolved] (SPARK-17033) GaussianMixture should use treeAggregate to improve performance

2016-08-15 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-17033. - Resolution: Fixed Assignee: Yanbo Liang Fix Version/s: 2.1.0 > GaussianMixture

[jira] [Resolved] (SPARK-17041) Columns in schema are no longer case sensitive when reading csv file

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17041. --- Resolution: Not A Problem > Columns in schema are no longer case sensitive when reading csv file >

[jira] [Created] (SPARK-17060) Call inner join after outer join will miss rows with null values

2016-08-15 Thread Linbo (JIRA)
Linbo created SPARK-17060: - Summary: Call inner join after outer join will miss rows with null values Key: SPARK-17060 URL: https://issues.apache.org/jira/browse/SPARK-17060 Project: Spark Issue

[jira] [Comment Edited] (SPARK-16781) java launched by PySpark as gateway may not be the same java used in the spark environment

2016-08-15 Thread Michael Berman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420995#comment-15420995 ] Michael Berman edited comment on SPARK-16781 at 8/15/16 2:10 PM: - In

[jira] [Commented] (SPARK-6235) Address various 2G limits

2016-08-15 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421048#comment-15421048 ] Guoqiang Li commented on SPARK-6235: Yes, it contains a lot of minor changes, eg: Replace ByteBuffer

[jira] [Commented] (SPARK-17041) Columns in schema are no longer case sensitive when reading csv file

2016-08-15 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420952#comment-15420952 ] Barry Becker commented on SPARK-17041: -- Yes, that suggestion worked. I added {code}

[jira] [Commented] (SPARK-17041) Columns in schema are no longer case sensitive when reading csv file

2016-08-15 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420994#comment-15420994 ] Dongjoon Hyun commented on SPARK-17041: --- Great! Thank you for confirming. > Columns in schema are

[jira] [Commented] (SPARK-16781) java launched by PySpark as gateway may not be the same java used in the spark environment

2016-08-15 Thread Michael Berman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420995#comment-15420995 ] Michael Berman commented on SPARK-16781: In 0.10.3, py4j introduced an option to use the java

[jira] [Commented] (SPARK-16533) Spark application not handling preemption messages

2016-08-15 Thread Tan N. Le (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421006#comment-15421006 ] Tan N. Le commented on SPARK-16533: --- This issue happens very often with large jobs. This one must be

[jira] [Commented] (SPARK-6235) Address various 2G limits

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421018#comment-15421018 ] Apache Spark commented on SPARK-6235: - User 'witgo' has created a pull request for this issue:

[jira] [Assigned] (SPARK-6235) Address various 2G limits

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6235: --- Assignee: Apache Spark > Address various 2G limits > - > >

[jira] [Assigned] (SPARK-6235) Address various 2G limits

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6235: --- Assignee: (was: Apache Spark) > Address various 2G limits > - >

[jira] [Created] (SPARK-17059) Allow FileFormat to specify partition pruning strategy

2016-08-15 Thread Andrew Duffy (JIRA)
Andrew Duffy created SPARK-17059: Summary: Allow FileFormat to specify partition pruning strategy Key: SPARK-17059 URL: https://issues.apache.org/jira/browse/SPARK-17059 Project: Spark Issue

[jira] [Resolved] (SPARK-16934) Update LogisticCostAggregator serialization code to make it consistent with LinearRegression

2016-08-15 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-16934. - Resolution: Fixed Assignee: Weichen Xu Fix Version/s: 2.1.0 Target

[jira] [Resolved] (SPARK-17060) Call inner join after outer join will miss rows with null values

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17060. --- Resolution: Duplicate SPARK-16991 > Call inner join after outer join will miss rows with null

[jira] [Commented] (SPARK-17036) Hadoop config caching could lead to memory pressure and high CPU usage in thrift server

2016-08-15 Thread Rajesh Balamohan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420984#comment-15420984 ] Rajesh Balamohan commented on SPARK-17036: -- When large number of jobs are run in concurrent

[jira] [Updated] (SPARK-17060) Call inner join after outer join will miss rows with null values

2016-08-15 Thread Linbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Linbo updated SPARK-17060: -- Description: {code:title=test.scala|borderStyle=solid} scala> val df1 = sc.parallelize(Seq((1, 2, 3), (3, 3,

[jira] [Updated] (SPARK-17070) Zookeeper server refused to accept the client (mesos-master)

2016-08-15 Thread Anh Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anh Nguyen updated SPARK-17070: --- Description: I started zookepper server: ./bin/zkServer.sh start-foreground conf/cnf_zoo.cfg and

[jira] [Resolved] (SPARK-16916) serde/storage properties should not have limitations

2016-08-15 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-16916. -- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14506

[jira] [Created] (SPARK-17071) Fetch Parquet schema within driver-side when there is single file to touch without another Spark job

2016-08-15 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-17071: Summary: Fetch Parquet schema within driver-side when there is single file to touch without another Spark job Key: SPARK-17071 URL:

[jira] [Commented] (SPARK-17071) Fetch Parquet schema within driver-side when there is single file to touch without another Spark job

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422241#comment-15422241 ] Apache Spark commented on SPARK-17071: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-17071) Fetch Parquet schema within driver-side when there is single file to touch without another Spark job

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17071: Assignee: (was: Apache Spark) > Fetch Parquet schema within driver-side when there is

[jira] [Created] (SPARK-17070) Zookeeper server refused to accept the client (mesos-master)

2016-08-15 Thread Anh Nguyen (JIRA)
Anh Nguyen created SPARK-17070: -- Summary: Zookeeper server refused to accept the client (mesos-master) Key: SPARK-17070 URL: https://issues.apache.org/jira/browse/SPARK-17070 Project: Spark

[jira] [Assigned] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-5928: --- Assignee: Apache Spark > Remote Shuffle Blocks cannot be more than 2 GB >

[jira] [Updated] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Hutton updated SPARK-15002: - Environment: AWS and Linux VM, both in spark-shell and spark-submit. tested in 1.5.2 and 1.6.

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421141#comment-15421141 ] Jamie Hutton commented on SPARK-15002: -- Also tested in 2.0 > Calling unpersist can cause spark to

[jira] [Resolved] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17061. --- Resolution: Duplicate Search JIRA please, and don't set blocker > Incorrect results returned

[jira] [Commented] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421169#comment-15421169 ] Jamie Hutton commented on SPARK-17061: -- Apologies for setting blocker. I wont use that again. Is

[jira] [Commented] (SPARK-17062) Add --conf to mesos dispatcher process

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421315#comment-15421315 ] Apache Spark commented on SPARK-17062: -- User 'skonto' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17062) Add --conf to mesos dispatcher process

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17062: Assignee: (was: Apache Spark) > Add --conf to mesos dispatcher process >

[jira] [Updated] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Hutton updated SPARK-15002: - Affects Version/s: 2.0.0 > Calling unpersist can cause spark to hang indefinitely when writing

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421182#comment-15421182 ] Jamie Hutton commented on SPARK-15002: -- GC time is 1.5seconds and does not increase whilst in the

[jira] [Updated] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Hutton updated SPARK-17061: - Priority: Critical (was: Blocker) > Incorrect results returned following a join of two datasets

[jira] [Assigned] (SPARK-17062) Add --conf to mesos dispatcher process

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17062: Assignee: Apache Spark > Add --conf to mesos dispatcher process >

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421147#comment-15421147 ] Sean Owen commented on SPARK-15002: --- Do you see any errors? what more can you say about what's hung --

[jira] [Assigned] (SPARK-16995) TreeNodeException when flat mapping RelationalGroupedDataset created from DataFrame containing a column created with lit/expr

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16995: Assignee: (was: Apache Spark) > TreeNodeException when flat mapping

[jira] [Assigned] (SPARK-16995) TreeNodeException when flat mapping RelationalGroupedDataset created from DataFrame containing a column created with lit/expr

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16995: Assignee: Apache Spark > TreeNodeException when flat mapping RelationalGroupedDataset

[jira] [Commented] (SPARK-16995) TreeNodeException when flat mapping RelationalGroupedDataset created from DataFrame containing a column created with lit/expr

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421146#comment-15421146 ] Apache Spark commented on SPARK-16995: -- User 'viirya' has created a pull request for this issue:

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421186#comment-15421186 ] Sean Owen commented on SPARK-15002: --- It would be the workers. Oh, and I meant RUNNING rather than

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421160#comment-15421160 ] Jamie Hutton commented on SPARK-15002: -- Hi Sean. There are no errors. When i run the code above the

[jira] [Commented] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421178#comment-15421178 ] Sean Owen commented on SPARK-17061: --- It is likely to be -- see also SPARK-17043. At least, I'd try a

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421234#comment-15421234 ] Sean Owen commented on SPARK-15002: --- Yeah I just mean it ought to be fine and there's no obvious reason

[jira] [Updated] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Hutton updated SPARK-17061: - Affects Version/s: 2.0.1 > Incorrect results returned following a join of two datasets and a map

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421164#comment-15421164 ] Sean Owen commented on SPARK-15002: --- In the UI, go look at a heap dump of the pegged executor. It

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421192#comment-15421192 ] Jamie Hutton commented on SPARK-15002: -- I took a look at the executors and there is nothing in a

[jira] [Commented] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421277#comment-15421277 ] Jamie Hutton commented on SPARK-17061: -- I have just downloaded 2.0.1 nightly build from here:

[jira] [Commented] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421306#comment-15421306 ] Jamie Hutton commented on SPARK-15002: -- Hi Sean, Looking at the stack trace on the executors, quite

[jira] [Created] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Jamie Hutton (JIRA)
Jamie Hutton created SPARK-17061: Summary: Incorrect results returned following a join of two datasets and a map step where total number of columns >100 Key: SPARK-17061 URL:

[jira] [Commented] (SPARK-17059) Allow FileFormat to specify partition pruning strategy

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421183#comment-15421183 ] Apache Spark commented on SPARK-17059: -- User 'andreweduffy' has created a pull request for this

[jira] [Assigned] (SPARK-17059) Allow FileFormat to specify partition pruning strategy

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17059: Assignee: (was: Apache Spark) > Allow FileFormat to specify partition pruning

[jira] [Assigned] (SPARK-17059) Allow FileFormat to specify partition pruning strategy

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17059: Assignee: Apache Spark > Allow FileFormat to specify partition pruning strategy >

[jira] [Reopened] (SPARK-17061) Incorrect results returned following a join of two datasets and a map step where total number of columns >100

2016-08-15 Thread Jamie Hutton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Hutton reopened SPARK-17061: -- Tested in 2.0.1 nightly snapshot and still not resolved so this appears not to be a dupe >

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-15 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421288#comment-15421288 ] Miao Wang commented on SPARK-17054: --- I use Mac and build from source. sparkR works fine. How to

[jira] [Created] (SPARK-17062) Add --conf to mesos dispatcher process

2016-08-15 Thread Stavros Kontopoulos (JIRA)
Stavros Kontopoulos created SPARK-17062: --- Summary: Add --conf to mesos dispatcher process Key: SPARK-17062 URL: https://issues.apache.org/jira/browse/SPARK-17062 Project: Spark Issue

[jira] [Created] (SPARK-17064) Reconsider spark.job.interruptOnCancel

2016-08-15 Thread Mark Hamstra (JIRA)
Mark Hamstra created SPARK-17064: Summary: Reconsider spark.job.interruptOnCancel Key: SPARK-17064 URL: https://issues.apache.org/jira/browse/SPARK-17064 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-16321) [Spark 2.0] Performance regression when reading parquet and using PPD and non-vectorized reader

2016-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16321. --- Resolution: Fixed Assignee: Liang-Chi Hsieh Fix Version/s: 2.1.0

[jira] [Commented] (SPARK-17063) MSCK REPAIR TABLE is super slow with Hive metastore

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421335#comment-15421335 ] Apache Spark commented on SPARK-17063: -- User 'davies' has created a pull request for this issue:

[jira] [Commented] (SPARK-17064) Reconsider spark.job.interruptOnCancel

2016-08-15 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421503#comment-15421503 ] Mark Hamstra commented on SPARK-17064: -- [~kayousterhout] [~r...@databricks.com] [~imranr] >

[jira] [Updated] (SPARK-17064) Reconsider spark.job.interruptOnCancel

2016-08-15 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hamstra updated SPARK-17064: - Description: There is a frequent need or desire in Spark to cancel already running Tasks. This

[jira] [Comment Edited] (SPARK-16321) [Spark 2.0] Performance regression when reading parquet and using PPD and non-vectorized reader

2016-08-15 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421468#comment-15421468 ] Maciej Bryński edited comment on SPARK-16321 at 8/15/16 7:34 PM: -

[jira] [Commented] (SPARK-16578) Configurable hostname for RBackend

2016-08-15 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421336#comment-15421336 ] Shivaram Venkataraman commented on SPARK-16578: --- [~zjffdu] The main goal I had for this

[jira] [Commented] (SPARK-16508) Fix documentation warnings found by R CMD check

2016-08-15 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421393#comment-15421393 ] Shivaram Venkataraman commented on SPARK-16508: --- We merged

[jira] [Created] (SPARK-17063) MSCK REPAIR TABLE is super slow with Hive metastore

2016-08-15 Thread Davies Liu (JIRA)
Davies Liu created SPARK-17063: -- Summary: MSCK REPAIR TABLE is super slow with Hive metastore Key: SPARK-17063 URL: https://issues.apache.org/jira/browse/SPARK-17063 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-08-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421484#comment-15421484 ] Davies Liu commented on SPARK-16922: Have you also have this one?

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-08-15 Thread Sital Kedia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421524#comment-15421524 ] Sital Kedia commented on SPARK-16922: - Yes, I have the above mentioned PR as well. > Query with

[jira] [Commented] (SPARK-16158) Support pluggable dynamic allocation heuristics

2016-08-15 Thread Nezih Yigitbasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421327#comment-15421327 ] Nezih Yigitbasi commented on SPARK-16158: - [~andrewor14] [~rxin] how do you guys feel about this?

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-15 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421351#comment-15421351 ] Shivaram Venkataraman commented on SPARK-17054: --- [~zjffdu] We added this new feature as a

[jira] [Resolved] (SPARK-16671) Merge variable substitution code in core and SQL

2016-08-15 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-16671. Resolution: Fixed Assignee: Marcelo Vanzin Fix Version/s: 2.1.0 > Merge

[jira] [Commented] (SPARK-16321) [Spark 2.0] Performance regression when reading parquet and using PPD and non-vectorized reader

2016-08-15 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421468#comment-15421468 ] Maciej Bryński commented on SPARK-16321: [~davies] I think you mark this one as resolved. >

[jira] [Commented] (SPARK-11714) Make Spark on Mesos honor port restrictions

2016-08-15 Thread Charles Allen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421478#comment-15421478 ] Charles Allen commented on SPARK-11714: --- Awesome! Thanks guys! > Make Spark on Mesos honor port

[jira] [Assigned] (SPARK-17063) MSCK REPAIR TABLE is super slow with Hive metastore

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17063: Assignee: Apache Spark (was: Davies Liu) > MSCK REPAIR TABLE is super slow with Hive

[jira] [Assigned] (SPARK-17063) MSCK REPAIR TABLE is super slow with Hive metastore

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17063: Assignee: Davies Liu (was: Apache Spark) > MSCK REPAIR TABLE is super slow with Hive

[jira] [Commented] (SPARK-16087) Spark Hangs When Using Union With Persisted Hadoop RDD

2016-08-15 Thread Nick Sakovich (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421457#comment-15421457 ] Nick Sakovich commented on SPARK-16087: --- [~kevinconaway], [~srowen] today i met the same issue ..

[jira] [Commented] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422141#comment-15422141 ] Apache Spark commented on SPARK-5928: - User 'witgo' has created a pull request for this issue:

[jira] [Assigned] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-5928: --- Assignee: (was: Apache Spark) > Remote Shuffle Blocks cannot be more than 2 GB >

[jira] [Assigned] (SPARK-16757) Set up caller context to HDFS

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16757: Assignee: (was: Apache Spark) > Set up caller context to HDFS >

[jira] [Commented] (SPARK-16757) Set up caller context to HDFS

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422160#comment-15422160 ] Apache Spark commented on SPARK-16757: -- User 'Sherry302' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16757) Set up caller context to HDFS

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16757: Assignee: Apache Spark > Set up caller context to HDFS > - >

[jira] [Assigned] (SPARK-17071) Fetch Parquet schema within driver-side when there is single file to touch without another Spark job

2016-08-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17071: Assignee: Apache Spark > Fetch Parquet schema within driver-side when there is single

[jira] [Commented] (SPARK-17039) cannot read null dates from csv file

2016-08-15 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421611#comment-15421611 ] Barry Becker commented on SPARK-17039: -- I was able to pull the patch

  1   2   >