[jira] [Resolved] (SPARK-13749) Faster pivot implementation for many distinct values with two phase aggregation

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-13749. -- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12861

[jira] [Updated] (SPARK-10216) Avoid creating empty files during overwrite into Hive table with group by query

2016-05-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-10216: - Affects Version/s: 2.0.0 > Avoid creating empty files during overwrite into Hive table with

[jira] [Created] (SPARK-15086) Update Java API once the Scala one is finalized

2016-05-02 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-15086: --- Summary: Update Java API once the Scala one is finalized Key: SPARK-15086 URL: https://issues.apache.org/jira/browse/SPARK-15086 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-11293) Spillable collections leak shuffle memory

2016-05-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249679#comment-15249679 ] Reynold Xin edited comment on SPARK-11293 at 5/3/16 4:31 AM: - I was using

[jira] [Updated] (SPARK-13566) Deadlock between MemoryStore and BlockManager

2016-05-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-13566: Target Version/s: 1.6.2 > Deadlock between MemoryStore and BlockManager >

[jira] [Updated] (SPARK-12469) Consistent Accumulators for Spark

2016-05-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12469: Target Version/s: 2.1.0 (was: 2.0.0) > Consistent Accumulators for Spark >

[jira] [Commented] (SPARK-15044) spark-sql will throw "input path does not exist" exception if it handles a partition which exists in hive table, but the path is removed manually

2016-05-02 Thread Xin Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268071#comment-15268071 ] Xin Wu commented on SPARK-15044: Sorry. What I meant was that after I removed the path manually, then did

[jira] [Created] (SPARK-15085) Rename current streaming-kafka artifact to include kafka version

2016-05-02 Thread Cody Koeninger (JIRA)
Cody Koeninger created SPARK-15085: -- Summary: Rename current streaming-kafka artifact to include kafka version Key: SPARK-15085 URL: https://issues.apache.org/jira/browse/SPARK-15085 Project: Spark

[jira] [Commented] (SPARK-15074) Spark shuffle service bottlenecked while fetching large amount of intermediate data

2016-05-02 Thread Sital Kedia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268063#comment-15268063 ] Sital Kedia commented on SPARK-15074: - Sure, I will do that. Thanks for the quick response. > Spark

[jira] [Commented] (SPARK-15074) Spark shuffle service bottlenecked while fetching large amount of intermediate data

2016-05-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268060#comment-15268060 ] Reynold Xin commented on SPARK-15074: - If you want to prototype that and do some testing on your job

[jira] [Commented] (SPARK-13749) Faster pivot implementation for many distinct values with two phase aggregation

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268061#comment-15268061 ] Apache Spark commented on SPARK-13749: -- User 'aray' has created a pull request for this issue:

[jira] [Resolved] (SPARK-15079) Support average/count/sum in LongAccumulator/DoubleAccumulator

2016-05-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-15079. - Resolution: Fixed > Support average/count/sum in LongAccumulator/DoubleAccumulator >

[jira] [Commented] (SPARK-15074) Spark shuffle service bottlenecked while fetching large amount of intermediate data

2016-05-02 Thread Sital Kedia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268039#comment-15268039 ] Sital Kedia commented on SPARK-15074: - Since the index files contain 8 bytes (Long) per reduce task,

[jira] [Assigned] (SPARK-15084) Use builder pattern to create SparkSession in PySpark

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15084: Assignee: Apache Spark > Use builder pattern to create SparkSession in PySpark >

[jira] [Assigned] (SPARK-15084) Use builder pattern to create SparkSession in PySpark

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15084: Assignee: (was: Apache Spark) > Use builder pattern to create SparkSession in PySpark

[jira] [Commented] (SPARK-15084) Use builder pattern to create SparkSession in PySpark

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268037#comment-15268037 ] Apache Spark commented on SPARK-15084: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Created] (SPARK-15084) Use builder pattern to create SparkSession in PySpark

2016-05-02 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-15084: - Summary: Use builder pattern to create SparkSession in PySpark Key: SPARK-15084 URL: https://issues.apache.org/jira/browse/SPARK-15084 Project: Spark

[jira] [Updated] (SPARK-15084) Use builder pattern to create SparkSession in PySpark

2016-05-02 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-15084: -- Issue Type: Sub-task (was: New Feature) Parent: SPARK-13485 > Use builder pattern to

[jira] [Assigned] (SPARK-6339) Support creating temporary tables with DDL

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6339: --- Assignee: (was: Apache Spark) > Support creating temporary tables with DDL >

[jira] [Commented] (SPARK-6339) Support creating temporary tables with DDL

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267999#comment-15267999 ] Apache Spark commented on SPARK-6339: - User 'clockfly' has created a pull request for this issue:

[jira] [Assigned] (SPARK-6339) Support creating temporary tables with DDL

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6339: --- Assignee: Apache Spark > Support creating temporary tables with DDL >

[jira] [Comment Edited] (SPARK-15044) spark-sql will throw "input path does not exist" exception if it handles a partition which exists in hive table, but the path is removed manually

2016-05-02 Thread huangyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267974#comment-15267974 ] huangyu edited comment on SPARK-15044 at 5/3/16 2:51 AM: - I removed the path by

[jira] [Commented] (SPARK-15044) spark-sql will throw "input path does not exist" exception if it handles a partition which exists in hive table, but the path is removed manually

2016-05-02 Thread huangyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267978#comment-15267978 ] huangyu commented on SPARK-15044: - Maybe you can try in Spark 1.6.1, because when I tried in Spark 1.6.1

[jira] [Commented] (SPARK-14476) Show table name or path in string of DataSourceScan

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267975#comment-15267975 ] Yin Huai commented on SPARK-14476: -- btw, for the path, we should hide the authority. Also, if the path

[jira] [Commented] (SPARK-15044) spark-sql will throw "input path does not exist" exception if it handles a partition which exists in hive table, but the path is removed manually

2016-05-02 Thread huangyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267974#comment-15267974 ] huangyu commented on SPARK-15044: - I removed the path by hadoop command,"hadoop fs -rmr

[jira] [Resolved] (SPARK-14685) Properly document heritability of localProperties

2016-05-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-14685. - Resolution: Fixed Assignee: Marcin Tustin Fix Version/s: 2.0.0 > Properly

[jira] [Updated] (SPARK-15083) JobProgressListener should limit memory usage of tasks in stages.

2016-05-02 Thread Zheng Tan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Tan updated SPARK-15083: -- Attachment: Screen Shot 2016-05-01 at 3.50.02 PM.png Screen Shot 2016-05-01 at 3.51.01

[jira] [Created] (SPARK-15083) JobProgressListener should limit memory usage of tasks in stages.

2016-05-02 Thread Zheng Tan (JIRA)
Zheng Tan created SPARK-15083: - Summary: JobProgressListener should limit memory usage of tasks in stages. Key: SPARK-15083 URL: https://issues.apache.org/jira/browse/SPARK-15083 Project: Spark

[jira] [Commented] (SPARK-15074) Spark shuffle service bottlenecked while fetching large amount of intermediate data

2016-05-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267932#comment-15267932 ] Reynold Xin commented on SPARK-15074: - Probably a good idea to explore. How big is the index file?

[jira] [Created] (SPARK-15082) Improve unit test coverage for AccumulatorV2

2016-05-02 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-15082: --- Summary: Improve unit test coverage for AccumulatorV2 Key: SPARK-15082 URL: https://issues.apache.org/jira/browse/SPARK-15082 Project: Spark Issue Type:

[jira] [Created] (SPARK-15081) Move AccumulatorV2 and subclasses into util package

2016-05-02 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-15081: --- Summary: Move AccumulatorV2 and subclasses into util package Key: SPARK-15081 URL: https://issues.apache.org/jira/browse/SPARK-15081 Project: Spark Issue

[jira] [Created] (SPARK-15080) Break copyAndReset into copy and reset

2016-05-02 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-15080: --- Summary: Break copyAndReset into copy and reset Key: SPARK-15080 URL: https://issues.apache.org/jira/browse/SPARK-15080 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-15079) Support average/count/sum in LongAccumulator/DoubleAccumulator

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15079: Assignee: Reynold Xin (was: Apache Spark) > Support average/count/sum in

[jira] [Commented] (SPARK-15079) Support average/count/sum in LongAccumulator/DoubleAccumulator

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267917#comment-15267917 ] Apache Spark commented on SPARK-15079: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-15079) Support average/count/sum in LongAccumulator/DoubleAccumulator

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15079: Assignee: Apache Spark (was: Reynold Xin) > Support average/count/sum in

[jira] [Created] (SPARK-15079) Support average/count/sum in LongAccumulator/DoubleAccumulator

2016-05-02 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-15079: --- Summary: Support average/count/sum in LongAccumulator/DoubleAccumulator Key: SPARK-15079 URL: https://issues.apache.org/jira/browse/SPARK-15079 Project: Spark

[jira] [Commented] (SPARK-15064) Locale support in StopWordsRemover

2016-05-02 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267910#comment-15267910 ] yuhao yang commented on SPARK-15064: Yes, something like that, as long as there's different behavior

[jira] [Comment Edited] (SPARK-15064) Locale support in StopWordsRemover

2016-05-02 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15266965#comment-15266965 ] yuhao yang edited comment on SPARK-15064 at 5/3/16 1:48 AM: Good point. I can

[jira] [Comment Edited] (SPARK-15064) Locale support in StopWordsRemover

2016-05-02 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15266965#comment-15266965 ] yuhao yang edited comment on SPARK-15064 at 5/3/16 1:47 AM: Good point. -I

[jira] [Assigned] (SPARK-15059) Fine-grained class loader lock in ChildFirstURLClassLoader caused dead locks

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15059: Assignee: Apache Spark > Fine-grained class loader lock in ChildFirstURLClassLoader

[jira] [Assigned] (SPARK-15059) Fine-grained class loader lock in ChildFirstURLClassLoader caused dead locks

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15059: Assignee: (was: Apache Spark) > Fine-grained class loader lock in

[jira] [Commented] (SPARK-15059) Fine-grained class loader lock in ChildFirstURLClassLoader caused dead locks

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267894#comment-15267894 ] Apache Spark commented on SPARK-15059: -- User 'tankkyo' has created a pull request for this issue:

[jira] [Commented] (SPARK-14997) Files in subdirectories are incorrectly considered in sqlContext.read.json()

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267891#comment-15267891 ] Apache Spark commented on SPARK-14997: -- User 'tdas' has created a pull request for this issue:

[jira] [Commented] (SPARK-10216) Avoid creating empty files during overwrite into Hive table with group by query

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267886#comment-15267886 ] Apache Spark commented on SPARK-10216: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Resolved] (SPARK-15077) StreamExecution.awaitOffset may take too long because of thread starvation

2016-05-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-15077. -- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12852

[jira] [Updated] (SPARK-15062) Show on DataFrame causes OutOfMemoryError, NegativeArraySizeException or segfault

2016-05-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15062: - Assignee: Bo Meng > Show on DataFrame causes OutOfMemoryError,

[jira] [Resolved] (SPARK-15062) Show on DataFrame causes OutOfMemoryError, NegativeArraySizeException or segfault

2016-05-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-15062. -- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12849

[jira] [Updated] (SPARK-15037) Use SparkSession instead of SQLContext in testsuites

2016-05-02 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-15037: -- Summary: Use SparkSession instead of SQLContext in testsuites (was: Use SparkSession instread of

[jira] [Resolved] (SPARK-15047) Cleanup SQLParser

2016-05-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-15047. - Resolution: Fixed Fix Version/s: 2.0.0 > Cleanup SQLParser > - > >

[jira] [Commented] (SPARK-15039) Kinesis reciever does not work in Yarn

2016-05-02 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267857#comment-15267857 ] Saisai Shao commented on SPARK-15039: - [~ltsai], do you have a exception stack about the failure you

[jira] [Commented] (SPARK-15078) Add all TPCDS 1.4 benchmark queries for SparkSQL

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267852#comment-15267852 ] Apache Spark commented on SPARK-15078: -- User 'sameeragarwal' has created a pull request for this

[jira] [Assigned] (SPARK-15078) Add all TPCDS 1.4 benchmark queries for SparkSQL

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15078: Assignee: (was: Apache Spark) > Add all TPCDS 1.4 benchmark queries for SparkSQL >

[jira] [Assigned] (SPARK-15078) Add all TPCDS 1.4 benchmark queries for SparkSQL

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15078: Assignee: Apache Spark > Add all TPCDS 1.4 benchmark queries for SparkSQL >

[jira] [Created] (SPARK-15078) Add all TPCDS 1.4 benchmark queries for SparkSQL

2016-05-02 Thread Sameer Agarwal (JIRA)
Sameer Agarwal created SPARK-15078: -- Summary: Add all TPCDS 1.4 benchmark queries for SparkSQL Key: SPARK-15078 URL: https://issues.apache.org/jira/browse/SPARK-15078 Project: Spark Issue

[jira] [Resolved] (SPARK-15050) Put CSV options as Python csv function parameters

2016-05-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-15050. - Resolution: Fixed Fix Version/s: 2.0.0 > Put CSV options as Python csv function

[jira] [Commented] (SPARK-14414) Make error messages consistent across DDLs

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267824#comment-15267824 ] Apache Spark commented on SPARK-14414: -- User 'andrewor14' has created a pull request for this issue:

[jira] [Commented] (SPARK-14521) StackOverflowError in Kryo when executing TPC-DS

2016-05-02 Thread Yan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267811#comment-15267811 ] Yan commented on SPARK-14521: - Yes, we are serializing the fields that should not be serialized. Please check

[jira] [Assigned] (SPARK-15077) StreamExecution.awaitOffset may take too long because of thread starvation

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15077: Assignee: Shixiong Zhu (was: Apache Spark) > StreamExecution.awaitOffset may take too

[jira] [Commented] (SPARK-15077) StreamExecution.awaitOffset may take too long because of thread starvation

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267786#comment-15267786 ] Apache Spark commented on SPARK-15077: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-15077) StreamExecution.awaitOffset may take too long because of thread starvation

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15077: Assignee: Apache Spark (was: Shixiong Zhu) > StreamExecution.awaitOffset may take too

[jira] [Created] (SPARK-15077) StreamExecution.awaitOffset may take too long because of thread starvation

2016-05-02 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-15077: Summary: StreamExecution.awaitOffset may take too long because of thread starvation Key: SPARK-15077 URL: https://issues.apache.org/jira/browse/SPARK-15077 Project:

[jira] [Commented] (SPARK-15059) Fine-grained class loader lock in ChildFirstURLClassLoader caused dead locks

2016-05-02 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267763#comment-15267763 ] Marcelo Vanzin commented on SPARK-15059: Sure. > Fine-grained class loader lock in

[jira] [Commented] (SPARK-3210) Flume Polling Receiver must be more tolerant to connection failures.

2016-05-02 Thread Neelesh Srinivas Salian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267761#comment-15267761 ] Neelesh Srinivas Salian commented on SPARK-3210: Is this still valid [~tdas]? > Flume

[jira] [Commented] (SPARK-15059) Fine-grained class loader lock in ChildFirstURLClassLoader caused dead locks

2016-05-02 Thread Zheng Tan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267752#comment-15267752 ] Zheng Tan commented on SPARK-15059: --- Yes, we have removed it in our product environment, and it works

[jira] [Resolved] (SPARK-14747) Add assertStreaming/assertNoneStreaming checks in DataFrameWriter

2016-05-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-14747. -- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12521

[jira] [Commented] (SPARK-14521) StackOverflowError in Kryo when executing TPC-DS

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267746#comment-15267746 ] Yin Huai commented on SPARK-14521: -- Do we know the root cause? > StackOverflowError in Kryo when

[jira] [Commented] (SPARK-14146) Imported implicits can't be found in Spark REPL in some cases

2016-05-02 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267742#comment-15267742 ] Wenchen Fan commented on SPARK-14146: - this is actually a scala issue:

[jira] [Resolved] (SPARK-12540) Support all TPCDS queries

2016-05-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-12540. Resolution: Fixed Fix Version/s: 2.0.0 > Support all TPCDS queries >

[jira] [Commented] (SPARK-12540) Support all TPCDS queries

2016-05-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267739#comment-15267739 ] Davies Liu commented on SPARK-12540: We made it into Spark 2.0 finally, bingo! > Support all TPCDS

[jira] [Updated] (SPARK-14896) Deprecate HiveContext in Python

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-14896: - Priority: Critical (was: Major) > Deprecate HiveContext in Python > --- > >

[jira] [Updated] (SPARK-15032) When we create a new JDBC session, we may need to create a new session of executionHive

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-15032: - Priority: Critical (was: Major) > When we create a new JDBC session, we may need to create a new

[jira] [Updated] (SPARK-6817) DataFrame UDFs in R

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-6817: Component/s: (was: SQL) > DataFrame UDFs in R > --- > > Key: SPARK-6817

[jira] [Updated] (SPARK-14785) Support correlated scalar subquery

2016-05-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14785: --- Assignee: Herman van Hovell > Support correlated scalar subquery >

[jira] [Resolved] (SPARK-14785) Support correlated scalar subquery

2016-05-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14785. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12822

[jira] [Commented] (SPARK-13753) Column nullable is derived incorrectly

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267730#comment-15267730 ] Yin Huai commented on SPARK-13753: -- [~davies] Thank you for looking at it. Yea, we do have the

[jira] [Updated] (SPARK-15058) Enable Java DecisionTree Save/Load tests

2016-05-02 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-15058: -- Summary: Enable Java DecisionTree Save/Load tests (was: Enable Java DecisionTree

[jira] [Updated] (SPARK-12928) Oracle FLOAT datatype is not properly handled when reading via JDBC

2016-05-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12928: Assignee: Greg Michalopoulos > Oracle FLOAT datatype is not properly handled when reading via JDBC

[jira] [Resolved] (SPARK-12928) Oracle FLOAT datatype is not properly handled when reading via JDBC

2016-05-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-12928. - Resolution: Fixed Fix Version/s: 2.0.0 > Oracle FLOAT datatype is not properly handled

[jira] [Commented] (SPARK-14906) Move VectorUDT and MatrixUDT in PySpark to new ML package

2016-05-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267704#comment-15267704 ] Liang-Chi Hsieh commented on SPARK-14906: - ok. I will do it soon. > Move VectorUDT and MatrixUDT

[jira] [Comment Edited] (SPARK-13649) Move CalendarInterval out of unsafe package

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267686#comment-15267686 ] Yin Huai edited comment on SPARK-13649 at 5/2/16 10:59 PM: --- I am setting the

[jira] [Assigned] (SPARK-15072) Remove SparkSession.withHiveSupport

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15072: Assignee: (was: Apache Spark) > Remove SparkSession.withHiveSupport >

[jira] [Commented] (SPARK-13753) Column nullable is derived incorrectly

2016-05-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267689#comment-15267689 ] Davies Liu commented on SPARK-13753: After looking at the query, the bug is caused by we though the

[jira] [Commented] (SPARK-8108) Build Hive module by default (i.e. remove -Phive profile)

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267691#comment-15267691 ] Yin Huai commented on SPARK-8108: - Are we going to do it for 2.0? > Build Hive module by default (i.e.

[jira] [Assigned] (SPARK-15072) Remove SparkSession.withHiveSupport

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15072: Assignee: Apache Spark > Remove SparkSession.withHiveSupport >

[jira] [Commented] (SPARK-15072) Remove SparkSession.withHiveSupport

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267690#comment-15267690 ] Apache Spark commented on SPARK-15072: -- User 'techaddict' has created a pull request for this issue:

[jira] [Updated] (SPARK-13649) Move CalendarInterval out of unsafe package

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-13649: - Target Version/s: 2.1.0 > Move CalendarInterval out of unsafe package >

[jira] [Updated] (SPARK-13649) Move CalendarInterval out of unsafe package

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-13649: - Target Version/s: (was: 2.0.0) > Move CalendarInterval out of unsafe package >

[jira] [Commented] (SPARK-13649) Move CalendarInterval out of unsafe package

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267686#comment-15267686 ] Yin Huai commented on SPARK-13649: -- I am dropping the target version for now. > Move CalendarInterval

[jira] [Commented] (SPARK-13649) Move CalendarInterval out of unsafe package

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267684#comment-15267684 ] Yin Huai commented on SPARK-13649: -- Its doc says that {{The internal representation of interval type.}}.

[jira] [Commented] (SPARK-13649) Move CalendarInterval out of unsafe package

2016-05-02 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267682#comment-15267682 ] Yin Huai commented on SPARK-13649: -- Can you remind me when we will return it? When we use something like

[jira] [Commented] (SPARK-15037) Use SparkSession instread of SQLContext in testsuites

2016-05-02 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267661#comment-15267661 ] Dongjoon Hyun commented on SPARK-15037: --- Thanks. I'll proceed this after example one as soon as

[jira] [Commented] (SPARK-14900) spark.ml classification metrics should include accuracy

2016-05-02 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267660#comment-15267660 ] Miao Wang commented on SPARK-14900: --- Got stuck on SPARK-14898. Switch to working on this one now. Miao

[jira] [Commented] (SPARK-15074) Spark shuffle service bottlenecked while fetching large amount of intermediate data

2016-05-02 Thread Sital Kedia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267659#comment-15267659 ] Sital Kedia commented on SPARK-15074: - [~rxin], [~srowen] - Please let me know what you think of the

[jira] [Commented] (SPARK-15076) Improve ConstantFolding optimizer to use integral associative property

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267657#comment-15267657 ] Apache Spark commented on SPARK-15076: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-15076) Improve ConstantFolding optimizer to use integral associative property

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15076: Assignee: (was: Apache Spark) > Improve ConstantFolding optimizer to use integral

[jira] [Assigned] (SPARK-15076) Improve ConstantFolding optimizer to use integral associative property

2016-05-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15076: Assignee: Apache Spark > Improve ConstantFolding optimizer to use integral associative

[jira] [Resolved] (SPARK-15068) Use proper metastore warehouse path

2016-05-02 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-15068. --- Resolution: Duplicate > Use proper metastore warehouse path > --- >

[jira] [Commented] (SPARK-14226) Caching a table with 1,100 columns and a few million rows fails

2016-05-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267651#comment-15267651 ] Davies Liu commented on SPARK-14226: [~falaki] Could you reproduce this with latest master? (or 2.0

[jira] [Comment Edited] (SPARK-12344) Remove env-based configurations

2016-05-02 Thread Amit Shinde (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267647#comment-15267647 ] Amit Shinde edited comment on SPARK-12344 at 5/2/16 10:38 PM: -- Hi: I'm

[jira] [Commented] (SPARK-12344) Remove env-based configurations

2016-05-02 Thread Amit Shinde (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267647#comment-15267647 ] Amit Shinde commented on SPARK-12344: - Hi: I'm working on the other env vars as well. I filed a PR

  1   2   3   4   >