[jira] [Resolved] (SPARK-13150) Flaky test: org.apache.spark.sql.hive.thriftserver.SingleSessionSuite.test single session

2016-02-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-13150. - Resolution: Fixed Assignee: Herman van Hovell (was: Cheng Lian) > Flaky test: org.apache.s

[jira] [Assigned] (SPARK-13164) Replace deprecated synchronizedBuffer in core

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13164: Assignee: Apache Spark > Replace deprecated synchronizedBuffer in core > -

[jira] [Commented] (SPARK-13164) Replace deprecated synchronizedBuffer in core

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131126#comment-15131126 ] Apache Spark commented on SPARK-13164: -- User 'holdenk' has created a pull request fo

[jira] [Assigned] (SPARK-13164) Replace deprecated synchronizedBuffer in core

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13164: Assignee: (was: Apache Spark) > Replace deprecated synchronizedBuffer in core > --

[jira] [Commented] (SPARK-13116) TungstenAggregate though it is supposedly capable of all processing unsafe & safe rows, fails if the input is safe rows

2016-02-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131138#comment-15131138 ] Davies Liu commented on SPARK-13116: Could you provide a test to reproduce this issue

[jira] [Resolved] (SPARK-12739) Details of batch in Streaming tab uses two Duration columns

2016-02-03 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-12739. -- Resolution: Fixed > Details of batch in Streaming tab uses two Duration columns > -

[jira] [Commented] (SPARK-13046) Partitioning looks broken in 1.6

2016-02-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131170#comment-15131170 ] Davies Liu commented on SPARK-13046: I tried Spark 1.6 and master with a directory li

[jira] [Commented] (SPARK-13116) TungstenAggregate though it is supposedly capable of all processing unsafe & safe rows, fails if the input is safe rows

2016-02-03 Thread Asif Hussain Shahid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131175#comment-15131175 ] Asif Hussain Shahid commented on SPARK-13116: - I will check if my tests encou

[jira] [Assigned] (SPARK-11316) isEmpty before coalesce seems to cause huge performance issue in setupGroups

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11316: Assignee: (was: Apache Spark) > isEmpty before coalesce seems to cause huge performanc

[jira] [Commented] (SPARK-11316) isEmpty before coalesce seems to cause huge performance issue in setupGroups

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131196#comment-15131196 ] Apache Spark commented on SPARK-11316: -- User 'zhuoliu' has created a pull request fo

[jira] [Assigned] (SPARK-11316) isEmpty before coalesce seems to cause huge performance issue in setupGroups

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11316: Assignee: Apache Spark > isEmpty before coalesce seems to cause huge performance issue in

[jira] [Commented] (SPARK-13131) Use median time in benchmark

2016-02-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131205#comment-15131205 ] Davies Liu commented on SPARK-13131: [~piccolbo] Thanks for you comments, we also hav

[jira] [Updated] (SPARK-13131) Use best time and average time in micro benchmark

2016-02-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13131: --- Summary: Use best time and average time in micro benchmark (was: Use median time in benchmark) > U

[jira] [Updated] (SPARK-13131) Use best time and average time in micro benchmark

2016-02-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13131: --- Description: Best time should be more stable than average time in benchmark, together with average ti

[jira] [Created] (SPARK-13166) Remove DataStreamReader/Writer

2016-02-03 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-13166: --- Summary: Remove DataStreamReader/Writer Key: SPARK-13166 URL: https://issues.apache.org/jira/browse/SPARK-13166 Project: Spark Issue Type: Sub-task C

[jira] [Assigned] (SPARK-13166) Remove DataStreamReader/Writer

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13166: Assignee: Reynold Xin (was: Apache Spark) > Remove DataStreamReader/Writer >

[jira] [Commented] (SPARK-13166) Remove DataStreamReader/Writer

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131252#comment-15131252 ] Apache Spark commented on SPARK-13166: -- User 'rxin' has created a pull request for t

[jira] [Assigned] (SPARK-13166) Remove DataStreamReader/Writer

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13166: Assignee: Apache Spark (was: Reynold Xin) > Remove DataStreamReader/Writer >

[jira] [Created] (SPARK-13167) JDBC data source does not include null value partition columns rows in the result.

2016-02-03 Thread Suresh Thalamati (JIRA)
Suresh Thalamati created SPARK-13167: Summary: JDBC data source does not include null value partition columns rows in the result. Key: SPARK-13167 URL: https://issues.apache.org/jira/browse/SPARK-13167

[jira] [Commented] (SPARK-13167) JDBC data source does not include null value partition columns rows in the result.

2016-02-03 Thread Suresh Thalamati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131258#comment-15131258 ] Suresh Thalamati commented on SPARK-13167: -- I am working on fix for this issue.

[jira] [Commented] (SPARK-13167) JDBC data source does not include null value partition columns rows in the result.

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131285#comment-15131285 ] Apache Spark commented on SPARK-13167: -- User 'sureshthalamati' has created a pull re

[jira] [Assigned] (SPARK-13167) JDBC data source does not include null value partition columns rows in the result.

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13167: Assignee: Apache Spark > JDBC data source does not include null value partition columns ro

[jira] [Assigned] (SPARK-13167) JDBC data source does not include null value partition columns rows in the result.

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13167: Assignee: (was: Apache Spark) > JDBC data source does not include null value partition

[jira] [Commented] (SPARK-12992) Vectorize parquet decoding using ColumnarBatch

2016-02-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131336#comment-15131336 ] Davies Liu commented on SPARK-12992: [~nongli] We usually have one PR for one JIRA, t

[jira] [Commented] (SPARK-13068) Extend pyspark ml paramtype conversion to support lists

2016-02-03 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131344#comment-15131344 ] holdenk commented on SPARK-13068: - This seems like a good direction, the current approach

[jira] [Created] (SPARK-13168) Collapse adjacent Repartition operations

2016-02-03 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-13168: -- Summary: Collapse adjacent Repartition operations Key: SPARK-13168 URL: https://issues.apache.org/jira/browse/SPARK-13168 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-13168) Collapse adjacent Repartition operations

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13168: Assignee: Josh Rosen (was: Apache Spark) > Collapse adjacent Repartition operations > ---

[jira] [Commented] (SPARK-13168) Collapse adjacent Repartition operations

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131348#comment-15131348 ] Apache Spark commented on SPARK-13168: -- User 'JoshRosen' has created a pull request

[jira] [Assigned] (SPARK-13168) Collapse adjacent Repartition operations

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13168: Assignee: Apache Spark (was: Josh Rosen) > Collapse adjacent Repartition operations > ---

[jira] [Updated] (SPARK-13149) Add FileStreamSource

2016-02-03 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-13149: - Summary: Add FileStreamSource (was: Add FileStreamSource and a simple version of FileStreamSink)

[jira] [Commented] (SPARK-13095) improve performance of hash join with dimension table

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131352#comment-15131352 ] Apache Spark commented on SPARK-13095: -- User 'davies' has created a pull request for

[jira] [Commented] (SPARK-13131) Use best time and average time in micro benchmark

2016-02-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131353#comment-15131353 ] Sean Owen commented on SPARK-13131: --- Isn't best time on fact the best estimator of what

[jira] [Resolved] (SPARK-13160) PySpark CDH 5

2016-02-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13160. --- Resolution: Invalid Not the place for questions - u...@spark.apache.org > PySpark CDH 5 > --

[jira] [Assigned] (SPARK-13165) Replace deprecated synchronizedBuffer in streaming

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13165: Assignee: Apache Spark > Replace deprecated synchronizedBuffer in streaming >

[jira] [Commented] (SPARK-13165) Replace deprecated synchronizedBuffer in streaming

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131383#comment-15131383 ] Apache Spark commented on SPARK-13165: -- User 'holdenk' has created a pull request fo

[jira] [Assigned] (SPARK-13165) Replace deprecated synchronizedBuffer in streaming

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13165: Assignee: (was: Apache Spark) > Replace deprecated synchronizedBuffer in streaming > -

[jira] [Resolved] (SPARK-6715) Eliminate duplicate filters from pushdown predicates

2016-02-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-6715. --- Resolution: Won't Fix I believe that this has been addressed by https://issues.apache.org/jira/browse

[jira] [Created] (SPARK-13170) Investigate replacing SynchronizedQueue as it is deprecated

2016-02-03 Thread holdenk (JIRA)
holdenk created SPARK-13170: --- Summary: Investigate replacing SynchronizedQueue as it is deprecated Key: SPARK-13170 URL: https://issues.apache.org/jira/browse/SPARK-13170 Project: Spark Issue Type

[jira] [Created] (SPARK-13169) CROSS JOIN slow or fails on tiny table

2016-02-03 Thread Antonio Piccolboni (JIRA)
Antonio Piccolboni created SPARK-13169: -- Summary: CROSS JOIN slow or fails on tiny table Key: SPARK-13169 URL: https://issues.apache.org/jira/browse/SPARK-13169 Project: Spark Issue Type

[jira] [Commented] (SPARK-7376) Python: Add validation functionality to individual Param

2016-02-03 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131401#comment-15131401 ] Seth Hendrickson commented on SPARK-7376: - I am seeing this Jira now after several

[jira] [Created] (SPARK-13171) Update promise & future to Promise and Future as the old ones are deprecated

2016-02-03 Thread holdenk (JIRA)
holdenk created SPARK-13171: --- Summary: Update promise & future to Promise and Future as the old ones are deprecated Key: SPARK-13171 URL: https://issues.apache.org/jira/browse/SPARK-13171 Project: Spark

[jira] [Created] (SPARK-13172) Stop using RichException.getStackTrace it is deprecated

2016-02-03 Thread holdenk (JIRA)
holdenk created SPARK-13172: --- Summary: Stop using RichException.getStackTrace it is deprecated Key: SPARK-13172 URL: https://issues.apache.org/jira/browse/SPARK-13172 Project: Spark Issue Type: Imp

[jira] [Resolved] (SPARK-3611) Show number of cores for each executor in application web UI

2016-02-03 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-3611. - Resolution: Fixed Assignee: Alex Bozarth Fix Version/s: 2.0.0 > Show number of cor

[jira] [Created] (SPARK-13173) Fail to load CSV file with NPE

2016-02-03 Thread Davies Liu (JIRA)
Davies Liu created SPARK-13173: -- Summary: Fail to load CSV file with NPE Key: SPARK-13173 URL: https://issues.apache.org/jira/browse/SPARK-13173 Project: Spark Issue Type: Bug Report

[jira] [Created] (SPARK-13174) Add API and options for csv data sources

2016-02-03 Thread Davies Liu (JIRA)
Davies Liu created SPARK-13174: -- Summary: Add API and options for csv data sources Key: SPARK-13174 URL: https://issues.apache.org/jira/browse/SPARK-13174 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-13131) Use best time and average time in micro benchmark

2016-02-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131413#comment-15131413 ] Davies Liu commented on SPARK-13131: [~srowen] Fully agreed with you, that's my first

[jira] [Created] (SPARK-13175) Scala 2.11 deprecation warnings cleanup

2016-02-03 Thread holdenk (JIRA)
holdenk created SPARK-13175: --- Summary: Scala 2.11 deprecation warnings cleanup Key: SPARK-13175 URL: https://issues.apache.org/jira/browse/SPARK-13175 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-13172) Stop using RichException.getStackTrace it is deprecated

2016-02-03 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-13172: Issue Type: Sub-task (was: Improvement) Parent: SPARK-13175 > Stop using RichException.getStackTra

[jira] [Updated] (SPARK-13171) Update promise & future to Promise and Future as the old ones are deprecated

2016-02-03 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-13171: Issue Type: Sub-task (was: Improvement) Parent: SPARK-13175 > Update promise & future to Promise a

[jira] [Updated] (SPARK-13170) Investigate replacing SynchronizedQueue as it is deprecated

2016-02-03 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-13170: Issue Type: Sub-task (was: Improvement) Parent: SPARK-13175 > Investigate replacing SynchronizedQu

[jira] [Commented] (SPARK-13046) Partitioning looks broken in 1.6

2016-02-03 Thread Julien Baley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131418#comment-15131418 ] Julien Baley commented on SPARK-13046: -- Hi Davies, I have no other file in the midd

[jira] [Updated] (SPARK-13176) Ignore deprecation warning for ProcessBuilder lines_!

2016-02-03 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-13176: Issue Type: Sub-task (was: Improvement) Parent: SPARK-13175 > Ignore deprecation warning for Proce

[jira] [Created] (SPARK-13176) Ignore deprecation warning for ProcessBuilder lines_!

2016-02-03 Thread holdenk (JIRA)
holdenk created SPARK-13176: --- Summary: Ignore deprecation warning for ProcessBuilder lines_! Key: SPARK-13176 URL: https://issues.apache.org/jira/browse/SPARK-13176 Project: Spark Issue Type: Impro

[jira] [Created] (SPARK-13177) Update ActorWordCount example to not directly use low level linked list as it is deprecated.

2016-02-03 Thread holdenk (JIRA)
holdenk created SPARK-13177: --- Summary: Update ActorWordCount example to not directly use low level linked list as it is deprecated. Key: SPARK-13177 URL: https://issues.apache.org/jira/browse/SPARK-13177 Pr

[jira] [Created] (SPARK-13178) RRDD faces with concurrency issue in case of rdd.zip(rdd).count()

2016-02-03 Thread Xusen Yin (JIRA)
Xusen Yin created SPARK-13178: - Summary: RRDD faces with concurrency issue in case of rdd.zip(rdd).count() Key: SPARK-13178 URL: https://issues.apache.org/jira/browse/SPARK-13178 Project: Spark

[jira] [Updated] (SPARK-13178) RRDD faces with concurrency issue in case of rdd.zip(rdd).count()

2016-02-03 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xusen Yin updated SPARK-13178: -- Description: In Kmeans algorithm, there is a zip operation before taking samples, i.e. https://github.

[jira] [Resolved] (SPARK-13166) Remove DataStreamReader/Writer

2016-02-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-13166. -- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11062 [htt

[jira] [Updated] (SPARK-13178) RRDD faces with concurrency issue in case of rdd.zip(rdd).count()

2016-02-03 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xusen Yin updated SPARK-13178: -- Description: In Kmeans algorithm, there is a zip operation before taking samples, i.e. https://github.

[jira] [Resolved] (SPARK-13101) Dataset complex types mapping to DataFrame (element nullability) mismatch

2016-02-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-13101. -- Resolution: Fixed Fix Version/s: 1.6.1 Issue resolved by pull request 11042 [htt

[jira] [Assigned] (SPARK-13101) Dataset complex types mapping to DataFrame (element nullability) mismatch

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13101: Assignee: Apache Spark (was: Wenchen Fan) > Dataset complex types mapping to DataFrame (

[jira] [Reopened] (SPARK-13101) Dataset complex types mapping to DataFrame (element nullability) mismatch

2016-02-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reopened SPARK-13101: -- Assignee: Wenchen Fan > Dataset complex types mapping to DataFrame (element nullabil

[jira] [Updated] (SPARK-13101) Dataset complex types mapping to DataFrame (element nullability) mismatch

2016-02-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-13101: - Target Version/s: 1.6.1, 2.0.0 (was: 1.6.1) > Dataset complex types mapping to DataFrame

[jira] [Assigned] (SPARK-13101) Dataset complex types mapping to DataFrame (element nullability) mismatch

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13101: Assignee: Wenchen Fan (was: Apache Spark) > Dataset complex types mapping to DataFrame (

[jira] [Commented] (SPARK-13178) RRDD faces with concurrency issue in case of rdd.zip(rdd).count()

2016-02-03 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131430#comment-15131430 ] Xusen Yin commented on SPARK-13178: --- Ping [~mengxr] [~shivaram] to know about the concu

[jira] [Updated] (SPARK-13175) Cleanup deprecation warnings from Scala 2.11 upgrade

2016-02-03 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-13175: Summary: Cleanup deprecation warnings from Scala 2.11 upgrade (was: Scala 2.11 deprecation warnings cleanu

[jira] [Created] (SPARK-13179) pyspark row name collision 'count'

2016-02-03 Thread David Fagnan (JIRA)
David Fagnan created SPARK-13179: Summary: pyspark row name collision 'count' Key: SPARK-13179 URL: https://issues.apache.org/jira/browse/SPARK-13179 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-13178) RRDD faces with concurrency issue in case of rdd.zip(rdd).count()

2016-02-03 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131436#comment-15131436 ] Shivaram Venkataraman commented on SPARK-13178: --- Hmm this is tricky to debu

[jira] [Updated] (SPARK-13178) RRDD faces with concurrency issue in case of rdd.zip(rdd).count()

2016-02-03 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xusen Yin updated SPARK-13178: -- Description: In Kmeans algorithm, there is a zip operation before taking samples, i.e. https://github.

[jira] [Updated] (SPARK-13179) pyspark row name collision 'count'

2016-02-03 Thread David Fagnan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Fagnan updated SPARK-13179: - Description: The following example from the documentation results in a name collision: {code:none

[jira] [Commented] (SPARK-12514) Spark MetricsSystem can fill disks/cause OOMs when using GangliaSink

2016-02-03 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131439#comment-15131439 ] Jonathan Kelly commented on SPARK-12514: As of Spark 1.6.0, there don't seem to b

[jira] [Commented] (SPARK-12720) SQL generation support for cube, rollup, and grouping set

2016-02-03 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131441#comment-15131441 ] Xiao Li commented on SPARK-12720: - Since CUBE and ROLLUP are just syntactic sugar for GRO

[jira] [Commented] (SPARK-12720) SQL generation support for cube, rollup, and grouping set

2016-02-03 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131447#comment-15131447 ] Xiao Li commented on SPARK-12720: - CUBE(a, b, c) = GROUPING SETS((a,b,c), (a,b), (a,c), (

[jira] [Commented] (SPARK-13178) RRDD faces with concurrency issue in case of rdd.zip(rdd).count()

2016-02-03 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131455#comment-15131455 ] Xusen Yin commented on SPARK-13178: --- I don't zip RRDD with itself. Actually, the bug ex

[jira] [Comment Edited] (SPARK-13178) RRDD faces with concurrency issue in case of rdd.zip(rdd).count()

2016-02-03 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131455#comment-15131455 ] Xusen Yin edited comment on SPARK-13178 at 2/4/16 12:41 AM: I

[jira] [Comment Edited] (SPARK-13178) RRDD faces with concurrency issue in case of rdd.zip(rdd).count()

2016-02-03 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131455#comment-15131455 ] Xusen Yin edited comment on SPARK-13178 at 2/4/16 12:43 AM: I

[jira] [Commented] (SPARK-13178) RRDD faces with concurrency issue in case of rdd.zip(rdd).count()

2016-02-03 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131463#comment-15131463 ] Xusen Yin commented on SPARK-13178: --- We can work around with just adding a cache for th

[jira] [Comment Edited] (SPARK-12514) Spark MetricsSystem can fill disks/cause OOMs when using GangliaSink

2016-02-03 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131439#comment-15131439 ] Jonathan Kelly edited comment on SPARK-12514 at 2/4/16 12:58 AM: --

[jira] [Commented] (SPARK-13046) Partitioning looks broken in 1.6

2016-02-03 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131484#comment-15131484 ] Yin Huai commented on SPARK-13046: -- [~julien.baley] Can you add how you load those dirs?

[jira] [Resolved] (SPARK-13131) Use best time and average time in micro benchmark

2016-02-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13131. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11018 [https://github.

[jira] [Created] (SPARK-13180) Protect against SessionState being null when accessing HiveClientImpl#conf

2016-02-03 Thread Ted Yu (JIRA)
Ted Yu created SPARK-13180: -- Summary: Protect against SessionState being null when accessing HiveClientImpl#conf Key: SPARK-13180 URL: https://issues.apache.org/jira/browse/SPARK-13180 Project: Spark

[jira] [Assigned] (SPARK-13180) Protect against SessionState being null when accessing HiveClientImpl#conf

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13180: Assignee: Apache Spark > Protect against SessionState being null when accessing HiveClient

[jira] [Commented] (SPARK-13180) Protect against SessionState being null when accessing HiveClientImpl#conf

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131516#comment-15131516 ] Apache Spark commented on SPARK-13180: -- User 'tedyu' has created a pull request for

[jira] [Assigned] (SPARK-13180) Protect against SessionState being null when accessing HiveClientImpl#conf

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13180: Assignee: (was: Apache Spark) > Protect against SessionState being null when accessing

[jira] [Assigned] (SPARK-13079) Provide an in-memory implementation of the catalog API

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13079: Assignee: Apache Spark > Provide an in-memory implementation of the catalog API >

[jira] [Commented] (SPARK-13079) Provide an in-memory implementation of the catalog API

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131530#comment-15131530 ] Apache Spark commented on SPARK-13079: -- User 'andrewor14' has created a pull request

[jira] [Assigned] (SPARK-13079) Provide an in-memory implementation of the catalog API

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13079: Assignee: (was: Apache Spark) > Provide an in-memory implementation of the catalog API

[jira] [Resolved] (SPARK-13152) Fix task metrics deprecation warning

2016-02-03 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-13152. --- Resolution: Fixed Assignee: holdenk Fix Version/s: 2.0.0 Target Version/s

[jira] [Updated] (SPARK-12145) Hive Authorization V2 interface requires the username information from SessionState

2016-02-03 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated SPARK-12145: - Summary: Hive Authorization V2 interface requires the username information from SessionState (wa

[jira] [Updated] (SPARK-12145) Hive Authorization V2 interface requires the username information from SessionState

2016-02-03 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated SPARK-12145: - Description: We need to pass username information to SessionState in order to initial the Hive au

[jira] [Commented] (SPARK-13178) RRDD faces with concurrency issue in case of rdd.zip(rdd).count()

2016-02-03 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131556#comment-15131556 ] Shivaram Venkataraman commented on SPARK-13178: --- Ah I see - so the problem

[jira] [Commented] (SPARK-13178) RRDD faces with concurrency issue in case of rdd.zip(rdd).count()

2016-02-03 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131634#comment-15131634 ] Sun Rui commented on SPARK-13178: - [~xusen] Could you first use a DataFrame created from

[jira] [Resolved] (SPARK-13079) Provide an in-memory implementation of the catalog API

2016-02-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-13079. - Resolution: Fixed Assignee: Andrew Or Fix Version/s: 2.0.0 > Provide an in-memory

[jira] [Commented] (SPARK-12720) SQL generation support for cube, rollup, and grouping set

2016-02-03 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131723#comment-15131723 ] Xiao Li commented on SPARK-12720: - Will submit a PR tomorrow. Thanks! > SQL generation s

[jira] [Resolved] (SPARK-12828) support natural join

2016-02-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-12828. - Resolution: Fixed Assignee: Adrian Wang Fix Version/s: 2.0.0 > support natural jo

[jira] [Created] (SPARK-13181) Spark delay in task scheduling within executor

2016-02-03 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created SPARK-13181: - Summary: Spark delay in task scheduling within executor Key: SPARK-13181 URL: https://issues.apache.org/jira/browse/SPARK-13181 Project: Spark Issue Type:

[jira] [Updated] (SPARK-13181) Spark delay in task scheduling within executor

2016-02-03 Thread Prabhu Joseph (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated SPARK-13181: -- Attachment: ran3.JPG > Spark delay in task scheduling within executor > ---

[jira] [Commented] (SPARK-12828) support natural join

2016-02-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131780#comment-15131780 ] Apache Spark commented on SPARK-12828: -- User 'rxin' has created a pull request for t

[jira] [Created] (SPARK-13182) Spark Executor retries infinitely

2016-02-03 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created SPARK-13182: - Summary: Spark Executor retries infinitely Key: SPARK-13182 URL: https://issues.apache.org/jira/browse/SPARK-13182 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-10814) API design: convergence of batch and streaming DataFrame

2016-02-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-10814. - Resolution: Fixed Fix Version/s: 2.0.0 > API design: convergence of batch and streaming Da

[jira] [Created] (SPARK-13183) Bytebuffers occupy a large amount of heap memory

2016-02-03 Thread dylanzhou (JIRA)
dylanzhou created SPARK-13183: - Summary: Bytebuffers occupy a large amount of heap memory Key: SPARK-13183 URL: https://issues.apache.org/jira/browse/SPARK-13183 Project: Spark Issue Type: Bug

<    1   2   3   >