[jira] [Commented] (SPARK-18538) Concurrent Fetching DataFrameReader JDBC APIs Do Not Work

2016-11-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15711221#comment-15711221 ] Wenchen Fan commented on SPARK-18538: - already merged to master, will resolve this ti

[jira] [Commented] (SPARK-18620) Spark Streaming + Kinesis : Receiver MaxRate is violated

2016-11-30 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15711171#comment-15711171 ] Takeshi Yamamuro commented on SPARK-18620: -- I quickly checked and I found that t

[jira] [Created] (SPARK-18667) input_file_name function does not work with UDF

2016-11-30 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-18667: Summary: input_file_name function does not work with UDF Key: SPARK-18667 URL: https://issues.apache.org/jira/browse/SPARK-18667 Project: Spark Issue Type: B

[jira] [Commented] (SPARK-18665) Spark ThriftServer jobs where are canceled are still “STARTED”

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15711100#comment-15711100 ] Apache Spark commented on SPARK-18665: -- User 'cenyuhai' has created a pull request f

[jira] [Assigned] (SPARK-18665) Spark ThriftServer jobs where are canceled are still “STARTED”

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18665: Assignee: (was: Apache Spark) > Spark ThriftServer jobs where are canceled are still “

[jira] [Assigned] (SPARK-18665) Spark ThriftServer jobs where are canceled are still “STARTED”

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18665: Assignee: Apache Spark > Spark ThriftServer jobs where are canceled are still “STARTED” >

[jira] [Commented] (SPARK-12347) Write script to run all MLlib examples for testing

2016-11-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15711095#comment-15711095 ] Nick Pentreath commented on SPARK-12347: Since the PR is still WIP and this is no

[jira] [Updated] (SPARK-12347) Write script to run all MLlib examples for testing

2016-11-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-12347: --- Target Version/s: 2.2.0 (was: 2.1.0) > Write script to run all MLlib examples for testing >

[jira] [Updated] (SPARK-18638) Upgrade sbt, zinc and maven plugins

2016-11-30 Thread Weiqing Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-18638: - Description: v2.1.0-rc1has been out. For 2.2.x, it is better to keep sbt up-to-date, and upgrade

[jira] [Updated] (SPARK-18638) Upgrade sbt, zinc and maven plugins

2016-11-30 Thread Weiqing Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-18638: - Summary: Upgrade sbt, zinc and maven plugins (was: Upgrade sbt to 0.13.13) > Upgrade sbt, zinc a

[jira] [Commented] (SPARK-18617) Close "kryo auto pick" feature for Spark Streaming

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710928#comment-15710928 ] Apache Spark commented on SPARK-18617: -- User 'uncleGen' has created a pull request f

[jira] [Updated] (SPARK-18666) Remove the codes checking deprecated config spark.sql.unsafe.enabled

2016-11-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-18666: Description: spark.sql.unsafe.enabled is deprecated since 1.6. There still are codes in Web

[jira] [Assigned] (SPARK-18666) Remove the codes checking deprecated config spark.sql.unsafe.enabled

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18666: Assignee: (was: Apache Spark) > Remove the codes checking deprecated config spark.sql.

[jira] [Assigned] (SPARK-18666) Remove the codes checking deprecated config spark.sql.unsafe.enabled

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18666: Assignee: Apache Spark > Remove the codes checking deprecated config spark.sql.unsafe.enab

[jira] [Commented] (SPARK-18666) Remove the codes checking deprecated config spark.sql.unsafe.enabled

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710891#comment-15710891 ] Apache Spark commented on SPARK-18666: -- User 'viirya' has created a pull request for

[jira] [Created] (SPARK-18666) Remove the codes checking deprecated config spark.sql.unsafe.enabled

2016-11-30 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-18666: --- Summary: Remove the codes checking deprecated config spark.sql.unsafe.enabled Key: SPARK-18666 URL: https://issues.apache.org/jira/browse/SPARK-18666 Project: S

[jira] [Commented] (SPARK-17583) Remove unused rowSeparator variable and set auto-expanding buffer as default for maxCharsPerColumn option in CSV

2016-11-30 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710871#comment-15710871 ] koert kuipers commented on SPARK-17583: --- i see. so you are saying in spark 2.0.x it

[jira] [Resolved] (SPARK-18476) SparkR Logistic Regression should should support output original label.

2016-11-30 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-18476. - Resolution: Fixed Assignee: Miao Wang Fix Version/s: 2.1.0 > SparkR Logistic Regr

[jira] [Updated] (SPARK-18665) Spark ThriftServer jobs where are canceled are still “STARTED”

2016-11-30 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai updated SPARK-18665: -- Attachment: 1179ACF7-3E62-44C5-B01D-CA71C876ECCE.png > Spark ThriftServer jobs where are canceled are s

[jira] [Updated] (SPARK-18665) Spark ThriftServer jobs where are canceled are still “STARTED”

2016-11-30 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai updated SPARK-18665: -- Attachment: 83C5E8AD-59DE-4A85-A483-2BE3FB83F378.png > Spark ThriftServer jobs where are canceled are s

[jira] [Created] (SPARK-18665) Spark ThriftServer jobs where are canceled are still “STARTED”

2016-11-30 Thread cen yuhai (JIRA)
cen yuhai created SPARK-18665: - Summary: Spark ThriftServer jobs where are canceled are still “STARTED” Key: SPARK-18665 URL: https://issues.apache.org/jira/browse/SPARK-18665 Project: Spark Iss

[jira] [Assigned] (SPARK-18541) Add pyspark.sql.Column.aliasWithMetadata to allow dynamic metadata management in pyspark SQL API

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18541: Assignee: Apache Spark > Add pyspark.sql.Column.aliasWithMetadata to allow dynamic metadat

[jira] [Commented] (SPARK-18541) Add pyspark.sql.Column.aliasWithMetadata to allow dynamic metadata management in pyspark SQL API

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710705#comment-15710705 ] Apache Spark commented on SPARK-18541: -- User 'shea-parkes' has created a pull reques

[jira] [Assigned] (SPARK-18541) Add pyspark.sql.Column.aliasWithMetadata to allow dynamic metadata management in pyspark SQL API

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18541: Assignee: (was: Apache Spark) > Add pyspark.sql.Column.aliasWithMetadata to allow dyna

[jira] [Commented] (SPARK-16026) Cost-based Optimizer framework

2016-11-30 Thread Ron Hu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710667#comment-15710667 ] Ron Hu commented on SPARK-16026: Hi Reynold, I previously worked on filter cardinality es

[jira] [Commented] (SPARK-16026) Cost-based Optimizer framework

2016-11-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710618#comment-15710618 ] Reynold Xin commented on SPARK-16026: - [~ZenWzh] can we start working on operator car

[jira] [Commented] (SPARK-18663) Simplify CountMinSketch aggregate implementation

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710616#comment-15710616 ] Apache Spark commented on SPARK-18663: -- User 'rxin' has created a pull request for t

[jira] [Assigned] (SPARK-18663) Simplify CountMinSketch aggregate implementation

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18663: Assignee: Apache Spark (was: Reynold Xin) > Simplify CountMinSketch aggregate implementat

[jira] [Assigned] (SPARK-18663) Simplify CountMinSketch aggregate implementation

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18663: Assignee: Reynold Xin (was: Apache Spark) > Simplify CountMinSketch aggregate implementat

[jira] [Updated] (SPARK-18664) Don't respond to HTTP OPTIONS in HTTP-based UIs

2016-11-30 Thread meiyoula (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] meiyoula updated SPARK-18664: - Description: This was flagged a while ago during a routine security scan(AWVS): the HTTP-based Spark ser

[jira] [Commented] (SPARK-18664) Don't respond to HTTP OPTIONS in HTTP-based UIs

2016-11-30 Thread meiyoula (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710591#comment-15710591 ] meiyoula commented on SPARK-18664: -- [~srowen] It is similar to SPARK-5983, should we fix

[jira] [Created] (SPARK-18664) Don't respond to HTTP OPTIONS in HTTP-based UIs

2016-11-30 Thread meiyoula (JIRA)
meiyoula created SPARK-18664: Summary: Don't respond to HTTP OPTIONS in HTTP-based UIs Key: SPARK-18664 URL: https://issues.apache.org/jira/browse/SPARK-18664 Project: Spark Issue Type: Improveme

[jira] [Created] (SPARK-18663) Simplify CountMinSketch aggregate implementation

2016-11-30 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18663: --- Summary: Simplify CountMinSketch aggregate implementation Key: SPARK-18663 URL: https://issues.apache.org/jira/browse/SPARK-18663 Project: Spark Issue Type: Su

[jira] [Resolved] (SPARK-18644) spark-submit fails to run python scripts with specific names

2016-11-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-18644. -- Resolution: Not A Problem > spark-submit fails to run python scripts with specific names >

[jira] [Commented] (SPARK-18644) spark-submit fails to run python scripts with specific names

2016-11-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710507#comment-15710507 ] Bryan Cutler commented on SPARK-18644: -- Yeah, [~vanzin] is right, it's a python thin

[jira] [Updated] (SPARK-18662) Move cluster managers into their own sub-directory

2016-11-30 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anirudh Ramanathan updated SPARK-18662: --- External issue URL: https://github.com/apache/spark/pull/16092 > Move cluster manager

[jira] [Assigned] (SPARK-18662) Move cluster managers into their own sub-directory

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18662: Assignee: Apache Spark > Move cluster managers into their own sub-directory >

[jira] [Commented] (SPARK-18662) Move cluster managers into their own sub-directory

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710391#comment-15710391 ] Apache Spark commented on SPARK-18662: -- User 'foxish' has created a pull request for

[jira] [Assigned] (SPARK-18662) Move cluster managers into their own sub-directory

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18662: Assignee: (was: Apache Spark) > Move cluster managers into their own sub-directory > -

[jira] [Commented] (SPARK-18650) race condition in FileScanRDD.scala

2016-11-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710358#comment-15710358 ] Hyukjin Kwon commented on SPARK-18650: -- Would this be possible to share your data/sa

[jira] [Resolved] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-18655. -- Resolution: Fixed Fix Version/s: 2.1.0 > Ignore Structured Streaming 2.0.2 logs in histo

[jira] [Commented] (SPARK-18617) Close "kryo auto pick" feature for Spark Streaming

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710308#comment-15710308 ] Apache Spark commented on SPARK-18617: -- User 'zsxwing' has created a pull request fo

[jira] [Commented] (SPARK-18560) Receiver data can not be dataSerialized properly.

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710309#comment-15710309 ] Apache Spark commented on SPARK-18560: -- User 'zsxwing' has created a pull request fo

[jira] [Assigned] (SPARK-18122) Fallback to Kryo for unknown classes in ExpressionEncoder

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18122: Assignee: (was: Apache Spark) > Fallback to Kryo for unknown classes in ExpressionEnco

[jira] [Assigned] (SPARK-18122) Fallback to Kryo for unknown classes in ExpressionEncoder

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18122: Assignee: Apache Spark > Fallback to Kryo for unknown classes in ExpressionEncoder > -

[jira] [Created] (SPARK-18662) Move cluster managers into their own sub-directory

2016-11-30 Thread Anirudh Ramanathan (JIRA)
Anirudh Ramanathan created SPARK-18662: -- Summary: Move cluster managers into their own sub-directory Key: SPARK-18662 URL: https://issues.apache.org/jira/browse/SPARK-18662 Project: Spark

[jira] [Reopened] (SPARK-18122) Fallback to Kryo for unknown classes in ExpressionEncoder

2016-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reopened SPARK-18122: -- I'm going to reopen this. I think the benefits outweigh the compatibility concerns. > Fa

[jira] [Closed] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Sina Sohangir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sina Sohangir closed SPARK-18656. - Resolution: Later > org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles >

[jira] [Commented] (SPARK-18661) Creating a partitioned datasource table should not scan all files for table

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710233#comment-15710233 ] Apache Spark commented on SPARK-18661: -- User 'ericl' has created a pull request for

[jira] [Assigned] (SPARK-18661) Creating a partitioned datasource table should not scan all files for table

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18661: Assignee: Apache Spark > Creating a partitioned datasource table should not scan all files

[jira] [Assigned] (SPARK-18661) Creating a partitioned datasource table should not scan all files for table

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18661: Assignee: (was: Apache Spark) > Creating a partitioned datasource table should not sca

[jira] [Created] (SPARK-18661) Creating a partitioned datasource table should not scan all files in filesystem

2016-11-30 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18661: -- Summary: Creating a partitioned datasource table should not scan all files in filesystem Key: SPARK-18661 URL: https://issues.apache.org/jira/browse/SPARK-18661 Project:

[jira] [Updated] (SPARK-18661) Creating a partitioned datasource table should not scan all files for table

2016-11-30 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18661: --- Summary: Creating a partitioned datasource table should not scan all files for table (was: Creating

[jira] [Commented] (SPARK-17583) Remove unused rowSeparator variable and set auto-expanding buffer as default for maxCharsPerColumn option in CSV

2016-11-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710199#comment-15710199 ] Hyukjin Kwon commented on SPARK-17583: -- For example, please refer the discussion in

[jira] [Commented] (SPARK-18374) Incorrect words in StopWords/english.txt

2016-11-30 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710198#comment-15710198 ] yuhao yang commented on SPARK-18374: I checked with some other lists of stopwords and

[jira] [Commented] (SPARK-17583) Remove unused rowSeparator variable and set auto-expanding buffer as default for maxCharsPerColumn option in CSV

2016-11-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710195#comment-15710195 ] Hyukjin Kwon commented on SPARK-17583: -- Ah, that would not be related with this JIRA

[jira] [Commented] (SPARK-17583) Remove unused rowSeparator variable and set auto-expanding buffer as default for maxCharsPerColumn option in CSV

2016-11-30 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710172#comment-15710172 ] koert kuipers commented on SPARK-17583: --- i just tested out inhouse unit test (which

[jira] [Updated] (SPARK-17939) Spark-SQL Nullability: Optimizations vs. Enforcement Clarification

2016-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17939: - Target Version/s: 2.1.0 > Spark-SQL Nullability: Optimizations vs. Enforcement Clarificat

[jira] [Updated] (SPARK-17939) Spark-SQL Nullability: Optimizations vs. Enforcement Clarification

2016-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17939: - Target Version/s: 2.2.0 (was: 2.1.0) > Spark-SQL Nullability: Optimizations vs. Enforcem

[jira] [Assigned] (SPARK-18658) Writing to a text DataSource buffers one or more lines in memory

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18658: Assignee: (was: Apache Spark) > Writing to a text DataSource buffers one or more lines

[jira] [Assigned] (SPARK-18658) Writing to a text DataSource buffers one or more lines in memory

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18658: Assignee: Apache Spark > Writing to a text DataSource buffers one or more lines in memory

[jira] [Commented] (SPARK-18658) Writing to a text DataSource buffers one or more lines in memory

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710116#comment-15710116 ] Apache Spark commented on SPARK-18658: -- User 'NathanHowell' has created a pull reque

[jira] [Updated] (SPARK-18481) ML 2.1 QA: Remove deprecated methods for ML

2016-11-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18481: -- Description: Remove deprecated methods for ML. This task removed the following (deprec

[jira] [Created] (SPARK-18660) Parquet complains "Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl

2016-11-30 Thread Yin Huai (JIRA)
Yin Huai created SPARK-18660: Summary: Parquet complains "Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl " Key: SPARK-18660

[jira] [Resolved] (SPARK-18546) UnsafeShuffleWriter corrupts encrypted shuffle files when merging

2016-11-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-18546. Resolution: Fixed Fix Version/s: 2.1.1 > UnsafeShuffleWriter corrupts encrypted shuf

[jira] [Commented] (SPARK-18085) Better History Server scalability for many / large applications

2016-11-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709943#comment-15709943 ] Marcelo Vanzin commented on SPARK-18085: I uploaded code for milestone 3 from the

[jira] [Updated] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-11-30 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18659: --- Description: The first three test cases fail due to a crash in hive client when dropping partitions

[jira] [Assigned] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18659: Assignee: (was: Apache Spark) > Incorrect behaviors in overwrite table for datasource

[jira] [Commented] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709938#comment-15709938 ] Apache Spark commented on SPARK-18659: -- User 'ericl' has created a pull request for

[jira] [Assigned] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18659: Assignee: Apache Spark > Incorrect behaviors in overwrite table for datasource tables > --

[jira] [Updated] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-11-30 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18659: --- Description: The following test cases fail due to a crash in hive client when dropping partitions th

[jira] [Assigned] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18656: Assignee: Apache Spark > org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQ

[jira] [Commented] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Sina Sohangir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709930#comment-15709930 ] Sina Sohangir commented on SPARK-18656: --- Create a PR: https://github.com/apache/spa

[jira] [Issue Comment Deleted] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Sina Sohangir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sina Sohangir updated SPARK-18656: -- Comment: was deleted (was: Created a PR: https://github.com/apache/spark/pull/16087 ) > org.a

[jira] [Comment Edited] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Sina Sohangir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709930#comment-15709930 ] Sina Sohangir edited comment on SPARK-18656 at 11/30/16 10:03 PM: -

[jira] [Commented] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709929#comment-15709929 ] Apache Spark commented on SPARK-18656: -- User 'sinasohangirsc' has created a pull req

[jira] [Assigned] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18656: Assignee: (was: Apache Spark) > org.apache.spark.sql.execution.stat.StatFunctions#mult

[jira] [Updated] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-11-30 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18659: --- Summary: Incorrect behaviors in overwrite table for datasource tables (was: Crash in overwrite table

[jira] [Commented] (SPARK-18251) DataSet API | RuntimeException: Null value appeared in non-nullable field when holding Option Case Class

2016-11-30 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709869#comment-15709869 ] Cheng Lian commented on SPARK-18251: One more comment about why we shouldn't allow a

[jira] [Updated] (SPARK-18251) DataSet API | RuntimeException: Null value appeared in non-nullable field when holding Option Case Class

2016-11-30 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-18251: --- Assignee: Wenchen Fan > DataSet API | RuntimeException: Null value appeared in non-nullable field >

[jira] [Resolved] (SPARK-18251) DataSet API | RuntimeException: Null value appeared in non-nullable field when holding Option Case Class

2016-11-30 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-18251. Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 15979 [https://github.

[jira] [Created] (SPARK-18658) Writing to a text DataSource buffers one or more lines in memory

2016-11-30 Thread Nathan Howell (JIRA)
Nathan Howell created SPARK-18658: - Summary: Writing to a text DataSource buffers one or more lines in memory Key: SPARK-18658 URL: https://issues.apache.org/jira/browse/SPARK-18658 Project: Spark

[jira] [Created] (SPARK-18659) Crash in overwrite table partitions due to hive metastore integration

2016-11-30 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18659: -- Summary: Crash in overwrite table partitions due to hive metastore integration Key: SPARK-18659 URL: https://issues.apache.org/jira/browse/SPARK-18659 Project: Spark

[jira] [Commented] (SPARK-18318) ML, Graph 2.1 QA: API: New Scala APIs, docs

2016-11-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709811#comment-15709811 ] Joseph K. Bradley commented on SPARK-18318: --- I did a quick check too and did no

[jira] [Resolved] (SPARK-18318) ML, Graph 2.1 QA: API: New Scala APIs, docs

2016-11-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-18318. --- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 Issue resolved

[jira] [Created] (SPARK-18657) Persist UUID across query restart

2016-11-30 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-18657: Summary: Persist UUID across query restart Key: SPARK-18657 URL: https://issues.apache.org/jira/browse/SPARK-18657 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-18274) Memory leak in PySpark StringIndexer

2016-11-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18274: -- Target Version/s: 2.0.3, 2.1.1, 2.2.0 (was: 2.0.3, 2.1.0) > Memory leak in PySpark Str

[jira] [Updated] (SPARK-18563) mapWithState: initialState should have a timeout setting per record

2016-11-30 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-18563: - Component/s: (was: Structured Streaming) DStreams > mapWithState: initialSta

[jira] [Updated] (SPARK-18588) KafkaSourceStressForDontFailOnDataLossSuite is flaky

2016-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18588: - Target Version/s: 2.1.0 > KafkaSourceStressForDontFailOnDataLossSuite is flaky >

[jira] [Resolved] (SPARK-16545) Structured Streaming : foreachSink creates the Physical Plan multiple times per TriggerInterval

2016-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-16545. -- Resolution: Later > Structured Streaming : foreachSink creates the Physical Plan multip

[jira] [Updated] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18655: - Fix Version/s: (was: 2.1.0) > Ignore Structured Streaming 2.0.2 logs in history serve

[jira] [Updated] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18655: - Target Version/s: 2.1.0 > Ignore Structured Streaming 2.0.2 logs in history server >

[jira] [Created] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Sina Sohangir (JIRA)
Sina Sohangir created SPARK-18656: - Summary: org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns Key: SPARK-18656 URL: https://issues.apache.

[jira] [Commented] (SPARK-18536) Failed to save to hive table when case class with empty field

2016-11-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709551#comment-15709551 ] Reynold Xin commented on SPARK-18536: - We need to add a PreWriteCheck for Parquet.

[jira] [Updated] (SPARK-18536) Failed to save to hive table when case class with empty field

2016-11-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18536: Description: {code}import scala.collection.mutable.Queue import org.apache.spark.SparkConf import

[jira] [Commented] (SPARK-18653) Dataset.show() generates incorrect padding for Unicode Character

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709468#comment-15709468 ] Apache Spark commented on SPARK-18653: -- User 'kiszk' has created a pull request for

[jira] [Assigned] (SPARK-18653) Dataset.show() generates incorrect padding for Unicode Character

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18653: Assignee: Apache Spark > Dataset.show() generates incorrect padding for Unicode Character

[jira] [Assigned] (SPARK-18653) Dataset.show() generates incorrect padding for Unicode Character

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18653: Assignee: (was: Apache Spark) > Dataset.show() generates incorrect padding for Unicode

[jira] [Assigned] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18655: Assignee: Apache Spark (was: Shixiong Zhu) > Ignore Structured Streaming 2.0.2 logs in hi

[jira] [Assigned] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18655: Assignee: Shixiong Zhu (was: Apache Spark) > Ignore Structured Streaming 2.0.2 logs in hi

  1   2   >