[jira] [Commented] (SPARK-10590) Spark with YARN build is broken

2016-10-12 Thread Nirman Narang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570963#comment-15570963 ] Nirman Narang commented on SPARK-10590: --- [~rxin], Should this ticket be reopened? > Spark with

[jira] [Updated] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

2016-10-12 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-17867: Assignee: Liang-Chi Hsieh > Dataset.dropDuplicates (i.e. distinct) should consider the columns

[jira] [Resolved] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

2016-10-12 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17867. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15427

[jira] [Updated] (SPARK-17866) Dataset.dropDuplicates (i.e., distinct) should not change the output of child plan

2016-10-12 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-17866: Assignee: Liang-Chi Hsieh > Dataset.dropDuplicates (i.e., distinct) should not change the output

[jira] [Resolved] (SPARK-17866) Dataset.dropDuplicates (i.e., distinct) should not change the output of child plan

2016-10-12 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17866. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15427

[jira] [Created] (SPARK-17902) collect() ignores stringsAsFactors

2016-10-12 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-17902: -- Summary: collect() ignores stringsAsFactors Key: SPARK-17902 URL: https://issues.apache.org/jira/browse/SPARK-17902 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-17901) NettyRpcEndpointRef: Error sending message and Caused by: java.util.ConcurrentModificationException

2016-10-12 Thread Harish (JIRA)
Harish created SPARK-17901: -- Summary: NettyRpcEndpointRef: Error sending message and Caused by: java.util.ConcurrentModificationException Key: SPARK-17901 URL: https://issues.apache.org/jira/browse/SPARK-17901

[jira] [Comment Edited] (SPARK-16599) java.util.NoSuchElementException: None.get at at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)

2016-10-12 Thread Shivansh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570826#comment-15570826 ] Shivansh edited comment on SPARK-16599 at 10/13/16 4:53 AM: [~srowen],

[jira] [Assigned] (SPARK-17899) add a debug mode to keep raw table properties in HiveExternalCatalog

2016-10-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17899: Assignee: Wenchen Fan (was: Apache Spark) > add a debug mode to keep raw table

[jira] [Assigned] (SPARK-17899) add a debug mode to keep raw table properties in HiveExternalCatalog

2016-10-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17899: Assignee: Apache Spark (was: Wenchen Fan) > add a debug mode to keep raw table

[jira] [Commented] (SPARK-17899) add a debug mode to keep raw table properties in HiveExternalCatalog

2016-10-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570837#comment-15570837 ] Apache Spark commented on SPARK-17899: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Commented] (SPARK-16599) java.util.NoSuchElementException: None.get at at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)

2016-10-12 Thread Shivansh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570826#comment-15570826 ] Shivansh commented on SPARK-16599: -- [~srowen], [~joshrosen]: Any updates on this issue ? We are also

[jira] [Resolved] (SPARK-17876) Write StructuredStreaming WAL to a stream instead of materializing all at once

2016-10-12 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-17876. -- Resolution: Fixed Assignee: Burak Yavuz Fix Version/s: 2.1.0

[jira] [Updated] (SPARK-17900) Mark the following Spark SQL APIs as stable

2016-10-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17900: Description: Mark the following stable: Dataset/DataFrame - functions, since 1.3 - ColumnName,

[jira] [Created] (SPARK-17899) add a debug mode to keep raw table properties in HiveExternalCatalog

2016-10-12 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-17899: --- Summary: add a debug mode to keep raw table properties in HiveExternalCatalog Key: SPARK-17899 URL: https://issues.apache.org/jira/browse/SPARK-17899 Project: Spark

[jira] [Created] (SPARK-17900) Mark the following Spark SQL APIs as stable

2016-10-12 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-17900: --- Summary: Mark the following Spark SQL APIs as stable Key: SPARK-17900 URL: https://issues.apache.org/jira/browse/SPARK-17900 Project: Spark Issue Type:

[jira] [Commented] (SPARK-17830) Annotate Spark SQL public APIs with InterfaceStability

2016-10-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570726#comment-15570726 ] Apache Spark commented on SPARK-17830: -- User 'rxin' has created a pull request for this issue:

[jira] [Updated] (SPARK-16827) Stop reporting spill metrics as shuffle metrics

2016-10-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16827: Fix Version/s: 2.0.2 > Stop reporting spill metrics as shuffle metrics >

[jira] [Commented] (SPARK-16827) Stop reporting spill metrics as shuffle metrics

2016-10-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570716#comment-15570716 ] Reynold Xin commented on SPARK-16827: - See https://github.com/apache/spark/pull/15455/files The

[jira] [Created] (SPARK-17898) --repositories needs username and password

2016-10-12 Thread lichenglin (JIRA)
lichenglin created SPARK-17898: -- Summary: --repositories needs username and password Key: SPARK-17898 URL: https://issues.apache.org/jira/browse/SPARK-17898 Project: Spark Issue Type: Wish

[jira] [Resolved] (SPARK-17835) Optimize NaiveBayes mllib wrapper to eliminate extra pass on data

2016-10-12 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-17835. - Resolution: Fixed Assignee: Yanbo Liang Fix Version/s: 2.1.0 > Optimize

[jira] [Resolved] (SPARK-17745) Update Python API for NB to support weighted instances

2016-10-12 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-17745. - Resolution: Fixed Assignee: Weichen Xu Fix Version/s: 2.1.0 > Update Python API

[jira] [Updated] (SPARK-17888) Memory leak in streaming driver when use SparkSQL in Streaming

2016-10-12 Thread weilin.chen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weilin.chen updated SPARK-17888: Summary: Memory leak in streaming driver when use SparkSQL in Streaming (was: Mseory leak in

[jira] [Created] (SPARK-17897) not isnotnull is converted to the always false condition isnotnull && not isnotnull

2016-10-12 Thread Jordan Halterman (JIRA)
Jordan Halterman created SPARK-17897: Summary: not isnotnull is converted to the always false condition isnotnull && not isnotnull Key: SPARK-17897 URL: https://issues.apache.org/jira/browse/SPARK-17897

[jira] [Assigned] (SPARK-17686) Propose to print Scala version in "spark-submit --version" command

2016-10-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17686: Assignee: Apache Spark > Propose to print Scala version in "spark-submit --version"

[jira] [Assigned] (SPARK-17686) Propose to print Scala version in "spark-submit --version" command

2016-10-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17686: Assignee: (was: Apache Spark) > Propose to print Scala version in "spark-submit

[jira] [Commented] (SPARK-17686) Propose to print Scala version in "spark-submit --version" command

2016-10-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570572#comment-15570572 ] Apache Spark commented on SPARK-17686: -- User 'jerryshao' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-16827) Stop reporting spill metrics as shuffle metrics

2016-10-12 Thread Gaoxiang Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570446#comment-15570446 ] Gaoxiang Liu edited comment on SPARK-16827 at 10/13/16 1:16 AM: [~rxin],

[jira] [Comment Edited] (SPARK-16827) Stop reporting spill metrics as shuffle metrics

2016-10-12 Thread Gaoxiang Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570446#comment-15570446 ] Gaoxiang Liu edited comment on SPARK-16827 at 10/13/16 1:14 AM: [~rxin],

[jira] [Commented] (SPARK-16827) Stop reporting spill metrics as shuffle metrics

2016-10-12 Thread Gaoxiang Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570446#comment-15570446 ] Gaoxiang Liu commented on SPARK-16827: -- [~rxin], for this one, if I want to add spill metrics, do

[jira] [Comment Edited] (SPARK-17074) generate histogram information for column

2016-10-12 Thread Zhenhua Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570366#comment-15570366 ] Zhenhua Wang edited comment on SPARK-17074 at 10/13/16 12:55 AM: - Well,

[jira] [Comment Edited] (SPARK-17074) generate histogram information for column

2016-10-12 Thread Zhenhua Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570366#comment-15570366 ] Zhenhua Wang edited comment on SPARK-17074 at 10/13/16 12:29 AM: - Well,

[jira] [Commented] (SPARK-17074) generate histogram information for column

2016-10-12 Thread Zhenhua Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570366#comment-15570366 ] Zhenhua Wang commented on SPARK-17074: -- Well, I've got stuck here for a few days. I went through the

[jira] [Updated] (SPARK-17845) Improve window function frame boundary API in DataFrame

2016-10-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17845: --- Fix Version/s: 2.1.0 > Improve window function frame boundary API in DataFrame >

[jira] [Resolved] (SPARK-17845) Improve window function frame boundary API in DataFrame

2016-10-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17845. Resolution: Fixed > Improve window function frame boundary API in DataFrame >

[jira] [Closed] (SPARK-15408) Spark streaming app crashes with NotLeaderForPartitionException

2016-10-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger closed SPARK-15408. -- Resolution: Cannot Reproduce > Spark streaming app crashes with NotLeaderForPartitionException

[jira] [Commented] (SPARK-15272) DirectKafkaInputDStream doesn't work with window operation

2016-10-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570221#comment-15570221 ] Cody Koeninger commented on SPARK-15272: Checking to see if the 0.10 consumer's handling of

[jira] [Comment Edited] (SPARK-15272) DirectKafkaInputDStream doesn't work with window operation

2016-10-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570221#comment-15570221 ] Cody Koeninger edited comment on SPARK-15272 at 10/12/16 11:33 PM: ---

[jira] [Commented] (SPARK-11698) Add option to ignore kafka messages that are out of limit rate

2016-10-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570208#comment-15570208 ] Cody Koeninger commented on SPARK-11698: Would a custom ConsumerStrategy for the new consumer

[jira] [Updated] (SPARK-17850) HadoopRDD should not swallow EOFException

2016-10-12 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17850: - Fix Version/s: 2.1.0 2.0.2 > HadoopRDD should not swallow EOFException >

[jira] [Resolved] (SPARK-10320) Kafka Support new topic subscriptions without requiring restart of the streaming context

2016-10-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-10320. Resolution: Fixed Fix Version/s: 2.0.0 SPARK-12177 added the new consumer, which

[jira] [Closed] (SPARK-9947) Separate Metadata and State Checkpoint Data

2016-10-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger closed SPARK-9947. - Resolution: Won't Fix The direct DStream api already gives access to offsets, and it seems clear

[jira] [Commented] (SPARK-16827) Stop reporting spill metrics as shuffle metrics

2016-10-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570193#comment-15570193 ] Apache Spark commented on SPARK-16827: -- User 'dafrista' has created a pull request for this issue:

[jira] [Commented] (SPARK-14516) Clustering evaluator

2016-10-12 Thread Saikat Kanjilal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570185#comment-15570185 ] Saikat Kanjilal commented on SPARK-14516: - Hello, New to spark and am interested in helping with

[jira] [Commented] (SPARK-8337) KafkaUtils.createDirectStream for python is lacking API/feature parity with the Scala/Java version

2016-10-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570186#comment-15570186 ] Cody Koeninger commented on SPARK-8337: --- Can this be closed, given that the subtasks are resolved

[jira] [Closed] (SPARK-5505) ConsumerRebalanceFailedException from Kafka consumer

2016-10-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger closed SPARK-5505. - Resolution: Won't Fix The old kafka High Level Consumer has been abandoned at this point.

[jira] [Resolved] (SPARK-5718) Add native offset management for ReliableKafkaReceiver

2016-10-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-5718. --- Resolution: Fixed Fix Version/s: 2.0.0 SPARK-12177 added support for the native kafka

[jira] [Commented] (SPARK-17850) HadoopRDD should not swallow EOFException

2016-10-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570154#comment-15570154 ] Apache Spark commented on SPARK-17850: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-12372) Document limitations of MLlib local linear algebra

2016-10-12 Thread Saikat Kanjilal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570152#comment-15570152 ] Saikat Kanjilal commented on SPARK-12372: - @josephkb, new to contributing to spark, is this

[jira] [Resolved] (SPARK-16827) Stop reporting spill metrics as shuffle metrics

2016-10-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-16827. - Resolution: Fixed Assignee: Brian Cho Fix Version/s: 2.1.0 > Stop reporting

[jira] [Commented] (SPARK-9487) Use the same num. worker threads in Scala/Python unit tests

2016-10-12 Thread Saikat Kanjilal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570122#comment-15570122 ] Saikat Kanjilal commented on SPARK-9487: Hello All, Can I help with this in anyway? Thanks > Use

[jira] [Commented] (SPARK-10815) API design: data sources and sinks

2016-10-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570139#comment-15570139 ] Cody Koeninger commented on SPARK-10815: Another unfortunate thing about the Sink api is that it

[jira] [Comment Edited] (SPARK-14212) Add configuration element for --packages option

2016-10-12 Thread Saikat Kanjilal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570125#comment-15570125 ] Saikat Kanjilal edited comment on SPARK-14212 at 10/12/16 10:53 PM:

[jira] [Commented] (SPARK-14212) Add configuration element for --packages option

2016-10-12 Thread Saikat Kanjilal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570125#comment-15570125 ] Saikat Kanjilal commented on SPARK-14212: - heldenk@ can I help out with this? > Add

[jira] [Comment Edited] (SPARK-14212) Add configuration element for --packages option

2016-10-12 Thread Saikat Kanjilal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570125#comment-15570125 ] Saikat Kanjilal edited comment on SPARK-14212 at 10/12/16 10:54 PM:

[jira] [Created] (SPARK-17896) Dataset groupByKey + reduceGroups fails with codegen-related exception

2016-10-12 Thread Adam Breindel (JIRA)
Adam Breindel created SPARK-17896: - Summary: Dataset groupByKey + reduceGroups fails with codegen-related exception Key: SPARK-17896 URL: https://issues.apache.org/jira/browse/SPARK-17896 Project:

[jira] [Resolved] (SPARK-17782) Kafka 010 test is flaky

2016-10-12 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-17782. -- Resolution: Fixed Assignee: Cody Koeninger Resolved by

[jira] [Updated] (SPARK-17782) Kafka 010 test is flaky

2016-10-12 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17782: - Fix Version/s: 2.1.0 2.0.2 > Kafka 010 test is flaky >

[jira] [Updated] (SPARK-17782) Kafka 010 test is flaky

2016-10-12 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17782: - Affects Version/s: 2.0.0 2.0.1 > Kafka 010 test is flaky >

[jira] [Commented] (SPARK-17892) Query in CTAS is Optimized Twice (branch-2.0)

2016-10-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570042#comment-15570042 ] Xiao Li commented on SPARK-17892: - Will do it! : ) > Query in CTAS is Optimized Twice (branch-2.0) >

[jira] [Updated] (SPARK-17894) Uniqueness of TaskSetManager name

2016-10-12 Thread Eren Avsarogullari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eren Avsarogullari updated SPARK-17894: --- Description: TaskSetManager should have unique name to avoid adding duplicate ones

[jira] [Created] (SPARK-17895) Improve documentation of "rowsBetween" and "rangeBetween"

2016-10-12 Thread Weiluo Ren (JIRA)
Weiluo Ren created SPARK-17895: -- Summary: Improve documentation of "rowsBetween" and "rangeBetween" Key: SPARK-17895 URL: https://issues.apache.org/jira/browse/SPARK-17895 Project: Spark Issue

[jira] [Created] (SPARK-17894) Uniqueness of TaskSetManager name

2016-10-12 Thread Eren Avsarogullari (JIRA)
Eren Avsarogullari created SPARK-17894: -- Summary: Uniqueness of TaskSetManager name Key: SPARK-17894 URL: https://issues.apache.org/jira/browse/SPARK-17894 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-17675) Add Blacklisting of Executors & Nodes within one TaskSet

2016-10-12 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid resolved SPARK-17675. -- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15249

[jira] [Comment Edited] (SPARK-12787) Dataset to support custom encoder

2016-10-12 Thread Aleksander Eskilson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569928#comment-15569928 ] Aleksander Eskilson edited comment on SPARK-12787 at 10/12/16 9:37 PM:

[jira] [Commented] (SPARK-12787) Dataset to support custom encoder

2016-10-12 Thread Aleksander Eskilson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569928#comment-15569928 ] Aleksander Eskilson commented on SPARK-12787: - [~Zariel], I've put together an implementation

[jira] [Updated] (SPARK-16827) Stop reporting spill metrics as shuffle metrics

2016-10-12 Thread Sital Kedia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sital Kedia updated SPARK-16827: Summary: Stop reporting spill metrics as shuffle metrics (was: Query with Join produces excessive

[jira] [Updated] (SPARK-17834) Fetch the earliest offsets manually in KafkaSource instead of counting on KafkaConsumer

2016-10-12 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17834: - Target Version/s: 2.0.2, 2.1.0 > Fetch the earliest offsets manually in KafkaSource instead of

[jira] [Assigned] (SPARK-17770) Make ObjectType SQL Type Public

2016-10-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17770: Assignee: (was: Apache Spark) > Make ObjectType SQL Type Public >

[jira] [Assigned] (SPARK-17770) Make ObjectType SQL Type Public

2016-10-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17770: Assignee: Apache Spark > Make ObjectType SQL Type Public >

[jira] [Commented] (SPARK-17770) Make ObjectType SQL Type Public

2016-10-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569840#comment-15569840 ] Apache Spark commented on SPARK-17770: -- User 'bdrillard' has created a pull request for this issue:

[jira] [Updated] (SPARK-17770) Make ObjectType SQL Type Public

2016-10-12 Thread Aleksander Eskilson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksander Eskilson updated SPARK-17770: Affects Version/s: 2.0.0 > Make ObjectType SQL Type Public >

[jira] [Updated] (SPARK-17770) Make ObjectType SQL Type Public

2016-10-12 Thread Aleksander Eskilson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksander Eskilson updated SPARK-17770: Target Version/s: (was: 2.0.2) > Make ObjectType SQL Type Public >

[jira] [Updated] (SPARK-17770) Make ObjectType SQL Type Public

2016-10-12 Thread Aleksander Eskilson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksander Eskilson updated SPARK-17770: Priority: Major (was: Critical) > Make ObjectType SQL Type Public >

[jira] [Updated] (SPARK-17770) Make ObjectType SQL Type Public

2016-10-12 Thread Aleksander Eskilson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksander Eskilson updated SPARK-17770: Priority: Critical (was: Minor) > Make ObjectType SQL Type Public >

[jira] [Commented] (SPARK-12664) Expose raw prediction scores in MultilayerPerceptronClassificationModel

2016-10-12 Thread Guo-Xun Yuan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569792#comment-15569792 ] Guo-Xun Yuan commented on SPARK-12664: -- I would also vote this as an important features. Also, Is

[jira] [Comment Edited] (SPARK-12664) Expose raw prediction scores in MultilayerPerceptronClassificationModel

2016-10-12 Thread Guo-Xun Yuan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569792#comment-15569792 ] Guo-Xun Yuan edited comment on SPARK-12664 at 10/12/16 8:31 PM: I would

[jira] [Created] (SPARK-17893) Window functions should also allow looking back in time

2016-10-12 Thread Raviteja Lokineni (JIRA)
Raviteja Lokineni created SPARK-17893: - Summary: Window functions should also allow looking back in time Key: SPARK-17893 URL: https://issues.apache.org/jira/browse/SPARK-17893 Project: Spark

[jira] [Updated] (SPARK-17892) Query in CTAS is Optimized Twice (branch-2.0)

2016-10-12 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-17892: - Assignee: Xiao Li > Query in CTAS is Optimized Twice (branch-2.0) >

[jira] [Commented] (SPARK-17892) Query in CTAS is Optimized Twice (branch-2.0)

2016-10-12 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569587#comment-15569587 ] Yin Huai commented on SPARK-17892: -- cc [~smilegator] > Query in CTAS is Optimized Twice (branch-2.0) >

[jira] [Created] (SPARK-17892) Query in CTAS is Optimized Twice (branch-2.0)

2016-10-12 Thread Yin Huai (JIRA)
Yin Huai created SPARK-17892: Summary: Query in CTAS is Optimized Twice (branch-2.0) Key: SPARK-17892 URL: https://issues.apache.org/jira/browse/SPARK-17892 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-17863) SELECT distinct does not work if there is a order by clause

2016-10-12 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-17863: - Priority: Blocker (was: Critical) > SELECT distinct does not work if there is a order by clause >

[jira] [Updated] (SPARK-17891) SQL-based three column join loses first column

2016-10-12 Thread Eli Miller (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Miller updated SPARK-17891: --- Attachment: test.tgz A simple maven project that contains TripleJoin.java and sample data files

[jira] [Created] (SPARK-17891) SQL-based three column join loses first column

2016-10-12 Thread Eli Miller (JIRA)
Eli Miller created SPARK-17891: -- Summary: SQL-based three column join loses first column Key: SPARK-17891 URL: https://issues.apache.org/jira/browse/SPARK-17891 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-12 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569493#comment-15569493 ] Kazuaki Ishizaki commented on SPARK-16845: -- Thank you for preparing the case. I noticed that the

[jira] [Issue Comment Deleted] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-12 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-16845: - Comment: was deleted (was: Thank you for preparing the case. I noticed that the

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-12 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569489#comment-15569489 ] Kazuaki Ishizaki commented on SPARK-16845: -- Thank you for preparing the case. I noticed that the

[jira] [Resolved] (SPARK-17840) Add some pointers for wiki/CONTRIBUTING.md in README.md and some warnings in PULL_REQUEST_TEMPLATE

2016-10-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17840. - Resolution: Fixed Assignee: Sean Owen Fix Version/s: 2.1.0 > Add some pointers

[jira] [Resolved] (SPARK-17790) Support for parallelizing R data.frame larger than 2GB

2016-10-12 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-17790. -- Resolution: Fixed Assignee: Hossein Falaki Fix Version/s: 2.1.0

[jira] [Commented] (SPARK-17814) spark submit arguments are truncated in yarn-cluster mode

2016-10-12 Thread shreyas subramanya (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569349#comment-15569349 ] shreyas subramanya commented on SPARK-17814: yarn version is 2.6.4 > spark submit arguments

[jira] [Commented] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-12 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569325#comment-15569325 ] Felix Cheung commented on SPARK-17781: -- Thanks for the investigation. This might seem like a R

[jira] [Resolved] (SPARK-17884) In the cast expression, casting from empty string to interval type throws NullPointerException

2016-10-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17884. - Resolution: Fixed Assignee: Priyanka Garg Fix Version/s: 2.1.0

[jira] [Resolved] (SPARK-14761) PySpark DataFrame.join should reject invalid join methods even when join columns are not specified

2016-10-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-14761. - Resolution: Fixed Assignee: Bijay Kumar Pathak Fix Version/s: 2.1.0 > PySpark

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-12 Thread K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569265#comment-15569265 ] K commented on SPARK-16845: --- Code and data are also here as well.

[jira] [Updated] (SPARK-17890) scala.ScalaReflectionException

2016-10-12 Thread Khalid Reid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khalid Reid updated SPARK-17890: Description: Hello, I am seeing an error message in spark-shell when I map a DataFrame to a

[jira] [Updated] (SPARK-17890) scala.ScalaReflectionException

2016-10-12 Thread Khalid Reid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khalid Reid updated SPARK-17890: Description: Hello, I am seeing a very cryptic error message in spark-shell when I map a

[jira] [Comment Edited] (SPARK-17883) Possible typo in comments of Row.scala

2016-10-12 Thread Weiluo Ren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569203#comment-15569203 ] Weiluo Ren edited comment on SPARK-17883 at 10/12/16 5:01 PM: -- Sure, I will

[jira] [Created] (SPARK-17890) scala.ScalaReflectionException

2016-10-12 Thread Khalid Reid (JIRA)
Khalid Reid created SPARK-17890: --- Summary: scala.ScalaReflectionException Key: SPARK-17890 URL: https://issues.apache.org/jira/browse/SPARK-17890 Project: Spark Issue Type: Bug Affects

[jira] [Commented] (SPARK-17883) Possible typo in comments of Row.scala

2016-10-12 Thread Weiluo Ren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569203#comment-15569203 ] Weiluo Ren commented on SPARK-17883: Sure, I will create a JIRA for this typo and others related (if

[jira] [Commented] (SPARK-17827) StatisticsColumnSuite failures on big endian platforms

2016-10-12 Thread Pete Robbins (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569048#comment-15569048 ] Pete Robbins commented on SPARK-17827: -- So this looks like the max field is being written as an Int

  1   2   >