[jira] [Commented] (SPARK-26059) Spark standalone mode, does not correctly record a failed Spark Job.

2018-11-28 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702809#comment-16702809 ] Prashant Sharma commented on SPARK-26059: - This will be a won't fix, as to fix it, 1) One

[jira] [Comment Edited] (SPARK-26059) Spark standalone mode, does not correctly record a failed Spark Job.

2018-11-28 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702809#comment-16702809 ] Prashant Sharma edited comment on SPARK-26059 at 11/29/18 7:33 AM: ---

[jira] [Resolved] (SPARK-26059) Spark standalone mode, does not correctly record a failed Spark Job.

2018-11-28 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Sharma resolved SPARK-26059. - Resolution: Won't Fix > Spark standalone mode, does not correctly record a failed Spark

[jira] [Assigned] (SPARK-26211) Fix InSet for binary, and struct and array with null.

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26211: Assignee: (was: Apache Spark) > Fix InSet for binary, and struct and array with

[jira] [Assigned] (SPARK-26211) Fix InSet for binary, and struct and array with null.

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26211: Assignee: Apache Spark > Fix InSet for binary, and struct and array with null. >

[jira] [Commented] (SPARK-26211) Fix InSet for binary, and struct and array with null.

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702801#comment-16702801 ] Apache Spark commented on SPARK-26211: -- User 'ueshin' has created a pull request for this issue:

[jira] [Created] (SPARK-26211) Fix InSet for binary, and struct and array with null.

2018-11-28 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-26211: - Summary: Fix InSet for binary, and struct and array with null. Key: SPARK-26211 URL: https://issues.apache.org/jira/browse/SPARK-26211 Project: Spark

[jira] [Updated] (SPARK-26206) Spark structured streaming with kafka integration fails in update mode

2018-11-28 Thread indraneel r (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] indraneel r updated SPARK-26206: Description: Spark structured streaming with kafka integration fails in update mode with

[jira] [Updated] (SPARK-26206) Spark structured streaming with kafka integration fails in update mode

2018-11-28 Thread indraneel r (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] indraneel r updated SPARK-26206: Description: Spark structured streaming with kafka integration fails in update mode with

[jira] [Updated] (SPARK-26206) Spark structured streaming with kafka integration fails in update mode

2018-11-28 Thread indraneel r (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] indraneel r updated SPARK-26206: Description: Spark structured streaming with kafka integration fails in update mode with

[jira] [Commented] (SPARK-26206) Spark structured streaming with kafka integration fails in update mode

2018-11-28 Thread indraneel r (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702757#comment-16702757 ] indraneel r commented on SPARK-26206: - [~kabhwan]  Have added the details in the description. >

[jira] [Created] (SPARK-26210) Streaming of syslog

2018-11-28 Thread Manoj (JIRA)
Manoj created SPARK-26210: - Summary: Streaming of syslog Key: SPARK-26210 URL: https://issues.apache.org/jira/browse/SPARK-26210 Project: Spark Issue Type: Question Components: Spark

[jira] [Commented] (SPARK-23410) Unable to read jsons in charset different from UTF-8

2018-11-28 Thread xuqianjin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702714#comment-16702714 ] xuqianjin commented on SPARK-23410: --- [~maxgekk] Thank you very much. I'll get started on this as soon

[jira] [Commented] (SPARK-26142) Implement shuffle read metrics in SQL

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702687#comment-16702687 ] Apache Spark commented on SPARK-26142: -- User 'xuanyuanking' has created a pull request for this

[jira] [Assigned] (SPARK-26133) Remove deprecated OneHotEncoder and rename OneHotEncoderEstimator to OneHotEncoder

2018-11-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26133: - Assignee: Liang-Chi Hsieh > Remove deprecated OneHotEncoder and rename OneHotEncoderEstimator

[jira] [Resolved] (SPARK-26133) Remove deprecated OneHotEncoder and rename OneHotEncoderEstimator to OneHotEncoder

2018-11-28 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai resolved SPARK-26133. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23100

[jira] [Updated] (SPARK-26207) add PowerIterationClustering (PIC) doc in 2.4 branch

2018-11-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26207: -- Priority: Minor (was: Major) > add PowerIterationClustering (PIC) doc in 2.4 branch >

[jira] [Commented] (SPARK-24498) Add JDK compiler for runtime codegen

2018-11-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702515#comment-16702515 ] Reynold Xin commented on SPARK-24498: - Why don't we close the ticket? I heard we would get mixed

[jira] [Updated] (SPARK-26188) Spark 2.4.0 Partitioning behavior breaks backwards compatibility

2018-11-28 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-26188: Target Version/s: 2.4.1 > Spark 2.4.0 Partitioning behavior breaks backwards compatibility >

[jira] [Updated] (SPARK-26188) Spark 2.4.0 Partitioning behavior breaks backwards compatibility

2018-11-28 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-26188: Component/s: (was: Spark Core) SQL > Spark 2.4.0 Partitioning behavior breaks

[jira] [Updated] (SPARK-26188) Spark 2.4.0 Partitioning behavior breaks backwards compatibility

2018-11-28 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-26188: Priority: Critical (was: Minor) > Spark 2.4.0 Partitioning behavior breaks backwards compatibility >

[jira] [Commented] (SPARK-26206) Spark structured streaming with kafka integration fails in update mode

2018-11-28 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702465#comment-16702465 ] Jungtaek Lim commented on SPARK-26206: -- [~indraneelrr] Could you also provide how UserEvent is

[jira] [Created] (SPARK-26209) Allow for dataframe bucketization without Hive

2018-11-28 Thread Walt Elder (JIRA)
Walt Elder created SPARK-26209: -- Summary: Allow for dataframe bucketization without Hive Key: SPARK-26209 URL: https://issues.apache.org/jira/browse/SPARK-26209 Project: Spark Issue Type:

[jira] [Commented] (SPARK-26194) Support automatic spark.authenticate secret in Kubernetes backend

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702397#comment-16702397 ] Apache Spark commented on SPARK-26194: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-26194) Support automatic spark.authenticate secret in Kubernetes backend

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26194: Assignee: (was: Apache Spark) > Support automatic spark.authenticate secret in

[jira] [Commented] (SPARK-26194) Support automatic spark.authenticate secret in Kubernetes backend

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702395#comment-16702395 ] Apache Spark commented on SPARK-26194: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-26194) Support automatic spark.authenticate secret in Kubernetes backend

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26194: Assignee: Apache Spark > Support automatic spark.authenticate secret in Kubernetes

[jira] [Commented] (SPARK-23904) Big execution plan cause OOM

2018-11-28 Thread Dave DeCaprio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702363#comment-16702363 ] Dave DeCaprio commented on SPARK-23904: --- I've created a pull request that will address this.  It

[jira] [Commented] (SPARK-25380) Generated plans occupy over 50% of Spark driver memory

2018-11-28 Thread Dave DeCaprio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702362#comment-16702362 ] Dave DeCaprio commented on SPARK-25380: --- I've created a PR that should address this.  It limits

[jira] [Assigned] (SPARK-26208) Empty dataframe does not roundtrip for csv with header

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26208: Assignee: (was: Apache Spark) > Empty dataframe does not roundtrip for csv with

[jira] [Commented] (SPARK-26208) Empty dataframe does not roundtrip for csv with header

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702345#comment-16702345 ] Apache Spark commented on SPARK-26208: -- User 'koertkuipers' has created a pull request for this

[jira] [Assigned] (SPARK-26208) Empty dataframe does not roundtrip for csv with header

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26208: Assignee: Apache Spark > Empty dataframe does not roundtrip for csv with header >

[jira] [Commented] (SPARK-25957) Skip building spark-r docker image if spark distribution does not have R support

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702313#comment-16702313 ] Apache Spark commented on SPARK-25957: -- User 'vanzin' has created a pull request for this issue:

[jira] [Commented] (SPARK-26205) Optimize In expression for bytes, shorts, ints

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702295#comment-16702295 ] Apache Spark commented on SPARK-26205: -- User 'aokolnychyi' has created a pull request for this

[jira] [Commented] (SPARK-26205) Optimize In expression for bytes, shorts, ints

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702293#comment-16702293 ] Apache Spark commented on SPARK-26205: -- User 'aokolnychyi' has created a pull request for this

[jira] [Assigned] (SPARK-26205) Optimize In expression for bytes, shorts, ints

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26205: Assignee: Apache Spark > Optimize In expression for bytes, shorts, ints >

[jira] [Assigned] (SPARK-26205) Optimize In expression for bytes, shorts, ints

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26205: Assignee: (was: Apache Spark) > Optimize In expression for bytes, shorts, ints >

[jira] [Commented] (SPARK-14948) Exception when joining DataFrames derived form the same DataFrame

2018-11-28 Thread mayur (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702225#comment-16702225 ] mayur commented on SPARK-14948: --- I am also facing this issue . any idea ETA would be great to know ! >

[jira] [Created] (SPARK-26208) Empty dataframe does not roundtrip for csv with header

2018-11-28 Thread koert kuipers (JIRA)
koert kuipers created SPARK-26208: - Summary: Empty dataframe does not roundtrip for csv with header Key: SPARK-26208 URL: https://issues.apache.org/jira/browse/SPARK-26208 Project: Spark

[jira] [Commented] (SPARK-26103) OutOfMemory error with large query plans

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702213#comment-16702213 ] Apache Spark commented on SPARK-26103: -- User 'DaveDeCaprio' has created a pull request for this

[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702228#comment-16702228 ] Apache Spark commented on SPARK-24423: -- User 'wangyum' has created a pull request for this issue:

[jira] [Commented] (SPARK-26103) OutOfMemory error with large query plans

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702219#comment-16702219 ] Apache Spark commented on SPARK-26103: -- User 'DaveDeCaprio' has created a pull request for this

[jira] [Commented] (SPARK-26207) add PowerIterationClustering (PIC) doc in 2.4 branch

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702205#comment-16702205 ] Apache Spark commented on SPARK-26207: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Commented] (SPARK-26207) add PowerIterationClustering (PIC) doc in 2.4 branch

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702204#comment-16702204 ] Apache Spark commented on SPARK-26207: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-26207) add PowerIterationClustering (PIC) doc in 2.4 branch

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26207: Assignee: (was: Apache Spark) > add PowerIterationClustering (PIC) doc in 2.4

[jira] [Assigned] (SPARK-26207) add PowerIterationClustering (PIC) doc in 2.4 branch

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26207: Assignee: Apache Spark > add PowerIterationClustering (PIC) doc in 2.4 branch >

[jira] [Created] (SPARK-26207) add PowerIterationClustering (PIC) doc in 2.4 branch

2018-11-28 Thread Huaxin Gao (JIRA)
Huaxin Gao created SPARK-26207: -- Summary: add PowerIterationClustering (PIC) doc in 2.4 branch Key: SPARK-26207 URL: https://issues.apache.org/jira/browse/SPARK-26207 Project: Spark Issue

[jira] [Updated] (SPARK-26206) Spark structured streaming with kafka integration fails in update mode

2018-11-28 Thread indraneel r (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] indraneel r updated SPARK-26206: Environment: Operating system : MacOS Mojave spark version : 2.4.0 spark-sql-kafka-0-10 : 2.4.0

[jira] [Commented] (SPARK-26024) Dataset API: repartitionByRange(...) has inconsistent behaviour

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702146#comment-16702146 ] Apache Spark commented on SPARK-26024: -- User 'srowen' has created a pull request for this issue:

[jira] [Updated] (SPARK-26206) Spark structured streaming with kafka integration fails in update mode

2018-11-28 Thread indraneel r (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] indraneel r updated SPARK-26206: Environment: Operating system : MacOS Mojave spark version : 2.4.0 spark-streaming-kafka-0-10 :

[jira] [Commented] (SPARK-26201) python broadcast.value on driver fails with disk encryption enabled

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702118#comment-16702118 ] Apache Spark commented on SPARK-26201: -- User 'redsanket' has created a pull request for this issue:

[jira] [Created] (SPARK-26205) Optimize In expression for bytes, shorts, ints

2018-11-28 Thread Anton Okolnychyi (JIRA)
Anton Okolnychyi created SPARK-26205: Summary: Optimize In expression for bytes, shorts, ints Key: SPARK-26205 URL: https://issues.apache.org/jira/browse/SPARK-26205 Project: Spark Issue

[jira] [Created] (SPARK-26206) Spark structured streaming with kafka integration fails in update mode

2018-11-28 Thread indraneel r (JIRA)
indraneel r created SPARK-26206: --- Summary: Spark structured streaming with kafka integration fails in update mode Key: SPARK-26206 URL: https://issues.apache.org/jira/browse/SPARK-26206 Project: Spark

[jira] [Updated] (SPARK-26205) Optimize In expression for bytes, shorts, ints

2018-11-28 Thread Anton Okolnychyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Okolnychyi updated SPARK-26205: - Description: Currently, {{In}} expressions are compiled into a sequence of if-else

[jira] [Assigned] (SPARK-26201) python broadcast.value on driver fails with disk encryption enabled

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26201: Assignee: Apache Spark > python broadcast.value on driver fails with disk encryption

[jira] [Assigned] (SPARK-26201) python broadcast.value on driver fails with disk encryption enabled

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26201: Assignee: (was: Apache Spark) > python broadcast.value on driver fails with disk

[jira] [Updated] (SPARK-26204) Optimize InSet expression

2018-11-28 Thread Anton Okolnychyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Okolnychyi updated SPARK-26204: - Description: The {{InSet}} expression was introduced in SPARK-3711 to avoid O\(n\) time

[jira] [Updated] (SPARK-26204) Optimize InSet expression

2018-11-28 Thread Anton Okolnychyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Okolnychyi updated SPARK-26204: - Attachment: heap size.png > Optimize InSet expression > - > >

[jira] [Updated] (SPARK-26204) Optimize InSet expression

2018-11-28 Thread Anton Okolnychyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Okolnychyi updated SPARK-26204: - Attachment: (was: fastutils.png) > Optimize InSet expression >

[jira] [Updated] (SPARK-26204) Optimize InSet expression

2018-11-28 Thread Anton Okolnychyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Okolnychyi updated SPARK-26204: - Attachment: fastutils.png > Optimize InSet expression > - > >

[jira] [Created] (SPARK-26204) Optimize InSet expression

2018-11-28 Thread Anton Okolnychyi (JIRA)
Anton Okolnychyi created SPARK-26204: Summary: Optimize InSet expression Key: SPARK-26204 URL: https://issues.apache.org/jira/browse/SPARK-26204 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-26203) Benchmark performance of In and InSet expressions

2018-11-28 Thread Anton Okolnychyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Okolnychyi updated SPARK-26203: - Description: {{OptimizeIn}} rule replaces {{In}} with {{InSet}} if the number of

[jira] [Commented] (SPARK-26188) Spark 2.4.0 Partitioning behavior breaks backwards compatibility

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702096#comment-16702096 ] Apache Spark commented on SPARK-26188: -- User 'gengliangwang' has created a pull request for this

[jira] [Assigned] (SPARK-26188) Spark 2.4.0 Partitioning behavior breaks backwards compatibility

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26188: Assignee: Apache Spark > Spark 2.4.0 Partitioning behavior breaks backwards

[jira] [Commented] (SPARK-26188) Spark 2.4.0 Partitioning behavior breaks backwards compatibility

2018-11-28 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702095#comment-16702095 ] Gengliang Wang commented on SPARK-26188: [~ddgirard]Thanks for the investigation. I have created

[jira] [Assigned] (SPARK-26188) Spark 2.4.0 Partitioning behavior breaks backwards compatibility

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26188: Assignee: (was: Apache Spark) > Spark 2.4.0 Partitioning behavior breaks backwards

[jira] [Updated] (SPARK-26203) Benchmark performance of In and InSet expressions

2018-11-28 Thread Anton Okolnychyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Okolnychyi updated SPARK-26203: - Description: {{OptimizeIn}} rule replaces {{In}} with {{InSet}} if the number of

[jira] [Created] (SPARK-26203) Benchmark performance of In and InSet expressions

2018-11-28 Thread Anton Okolnychyi (JIRA)
Anton Okolnychyi created SPARK-26203: Summary: Benchmark performance of In and InSet expressions Key: SPARK-26203 URL: https://issues.apache.org/jira/browse/SPARK-26203 Project: Spark

[jira] [Created] (SPARK-26202) R bucketBy

2018-11-28 Thread Huaxin Gao (JIRA)
Huaxin Gao created SPARK-26202: -- Summary: R bucketBy Key: SPARK-26202 URL: https://issues.apache.org/jira/browse/SPARK-26202 Project: Spark Issue Type: Improvement Components: SparkR

[jira] [Updated] (SPARK-21291) R partitionBy API

2018-11-28 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huaxin Gao updated SPARK-21291: --- Summary: R partitionBy API (was: R bucketBy partitionBy API) > R partitionBy API >

[jira] [Updated] (SPARK-25829) remove duplicated map keys with last wins policy

2018-11-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-25829: Summary: remove duplicated map keys with last wins policy (was: Duplicated map keys are not

[jira] [Resolved] (SPARK-25829) Duplicated map keys are not handled consistently

2018-11-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25829. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23124

[jira] [Commented] (SPARK-26201) python broadcast.value on driver fails with disk encryption enabled

2018-11-28 Thread Sanket Reddy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702046#comment-16702046 ] Sanket Reddy commented on SPARK-26201: -- Thanks [~tgraves] will put up the patch shortly > python

[jira] [Resolved] (SPARK-25831) should apply "earlier entry wins" in hive map value converter

2018-11-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25831. - Resolution: Invalid > should apply "earlier entry wins" in hive map value converter >

[jira] [Resolved] (SPARK-25824) Remove duplicated map entries in `showString`

2018-11-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25824. - Resolution: Fixed > Remove duplicated map entries in `showString` >

[jira] [Resolved] (SPARK-25830) should apply "earlier entry wins" in Dataset.collect

2018-11-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25830. - Resolution: Invalid > should apply "earlier entry wins" in Dataset.collect >

[jira] [Assigned] (SPARK-25829) Duplicated map keys are not handled consistently

2018-11-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-25829: --- Assignee: Wenchen Fan > Duplicated map keys are not handled consistently >

[jira] [Commented] (SPARK-26201) python broadcast.value on driver fails with disk encryption enabled

2018-11-28 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702028#comment-16702028 ] Thomas Graves commented on SPARK-26201: --- the issue here seems to be that it isn't decrypting the

[jira] [Resolved] (SPARK-25989) OneVsRestModel handle empty outputCols incorrectly

2018-11-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25989. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22991

[jira] [Assigned] (SPARK-25989) OneVsRestModel handle empty outputCols incorrectly

2018-11-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-25989: - Assignee: zhengruifeng > OneVsRestModel handle empty outputCols incorrectly >

[jira] [Created] (SPARK-26201) python broadcast.value on driver fails with disk encryption enabled

2018-11-28 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-26201: - Summary: python broadcast.value on driver fails with disk encryption enabled Key: SPARK-26201 URL: https://issues.apache.org/jira/browse/SPARK-26201 Project: Spark

[jira] [Resolved] (SPARK-25998) TorrentBroadcast holds strong reference to broadcast object

2018-11-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25998. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22995

[jira] [Assigned] (SPARK-25998) TorrentBroadcast holds strong reference to broadcast object

2018-11-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-25998: - Assignee: Brandon Krieger > TorrentBroadcast holds strong reference to broadcast object >

[jira] [Resolved] (SPARK-26137) Linux file separator is hard coded in DependencyUtils used in deploy process

2018-11-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26137. --- Resolution: Fixed Fix Version/s: 2.4.1 3.0.0 2.3.3

[jira] [Assigned] (SPARK-26137) Linux file separator is hard coded in DependencyUtils used in deploy process

2018-11-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26137: - Assignee: Mark Pavey > Linux file separator is hard coded in DependencyUtils used in deploy

[jira] [Updated] (SPARK-26198) Metadata serialize null values throw NPE

2018-11-28 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-26198: Description: How to reproduce this issue: {code} scala> val meta = new

[jira] [Commented] (SPARK-26155) Spark SQL performance degradation after apply SPARK-21052 with Q19 of TPC-DS in 3TB scale

2018-11-28 Thread Yang Jie (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701964#comment-16701964 ] Yang Jie commented on SPARK-26155: -- [~Jk_Self] Can you try to add `-XX:+UseSuperWord` to

[jira] [Created] (SPARK-26200) Column values are incorrectly transposed when a field in a PySpark Row requires serialization

2018-11-28 Thread David Lyness (JIRA)
David Lyness created SPARK-26200: Summary: Column values are incorrectly transposed when a field in a PySpark Row requires serialization Key: SPARK-26200 URL: https://issues.apache.org/jira/browse/SPARK-26200

[jira] [Updated] (SPARK-26200) Column values are incorrectly transposed when a field in a PySpark Row requires serialization

2018-11-28 Thread David Lyness (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Lyness updated SPARK-26200: - Environment: Spark version 2.4.0 Using Scala version 2.11.12, Java HotSpot(TM) 64-Bit Server

[jira] [Resolved] (SPARK-26147) Python UDFs in join condition fail even when using columns from only one side of join

2018-11-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-26147. - Resolution: Fixed Fix Version/s: 2.4.1 3.0.0 Issue resolved by pull

[jira] [Assigned] (SPARK-26147) Python UDFs in join condition fail even when using columns from only one side of join

2018-11-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-26147: --- Assignee: Wenchen Fan > Python UDFs in join condition fail even when using columns from

[jira] [Commented] (SPARK-26198) Metadata serialize null values throw NPE

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701818#comment-16701818 ] Apache Spark commented on SPARK-26198: -- User 'wangyum' has created a pull request for this issue:

[jira] [Created] (SPARK-26199) Long expressions cause mutate to fail

2018-11-28 Thread JIRA
João Rafael created SPARK-26199: --- Summary: Long expressions cause mutate to fail Key: SPARK-26199 URL: https://issues.apache.org/jira/browse/SPARK-26199 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-26198) Metadata serialize null values throw NPE

2018-11-28 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-26198: Description: How to reproduce this issue: {code:scala} scala> val meta = new

[jira] [Assigned] (SPARK-26198) Metadata serialize null values throw NPE

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26198: Assignee: (was: Apache Spark) > Metadata serialize null values throw NPE >

[jira] [Assigned] (SPARK-26198) Metadata serialize null values throw NPE

2018-11-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26198: Assignee: Apache Spark > Metadata serialize null values throw NPE >

[jira] [Assigned] (SPARK-26114) Memory leak of PartitionedPairBuffer when coalescing after repartitionAndSortWithinPartitions

2018-11-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-26114: --- Assignee: Sergey Zhemzhitsky > Memory leak of PartitionedPairBuffer when coalescing after

[jira] [Resolved] (SPARK-26114) Memory leak of PartitionedPairBuffer when coalescing after repartitionAndSortWithinPartitions

2018-11-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-26114. - Resolution: Fixed Fix Version/s: 2.4.1 3.0.0 Issue resolved by pull

[jira] [Updated] (SPARK-26198) Metadata serialize null values throw NPE

2018-11-28 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-26198: Summary: Metadata serialize null values throw NPE (was: Metadata serialize null value throw NPE)

[jira] [Created] (SPARK-26198) Metadata serialize null value throw NPE

2018-11-28 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-26198: --- Summary: Metadata serialize null value throw NPE Key: SPARK-26198 URL: https://issues.apache.org/jira/browse/SPARK-26198 Project: Spark Issue Type: Bug

  1   2   >