[jira] [Updated] (SPARK-27613) Caching an RDD composed of Row Objects produces some kind of key recombination

2019-04-30 Thread Andres Fernandez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres Fernandez updated SPARK-27613: - Description: (Code included at the bottom) The function "+create_dataframes_from_azure_

[jira] [Commented] (SPARK-22796) Add multiple column support to PySpark QuantileDiscretizer

2019-04-30 Thread Dor Kedem (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830867#comment-16830867 ] Dor Kedem commented on SPARK-22796: --- Thanks for your work. Isn't there a PR for this i

[jira] [Commented] (SPARK-24530) Sphinx doesn't render autodoc_docstring_signature correctly (with Python 2?) and pyspark.ml docs are broken

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830846#comment-16830846 ] Apache Spark commented on SPARK-24530: -- User 'gatorsmile' has created a pull reques

[jira] [Updated] (SPARK-27613) Caching an RDD composed of Row Objects produces some kind of key recombination

2019-04-30 Thread Andres Fernandez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres Fernandez updated SPARK-27613: - Description: (Code included at the bottom) The function "+create_dataframes_from_azure_

[jira] [Commented] (SPARK-27602) SparkSQL CBO can't get true size of partition table after partition pruning

2019-04-30 Thread angerszhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830836#comment-16830836 ] angerszhu commented on SPARK-27602: --- Want to do this need to change the calculate mode

[jira] [Updated] (SPARK-27613) Caching an RDD composed of Row Objects produces some kind of key recombination

2019-04-30 Thread Andres Fernandez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres Fernandez updated SPARK-27613: - Description: (Code included at the bottom) The function "+create_dataframes_from_azure_

[jira] [Updated] (SPARK-27613) Caching an RDD composed of Row Objects produces some kind of key recombination

2019-04-30 Thread Andres Fernandez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres Fernandez updated SPARK-27613: - Description: (Code included at the bottom) The function "+create_dataframes_from_azure_

[jira] [Updated] (SPARK-27613) Caching an RDD composed of Row Objects produces some kind of key recombination

2019-04-30 Thread Andres Fernandez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres Fernandez updated SPARK-27613: - Description: (Code included at the bottom) The function "+create_dataframes_from_azure_

[jira] [Updated] (SPARK-27613) Caching an RDD composed of Row Objects produces some kind of key recombination

2019-04-30 Thread Andres Fernandez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres Fernandez updated SPARK-27613: - Description: (Code included at the bottom) The function "+create_dataframes_from_azure_

[jira] [Updated] (SPARK-27613) Caching an RDD composed of Row Objects produces some kind of key recombination

2019-04-30 Thread Andres Fernandez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres Fernandez updated SPARK-27613: - Component/s: (was: Spark Core) PySpark > Caching an RDD composed of

[jira] [Updated] (SPARK-27613) Caching an RDD composed of Row Objects produces some kind of key recombination

2019-04-30 Thread Andres Fernandez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres Fernandez updated SPARK-27613: - Description: (Code included at the bottom) The function "+create_dataframes_from_azure_

[jira] [Created] (SPARK-27613) Caching an RDD composed of Row Objects produces some kind of key recombination

2019-04-30 Thread Andres Fernandez (JIRA)
Andres Fernandez created SPARK-27613: Summary: Caching an RDD composed of Row Objects produces some kind of key recombination Key: SPARK-27613 URL: https://issues.apache.org/jira/browse/SPARK-27613

[jira] [Updated] (SPARK-24422) Add JDK11 in our Jenkins' build servers

2019-04-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24422: -- Fix Version/s: 3.0.0 > Add JDK11 in our Jenkins' build servers > -

[jira] [Commented] (SPARK-24422) Add JDK11 in our Jenkins' build servers

2019-04-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830823#comment-16830823 ] Dongjoon Hyun commented on SPARK-24422: --- Thank you, [~shaneknapp]! :D > Add JDK11

[jira] [Resolved] (SPARK-27608) Upgrade Surefire plugin to 3.0.0-M3

2019-04-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-27608. --- Resolution: Fixed Assignee: Dongjoon Hyun Fix Version/s: 3.0.0 This is resol

[jira] [Commented] (SPARK-24422) Add JDK11 in our Jenkins' build servers

2019-04-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830772#comment-16830772 ] shane knapp commented on SPARK-24422: - yeah, i think so! On Tue, Apr 30, 2019 at 2:

[jira] [Resolved] (SPARK-24422) Add JDK11 in our Jenkins' build servers

2019-04-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shane knapp resolved SPARK-24422. - Resolution: Fixed > Add JDK11 in our Jenkins' build servers > --

[jira] [Commented] (SPARK-18406) Race between end-of-task and completion iterator read lock release

2019-04-30 Thread Xingbo Jiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830770#comment-16830770 ] Xingbo Jiang commented on SPARK-18406: -- This problem still exists in PythonRunner,

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830760#comment-16830760 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:44 PM: ---

[jira] [Commented] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830760#comment-16830760 ] Stavros Kontopoulos commented on SPARK-27598: - @[~dongjoon] I am wondering i

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830758#comment-16830758 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:39 PM: ---

[jira] [Commented] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830758#comment-16830758 ] Stavros Kontopoulos commented on SPARK-27598: - Btw if I do the trick and put

[jira] [Resolved] (SPARK-27591) A bug in UnivocityParser prevents using UDT

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-27591. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24496 [https://gi

[jira] [Assigned] (SPARK-27591) A bug in UnivocityParser prevents using UDT

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-27591: Assignee: Artem Kalchenko > A bug in UnivocityParser prevents using UDT > ---

[jira] [Commented] (SPARK-27519) Pandas udf corrupting data

2019-04-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830755#comment-16830755 ] Bryan Cutler commented on SPARK-27519: -- I made SPARK-27612 for the problem with {{R

[jira] [Updated] (SPARK-27612) Creating a DataFrame in PySpark with ArrayType produces some Rows with Arrays of None

2019-04-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-27612: - Description: When creating a DataFrame with type {{ArrayType(IntegerType(), True)}} there ends

[jira] [Created] (SPARK-27612) Creating a DataFrame in PySpark with ArrayType produces some Rows with Arrays of None

2019-04-30 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-27612: Summary: Creating a DataFrame in PySpark with ArrayType produces some Rows with Arrays of None Key: SPARK-27612 URL: https://issues.apache.org/jira/browse/SPARK-27612

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830740#comment-16830740 ] Dongjoon Hyun edited comment on SPARK-27598 at 4/30/19 11:24 PM: -

[jira] [Commented] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830752#comment-16830752 ] Dongjoon Hyun commented on SPARK-27598: --- Thanks! > DStreams checkpointing does no

[jira] [Updated] (SPARK-27611) Redundant javax.activation dependencies in the Maven build

2019-04-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27611: -- Priority: Minor (was: Major) Agree, exclude the jakarta one. > Redundant javax.activation dependenci

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830745#comment-16830745 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:21 PM: ---

[jira] [Commented] (SPARK-27593) CSV Parser returns 2 DataFrame - Valid and Malformed DFs

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830747#comment-16830747 ] Hyukjin Kwon commented on SPARK-27593: -- Malformed column is just optional additiona

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830745#comment-16830745 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:16 PM: ---

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830745#comment-16830745 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:15 PM: ---

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830745#comment-16830745 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:14 PM: ---

[jira] [Comment Edited] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830745#comment-16830745 ] Stavros Kontopoulos edited comment on SPARK-27598 at 4/30/19 11:13 PM: ---

[jira] [Updated] (SPARK-27598) DStreams checkpointing does not work with the Spark Shell

2019-04-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-27598: Summary: DStreams checkpointing does not work with the Spark Shell (was: DStreams

[jira] [Commented] (SPARK-27598) DStreams checkpointing does not work with Scala 2.12

2019-04-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830745#comment-16830745 ] Stavros Kontopoulos commented on SPARK-27598: -  I will remove the language v

[jira] [Created] (SPARK-27611) Redundant javax.activation dependencies in the Maven build

2019-04-30 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-27611: -- Summary: Redundant javax.activation dependencies in the Maven build Key: SPARK-27611 URL: https://issues.apache.org/jira/browse/SPARK-27611 Project: Spark Issue

[jira] [Resolved] (SPARK-27519) Pandas udf corrupting data

2019-04-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-27519. -- Resolution: Fixed Fix Version/s: 3.0.0 Problem does not happen when running the latest

[jira] [Comment Edited] (SPARK-27519) Pandas udf corrupting data

2019-04-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830743#comment-16830743 ] Bryan Cutler edited comment on SPARK-27519 at 4/30/19 10:49 PM: --

[jira] [Updated] (SPARK-27519) Pandas udf corrupting data

2019-04-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-27519: - Affects Version/s: (was: 3.0.0) > Pandas udf corrupting data > -- >

[jira] [Commented] (SPARK-27519) Pandas udf corrupting data

2019-04-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830742#comment-16830742 ] Bryan Cutler commented on SPARK-27519: -- Thanks for the script [~f7faf8ba36], I was

[jira] [Commented] (SPARK-27598) DStreams checkpointing does not work with Scala 2.12

2019-04-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830740#comment-16830740 ] Dongjoon Hyun commented on SPARK-27598: --- If this fails with Spark 2.3 with Scala 2

[jira] [Commented] (SPARK-27598) DStreams checkpointing does not work with Scala 2.12

2019-04-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830738#comment-16830738 ] Dongjoon Hyun commented on SPARK-27598: --- Thank you for reporting, [~skonto]. BTW,

[jira] [Assigned] (SPARK-27610) Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27610: Assignee: Apache Spark > Yarn external shuffle service fails to start when spark.shuffle.

[jira] [Assigned] (SPARK-27610) Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27610: Assignee: (was: Apache Spark) > Yarn external shuffle service fails to start when spa

[jira] [Updated] (SPARK-27610) Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL

2019-04-30 Thread Adrian Muraru (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Muraru updated SPARK-27610: -- Description: Enabling netty epoll mode in yarn shuffle service ({{spark.shuffle.io.mode=EPOLL

[jira] [Created] (SPARK-27610) Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL

2019-04-30 Thread Adrian Muraru (JIRA)
Adrian Muraru created SPARK-27610: - Summary: Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL Key: SPARK-27610 URL: https://issues.apache.org/jira/browse/SPARK-27610 Proje

[jira] [Assigned] (SPARK-27608) Upgrade Surefire plugin to 3.0.0-M3

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27608: Assignee: Apache Spark > Upgrade Surefire plugin to 3.0.0-M3 > --

[jira] [Assigned] (SPARK-27608) Upgrade Surefire plugin to 3.0.0-M3

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27608: Assignee: (was: Apache Spark) > Upgrade Surefire plugin to 3.0.0-M3 > ---

[jira] [Created] (SPARK-27609) [Documentation Issue?] from_json expects values of options dictionary to be

2019-04-30 Thread Zachary Jablons (JIRA)
Zachary Jablons created SPARK-27609: --- Summary: [Documentation Issue?] from_json expects values of options dictionary to be Key: SPARK-27609 URL: https://issues.apache.org/jira/browse/SPARK-27609 Pr

[jira] [Updated] (SPARK-27608) Upgrade Surefire plugin to 3.0.0-M3

2019-04-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27608: -- Component/s: (was: Tests) Build > Upgrade Surefire plugin to 3.0.0-M3 > -

[jira] [Created] (SPARK-27608) Upgrade Surefire plugin to 3.0.0-M3

2019-04-30 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-27608: - Summary: Upgrade Surefire plugin to 3.0.0-M3 Key: SPARK-27608 URL: https://issues.apache.org/jira/browse/SPARK-27608 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-24422) Add JDK11 in our Jenkins' build servers

2019-04-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830718#comment-16830718 ] Dongjoon Hyun commented on SPARK-24422: --- Hi, [~dbtsai] and [~shaneknapp]. Can we r

[jira] [Created] (SPARK-27607) Improve performance of Row.toString()

2019-04-30 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-27607: -- Summary: Improve performance of Row.toString() Key: SPARK-27607 URL: https://issues.apache.org/jira/browse/SPARK-27607 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-27548) PySpark toLocalIterator does not raise errors from worker

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830694#comment-16830694 ] Apache Spark commented on SPARK-27548: -- User 'BryanCutler' has created a pull reque

[jira] [Assigned] (SPARK-27548) PySpark toLocalIterator does not raise errors from worker

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27548: Assignee: (was: Apache Spark) > PySpark toLocalIterator does not raise errors from wo

[jira] [Assigned] (SPARK-27548) PySpark toLocalIterator does not raise errors from worker

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27548: Assignee: Apache Spark > PySpark toLocalIterator does not raise errors from worker >

[jira] [Updated] (SPARK-24601) Bump Jackson version to 2.9.6

2019-04-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24601: -- Fix Version/s: (was: 2.4.3) > Bump Jackson version to 2.9.6 >

[jira] [Updated] (SPARK-27051) Bump Jackson version to 2.9.8

2019-04-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27051: -- Fix Version/s: (was: 2.4.3) > Bump Jackson version to 2.9.8 >

[jira] [Comment Edited] (SPARK-27593) CSV Parser returns 2 DataFrame - Valid and Malformed DFs

2019-04-30 Thread Ladislav Jech (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830596#comment-16830596 ] Ladislav Jech edited comment on SPARK-27593 at 4/30/19 7:12 PM: --

[jira] [Reopened] (SPARK-27593) CSV Parser returns 2 DataFrame - Valid and Malformed DFs

2019-04-30 Thread Ladislav Jech (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ladislav Jech reopened SPARK-27593: --- As per comment, please review again. > CSV Parser returns 2 DataFrame - Valid and Malformed DFs

[jira] [Commented] (SPARK-27593) CSV Parser returns 2 DataFrame - Valid and Malformed DFs

2019-04-30 Thread Ladislav Jech (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830596#comment-16830596 ] Ladislav Jech commented on SPARK-27593: --- [~hyukjin.kwon] - Then return optionally

[jira] [Commented] (SPARK-27597) RuntimeConfig should be serializable

2019-04-30 Thread Nick Dimiduk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830564#comment-16830564 ] Nick Dimiduk commented on SPARK-27597: -- {quote}bq. Do you want to access {{SparkSes

[jira] [Commented] (SPARK-27463) SPIP: Support Dataframe Cogroup via Pandas UDFs

2019-04-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830535#comment-16830535 ] Bryan Cutler commented on SPARK-27463: -- I left some comments on the doc. Overall, I

[jira] [Commented] (SPARK-27566) SIGSEV in Spark SQL during broadcast

2019-04-30 Thread Martin Studer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830530#comment-16830530 ] Martin Studer commented on SPARK-27566: --- Hi [~hyukjin.kwon], I try to collect more

[jira] [Commented] (SPARK-17859) persist should not impede with spark's ability to perform a broadcast join.

2019-04-30 Thread colin fang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830504#comment-16830504 ] colin fang commented on SPARK-17859: The above case works for me in v2.4 {code:java}

[jira] [Resolved] (SPARK-27566) SIGSEV in Spark SQL during broadcast

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-27566. -- Resolution: Incomplete No feedback > SIGSEV in Spark SQL during broadcast > -

[jira] [Resolved] (SPARK-27574) spark on kubernetes driver pod phase changed from running to pending and starts another container in pod

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-27574. -- Resolution: Invalid If it's asking for help to diagnose symptoms, ask it to mailing list befor

[jira] [Resolved] (SPARK-27582) Add Dataset DSL for left_anti and left_semi joins

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-27582. -- Resolution: Won't Fix I don't think we should add a set of aliases. The way of them looks alre

[jira] [Resolved] (SPARK-27585) No such method error (sun.nio.ch.DirectBuffer.cleaner())

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-27585. -- Resolution: Duplicate > No such method error (sun.nio.ch.DirectBuffer.cleaner()) > ---

[jira] [Commented] (SPARK-27585) No such method error (sun.nio.ch.DirectBuffer.cleaner())

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830464#comment-16830464 ] Hyukjin Kwon commented on SPARK-27585: -- It was not added. Please search JIRAs next

[jira] [Updated] (SPARK-27585) No such method error (sun.nio.ch.DirectBuffer.cleaner())

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-27585: - Issue Type: Sub-task (was: Bug) Parent: SPARK-24417 > No such method error (sun.nio.ch.

[jira] [Resolved] (SPARK-27587) No such method error (sun.nio.ch.DirectBuffer.cleaner()) when reading big table from JDBC (with one slow query)

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-27587. -- Resolution: Duplicate > No such method error (sun.nio.ch.DirectBuffer.cleaner()) when reading

[jira] [Resolved] (SPARK-27593) CSV Parser returns 2 DataFrame - Valid and Malformed DFs

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-27593. -- Resolution: Won't Fix Malformed column is just an informative field. I don't think we need a s

[jira] [Commented] (SPARK-27595) Spark couldn't read partitioned(string type) Orc column correctly if the value contains Float/Double value

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830460#comment-16830460 ] Hyukjin Kwon commented on SPARK-27595: -- Don't set Critical+ which is reserved for c

[jira] [Resolved] (SPARK-27594) spark.sql.orc.enableVectorizedReader causes milliseconds in Timestamp to be read incorrectly

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-27594. -- Resolution: Cannot Reproduce > spark.sql.orc.enableVectorizedReader causes milliseconds in Tim

[jira] [Resolved] (SPARK-27595) Spark couldn't read partitioned(string type) Orc column correctly if the value contains Float/Double value

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-27595. -- Resolution: Not A Problem > Spark couldn't read partitioned(string type) Orc column correctly

[jira] [Updated] (SPARK-27595) Spark couldn't read partitioned(string type) Orc column correctly if the value contains Float/Double value

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-27595: - Description: {code} create external table unique_keys ( key string ,locator_id string , creat

[jira] [Updated] (SPARK-27595) Spark couldn't read partitioned(string type) Orc column correctly if the value contains Float/Double value

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-27595: - Priority: Major (was: Critical) > Spark couldn't read partitioned(string type) Orc column corre

[jira] [Resolved] (SPARK-27597) RuntimeConfig should be serializable

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-27597. -- Resolution: Not A Problem > RuntimeConfig should be serializable > ---

[jira] [Commented] (SPARK-27597) RuntimeConfig should be serializable

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830457#comment-16830457 ] Hyukjin Kwon commented on SPARK-27597: -- Yes .. I don't think it's meant to be acces

[jira] [Commented] (SPARK-27600) Unable to start Spark Hive Thrift Server when multiple hive server server share the same metastore

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830456#comment-16830456 ] Hyukjin Kwon commented on SPARK-27600: -- Doesn't look like Spark issue. looks no evi

[jira] [Resolved] (SPARK-27600) Unable to start Spark Hive Thrift Server when multiple hive server server share the same metastore

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-27600. -- Resolution: Invalid > Unable to start Spark Hive Thrift Server when multiple hive server serve

[jira] [Commented] (SPARK-27602) SparkSQL CBO can't get true size of partition table after partition pruning

2019-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830455#comment-16830455 ] Hyukjin Kwon commented on SPARK-27602: -- So, what's proposal to fix it? > SparkSQL

[jira] [Assigned] (SPARK-27606) Deprecate `extended` field in ExpressionDescription/ExpressionInfo

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27606: Assignee: (was: Apache Spark) > Deprecate `extended` field in ExpressionDescription/E

[jira] [Assigned] (SPARK-27606) Deprecate `extended` field in ExpressionDescription/ExpressionInfo

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27606: Assignee: Apache Spark > Deprecate `extended` field in ExpressionDescription/ExpressionIn

[jira] [Created] (SPARK-27606) Deprecate `extended` field in ExpressionDescription/ExpressionInfo

2019-04-30 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-27606: Summary: Deprecate `extended` field in ExpressionDescription/ExpressionInfo Key: SPARK-27606 URL: https://issues.apache.org/jira/browse/SPARK-27606 Project: Spark

[jira] [Assigned] (SPARK-25888) Service requests for persist() blocks via external service after dynamic deallocation

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25888: Assignee: Apache Spark > Service requests for persist() blocks via external service after

[jira] [Assigned] (SPARK-25888) Service requests for persist() blocks via external service after dynamic deallocation

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25888: Assignee: (was: Apache Spark) > Service requests for persist() blocks via external se

[jira] [Assigned] (SPARK-27605) Add new column "Partition ID" to the tasks table in stages page

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27605: Assignee: Apache Spark > Add new column "Partition ID" to the tasks table in stages page

[jira] [Assigned] (SPARK-27605) Add new column "Partition ID" to the tasks table in stages page

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27605: Assignee: (was: Apache Spark) > Add new column "Partition ID" to the tasks table in s

[jira] [Created] (SPARK-27605) Add new column "Partition ID" to the tasks table in stages page

2019-04-30 Thread Parth Gandhi (JIRA)
Parth Gandhi created SPARK-27605: Summary: Add new column "Partition ID" to the tasks table in stages page Key: SPARK-27605 URL: https://issues.apache.org/jira/browse/SPARK-27605 Project: Spark

[jira] [Assigned] (SPARK-27591) A bug in UnivocityParser prevents using UDT

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27591: Assignee: Apache Spark > A bug in UnivocityParser prevents using UDT > --

[jira] [Assigned] (SPARK-27591) A bug in UnivocityParser prevents using UDT

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27591: Assignee: (was: Apache Spark) > A bug in UnivocityParser prevents using UDT > ---

[jira] [Commented] (SPARK-27597) RuntimeConfig should be serializable

2019-04-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830179#comment-16830179 ] Liang-Chi Hsieh commented on SPARK-27597: - Do you want to access {{SparkSession}

[jira] [Assigned] (SPARK-27604) Enhance constant and constraint propagation

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27604: Assignee: Apache Spark > Enhance constant and constraint propagation > --

[jira] [Assigned] (SPARK-27604) Enhance constant and constraint propagation

2019-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27604: Assignee: (was: Apache Spark) > Enhance constant and constraint propagation > ---

[jira] [Commented] (SPARK-27595) Spark couldn't read partitioned(string type) Orc column correctly if the value contains Float/Double value

2019-04-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830165#comment-16830165 ] Liang-Chi Hsieh commented on SPARK-27595: - Is turning off {{spark.sql.sources.pa

  1   2   >