[jira] [Assigned] (SPARK-27590) do not consider skipped tasks when scheduling speculative tasks

2019-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27590: Assignee: Wenchen Fan (was: Apache Spark) > do not consider skipped tasks when schedulin

[jira] [Assigned] (SPARK-27590) do not consider skipped tasks when scheduling speculative tasks

2019-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27590: Assignee: Apache Spark (was: Wenchen Fan) > do not consider skipped tasks when schedulin

[jira] [Created] (SPARK-27590) do not consider skipped tasks when scheduling speculative tasks

2019-04-29 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-27590: --- Summary: do not consider skipped tasks when scheduling speculative tasks Key: SPARK-27590 URL: https://issues.apache.org/jira/browse/SPARK-27590 Project: Spark

[jira] [Commented] (SPARK-23191) Workers registration failes in case of network drop

2019-04-29 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829051#comment-16829051 ] zuotingbing commented on SPARK-23191: - we faced the same issue in standalone HA mode

[jira] [Comment Edited] (SPARK-23191) Workers registration failes in case of network drop

2019-04-29 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829051#comment-16829051 ] zuotingbing edited comment on SPARK-23191 at 4/29/19 9:28 AM:

[jira] [Created] (SPARK-27591) A bug in UnivocityParser prevents using UDT

2019-04-29 Thread Artem Kalchenko (JIRA)
Artem Kalchenko created SPARK-27591: --- Summary: A bug in UnivocityParser prevents using UDT Key: SPARK-27591 URL: https://issues.apache.org/jira/browse/SPARK-27591 Project: Spark Issue Type:

[jira] [Commented] (SPARK-27591) A bug in UnivocityParser prevents using UDT

2019-04-29 Thread Artem Kalchenko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829067#comment-16829067 ] Artem Kalchenko commented on SPARK-27591: - I change the line UnivocityParser:184

[jira] [Assigned] (SPARK-27580) Implement `doCanonicalize` in BatchScanExec for comparing query plan results

2019-04-29 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-27580: --- Assignee: Gengliang Wang > Implement `doCanonicalize` in BatchScanExec for comparing query

[jira] [Resolved] (SPARK-27580) Implement `doCanonicalize` in BatchScanExec for comparing query plan results

2019-04-29 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-27580. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24475 [https://gith

[jira] [Created] (SPARK-27592) Write the data of table write information to metadata

2019-04-29 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-27592: --- Summary: Write the data of table write information to metadata Key: SPARK-27592 URL: https://issues.apache.org/jira/browse/SPARK-27592 Project: Spark Issue Typ

[jira] [Commented] (SPARK-21827) Task fail due to executor exception when enable Sasl Encryption

2019-04-29 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829089#comment-16829089 ] Sébastien BARNOUD commented on SPARK-21827: --- Hi,   I was investigating timeo

[jira] [Comment Edited] (SPARK-21827) Task fail due to executor exception when enable Sasl Encryption

2019-04-29 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829089#comment-16829089 ] Sébastien BARNOUD edited comment on SPARK-21827 at 4/29/19 10:03 AM: -

[jira] [Comment Edited] (SPARK-21827) Task fail due to executor exception when enable Sasl Encryption

2019-04-29 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829089#comment-16829089 ] Sébastien BARNOUD edited comment on SPARK-21827 at 4/29/19 10:05 AM: -

[jira] [Assigned] (SPARK-27592) Write the data of table write information to metadata

2019-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27592: Assignee: Apache Spark > Write the data of table write information to metadata >

[jira] [Assigned] (SPARK-27592) Write the data of table write information to metadata

2019-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27592: Assignee: (was: Apache Spark) > Write the data of table write information to metadata

[jira] [Updated] (SPARK-27592) Write the data of table write information to metadata

2019-04-29 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-27592: Description: We hint Hive using incorrect InputFormat(org.apache.hadoop.mapred.SequenceFileInputF

[jira] [Commented] (SPARK-20880) When spark SQL is used with Avro-backed HIVE tables, NPE from org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.supportedCategories.

2019-04-29 Thread DineshPandian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829093#comment-16829093 ] DineshPandian commented on SPARK-20880: --- Hi, any update on this? > When spark SQL

[jira] [Created] (SPARK-27593) CSV Parser returns 2 DataFrame - Valid and Malformed DFs

2019-04-29 Thread Ladislav Jech (JIRA)
Ladislav Jech created SPARK-27593: - Summary: CSV Parser returns 2 DataFrame - Valid and Malformed DFs Key: SPARK-27593 URL: https://issues.apache.org/jira/browse/SPARK-27593 Project: Spark Is

[jira] [Updated] (SPARK-27593) CSV Parser returns 2 DataFrame - Valid and Malformed DFs

2019-04-29 Thread Ladislav Jech (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ladislav Jech updated SPARK-27593: -- Description: When we process CSV in any kind of data warehouse, its common procedure to repor

[jira] [Updated] (SPARK-27593) CSV Parser returns 2 DataFrame - Valid and Malformed DFs

2019-04-29 Thread Ladislav Jech (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ladislav Jech updated SPARK-27593: -- Description: When we process CSV in any kind of data warehouse, its common procedure to repor

[jira] [Updated] (SPARK-27593) CSV Parser returns 2 DataFrame - Valid and Malformed DFs

2019-04-29 Thread Ladislav Jech (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ladislav Jech updated SPARK-27593: -- Issue Type: New Feature (was: Improvement) > CSV Parser returns 2 DataFrame - Valid and Malfo

[jira] [Commented] (SPARK-24771) Upgrade AVRO version from 1.7.7 to 1.8.2

2019-04-29 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829155#comment-16829155 ] Steve Loughran commented on SPARK-24771: Update Hadoop is going to update its av

[jira] [Comment Edited] (SPARK-21827) Task fail due to executor exception when enable Sasl Encryption

2019-04-29 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829089#comment-16829089 ] Sébastien BARNOUD edited comment on SPARK-21827 at 4/29/19 12:15 PM: -

[jira] [Resolved] (SPARK-27581) DataFrame countDistinct("*") fails with AnalysisException: "Invalid usage of '*' in expression 'count'"

2019-04-29 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-27581. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24482 [https://gith

[jira] [Assigned] (SPARK-27581) DataFrame countDistinct("*") fails with AnalysisException: "Invalid usage of '*' in expression 'count'"

2019-04-29 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-27581: --- Assignee: Liang-Chi Hsieh > DataFrame countDistinct("*") fails with AnalysisException: "Inv

[jira] [Commented] (SPARK-23191) Workers registration failes in case of network drop

2019-04-29 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829260#comment-16829260 ] Wenchen Fan commented on SPARK-23191: - [~Ngone51] can you take a look please? > Wor

[jira] [Assigned] (SPARK-27555) cannot create table by using the hive default fileformat in both hive-site.xml and spark-defaults.conf

2019-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27555: Assignee: Apache Spark > cannot create table by using the hive default fileformat in both

[jira] [Assigned] (SPARK-27555) cannot create table by using the hive default fileformat in both hive-site.xml and spark-defaults.conf

2019-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27555: Assignee: (was: Apache Spark) > cannot create table by using the hive default filefor

[jira] [Commented] (SPARK-23191) Workers registration failes in case of network drop

2019-04-29 Thread wuyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829272#comment-16829272 ] wuyi commented on SPARK-23191: -- [~cloud_fan] Ok, I'll have a deep look after 5.1 holiday.

[jira] [Created] (SPARK-27594) spark.sql.orc.enableVectorizedReader causes milliseconds in Timestamp to be read incorrectly

2019-04-29 Thread Jan-Willem van der Sijp (JIRA)
Jan-Willem van der Sijp created SPARK-27594: --- Summary: spark.sql.orc.enableVectorizedReader causes milliseconds in Timestamp to be read incorrectly Key: SPARK-27594 URL: https://issues.apache.org/jira/br

[jira] [Commented] (SPARK-27574) spark on kubernetes driver pod phase changed from running to pending and starts another container in pod

2019-04-29 Thread Udbhav Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829292#comment-16829292 ] Udbhav Agrawal commented on SPARK-27574: Hey [~zyfo2] can you share driver pod l

[jira] [Commented] (SPARK-27586) Improve binary comparison: replace Scala's for-comprehension if statements with while loop

2019-04-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829303#comment-16829303 ] Josh Rosen commented on SPARK-27586: Good find! This sounds pretty straightforward t

[jira] [Commented] (SPARK-27213) Unexpected results when filter is used after distinct

2019-04-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829309#comment-16829309 ] Josh Rosen commented on SPARK-27213: Hmm, this must have been fixed relatively recen

[jira] [Commented] (SPARK-27591) A bug in UnivocityParser prevents using UDT

2019-04-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829336#comment-16829336 ] Liang-Chi Hsieh commented on SPARK-27591: - Are you returning string in {{seriali

[jira] [Updated] (SPARK-27574) spark on kubernetes driver pod phase changed from running to pending and starts another container in pod

2019-04-29 Thread Will Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Zhang updated SPARK-27574: --- Attachment: driver-pod-logs.zip > spark on kubernetes driver pod phase changed from running to pendi

[jira] [Commented] (SPARK-27574) spark on kubernetes driver pod phase changed from running to pending and starts another container in pod

2019-04-29 Thread Will Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829337#comment-16829337 ] Will Zhang commented on SPARK-27574: Hi [~Udbhav Agrawal],  the driver log is nothin

[jira] [Comment Edited] (SPARK-27574) spark on kubernetes driver pod phase changed from running to pending and starts another container in pod

2019-04-29 Thread Will Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829337#comment-16829337 ] Will Zhang edited comment on SPARK-27574 at 4/29/19 3:21 PM: -

[jira] [Comment Edited] (SPARK-27574) spark on kubernetes driver pod phase changed from running to pending and starts another container in pod

2019-04-29 Thread Will Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829337#comment-16829337 ] Will Zhang edited comment on SPARK-27574 at 4/29/19 3:21 PM: -

[jira] [Updated] (SPARK-27574) spark on kubernetes driver pod phase changed from running to pending and starts another container in pod

2019-04-29 Thread Will Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Zhang updated SPARK-27574: --- Description: I'm using spark-on-kubernetes to submit spark app to kubernetes. most of the time, it r

[jira] [Commented] (SPARK-27591) A bug in UnivocityParser prevents using UDT

2019-04-29 Thread Artem Kalchenko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829354#comment-16829354 ] Artem Kalchenko commented on SPARK-27591: - [~viirya], yes, I'm returning a strin

[jira] [Commented] (SPARK-27591) A bug in UnivocityParser prevents using UDT

2019-04-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829370#comment-16829370 ] Liang-Chi Hsieh commented on SPARK-27591: - oh, you're right. I've misread the de

[jira] [Resolved] (SPARK-27472) Docuement binary file data source in Spark user guide

2019-04-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-27472. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24484 [https://

[jira] [Commented] (SPARK-27591) A bug in UnivocityParser prevents using UDT

2019-04-29 Thread Artem Kalchenko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829379#comment-16829379 ] Artem Kalchenko commented on SPARK-27591: - yes, I will later today > A bug in U

[jira] [Resolved] (SPARK-27536) Code improvements for 3.0: existentials edition

2019-04-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27536. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24431 [https://github.c

[jira] [Created] (SPARK-27595) Spark couldn't read partitioned(string type) Orc column correctly if the value contains Float/Double value

2019-04-29 Thread Ameer Basha Pattan (JIRA)
Ameer Basha Pattan created SPARK-27595: -- Summary: Spark couldn't read partitioned(string type) Orc column correctly if the value contains Float/Double value Key: SPARK-27595 URL: https://issues.apache.org/jir

[jira] [Resolved] (SPARK-27571) Spark 3.0 build warnings: reflectiveCalls edition

2019-04-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27571. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24463 [https://github.c

[jira] [Created] (SPARK-27596) The JDBC 'query' option doesn't work for Oracle database

2019-04-29 Thread Xiao Li (JIRA)
Xiao Li created SPARK-27596: --- Summary: The JDBC 'query' option doesn't work for Oracle database Key: SPARK-27596 URL: https://issues.apache.org/jira/browse/SPARK-27596 Project: Spark Issue Type: Im

[jira] [Resolved] (SPARK-23014) Migrate MemorySink fully to v2

2019-04-29 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-23014. Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24403 [https:

[jira] [Assigned] (SPARK-23014) Migrate MemorySink fully to v2

2019-04-29 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-23014: -- Assignee: Gabor Somogyi > Migrate MemorySink fully to v2 > --

[jira] [Resolved] (SPARK-27575) Spark overwrites existing value of spark.yarn.dist.* instead of merging value

2019-04-29 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-27575. Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24465 [https:

[jira] [Assigned] (SPARK-27575) Spark overwrites existing value of spark.yarn.dist.* instead of merging value

2019-04-29 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-27575: -- Assignee: Jungtaek Lim > Spark overwrites existing value of spark.yarn.dist.* instead

[jira] [Commented] (SPARK-21492) Memory leak in SortMergeJoin

2019-04-29 Thread Tao Luo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829545#comment-16829545 ] Tao Luo commented on SPARK-21492: - The problem is that the task won't complete because o

[jira] [Commented] (SPARK-27567) Spark Streaming consumers (from Kafka) intermittently die with 'SparkException: Couldn't find leaders for Set'

2019-04-29 Thread Dmitry Goldenberg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829555#comment-16829555 ] Dmitry Goldenberg commented on SPARK-27567: --- Hi Liang-Chi, So you must be ref

[jira] [Created] (SPARK-27597) RuntimeConfig should be serializable

2019-04-29 Thread Nick Dimiduk (JIRA)
Nick Dimiduk created SPARK-27597: Summary: RuntimeConfig should be serializable Key: SPARK-27597 URL: https://issues.apache.org/jira/browse/SPARK-27597 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-27597) RuntimeConfig should be serializable

2019-04-29 Thread Nick Dimiduk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829612#comment-16829612 ] Nick Dimiduk commented on SPARK-27597: -- It would nice nice if there was an API buil

[jira] [Updated] (SPARK-27396) SPIP: Public APIs for extended Columnar Processing Support

2019-04-29 Thread Robert Joseph Evans (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated SPARK-27396: Description: *SPIP: Columnar Processing Without Arrow Formatting Guarantees.*  

[jira] [Commented] (SPARK-27396) SPIP: Public APIs for extended Columnar Processing Support

2019-04-29 Thread Robert Joseph Evans (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829631#comment-16829631 ] Robert Joseph Evans commented on SPARK-27396: - I have updated this SPIP to c

[jira] [Created] (SPARK-27598) DStreams checkpointing does not work with Scala 2.12

2019-04-29 Thread Stavros Kontopoulos (JIRA)
Stavros Kontopoulos created SPARK-27598: --- Summary: DStreams checkpointing does not work with Scala 2.12 Key: SPARK-27598 URL: https://issues.apache.org/jira/browse/SPARK-27598 Project: Spark

[jira] [Updated] (SPARK-27598) DStreams checkpointing does not work with Scala 2.12

2019-04-29 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-27598: Description: When I restarted a stream with checkpointing enabled I got this: {quo

[jira] [Updated] (SPARK-27598) DStreams checkpointing does not work with Scala 2.12

2019-04-29 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-27598: Description: When I restarted a stream with checkpointing enabled I got this: {quo

[jira] [Updated] (SPARK-27598) DStreams checkpointing does not work with Scala 2.12

2019-04-29 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-27598: Priority: Critical (was: Blocker) > DStreams checkpointing does not work with Sca

[jira] [Updated] (SPARK-27599) DataFrameWriter.partitionBy should be optional when writing to a hive table

2019-04-29 Thread Nick Dimiduk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated SPARK-27599: - Summary: DataFrameWriter.partitionBy should be optional when writing to a hive table (was: Data

[jira] [Updated] (SPARK-27598) DStreams checkpointing does not work with Scala 2.12

2019-04-29 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-27598: Description: When I restarted a stream with checkpointing enabled I got this: {quo

[jira] [Created] (SPARK-27599) DataFrameWriter$partitionBy should be optional when writing to a hive table

2019-04-29 Thread Nick Dimiduk (JIRA)
Nick Dimiduk created SPARK-27599: Summary: DataFrameWriter$partitionBy should be optional when writing to a hive table Key: SPARK-27599 URL: https://issues.apache.org/jira/browse/SPARK-27599 Project:

[jira] [Updated] (SPARK-27547) fix DataFrame self-join problems

2019-04-29 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27547: -- Labels: correctness (was: ) > fix DataFrame self-join problems >

[jira] [Assigned] (SPARK-13587) Support virtualenv in PySpark

2019-04-29 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-13587: -- Assignee: (was: Marcelo Vanzin) > Support virtualenv in PySpark > ---

[jira] [Updated] (SPARK-13587) Support virtualenv in PySpark

2019-04-29 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-13587: --- Target Version/s: (was: 3.0.0) > Support virtualenv in PySpark > -

[jira] [Resolved] (SPARK-16367) Wheelhouse Support for PySpark

2019-04-29 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-16367. Resolution: Duplicate This is somewhat similar to SPARK-13587 so let's keep the discussion

[jira] [Assigned] (SPARK-13587) Support virtualenv in PySpark

2019-04-29 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-13587: -- Assignee: Marcelo Vanzin > Support virtualenv in PySpark > --

[jira] [Resolved] (SPARK-27588) Fail fast if binary file data source will load a file that is bigger than 2GB

2019-04-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-27588. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24483 [https://

[jira] [Commented] (SPARK-27548) PySpark toLocalIterator does not raise errors from worker

2019-04-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829861#comment-16829861 ] Bryan Cutler commented on SPARK-27548: -- This is not that easy to fix by itself. Sin

[jira] [Comment Edited] (SPARK-27548) PySpark toLocalIterator does not raise errors from worker

2019-04-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829861#comment-16829861 ] Bryan Cutler edited comment on SPARK-27548 at 4/30/19 12:46 AM: --

[jira] [Comment Edited] (SPARK-27548) PySpark toLocalIterator does not raise errors from worker

2019-04-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829861#comment-16829861 ] Bryan Cutler edited comment on SPARK-27548 at 4/30/19 12:46 AM: --

[jira] [Updated] (SPARK-27598) DStreams checkpointing does not work with Scala 2.12

2019-04-29 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-27598: Description: When I restarted a stream with checkpointing enabled I got this: {quo

[jira] [Updated] (SPARK-27598) DStreams checkpointing does not work with Scala 2.12

2019-04-29 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-27598: Priority: Major (was: Critical) > DStreams checkpointing does not work with Scala

[jira] [Created] (SPARK-27600) Unable to start Spark Hive Thrift Server when multiple hive server server share the same metastore

2019-04-29 Thread pin_zhang (JIRA)
pin_zhang created SPARK-27600: - Summary: Unable to start Spark Hive Thrift Server when multiple hive server server share the same metastore Key: SPARK-27600 URL: https://issues.apache.org/jira/browse/SPARK-27600

[jira] [Updated] (SPARK-27525) Exclude commons-httpclient when interacting with different versions of the HiveMetastoreClient

2019-04-29 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-27525: Description: {noformat} Error Message java.lang.RuntimeException: [download failed: commons-httpc

[jira] [Resolved] (SPARK-27525) Exclude commons-httpclient when interacting with different versions of the HiveMetastoreClient

2019-04-29 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-27525. - Resolution: Not A Problem > Exclude commons-httpclient when interacting with different versions

[jira] [Commented] (SPARK-27525) Exclude commons-httpclient when interacting with different versions of the HiveMetastoreClient

2019-04-29 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829880#comment-16829880 ] Yuming Wang commented on SPARK-27525: - It's jenkins issue. > Exclude commons-httpcl

[jira] [Created] (SPARK-27601) Upgrade stream-lib to 2.9.6

2019-04-29 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-27601: --- Summary: Upgrade stream-lib to 2.9.6 Key: SPARK-27601 URL: https://issues.apache.org/jira/browse/SPARK-27601 Project: Spark Issue Type: Improvement C

[jira] [Updated] (SPARK-27601) Upgrade stream-lib to 2.9.6

2019-04-29 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-27601: Description: 1. Improve HyperLogLogPlus.merge and HyperLogLogPlus.mergeEstimators by using native

[jira] [Created] (SPARK-27602) SparkSQL CBO can't get true size of partition table after partition pruning

2019-04-29 Thread angerszhu (JIRA)
angerszhu created SPARK-27602: - Summary: SparkSQL CBO can't get true size of partition table after partition pruning Key: SPARK-27602 URL: https://issues.apache.org/jira/browse/SPARK-27602 Project: Spark

[jira] [Assigned] (SPARK-27601) Upgrade stream-lib to 2.9.6

2019-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27601: Assignee: (was: Apache Spark) > Upgrade stream-lib to 2.9.6 > ---

[jira] [Assigned] (SPARK-27601) Upgrade stream-lib to 2.9.6

2019-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27601: Assignee: Apache Spark > Upgrade stream-lib to 2.9.6 > --- > >

[jira] [Commented] (SPARK-27051) Bump Jackson version to 2.9.8

2019-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829959#comment-16829959 ] Apache Spark commented on SPARK-27051: -- User 'gatorsmile' has created a pull reques

[jira] [Commented] (SPARK-27051) Bump Jackson version to 2.9.8

2019-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829960#comment-16829960 ] Apache Spark commented on SPARK-27051: -- User 'gatorsmile' has created a pull reques

[jira] [Assigned] (SPARK-27586) Improve binary comparison: replace Scala's for-comprehension if statements with while loop

2019-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27586: Assignee: (was: Apache Spark) > Improve binary comparison: replace Scala's for-compre

[jira] [Assigned] (SPARK-27586) Improve binary comparison: replace Scala's for-comprehension if statements with while loop

2019-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27586: Assignee: Apache Spark > Improve binary comparison: replace Scala's for-comprehension if

[jira] [Updated] (SPARK-27298) Dataset except operation gives different results(dataset count) on Spark 2.3.0 Windows and Spark 2.3.0 Linux environment

2019-04-29 Thread Mahima Khatri (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahima Khatri updated SPARK-27298: -- Attachment: console-result-spark-2.4.2-windows > Dataset except operation gives different resu

[jira] [Updated] (SPARK-27298) Dataset except operation gives different results(dataset count) on Spark 2.3.0 Windows and Spark 2.3.0 Linux environment

2019-04-29 Thread Mahima Khatri (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahima Khatri updated SPARK-27298: -- Attachment: console-result-spark-2.4.2-linux > Dataset except operation gives different result

[jira] [Commented] (SPARK-27298) Dataset except operation gives different results(dataset count) on Spark 2.3.0 Windows and Spark 2.3.0 Linux environment

2019-04-29 Thread Mahima Khatri (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829995#comment-16829995 ] Mahima Khatri commented on SPARK-27298: --- I have tested the code with Spark-2.4.2 v