[jira] [Updated] (SPARK-25602) SparkPlan.getByteArrayRdd should not consume the input when not necessary

2018-10-03 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-25602: Summary: SparkPlan.getByteArrayRdd should not consume the input when not necessary (was: range me

[jira] [Resolved] (SPARK-25601) Register Grouped aggregate UDF Vectorized UDFs for SQL Statement

2018-10-03 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-25601. -- Resolution: Fixed Assignee: Hyukjin Kwon Fix Version/s: 3.0.0

[jira] [Comment Edited] (SPARK-25461) PySpark Pandas UDF outputs incorrect results when input columns contain None

2018-10-03 Thread Chongyuan Xiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637657#comment-16637657 ] Chongyuan Xiang edited comment on SPARK-25461 at 10/4/18 12:37 AM: ---

[jira] [Commented] (SPARK-25461) PySpark Pandas UDF outputs incorrect results when input columns contain None

2018-10-03 Thread Chongyuan Xiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637657#comment-16637657 ] Chongyuan Xiang commented on SPARK-25461: - Hi all, thanks for looking into the i

[jira] [Comment Edited] (SPARK-25461) PySpark Pandas UDF outputs incorrect results when input columns contain None

2018-10-03 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637626#comment-16637626 ] Bryan Cutler edited comment on SPARK-25461 at 10/3/18 11:53 PM: --

[jira] [Commented] (SPARK-25538) incorrect row counts after distinct()

2018-10-03 Thread Steven Rand (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637630#comment-16637630 ] Steven Rand commented on SPARK-25538: - Thanks all! > incorrect row counts after dis

[jira] [Commented] (SPARK-25461) PySpark Pandas UDF outputs incorrect results when input columns contain None

2018-10-03 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637626#comment-16637626 ] Bryan Cutler commented on SPARK-25461: -- I file ARROW-3428, which deals with the inc

[jira] [Assigned] (SPARK-25637) SparkException: Could not find CoarseGrainedScheduler occurs during the application stop

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25637: Assignee: Apache Spark > SparkException: Could not find CoarseGrainedScheduler occurs dur

[jira] [Updated] (SPARK-25586) toString method of GeneralizedLinearRegressionTrainingSummary runs in infinite loop throwing StackOverflowError

2018-10-03 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-25586: --- Issue Type: Bug (was: Improvement) > toString method of GeneralizedLinearRegressionTraining

[jira] [Assigned] (SPARK-25637) SparkException: Could not find CoarseGrainedScheduler occurs during the application stop

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25637: Assignee: (was: Apache Spark) > SparkException: Could not find CoarseGrainedScheduler

[jira] [Commented] (SPARK-25637) SparkException: Could not find CoarseGrainedScheduler occurs during the application stop

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637606#comment-16637606 ] Apache Spark commented on SPARK-25637: -- User 'devaraj-kavali' has created a pull re

[jira] [Commented] (SPARK-25586) toString method of GeneralizedLinearRegressionTrainingSummary runs in infinite loop throwing StackOverflowError

2018-10-03 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637605#comment-16637605 ] Marcelo Vanzin commented on SPARK-25586: bq. This is not a bug Actually it's a

[jira] [Commented] (SPARK-25637) SparkException: Could not find CoarseGrainedScheduler occurs during the application stop

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637604#comment-16637604 ] Apache Spark commented on SPARK-25637: -- User 'devaraj-kavali' has created a pull re

[jira] [Resolved] (SPARK-25586) toString method of GeneralizedLinearRegressionTrainingSummary runs in infinite loop throwing StackOverflowError

2018-10-03 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-25586. Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22616 [https:

[jira] [Assigned] (SPARK-25586) toString method of GeneralizedLinearRegressionTrainingSummary runs in infinite loop throwing StackOverflowError

2018-10-03 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-25586: -- Assignee: Ankur Gupta > toString method of GeneralizedLinearRegressionTrainingSummary

[jira] [Created] (SPARK-25637) SparkException: Could not find CoarseGrainedScheduler occurs during the application stop

2018-10-03 Thread Devaraj K (JIRA)
Devaraj K created SPARK-25637: - Summary: SparkException: Could not find CoarseGrainedScheduler occurs during the application stop Key: SPARK-25637 URL: https://issues.apache.org/jira/browse/SPARK-25637 Pr

[jira] [Commented] (SPARK-23781) Merge YARN and Mesos token renewal code

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637582#comment-16637582 ] Apache Spark commented on SPARK-23781: -- User 'vanzin' has created a pull request fo

[jira] [Commented] (SPARK-23781) Merge YARN and Mesos token renewal code

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637581#comment-16637581 ] Apache Spark commented on SPARK-23781: -- User 'vanzin' has created a pull request fo

[jira] [Assigned] (SPARK-23781) Merge YARN and Mesos token renewal code

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23781: Assignee: Apache Spark > Merge YARN and Mesos token renewal code > --

[jira] [Assigned] (SPARK-23781) Merge YARN and Mesos token renewal code

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23781: Assignee: (was: Apache Spark) > Merge YARN and Mesos token renewal code > ---

[jira] [Commented] (SPARK-25005) Structured streaming doesn't support kafka transaction (creating empty offset with abort & markers)

2018-10-03 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637564#comment-16637564 ] Shixiong Zhu commented on SPARK-25005: -- [~qambard] Not sure about your question. If

[jira] [Commented] (SPARK-25005) Structured streaming doesn't support kafka transaction (creating empty offset with abort & markers)

2018-10-03 Thread Quentin Ambard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637560#comment-16637560 ] Quentin Ambard commented on SPARK-25005: ok I see, great idea, and the consumer

[jira] [Commented] (SPARK-25005) Structured streaming doesn't support kafka transaction (creating empty offset with abort & markers)

2018-10-03 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637541#comment-16637541 ] Shixiong Zhu commented on SPARK-25005: -- [~qambard] If `poll` returns and offset get

[jira] [Commented] (SPARK-25005) Structured streaming doesn't support kafka transaction (creating empty offset with abort & markers)

2018-10-03 Thread Quentin Ambard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637532#comment-16637532 ] Quentin Ambard commented on SPARK-25005: How do you make difference between data

[jira] [Assigned] (SPARK-25636) spark-submit swallows the failure reason when there is an error connecting to master

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25636: Assignee: Apache Spark > spark-submit swallows the failure reason when there is an error

[jira] [Commented] (SPARK-25636) spark-submit swallows the failure reason when there is an error connecting to master

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637530#comment-16637530 ] Apache Spark commented on SPARK-25636: -- User 'devaraj-kavali' has created a pull re

[jira] [Assigned] (SPARK-25636) spark-submit swallows the failure reason when there is an error connecting to master

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25636: Assignee: (was: Apache Spark) > spark-submit swallows the failure reason when there i

[jira] [Created] (SPARK-25636) spark-submit swallows the failure reason when there is an error connecting to master

2018-10-03 Thread Devaraj K (JIRA)
Devaraj K created SPARK-25636: - Summary: spark-submit swallows the failure reason when there is an error connecting to master Key: SPARK-25636 URL: https://issues.apache.org/jira/browse/SPARK-25636 Projec

[jira] [Assigned] (SPARK-25635) Support selective direct encoding in native ORC write

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25635: Assignee: Dongjoon Hyun (was: Apache Spark) > Support selective direct encoding in nativ

[jira] [Commented] (SPARK-25635) Support selective direct encoding in native ORC write

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637506#comment-16637506 ] Apache Spark commented on SPARK-25635: -- User 'dongjoon-hyun' has created a pull req

[jira] [Assigned] (SPARK-25635) Support selective direct encoding in native ORC write

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25635: Assignee: Apache Spark (was: Dongjoon Hyun) > Support selective direct encoding in nativ

[jira] [Assigned] (SPARK-25635) Support selective direct encoding in native ORC write

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-25635: - Assignee: Dongjoon Hyun > Support selective direct encoding in native ORC write > -

[jira] [Created] (SPARK-25635) Support selective direct encoding in native ORC write

2018-10-03 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-25635: - Summary: Support selective direct encoding in native ORC write Key: SPARK-25635 URL: https://issues.apache.org/jira/browse/SPARK-25635 Project: Spark Issue

[jira] [Commented] (SPARK-25633) Performance Improvement for Drools Spark Jobs.

2018-10-03 Thread Koushik (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637485#comment-16637485 ] Koushik commented on SPARK-25633: - yes we can connect @ 11 AM ET tomorrow. > Performanc

[jira] [Commented] (SPARK-17895) Improve documentation of "rowsBetween" and "rangeBetween"

2018-10-03 Thread Antonio Pedro de Sousa Vieira (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637464#comment-16637464 ] Antonio Pedro de Sousa Vieira commented on SPARK-17895: --- These cha

[jira] [Updated] (SPARK-25633) Performance Improvement for Drools Spark Jobs.

2018-10-03 Thread Koushik (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koushik updated SPARK-25633: Attachment: RTTA Performance Issue.pptx > Performance Improvement for Drools Spark Jobs. > ---

[jira] [Commented] (SPARK-25634) New Metrics in External Shuffle Service to help identify abusing application

2018-10-03 Thread Ye Zhou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637436#comment-16637436 ] Ye Zhou commented on SPARK-25634: - [~felixcheung]  [~vanzin]  [~tgraves]  [~irashid]  [~

[jira] [Created] (SPARK-25634) New Metrics in External Shuffle Service to help identify abusing application

2018-10-03 Thread Ye Zhou (JIRA)
Ye Zhou created SPARK-25634: --- Summary: New Metrics in External Shuffle Service to help identify abusing application Key: SPARK-25634 URL: https://issues.apache.org/jira/browse/SPARK-25634 Project: Spark

[jira] [Created] (SPARK-25633) Performance Improvement for Drools Spark Jobs.

2018-10-03 Thread Koushik (JIRA)
Koushik created SPARK-25633: --- Summary: Performance Improvement for Drools Spark Jobs. Key: SPARK-25633 URL: https://issues.apache.org/jira/browse/SPARK-25633 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-25632) KafkaRDDSuite: compacted topic 2 min 5 sec.

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25632: --- Summary: KafkaRDDSuite: compacted topic 2 min 5 sec. Key: SPARK-25632 URL: https://issues.apache.org/jira/browse/SPARK-25632 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-25631) KafkaRDDSuite: basic usage 2 min 4 sec

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25631: --- Summary: KafkaRDDSuite: basic usage2 min 4 sec Key: SPARK-25631 URL: https://issues.apache.org/jira/browse/SPARK-25631 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-25630) HiveOrcHadoopFsRelationSuite: SPARK-8406: Avoids name collision while writing files 21 sec

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25630: --- Summary: HiveOrcHadoopFsRelationSuite: SPARK-8406: Avoids name collision while writing files 21 sec Key: SPARK-25630 URL: https://issues.apache.org/jira/browse/SPARK-25630 Proj

[jira] [Created] (SPARK-25629) ParquetFilterSuite: filter pushdown - decimal 16 sec

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25629: --- Summary: ParquetFilterSuite: filter pushdown - decimal 16 sec Key: SPARK-25629 URL: https://issues.apache.org/jira/browse/SPARK-25629 Project: Spark Issue Type: Sub-ta

[jira] [Created] (SPARK-25628) DistributedSuite: recover from repeated node failures during shuffle-reduce 40 seconds

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25628: --- Summary: DistributedSuite: recover from repeated node failures during shuffle-reduce 40 seconds Key: SPARK-25628 URL: https://issues.apache.org/jira/browse/SPARK-25628 Project:

[jira] [Created] (SPARK-25627) ContinuousStressSuite - 8 mins 13 sec

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25627: --- Summary: ContinuousStressSuite - 8 mins 13 sec Key: SPARK-25627 URL: https://issues.apache.org/jira/browse/SPARK-25627 Project: Spark Issue Type: Sub-task Co

[jira] [Created] (SPARK-25626) HiveClientSuites: getPartitionsByFilter returns all partitions when hive.metastore.try.direct.sql=false 46 sec

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25626: --- Summary: HiveClientSuites: getPartitionsByFilter returns all partitions when hive.metastore.try.direct.sql=false 46 sec Key: SPARK-25626 URL: https://issues.apache.org/jira/browse/SPARK-256

[jira] [Created] (SPARK-25625) LogisticRegressionSuite.binary logistic regression with intercept with ElasticNet regularization - 33 sec

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25625: --- Summary: LogisticRegressionSuite.binary logistic regression with intercept with ElasticNet regularization - 33 sec Key: SPARK-25625 URL: https://issues.apache.org/jira/browse/SPARK-25625

[jira] [Created] (SPARK-25624) LogisticRegressionSuite.multinomial logistic regression with intercept with elasticnet regularization 56 seconds

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25624: --- Summary: LogisticRegressionSuite.multinomial logistic regression with intercept with elasticnet regularization 56 seconds Key: SPARK-25624 URL: https://issues.apache.org/jira/browse/SPARK-2

[jira] [Created] (SPARK-25623) LogisticRegressionSuite: multinomial logistic regression with intercept with L1 regularization 1 min 10 sec

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25623: --- Summary: LogisticRegressionSuite: multinomial logistic regression with intercept with L1 regularization 1 min 10 sec Key: SPARK-25623 URL: https://issues.apache.org/jira/browse/SPARK-25623

[jira] [Created] (SPARK-25622) BucketedReadWithHiveSupportSuite: read partitioning bucketed tables with bucket pruning filters - 42 seconds

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25622: --- Summary: BucketedReadWithHiveSupportSuite: read partitioning bucketed tables with bucket pruning filters - 42 seconds Key: SPARK-25622 URL: https://issues.apache.org/jira/browse/SPARK-25622

[jira] [Created] (SPARK-25621) BucketedReadWithHiveSupportSuite: read partitioning bucketed tables having composite filters 45 sec

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25621: --- Summary: BucketedReadWithHiveSupportSuite: read partitioning bucketed tables having composite filters 45 sec Key: SPARK-25621 URL: https://issues.apache.org/jira/browse/SPARK-25621

[jira] [Updated] (SPARK-25620) WithAggregationKinesisStreamSuite: failure recovery 1 min 36 seconds

2018-10-03 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-25620: Description: org.apache.spark.streaming.kinesis.WithAggregationKinesisStreamSuite.failure recovery Took

[jira] [Updated] (SPARK-25619) WithAggregationKinesisStreamSuite: split and merge shards in a stream 2 min 15 sec

2018-10-03 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-25619: Description: org.apache.spark.streaming.kinesis.WithAggregationKinesisStreamSuite.split and merge shards

[jira] [Reopened] (SPARK-25582) Error in Spark logs when using the org.apache.spark:spark-sql_2.11:2.2.0 Java library

2018-10-03 Thread Thomas Brugiere (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Brugiere reopened SPARK-25582: - > Error in Spark logs when using the org.apache.spark:spark-sql_2.11:2.2.0 Java > library >

[jira] [Resolved] (SPARK-25582) Error in Spark logs when using the org.apache.spark:spark-sql_2.11:2.2.0 Java library

2018-10-03 Thread Thomas Brugiere (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Brugiere resolved SPARK-25582. - Resolution: Later > Error in Spark logs when using the org.apache.spark:spark-sql_2.11:2

[jira] [Created] (SPARK-25620) WithAggregationKinesisStreamSuite: failure recovery 1 min 36 seconds

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25620: --- Summary: WithAggregationKinesisStreamSuite: failure recovery 1 min 36 seconds Key: SPARK-25620 URL: https://issues.apache.org/jira/browse/SPARK-25620 Project: Spark I

[jira] [Created] (SPARK-25619) WithAggregationKinesisStreamSuite: split and merge shards in a stream 2 min 15 sec

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25619: --- Summary: WithAggregationKinesisStreamSuite: split and merge shards in a stream 2 min 15 sec Key: SPARK-25619 URL: https://issues.apache.org/jira/browse/SPARK-25619 Project: Spa

[jira] [Commented] (SPARK-25501) Kafka delegation token support

2018-10-03 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637268#comment-16637268 ] Gabor Somogyi commented on SPARK-25501: --- Yeah, it's posted on the dev list. To an

[jira] [Created] (SPARK-25618) KafkaContinuousSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false 1 min 1 sec

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25618: --- Summary: KafkaContinuousSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false 1 min 1 sec Key: SPARK-25618 URL: https://issues.apache.org/jira/browse/SPARK-25618

[jira] [Created] (SPARK-25617) KafkaContinuousSinkSuite: generic - write big data with small producer buffer 56 secs

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25617: --- Summary: KafkaContinuousSinkSuite: generic - write big data with small producer buffer 56 secs Key: SPARK-25617 URL: https://issues.apache.org/jira/browse/SPARK-25617 Project:

[jira] [Created] (SPARK-25616) KafkaSinkSuite: generic - write big data with small producer buffer 57 secs

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25616: --- Summary: KafkaSinkSuite: generic - write big data with small producer buffer 57 secs Key: SPARK-25616 URL: https://issues.apache.org/jira/browse/SPARK-25616 Project: Spark

[jira] [Created] (SPARK-25615) KafkaSinkSuite: streaming - write to non-existing topic 1 min

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25615: --- Summary: KafkaSinkSuite: streaming - write to non-existing topic 1 min Key: SPARK-25615 URL: https://issues.apache.org/jira/browse/SPARK-25615 Project: Spark Issue Ty

[jira] [Created] (SPARK-25614) HiveSparkSubmitSuite: SPARK-18989: DESC TABLE should not fail with format class not found 38 seconds

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25614: --- Summary: HiveSparkSubmitSuite: SPARK-18989: DESC TABLE should not fail with format class not found 38 seconds Key: SPARK-25614 URL: https://issues.apache.org/jira/browse/SPARK-25614

[jira] [Created] (SPARK-25613) HiveSparkSubmitSuite: dir 1 min 3 seconds

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25613: --- Summary: HiveSparkSubmitSuite: dir 1 min 3 seconds Key: SPARK-25613 URL: https://issues.apache.org/jira/browse/SPARK-25613 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-25612) CompressionCodecSuite: table-level compression is not set but session-level compressions 47 seconds

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25612: --- Summary: CompressionCodecSuite: table-level compression is not set but session-level compressions 47 seconds Key: SPARK-25612 URL: https://issues.apache.org/jira/browse/SPARK-25612

[jira] [Created] (SPARK-25611) CompressionCodecSuite: both table-level and session-level compression are set 2 min 20 sec

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25611: --- Summary: CompressionCodecSuite: both table-level and session-level compression are set 2 min 20 sec Key: SPARK-25611 URL: https://issues.apache.org/jira/browse/SPARK-25611 Proj

[jira] [Created] (SPARK-25610) DatasetCacheSuite: cache UDF result correctly 25 seconds

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25610: --- Summary: DatasetCacheSuite: cache UDF result correctly 25 seconds Key: SPARK-25610 URL: https://issues.apache.org/jira/browse/SPARK-25610 Project: Spark Issue Type: Su

[jira] [Created] (SPARK-25609) DataFrameSuite: SPARK-22226: splitExpressions should not generate codes beyond 64KB 49 seconds

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25609: --- Summary: DataFrameSuite: SPARK-6: splitExpressions should not generate codes beyond 64KB 49 seconds Key: SPARK-25609 URL: https://issues.apache.org/jira/browse/SPARK-25609

[jira] [Commented] (SPARK-25501) Kafka delegation token support

2018-10-03 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637248#comment-16637248 ] Thomas Graves commented on SPARK-25501: --- the spip title has "Structured Streaming"

[jira] [Created] (SPARK-25608) HashAggregationQueryWithControlledFallbackSuite: multiple distinct multiple columns sets 38 seconds

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25608: --- Summary: HashAggregationQueryWithControlledFallbackSuite: multiple distinct multiple columns sets 38 seconds Key: SPARK-25608 URL: https://issues.apache.org/jira/browse/SPARK-25608

[jira] [Created] (SPARK-25607) HashAggregationQueryWithControlledFallbackSuite: single distinct column set 42 seconds

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25607: --- Summary: HashAggregationQueryWithControlledFallbackSuite: single distinct column set 42 seconds Key: SPARK-25607 URL: https://issues.apache.org/jira/browse/SPARK-25607 Project:

[jira] [Commented] (SPARK-25501) Kafka delegation token support

2018-10-03 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637240#comment-16637240 ] Thomas Graves commented on SPARK-25501: --- did you post SPIP to the dev list, I didn

[jira] [Created] (SPARK-25606) DateExpressionsSuite: Hour 1 min

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25606: --- Summary: DateExpressionsSuite: Hour 1 min Key: SPARK-25606 URL: https://issues.apache.org/jira/browse/SPARK-25606 Project: Spark Issue Type: Sub-task Compone

[jira] [Created] (SPARK-25605) CastSuite: cast string to timestamp 2 mins 31 sec

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25605: --- Summary: CastSuite: cast string to timestamp 2 mins 31 sec Key: SPARK-25605 URL: https://issues.apache.org/jira/browse/SPARK-25605 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-25604) Reduce the overall time costs in Jenkins tests

2018-10-03 Thread Xiao Li (JIRA)
Xiao Li created SPARK-25604: --- Summary: Reduce the overall time costs in Jenkins tests Key: SPARK-25604 URL: https://issues.apache.org/jira/browse/SPARK-25604 Project: Spark Issue Type: Umbrella

[jira] [Commented] (SPARK-25062) Clean up BlockLocations in FileStatus objects

2018-10-03 Thread Andrei Stankevich (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637089#comment-16637089 ] Andrei Stankevich commented on SPARK-25062: --- Hi [~dongjoon], yes, it an improv

[jira] [Resolved] (SPARK-25538) incorrect row counts after distinct()

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-25538. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 22602 [https://

[jira] [Assigned] (SPARK-25538) incorrect row counts after distinct()

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-25538: - Assignee: Marco Gaido > incorrect row counts after distinct() > ---

[jira] [Created] (SPARK-25603) `Projection` expression pushdown through `coalesce` and `limit`

2018-10-03 Thread DB Tsai (JIRA)
DB Tsai created SPARK-25603: --- Summary: `Projection` expression pushdown through `coalesce` and `limit` Key: SPARK-25603 URL: https://issues.apache.org/jira/browse/SPARK-25603 Project: Spark Issue

[jira] [Commented] (SPARK-25602) range metrics can be wrong if the result rows are not fully consumed

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636995#comment-16636995 ] Apache Spark commented on SPARK-25602: -- User 'cloud-fan' has created a pull request

[jira] [Assigned] (SPARK-25602) range metrics can be wrong if the result rows are not fully consumed

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25602: Assignee: Apache Spark (was: Wenchen Fan) > range metrics can be wrong if the result row

[jira] [Commented] (SPARK-25602) range metrics can be wrong if the result rows are not fully consumed

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636994#comment-16636994 ] Apache Spark commented on SPARK-25602: -- User 'cloud-fan' has created a pull request

[jira] [Assigned] (SPARK-25602) range metrics can be wrong if the result rows are not fully consumed

2018-10-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25602: Assignee: Wenchen Fan (was: Apache Spark) > range metrics can be wrong if the result row

[jira] [Commented] (SPARK-21402) Java encoders - switch fields on collectAsList

2018-10-03 Thread Paul Praet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636957#comment-16636957 ] Paul Praet commented on SPARK-21402: Still there in Spark 2.3.1. > Java encoders -

[jira] [Comment Edited] (SPARK-25062) Clean up BlockLocations in FileStatus objects

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636900#comment-16636900 ] Dongjoon Hyun edited comment on SPARK-25062 at 10/3/18 12:43 PM: -

[jira] [Commented] (SPARK-25062) Clean up BlockLocations in FileStatus objects

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636900#comment-16636900 ] Dongjoon Hyun commented on SPARK-25062: --- Hi, [~petertoth]. According to your descr

[jira] [Updated] (SPARK-25062) Clean up BlockLocations in FileStatus objects

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25062: -- Issue Type: Improvement (was: Bug) > Clean up BlockLocations in FileStatus objects >

[jira] [Created] (SPARK-25602) range metrics can be wrong if the result rows are not fully consumed

2018-10-03 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-25602: --- Summary: range metrics can be wrong if the result rows are not fully consumed Key: SPARK-25602 URL: https://issues.apache.org/jira/browse/SPARK-25602 Project: Spark

[jira] [Commented] (SPARK-25436) Bump master branch version to 2.5.0-SNAPSHOT

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636833#comment-16636833 ] Dongjoon Hyun commented on SPARK-25436: --- I updated the versions to 3.0.0 since we

[jira] [Updated] (SPARK-25436) Bump master branch version to 2.5.0-SNAPSHOT

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25436: -- Affects Version/s: (was: 2.5.0) 3.0.0 > Bump master branch version

[jira] [Updated] (SPARK-16323) Avoid unnecessary cast when doing integral divide

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-16323: -- Affects Version/s: (was: 2.5.0) 3.0.0 > Avoid unnecessary cast when

[jira] [Updated] (SPARK-25423) Output "dataFilters" in DataSourceScanExec.metadata

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25423: -- Affects Version/s: (was: 2.5.0) 3.0.0 > Output "dataFilters" in Dat

[jira] [Updated] (SPARK-25390) data source V2 API refactoring

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25390: -- Target Version/s: 3.0.0 (was: 2.5.0) > data source V2 API refactoring > -

[jira] [Updated] (SPARK-25390) data source V2 API refactoring

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25390: -- Affects Version/s: (was: 2.5.0) 3.0.0 > data source V2 API refactor

[jira] [Updated] (SPARK-25444) Refactor GenArrayData.genCodeToCreateArrayData() method

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25444: -- Affects Version/s: (was: 2.5.0) 3.0.0 > Refactor GenArrayData.genCo

[jira] [Updated] (SPARK-25442) Support STS to run in K8S deployment with spark deployment mode as cluster

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25442: -- Affects Version/s: (was: 2.5.0) 3.0.0 > Support STS to run in K8S d

[jira] [Updated] (SPARK-25457) IntegralDivide (div) should not always return long

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25457: -- Affects Version/s: (was: 2.5.0) 3.0.0 > IntegralDivide (div) should

[jira] [Updated] (SPARK-25475) Refactor all benchmark to save the result as a separate file

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25475: -- Affects Version/s: (was: 2.5.0) 3.0.0 > Refactor all benchmark to s

[jira] [Updated] (SPARK-25458) Support FOR ALL COLUMNS in ANALYZE TABLE

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25458: -- Affects Version/s: (was: 2.5.0) 3.0.0 > Support FOR ALL COLUMNS in

[jira] [Updated] (SPARK-25476) Refactor AggregateBenchmark to use main method

2018-10-03 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25476: -- Affects Version/s: (was: 2.5.0) 3.0.0 > Refactor AggregateBenchmark

  1   2   >