[jira] [Created] (SPARK-30421) Dropped columns still available for filtering

2020-01-05 Thread Tobias Hermann (Jira)
Tobias Hermann created SPARK-30421: -- Summary: Dropped columns still available for filtering Key: SPARK-30421 URL: https://issues.apache.org/jira/browse/SPARK-30421 Project: Spark Issue Type:

[jira] [Commented] (SPARK-30421) Dropped columns still available for filtering

2020-01-05 Thread Tobias Hermann (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008301#comment-17008301 ] Tobias Hermann commented on SPARK-30421: see:  [https://stackoverflow.com/questi

[jira] [Created] (SPARK-30422) deprecate UserDefinedAggregateFunction in favor of SPARK-27296

2020-01-05 Thread Erik Erlandson (Jira)
Erik Erlandson created SPARK-30422: -- Summary: deprecate UserDefinedAggregateFunction in favor of SPARK-27296 Key: SPARK-30422 URL: https://issues.apache.org/jira/browse/SPARK-30422 Project: Spark

[jira] [Created] (SPARK-30423) Deprecate UserDefinedAggregateFunction

2020-01-05 Thread Erik Erlandson (Jira)
Erik Erlandson created SPARK-30423: -- Summary: Deprecate UserDefinedAggregateFunction Key: SPARK-30423 URL: https://issues.apache.org/jira/browse/SPARK-30423 Project: Spark Issue Type: Task

[jira] [Created] (SPARK-30424) Change ExpressionEncoder toRow method to return UnsafeRow

2020-01-05 Thread Erik Erlandson (Jira)
Erik Erlandson created SPARK-30424: -- Summary: Change ExpressionEncoder toRow method to return UnsafeRow Key: SPARK-30424 URL: https://issues.apache.org/jira/browse/SPARK-30424 Project: Spark

[jira] [Commented] (SPARK-25603) Generalize Nested Column Pruning

2020-01-05 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008438#comment-17008438 ] Dongjoon Hyun commented on SPARK-25603: --- [~maropu]. Thank you for pinging me. Let'

[jira] [Commented] (SPARK-25603) Generalize Nested Column Pruning

2020-01-05 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008447#comment-17008447 ] Takeshi Yamamuro commented on SPARK-25603: -- Looks nice, thanks, [~dongjoon] >

[jira] [Resolved] (SPARK-25464) Dropping database can remove the hive warehouse directory contents

2020-01-05 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-25464. -- Resolution: Not A Problem > Dropping database can remove the hive warehouse directory contents

[jira] [Resolved] (SPARK-27258) The value of "spark.app.name" or "--name" starts with number , which causes resourceName does not match regular expression

2020-01-05 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-27258. -- Resolution: Won't Fix > The value of "spark.app.name" or "--name" starts with number , which c

[jira] [Assigned] (SPARK-30418) make FM call super class method extractLabeledPoints

2020-01-05 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-30418: Assignee: Huaxin Gao > make FM call super class method extractLabeledPoints > ---

[jira] [Resolved] (SPARK-30418) make FM call super class method extractLabeledPoints

2020-01-05 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-30418. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 27093 [https://gi

[jira] [Commented] (SPARK-23432) Expose executor memory metrics in the web UI for executors

2020-01-05 Thread Zhongwei Zhu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008493#comment-17008493 ] Zhongwei Zhu commented on SPARK-23432: -- I'll work on this.  > Expose executor memo

[jira] [Created] (SPARK-30425) FileScan of Data Source V2 doesn't implement Partition Pruning

2020-01-05 Thread Haifeng Chen (Jira)
Haifeng Chen created SPARK-30425: Summary: FileScan of Data Source V2 doesn't implement Partition Pruning Key: SPARK-30425 URL: https://issues.apache.org/jira/browse/SPARK-30425 Project: Spark

[jira] [Created] (SPARK-30426) Fix disorder the structured-streaming-kafka-integration page

2020-01-05 Thread Yuanjian Li (Jira)
Yuanjian Li created SPARK-30426: --- Summary: Fix disorder the structured-streaming-kafka-integration page Key: SPARK-30426 URL: https://issues.apache.org/jira/browse/SPARK-30426 Project: Spark I

[jira] [Updated] (SPARK-30426) Fix the disorder of structured-streaming-kafka-integration page

2020-01-05 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanjian Li updated SPARK-30426: Summary: Fix the disorder of structured-streaming-kafka-integration page (was: Fix disorder the s

[jira] [Commented] (SPARK-29596) Task duration not updating for running tasks

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008536#comment-17008536 ] Hyukjin Kwon commented on SPARK-29596: -- [~726575...@qq.com] have you made some prog

[jira] [Commented] (SPARK-30196) Bump lz4-java version to 1.7.0

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008537#comment-17008537 ] Hyukjin Kwon commented on SPARK-30196: -- [~larsfrancke] does that happen after this

[jira] [Resolved] (SPARK-30426) Fix the disorder of structured-streaming-kafka-integration page

2020-01-05 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-30426. - Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 27098 [https://gith

[jira] [Assigned] (SPARK-30426) Fix the disorder of structured-streaming-kafka-integration page

2020-01-05 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-30426: --- Assignee: Yuanjian Li > Fix the disorder of structured-streaming-kafka-integration page > -

[jira] [Resolved] (SPARK-30422) deprecate UserDefinedAggregateFunction in favor of SPARK-27296

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-30422. -- Resolution: Duplicate > deprecate UserDefinedAggregateFunction in favor of SPARK-27296 > -

[jira] [Updated] (SPARK-30364) The spark-streaming-kafka-0-10_2.11 test cases are failing on ppc64le

2020-01-05 Thread AK97 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AK97 updated SPARK-30364: - Description: I have been trying to build the Apache Spark on rhel_7.6/ppc64le; however, the spark-streaming-kaf

[jira] [Created] (SPARK-30427) Add config item for limiting partition number when calculating statistics through HDFS

2020-01-05 Thread Hu Fuwang (Jira)
Hu Fuwang created SPARK-30427: - Summary: Add config item for limiting partition number when calculating statistics through HDFS Key: SPARK-30427 URL: https://issues.apache.org/jira/browse/SPARK-30427 Proj

[jira] [Commented] (SPARK-29596) Task duration not updating for running tasks

2020-01-05 Thread daile (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008552#comment-17008552 ] daile commented on SPARK-29596: --- [~hyukjin.kwon]  task detail list use task.taskMetrics in

[jira] [Updated] (SPARK-30411) saveAsTable does not honor spark.hadoop.hive.warehouse.subdir.inherit.perms

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-30411: - Description: {code} -bash-4.2$ hdfs dfs -ls /tmp | grep my_databases drwxr-x--T - redsanket use

[jira] [Updated] (SPARK-30411) saveAsTable does not honor spark.hadoop.hive.warehouse.subdir.inherit.perms

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-30411: - Description: {code} -bash-4.2$ hdfs dfs -ls /tmp | grep my_databases drwxr-x--T - redsanket use

[jira] [Updated] (SPARK-30411) saveAsTable does not honor spark.hadoop.hive.warehouse.subdir.inherit.perms

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-30411: - Description: {code} -bash-4.2$ hdfs dfs -ls /tmp | grep my_databases drwxr-x--T - redsanket use

[jira] [Commented] (SPARK-30411) saveAsTable does not honor spark.hadoop.hive.warehouse.subdir.inherit.perms

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008556#comment-17008556 ] Hyukjin Kwon commented on SPARK-30411: -- I think it's because it uses Spark's native

[jira] [Updated] (SPARK-30400) Test failure in SQL module on ppc64le

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-30400: - Description: I have been trying to build the Apache Spark on rhel_7.6/ppc64le; however, the tes

[jira] [Resolved] (SPARK-30399) Bucketing does not compatible with partitioning in practice

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-30399. -- Resolution: Invalid [~shay_elbaz], please ask questions into mailing lists (see https://spark

[jira] [Updated] (SPARK-30397) [pyspark] Writer applied to custom model changes type of keys' dict from int to str

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-30397: - Component/s: ML > [pyspark] Writer applied to custom model changes type of keys' dict from int

[jira] [Resolved] (SPARK-30397) [pyspark] Writer applied to custom model changes type of keys' dict from int to str

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-30397. -- Resolution: Not A Problem > [pyspark] Writer applied to custom model changes type of keys' dic

[jira] [Resolved] (SPARK-30393) Too much ProvisionedThroughputExceededException while recover from checkpoint

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-30393. -- Resolution: Invalid Please ask questions into mailing list or stackoverflow (see https://spar

[jira] [Resolved] (SPARK-30372) Modify spark-redshift to work with iam on aws

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-30372. -- Resolution: Invalid It's not a Spark issue. please report at https://github.com/databricks/sp

[jira] [Resolved] (SPARK-30365) When deploy mode is a client, why doesn't it support remote "spark.files" download?

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-30365. -- Resolution: Invalid Please ask questions into mailing list or stackoverflow (see https://spar

[jira] [Updated] (SPARK-30364) The spark-streaming-kafka-0-10_2.11 test cases are failing on ppc64le

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-30364: - Description: I have been trying to build the Apache Spark on rhel_7.6/ppc64le; however, the spa

[jira] [Updated] (SPARK-30364) The spark-streaming-kafka-0-10_2.11 test cases are failing on ppc64le

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-30364: - Component/s: DStreams > The spark-streaming-kafka-0-10_2.11 test cases are failing on ppc64le >

[jira] [Updated] (SPARK-30357) SparkContext: Invoking stop() from shutdown hook

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-30357: - Description: I'm getting below error while running spark-submit job in kubernetes , i didn't ge

[jira] [Updated] (SPARK-30357) SparkContext: Invoking stop() from shutdown hook

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-30357: - Target Version/s: (was: 2.4.4) > SparkContext: Invoking stop() from shutdown hook > --

[jira] [Updated] (SPARK-30357) SparkContext: Invoking stop() from shutdown hook

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-30357: - Labels: (was: newbie) > SparkContext: Invoking stop() from shutdown hook > ---

[jira] [Updated] (SPARK-30340) Python tests failed on arm64/x86

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-30340: - Description: Jenkins job spark-master-test-python-arm failed after the commit  c6ab7165dd11a0a7b

[jira] [Commented] (SPARK-30357) SparkContext: Invoking stop() from shutdown hook

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008564#comment-17008564 ] Hyukjin Kwon commented on SPARK-30357: -- Please just don't copy and paste the logs,

[jira] [Resolved] (SPARK-30357) SparkContext: Invoking stop() from shutdown hook

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-30357. -- Resolution: Incomplete > SparkContext: Invoking stop() from shutdown hook > --

[jira] [Commented] (SPARK-30335) Clarify behavior of FIRST and LAST without OVER caluse.

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008566#comment-17008566 ] Hyukjin Kwon commented on SPARK-30335: -- They are not deterministic. > Clarify beha

[jira] [Commented] (SPARK-30332) When running sql query with limit catalyst throw StackOverFlow exception

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008571#comment-17008571 ] Hyukjin Kwon commented on SPARK-30332: -- Can you narrow down the problem? Looks impo

[jira] [Commented] (SPARK-30328) Fail to write local files with RDD.saveTextFile when setting the incorrect Hadoop configuration files

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008572#comment-17008572 ] Hyukjin Kwon commented on SPARK-30328: -- Why don't you set the Hadoop configuration

[jira] [Resolved] (SPARK-30316) data size boom after shuffle writing dataframe save as parquet

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-30316. -- Resolution: Invalid Please show reproducible steps and output files (at least {{ls -al}}). Ot

[jira] [Commented] (SPARK-30275) Add gitlab-ci.yml file for reproducible builds

2020-01-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008576#comment-17008576 ] Hyukjin Kwon commented on SPARK-30275: -- What's the benefit of adding it? > Add git

[jira] [Created] (SPARK-30428) File source V2: support partition pruning

2020-01-05 Thread Gengliang Wang (Jira)
Gengliang Wang created SPARK-30428: -- Summary: File source V2: support partition pruning Key: SPARK-30428 URL: https://issues.apache.org/jira/browse/SPARK-30428 Project: Spark Issue Type: Sub

[jira] [Updated] (SPARK-29800) Rewrite non-correlated EXISTS subquery use ScalaSubquery to optimize perf

2020-01-05 Thread angerszhu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angerszhu updated SPARK-29800: -- Summary: Rewrite non-correlated EXISTS subquery use ScalaSubquery to optimize perf (was: Rewrite non-

[jira] [Created] (SPARK-30429) WideSchemaBenchmark fails with OOM

2020-01-05 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30429: -- Summary: WideSchemaBenchmark fails with OOM Key: SPARK-30429 URL: https://issues.apache.org/jira/browse/SPARK-30429 Project: Spark Issue Type: Bug Comp

[jira] [Updated] (SPARK-30429) WideSchemaBenchmark fails with OOM

2020-01-05 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-30429: --- Attachment: WideSchemaBenchmark_console.txt > WideSchemaBenchmark fails with OOM > -