[jira] [Commented] (SPARK-29166) Add parameters to limit the number of dynamic partitions for data source table

2019-09-23 Thread Lantao Jin (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936401#comment-16936401 ] Lantao Jin commented on SPARK-29166: Thank you [~idomi], I have already worked on this.

[jira] [Comment Edited] (SPARK-29217) How to read streaming output path by ignoring metadata log files

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936386#comment-16936386 ] Jungtaek Lim edited comment on SPARK-29217 at 9/24/19 4:40 AM: --- The

[jira] [Comment Edited] (SPARK-29217) How to read streaming output path by ignoring metadata log files

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936386#comment-16936386 ] Jungtaek Lim edited comment on SPARK-29217 at 9/24/19 4:39 AM: --- The

[jira] [Commented] (SPARK-29217) How to read streaming output path by ignoring metadata log files

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936388#comment-16936388 ] Jungtaek Lim commented on SPARK-29217: -- Btw, ideally I encourage asking questions to either users

[jira] [Comment Edited] (SPARK-29217) How to read streaming output path by ignoring metadata log files

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936386#comment-16936386 ] Jungtaek Lim edited comment on SPARK-29217 at 9/24/19 4:36 AM: --- The

[jira] [Commented] (SPARK-29217) How to read streaming output path by ignoring metadata log files

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936386#comment-16936386 ] Jungtaek Lim commented on SPARK-29217: -- The metadata is leveraged to provide end-to-end exactly

[jira] [Commented] (SPARK-28137) Data Type Formatting Functions: `to_number`

2019-09-23 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936375#comment-16936375 ] jiaan.geng commented on SPARK-28137: I'm working on. > Data Type Formatting Functions: `to_number`

[jira] [Updated] (SPARK-29218) Widths of checkboxes in StagePage are too narrow.

2019-09-23 Thread Kousuke Saruta (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-29218: --- Environment: I've noticed this issue occurs to at least following environments.   Firefox

[jira] [Commented] (SPARK-29221) Flaky test: SQLQueryTestSuite.sql (subquery/scalar-subquery/scalar-subquery-select.sql)

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936354#comment-16936354 ] Jungtaek Lim commented on SPARK-29221: --

[jira] [Updated] (SPARK-29221) Flaky test: SQLQueryTestSuite.sql (subquery/scalar-subquery/scalar-subquery-select.sql)

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29221: - Summary: Flaky test: SQLQueryTestSuite.sql

[jira] [Commented] (SPARK-29211) Second invocation of custom UDF results in exception (when invoked from shell)

2019-09-23 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936352#comment-16936352 ] Dongjoon Hyun commented on SPARK-29211: --- [~angerszhuuu]. We need to know the Spark version numbers

[jira] [Commented] (SPARK-28558) DatasetWriter partitionBy is changing the group file permissions in 2.4 for parquets

2019-09-23 Thread Stephen Pearson (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936349#comment-16936349 ] Stephen Pearson commented on SPARK-28558: - [~holden] I am using MapR 5.1.0   [~nladuguie] have

[jira] [Commented] (SPARK-29211) Second invocation of custom UDF results in exception (when invoked from shell)

2019-09-23 Thread angerszhu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936332#comment-16936332 ] angerszhu commented on SPARK-29211: --- [~dkbiswal] [~dongjoon] spark-shell have some problem about

[jira] [Updated] (SPARK-28845) Enable spark.sql.execution.sortBeforeRepartition only for retried stages

2019-09-23 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanjian Li updated SPARK-28845: Description: For fixing the correctness bug of SPARK-28699, we disable radix sort for the

[jira] [Commented] (SPARK-28845) Enable spark.sql.execution.sortBeforeRepartition only for retried stages

2019-09-23 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936331#comment-16936331 ] Yuanjian Li commented on SPARK-28845: - Sorry for the typo, I mean we will do the optimization by

[jira] [Updated] (SPARK-29225) Spark SQL 'DESC FORMATTED TABLE' show different format with hive

2019-09-23 Thread angerszhu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angerszhu updated SPARK-29225: -- Description: Current `DESC FORMATTED TABLE` show different table desc format, this problem cause HUE

[jira] [Updated] (SPARK-29225) Spark SQL 'DESC FORMATTED TABLE' show different format with hive

2019-09-23 Thread angerszhu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angerszhu updated SPARK-29225: -- Description: Current `DESC FORMATTED TABLE` show different table desc format, this problem cause HUE

[jira] [Created] (SPARK-29225) Spark SQL 'DESC FORMATTED TABLE' show different format with hive

2019-09-23 Thread angerszhu (Jira)
angerszhu created SPARK-29225: - Summary: Spark SQL 'DESC FORMATTED TABLE' show different format with hive Key: SPARK-29225 URL: https://issues.apache.org/jira/browse/SPARK-29225 Project: Spark

[jira] [Commented] (SPARK-28653) Create table using DDL statement should not auto create the destination folder

2019-09-23 Thread angerszhu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936314#comment-16936314 ] angerszhu commented on SPARK-28653: --- [~thanida.t] Maybe you should put more information. In 3.0 , I

[jira] [Commented] (SPARK-29224) Implement Factorization Machines as a ml-pipeline component

2019-09-23 Thread mob-ai (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936311#comment-16936311 ] mob-ai commented on SPARK-29224: PR: [https://github.com/apache/spark/pull/25909] > Implement

[jira] [Commented] (SPARK-29217) How to read streaming output path by ignoring metadata log files

2019-09-23 Thread Thanida (Jira)
(Trigger.ProcessingTime(3)) .outputMode("append") .format("parquet") .option("path", "path/destination") .partitionBy("dt").start(); {code} I got data in the output as {code:java} - _spark_metadata/.. - dt=20190923/part-0-parquet - dt=201909

[jira] [Commented] (SPARK-29224) Implement Factorization Machines as a ml-pipeline component

2019-09-23 Thread mob-ai (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936308#comment-16936308 ] mob-ai commented on SPARK-29224: This is my implementation of FactorizationMachines:

[jira] [Created] (SPARK-29224) Implement Factorization Machines as a ml-pipeline component

2019-09-23 Thread mob-ai (Jira)
mob-ai created SPARK-29224: -- Summary: Implement Factorization Machines as a ml-pipeline component Key: SPARK-29224 URL: https://issues.apache.org/jira/browse/SPARK-29224 Project: Spark Issue Type:

[jira] [Closed] (SPARK-29216) Modify the run() method case statement structure in Executor without adding a new case

2019-09-23 Thread Jiaqi Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiaqi Li closed SPARK-29216. > Modify the run() method case statement structure in Executor without adding a > new case >

[jira] [Resolved] (SPARK-29216) Modify the run() method case statement structure in Executor without adding a new case

2019-09-23 Thread Jiaqi Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiaqi Li resolved SPARK-29216. -- Resolution: Not A Problem > Modify the run() method case statement structure in Executor without

[jira] [Resolved] (SPARK-29214) Oracle sql's to_number equivalent in spark sql

2019-09-23 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-29214. - Resolution: Duplicate {code:sql} psql (11.3 (Debian 11.3-1.pgdg90+1)) Type "help" for help.

[jira] [Commented] (SPARK-28653) Create table using DDL statement should not auto create the destination folder

2019-09-23 Thread Thanida (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936297#comment-16936297 ] Thanida commented on SPARK-28653: - [~holden] I still have the same issue. > Create table using DDL

[jira] [Commented] (SPARK-29220) Flaky test: org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite.handle large number of containers and tasks (SPARK-18750) [hadoop-3.2][java11]

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936294#comment-16936294 ] Jungtaek Lim commented on SPARK-29220: -- [~dongjoon] Ah yes, I guess you have been dealing with

[jira] [Created] (SPARK-29223) Kafka source: offset by timestamp - allow specifying timestamp for "all partitions"

2019-09-23 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29223: Summary: Kafka source: offset by timestamp - allow specifying timestamp for "all partitions" Key: SPARK-29223 URL: https://issues.apache.org/jira/browse/SPARK-29223

[jira] [Commented] (SPARK-29223) Kafka source: offset by timestamp - allow specifying timestamp for "all partitions"

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936287#comment-16936287 ] Jungtaek Lim commented on SPARK-29223: -- Working on this. > Kafka source: offset by timestamp -

[jira] [Commented] (SPARK-29220) Flaky test: org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite.handle large number of containers and tasks (SPARK-18750) [hadoop-3.2][java11]

2019-09-23 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936286#comment-16936286 ] Dongjoon Hyun commented on SPARK-29220: --- Maybe, [~shaneknapp] knows that~ > Flaky test: >

[jira] [Updated] (SPARK-29218) Widths of checkboxes in StagePage are too narrow.

2019-09-23 Thread Kousuke Saruta (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-29218: --- Environment: Firefox 67.0 and 69.0 on Pop!_OS, an Ubuntu based OS 19.04. > Widths of

[jira] [Updated] (SPARK-29220) Flaky test: org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite.handle large number of containers and tasks (SPARK-18750) [hadoop-3.2][java11]

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29220: - Summary: Flaky test: org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite.handle large

[jira] [Updated] (SPARK-29221) Flaky test: SQLQueryTestSuite.sql (subquery/scalar-subquery/scalar-subquery-select.sql) [hadoop-3.2][java11]

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29221: - Summary: Flaky test: SQLQueryTestSuite.sql

[jira] [Updated] (SPARK-29221) Flaky test: SQLQueryTestSuite.sql (subquery/scalar-subquery/scalar-subquery-select.sql)

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29221: - Description:

[jira] [Commented] (SPARK-29220) Flaky test: org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite.handle large number of containers and tasks (SPARK-18750)

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936285#comment-16936285 ] Jungtaek Lim commented on SPARK-29220: -- Stack trace only shows verification failed, whereas what we

[jira] [Commented] (SPARK-29138) Flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLogisticRegressionWithSGDTests.test_parameter_accuracy

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936275#comment-16936275 ] Jungtaek Lim commented on SPARK-29138: --

[jira] [Created] (SPARK-29222) Flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests.test_parameter_convergence

2019-09-23 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29222: Summary: Flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests.test_parameter_convergence Key: SPARK-29222 URL:

[jira] [Created] (SPARK-29221) Flaky test: SQLQueryTestSuite.sql (subquery/scalar-subquery/scalar-subquery-select.sql)

2019-09-23 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29221: Summary: Flaky test: SQLQueryTestSuite.sql (subquery/scalar-subquery/scalar-subquery-select.sql) Key: SPARK-29221 URL: https://issues.apache.org/jira/browse/SPARK-29221

[jira] [Assigned] (SPARK-26848) Introduce new option to Kafka source - specify timestamp to start and end offset

2019-09-23 Thread Sean Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26848: - Assignee: Jungtaek Lim > Introduce new option to Kafka source - specify timestamp to start and

[jira] [Resolved] (SPARK-26848) Introduce new option to Kafka source - specify timestamp to start and end offset

2019-09-23 Thread Sean Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26848. --- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 23747

[jira] [Updated] (SPARK-29220) Flaky test: org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite.handle large number of containers and tasks (SPARK-18750)

2019-09-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29220: - Component/s: Tests > Flaky test: >

[jira] [Created] (SPARK-29220) Flaky test: org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite.handle large number of containers and tasks (SPARK-18750)

2019-09-23 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29220: Summary: Flaky test: org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite.handle large number of containers and tasks (SPARK-18750) Key: SPARK-29220 URL:

[jira] [Commented] (SPARK-29214) Oracle sql's to_number equivalent in spark sql

2019-09-23 Thread Sakthi Kavin SS (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936234#comment-16936234 ] Sakthi Kavin SS commented on SPARK-29214: - Yes the same issue. Eg:

[jira] [Commented] (SPARK-29214) Oracle sql's to_number equivalent in spark sql

2019-09-23 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936232#comment-16936232 ] Yuming Wang commented on SPARK-29214: - Hi [~Sakthikavin] Is SPARK-28137 what you want? > Oracle

[jira] [Commented] (SPARK-29206) Number of shuffle Netty server threads should be a multiple of number of chunk fetch handler threads

2019-09-23 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936219#comment-16936219 ] Min Shen commented on SPARK-29206: -- [~tgraves], A PR is put up for this. The actual fix itself is

[jira] [Updated] (SPARK-28599) Fix `Execution Time` and `Duration` column sorting for ThriftServerSessionPage

2019-09-23 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28599: -- Component/s: SQL > Fix `Execution Time` and `Duration` column sorting for

[jira] [Updated] (SPARK-28599) Fix `Execution Time` and `Duration` column sorting for ThriftServerSessionPage

2019-09-23 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28599: -- Component/s: (was: SQL) Web UI > Fix `Execution Time` and `Duration`

[jira] [Updated] (SPARK-28599) Fix `Execution Time` and `Duration` column sorting for ThriftServerSessionPage

2019-09-23 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28599: -- Affects Version/s: 2.4.0 2.4.1 2.4.2

[jira] [Updated] (SPARK-28599) Fix `Execution Time` and `Duration` column sorting for ThriftServerSessionPage

2019-09-23 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28599: -- Fix Version/s: 2.4.5 > Fix `Execution Time` and `Duration` column sorting for

[jira] [Commented] (SPARK-28360) The serviceAccountName configuration item does not take effect in client mode.

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936207#comment-16936207 ] holdenk commented on SPARK-28360: - Don't we need a service account name to create the executor pods? >

[jira] [Comment Edited] (SPARK-28362) Error communicating with MapOutputTracker when many tasks are launched concurrently

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936206#comment-16936206 ] holdenk edited comment on SPARK-28362 at 9/23/19 9:34 PM: -- Why is your default

[jira] [Commented] (SPARK-28362) Error communicating with MapOutputTracker when many tasks are launched concurrently

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936206#comment-16936206 ] holdenk commented on SPARK-28362: - Why is your default parallelism configured to `49 * 13 (cores) * 20 =

[jira] [Updated] (SPARK-28403) Executor Allocation Manager can add an extra executor when speculative tasks

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-28403: Shepherd: holdenk > Executor Allocation Manager can add an extra executor when speculative tasks >

[jira] [Commented] (SPARK-28517) pyspark with --conf spark.jars.packages causes duplicate jars to be uploaded

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936201#comment-16936201 ] holdenk commented on SPARK-28517: - cc [~bryanc] / [~ifilonenko] > pyspark with --conf

[jira] [Commented] (SPARK-28558) DatasetWriter partitionBy is changing the group file permissions in 2.4 for parquets

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936200#comment-16936200 ] holdenk commented on SPARK-28558: - What storage system are y'all using [~nladuguie] & [~spearson] ? >

[jira] [Commented] (SPARK-28592) Mark new Shuffle apis as @Experimental (instead of @Private)

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936199#comment-16936199 ] holdenk commented on SPARK-28592: - Should we set this to blocker so we don't forget? > Mark new Shuffle

[jira] [Commented] (SPARK-28653) Create table using DDL statement should not auto create the destination folder

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936198#comment-16936198 ] holdenk commented on SPARK-28653: - [~thanida.t] can you confirm if you're still exerpeincing this issue

[jira] [Commented] (SPARK-28727) Request for partial least square (PLS) regression model

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936196#comment-16936196 ] holdenk commented on SPARK-28727: - I don't believe we'll be adding new algorithms to Spark ML in the

[jira] [Updated] (SPARK-28781) Unneccesary persist in PeriodicCheckpointer.update()

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-28781: Issue Type: Improvement (was: Bug) > Unneccesary persist in PeriodicCheckpointer.update() >

[jira] [Commented] (SPARK-28845) Enable spark.sql.execution.sortBeforeRepartition only for retried stages

2019-09-23 Thread Yifan Xing (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936194#comment-16936194 ] Yifan Xing commented on SPARK-28845: Can you please clarify what do you mean by "before repartition

[jira] [Updated] (SPARK-28978) PySpark: Can't pass more than 256 arguments to a UDF

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-28978: Target Version/s: 3.0.0 > PySpark: Can't pass more than 256 arguments to a UDF >

[jira] [Assigned] (SPARK-29083) Speed up toLocalIterator with prefetching when enabled

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-29083: --- Assignee: holdenk > Speed up toLocalIterator with prefetching when enabled >

[jira] [Commented] (SPARK-29217) How to read streaming output path by ignoring metadata log files

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936186#comment-16936186 ] holdenk commented on SPARK-29217: - Can you clarify what you mean by "Moving some files in the output

[jira] [Commented] (SPARK-25411) Implement range partition in Spark

2019-09-23 Thread Christopher Hoshino-Fish (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936176#comment-16936176 ] Christopher Hoshino-Fish commented on SPARK-25411: -- I've also done this in the past to

[jira] [Commented] (SPARK-29204) Remove `Spark Release` Jenkins tab and its four jobs

2019-09-23 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936150#comment-16936150 ] Dongjoon Hyun commented on SPARK-29204: --- Thank you, [~shaneknapp]! > Remove `Spark Release`

[jira] [Commented] (SPARK-29211) Second invocation of custom UDF results in exception (when invoked from shell)

2019-09-23 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936148#comment-16936148 ] Dongjoon Hyun commented on SPARK-29211: --- Could you try this with older versions too in order to

[jira] [Updated] (SPARK-29211) Second invocation of custom UDF results in exception (when invoked from shell)

2019-09-23 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-29211: -- Affects Version/s: 3.0.0 > Second invocation of custom UDF results in exception (when invoked

[jira] [Commented] (SPARK-29211) Second invocation of custom UDF results in exception (when invoked from shell)

2019-09-23 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936145#comment-16936145 ] Dongjoon Hyun commented on SPARK-29211: --- Thank you for pining me, [~dkbiswal]. > Second

[jira] [Commented] (SPARK-29102) Read gzipped file into multiple partitions without full gzip expansion on a single-node

2019-09-23 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936077#comment-16936077 ] Nicholas Chammas commented on SPARK-29102: -- I wonder if

[jira] [Updated] (SPARK-29219) DataSourceV2: Support all SaveModes in DataFrameWriter.save

2019-09-23 Thread Burak Yavuz (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-29219: Description: We currently don't support all save modes in DataFrameWriter.save as the

[jira] [Created] (SPARK-29219) DataSourceV2: Support all SaveModes in DataFrameWriter.save

2019-09-23 Thread Burak Yavuz (Jira)
Burak Yavuz created SPARK-29219: --- Summary: DataSourceV2: Support all SaveModes in DataFrameWriter.save Key: SPARK-29219 URL: https://issues.apache.org/jira/browse/SPARK-29219 Project: Spark

[jira] [Updated] (SPARK-29218) Widths of checkboxes in StagePage are too narrow.

2019-09-23 Thread Kousuke Saruta (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-29218: --- Summary: Widths of checkboxes in StagePage are too narrow. (was: Widths of checkboxes in

[jira] [Updated] (SPARK-29218) Widths of checkboxes in StagePage are not proper.

2019-09-23 Thread Kousuke Saruta (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-29218: --- Attachment: before-modified2.png > Widths of checkboxes in StagePage are not proper. >

[jira] [Updated] (SPARK-29218) Widths of checkboxes in StagePage are not proper.

2019-09-23 Thread Kousuke Saruta (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-29218: --- Attachment: before-modified1.png > Widths of checkboxes in StagePage are not proper. >

[jira] [Created] (SPARK-29218) Widths of checkboxes in StagePage are not proper.

2019-09-23 Thread Kousuke Saruta (Jira)
Kousuke Saruta created SPARK-29218: -- Summary: Widths of checkboxes in StagePage are not proper. Key: SPARK-29218 URL: https://issues.apache.org/jira/browse/SPARK-29218 Project: Spark Issue

[jira] [Resolved] (SPARK-29204) Remove `Spark Release` Jenkins tab and its four jobs

2019-09-23 Thread shane knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shane knapp resolved SPARK-29204. - Assignee: shane knapp Resolution: Fixed > Remove `Spark Release` Jenkins tab and its four

[jira] [Commented] (SPARK-29204) Remove `Spark Release` Jenkins tab and its four jobs

2019-09-23 Thread shane knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936066#comment-16936066 ] shane knapp commented on SPARK-29204: - done, done and done. > Remove `Spark Release` Jenkins tab

[jira] [Commented] (SPARK-29204) Remove `Spark Release` Jenkins tab and its four jobs

2019-09-23 Thread shane knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936060#comment-16936060 ] shane knapp commented on SPARK-29204: - PR merged. deleting jobs + views now. thanks [~yhuai]! >

[jira] [Commented] (SPARK-29207) Document LIST JAR in SQL Reference

2019-09-23 Thread Huaxin Gao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936052#comment-16936052 ] Huaxin Gao commented on SPARK-29207: I am OK with either one or two jiras. The only reason I opened

[jira] [Commented] (SPARK-29163) Provide a mixin to simplify HadoopConf access patterns in DataSource V2

2019-09-23 Thread holdenk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936053#comment-16936053 ] holdenk commented on SPARK-29163: - I'm going to try and do some work on this before the end of the

[jira] [Commented] (SPARK-29204) Remove `Spark Release` Jenkins tab and its four jobs

2019-09-23 Thread shane knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936051#comment-16936051 ] shane knapp commented on SPARK-29204: - for those w/perms to see the JJB databricks repo:

[jira] [Commented] (SPARK-29204) Remove `Spark Release` Jenkins tab and its four jobs

2019-09-23 Thread shane knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936034#comment-16936034 ] shane knapp commented on SPARK-29204: - today, in order, i will: 1) delete the configs from the

[jira] [Resolved] (SPARK-29016) Update LICENSE and NOTICE for Hive 2.3

2019-09-23 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-29016. --- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25896

[jira] [Assigned] (SPARK-29016) Update LICENSE and NOTICE for Hive 2.3

2019-09-23 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-29016: - Assignee: Yuming Wang > Update LICENSE and NOTICE for Hive 2.3 >

[jira] [Assigned] (SPARK-25903) Flaky test: BarrierTaskContextSuite.throw exception on barrier() call timeout

2019-09-23 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-25903: --- Assignee: Liang-Chi Hsieh > Flaky test: BarrierTaskContextSuite.throw exception on

[jira] [Resolved] (SPARK-25903) Flaky test: BarrierTaskContextSuite.throw exception on barrier() call timeout

2019-09-23 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25903. - Fix Version/s: 3.0.0 2.4.5 Resolution: Fixed Issue resolved by pull

[jira] [Assigned] (SPARK-29203) Reduce shuffle partitions in SQLQueryTestSuite

2019-09-23 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang reassigned SPARK-29203: --- Assignee: Yuming Wang > Reduce shuffle partitions in SQLQueryTestSuite >

[jira] [Resolved] (SPARK-29203) Reduce shuffle partitions in SQLQueryTestSuite

2019-09-23 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-29203. - Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25891

[jira] [Commented] (SPARK-29206) Number of shuffle Netty server threads should be a multiple of number of chunk fetch handler threads

2019-09-23 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935968#comment-16935968 ] Thomas Graves commented on SPARK-29206: --- that is definitely not the behavior I was expecting from

[jira] [Created] (SPARK-29217) How to read streaming output path by ignoring metadata log files

2019-09-23 Thread Thanida (Jira)
Thanida created SPARK-29217: --- Summary: How to read streaming output path by ignoring metadata log files Key: SPARK-29217 URL: https://issues.apache.org/jira/browse/SPARK-29217 Project: Spark

[jira] [Commented] (SPARK-29204) Remove `Spark Release` Jenkins tab and its four jobs

2019-09-23 Thread shane knapp (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935946#comment-16935946 ] shane knapp commented on SPARK-29204: - yeah, i'll get around to this later today (as well as the JJB

[jira] [Updated] (SPARK-29216) Modify the run() method case statement structure in Executor without adding a new case

2019-09-23 Thread Jiaqi Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiaqi Li updated SPARK-29216: - Affects Version/s: (was: 2.3.5) 3.0.0 > Modify the run() method case

[jira] [Created] (SPARK-29216) Modify the run() method case statement structure in Executor without adding a new case

2019-09-23 Thread Jiaqi Li (Jira)
Jiaqi Li created SPARK-29216: Summary: Modify the run() method case statement structure in Executor without adding a new case Key: SPARK-29216 URL: https://issues.apache.org/jira/browse/SPARK-29216

[jira] [Created] (SPARK-29215) current namespace should be tracked in SessionCatalog if the current catalog is session catalog

2019-09-23 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-29215: --- Summary: current namespace should be tracked in SessionCatalog if the current catalog is session catalog Key: SPARK-29215 URL: https://issues.apache.org/jira/browse/SPARK-29215

[jira] [Updated] (SPARK-29053) Sort does not work on some columns

2019-09-23 Thread Sean Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-29053: -- Fix Version/s: 2.4.5 > Sort does not work on some columns > -- > >

[jira] [Commented] (SPARK-28558) DatasetWriter partitionBy is changing the group file permissions in 2.4 for parquets

2019-09-23 Thread Nicolas Laduguie (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935848#comment-16935848 ] Nicolas Laduguie commented on SPARK-28558: -- Hi, I also face this issue. >From my point of

[jira] [Created] (SPARK-29214) Oracle sql's to_number equivalent in spark sql

2019-09-23 Thread Sakthi Kavin SS (Jira)
Sakthi Kavin SS created SPARK-29214: --- Summary: Oracle sql's to_number equivalent in spark sql Key: SPARK-29214 URL: https://issues.apache.org/jira/browse/SPARK-29214 Project: Spark Issue

[jira] [Resolved] (SPARK-29036) SparkThriftServer may can't cancel job after call a cancel before start.

2019-09-23 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-29036. - Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25743

[jira] [Assigned] (SPARK-29036) SparkThriftServer may can't cancel job after call a cancel before start.

2019-09-23 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang reassigned SPARK-29036: --- Assignee: angerszhu > SparkThriftServer may can't cancel job after call a cancel before

  1   2   >