[jira] [Resolved] (SPARK-30730) Wrong results of `converTz` for different session and system time zones

2020-02-06 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk resolved SPARK-30730. Resolution: Won't Fix > Wrong results of `converTz` for different session and system time zones >

[jira] [Resolved] (SPARK-30708) first_value/last_value window function throws ParseException

2020-02-06 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng resolved SPARK-30708. Resolution: Won't Fix > first_value/last_value window function throws ParseException > ---

[jira] [Commented] (SPARK-30619) org.slf4j.Logger and org.apache.commons.collections classes not built as part of hadoop-provided profile

2020-02-06 Thread Abhishek Rao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17032106#comment-17032106 ] Abhishek Rao commented on SPARK-30619: -- Hi, Any updates on this? > org.slf4j.Logg

[jira] [Commented] (SPARK-30712) Estimate sizeInBytes from file metadata for parquet files

2020-02-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17032102#comment-17032102 ] Hyukjin Kwon commented on SPARK-30712: -- SPARK-24914 JIRA and PR are still open. >

[jira] [Commented] (SPARK-30712) Estimate sizeInBytes from file metadata for parquet files

2020-02-06 Thread liupengcheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17032076#comment-17032076 ] liupengcheng commented on SPARK-30712: -- [~hyukjin.kwon] SPARK-24914 seems already c

[jira] [Updated] (SPARK-30394) Skip collecting stats in DetermineTableStats rule when hive table is convertible to datasource tables

2020-02-06 Thread liupengcheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liupengcheng updated SPARK-30394: - Description: Currently, if `spark.sql.statistics.fallBackToHdfs` is enabled, then spark will sc

[jira] [Commented] (SPARK-30712) Estimate sizeInBytes from file metadata for parquet files

2020-02-06 Thread liupengcheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17032066#comment-17032066 ] liupengcheng commented on SPARK-30712: -- OK, thanks! [~hyukjin.kwon]. > Estimate si

[jira] [Resolved] (SPARK-30735) Improving writing performance by adding repartition based on columns to partitionBy for DataFrameWriter

2020-02-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-30735. -- Resolution: Won't Fix > Improving writing performance by adding repartition based on columns t

[jira] [Commented] (SPARK-30712) Estimate sizeInBytes from file metadata for parquet files

2020-02-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17032036#comment-17032036 ] Hyukjin Kwon commented on SPARK-30712: -- SPARK-24914 is trying to add the base for t

[jira] [Commented] (SPARK-30732) BroadcastExchangeExec does not fully honor "spark.broadcast.compress"

2020-02-06 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031996#comment-17031996 ] L. C. Hsieh commented on SPARK-30732: - `getByteArrayRdd` is not used only there. And

[jira] [Commented] (SPARK-27298) Dataset except operation gives different results(dataset count) on Spark 2.3.0 Windows and Spark 2.3.0 Linux environment

2020-02-06 Thread Sunitha Kambhampati (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031938#comment-17031938 ] Sunitha Kambhampati commented on SPARK-27298: - Thanks for trying it out in y

[jira] [Commented] (SPARK-24655) [K8S] Custom Docker Image Expectations and Documentation

2020-02-06 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031870#comment-17031870 ] Thomas Graves commented on SPARK-24655: --- some other discussions on this from https

[jira] [Commented] (SPARK-30668) to_timestamp failed to parse 2020-01-27T20:06:11.847-0800 using pattern "yyyy-MM-dd'T'HH:mm:ss.SSSz"

2020-02-06 Thread Xiao Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031860#comment-17031860 ] Xiao Li commented on SPARK-30668: - I think this is still not resolved. Spark 3.0 should

[jira] [Reopened] (SPARK-30668) to_timestamp failed to parse 2020-01-27T20:06:11.847-0800 using pattern "yyyy-MM-dd'T'HH:mm:ss.SSSz"

2020-02-06 Thread Xiao Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reopened SPARK-30668: - > to_timestamp failed to parse 2020-01-27T20:06:11.847-0800 using pattern > "-MM-dd'T'HH:mm:ss.SSSz" >

[jira] [Commented] (SPARK-30752) Wrong result of to_utc_timestamp() on daylight saving day

2020-02-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031834#comment-17031834 ] Dongjoon Hyun commented on SPARK-30752: --- The next release 2.4.6 is scheduled but t

[jira] [Commented] (SPARK-30752) Wrong result of to_utc_timestamp() on daylight saving day

2020-02-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031833#comment-17031833 ] Dongjoon Hyun commented on SPARK-30752: --- Thanks, bit you don't need to ping me fro

[jira] [Resolved] (SPARK-30719) AQE should not issue a "not supported" warning for queries being by-passed

2020-02-06 Thread Xiao Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-30719. - Fix Version/s: 3.0.0 Assignee: Wenchen Fan Resolution: Fixed > AQE should not issue a "n

[jira] [Commented] (SPARK-30730) Wrong results of `converTz` for different session and system time zones

2020-02-06 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031776#comment-17031776 ] Maxim Gekk commented on SPARK-30730: I am going to close this ticket w.r.t  https://

[jira] [Commented] (SPARK-30752) Wrong result of to_utc_timestamp() on daylight saving day

2020-02-06 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031774#comment-17031774 ] Maxim Gekk commented on SPARK-30752: [~dongjoon] FYI, 2.4 has the bug. > Wrong resu

[jira] [Created] (SPARK-30752) Wrong result of to_utc_timestamp() on daylight saving day

2020-02-06 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30752: -- Summary: Wrong result of to_utc_timestamp() on daylight saving day Key: SPARK-30752 URL: https://issues.apache.org/jira/browse/SPARK-30752 Project: Spark Issue T

[jira] [Updated] (SPARK-30751) Combine the skewed readers into one in AQE skew join optimizations

2020-02-06 Thread Wei Xue (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Xue updated SPARK-30751: Description: Assume we have N partitions based on the original join keys, and for a specific partition id

[jira] [Created] (SPARK-30751) Combine the skewed readers into one in AQE skew join optimizations

2020-02-06 Thread Wei Xue (Jira)
Wei Xue created SPARK-30751: --- Summary: Combine the skewed readers into one in AQE skew join optimizations Key: SPARK-30751 URL: https://issues.apache.org/jira/browse/SPARK-30751 Project: Spark Iss

[jira] [Created] (SPARK-30750) stage level scheduling: Add ability to set dynamic allocation configs

2020-02-06 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30750: - Summary: stage level scheduling: Add ability to set dynamic allocation configs Key: SPARK-30750 URL: https://issues.apache.org/jira/browse/SPARK-30750 Project: Spar

[jira] [Updated] (SPARK-30749) stage level scheduling: Better cleanup of Resource profiles

2020-02-06 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-30749: -- Summary: stage level scheduling: Better cleanup of Resource profiles (was: Better cleanup of

[jira] [Created] (SPARK-30749) Better cleanup of Resource profiles

2020-02-06 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30749: - Summary: Better cleanup of Resource profiles Key: SPARK-30749 URL: https://issues.apache.org/jira/browse/SPARK-30749 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-27570) java.io.EOFException Reached the end of stream - Reading Parquet from Swift

2020-02-06 Thread Hadrien Negros (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031683#comment-17031683 ] Hadrien Negros commented on SPARK-27570: I have the same problem with reading pr

[jira] [Commented] (SPARK-24615) SPIP: Accelerator-aware task scheduling for Spark

2020-02-06 Thread Sujith Chacko (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031647#comment-17031647 ] Sujith Chacko commented on SPARK-24615: --- Great!!Thanks for the update. > SPIP: Ac

[jira] [Commented] (SPARK-30712) Estimate sizeInBytes from file metadata for parquet files

2020-02-06 Thread liupengcheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031644#comment-17031644 ] liupengcheng commented on SPARK-30712: -- [~hyukjin.kwon] We use the rowCount info in

[jira] [Commented] (SPARK-24615) SPIP: Accelerator-aware task scheduling for Spark

2020-02-06 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031637#comment-17031637 ] Thomas Graves commented on SPARK-24615: --- yes it will be in 3.0, the feature is com

[jira] [Commented] (SPARK-30712) Estimate sizeInBytes from file metadata for parquet files

2020-02-06 Thread liupengcheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031636#comment-17031636 ] liupengcheng commented on SPARK-30712: -- [~hyukjin.kwon] Yes, in our customed spark

[jira] [Comment Edited] (SPARK-30688) Spark SQL Unix Timestamp produces incorrect result with unix_timestamp UDF

2020-02-06 Thread Attila Zsolt Piros (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031614#comment-17031614 ] Attila Zsolt Piros edited comment on SPARK-30688 at 2/6/20 2:22 PM: --

[jira] [Commented] (SPARK-30688) Spark SQL Unix Timestamp produces incorrect result with unix_timestamp UDF

2020-02-06 Thread Attila Zsolt Piros (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031614#comment-17031614 ] Attila Zsolt Piros commented on SPARK-30688: I have checked on Spark 3.0.0-p

[jira] [Created] (SPARK-30748) Storage Memory in Spark Web UI means

2020-02-06 Thread islandshinji (Jira)
islandshinji created SPARK-30748: Summary: Storage Memory in Spark Web UI means Key: SPARK-30748 URL: https://issues.apache.org/jira/browse/SPARK-30748 Project: Spark Issue Type: Question

[jira] [Resolved] (SPARK-30744) Optimize AnalyzePartitionCommand by calculating location sizes in parallel

2020-02-06 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-30744. - Fix Version/s: 3.1.0 Assignee: wuyi Resolution: Fixed > Optimize AnalyzePartitio

[jira] [Commented] (SPARK-30747) Update roxygen2 to 7.0.1

2020-02-06 Thread Maciej Szymkiewicz (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031471#comment-17031471 ] Maciej Szymkiewicz commented on SPARK-30747: CC [~felixcheung] [~hyukjin.kwo

[jira] [Updated] (SPARK-30747) Update roxygen2 to 7.0.1

2020-02-06 Thread Maciej Szymkiewicz (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz updated SPARK-30747: --- Description: Currently Spark uses {{roxygen2}} 5.0.1. It is already pretty old (2015

[jira] [Created] (SPARK-30747) Update roxygen2 to 7.0.1

2020-02-06 Thread Maciej Szymkiewicz (Jira)
Maciej Szymkiewicz created SPARK-30747: -- Summary: Update roxygen2 to 7.0.1 Key: SPARK-30747 URL: https://issues.apache.org/jira/browse/SPARK-30747 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-30739) unable to turn off Hadoop's trash feature

2020-02-06 Thread Ohad Raviv (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031437#comment-17031437 ] Ohad Raviv commented on SPARK-30739: Closing as I realized this is actually the docu

[jira] [Resolved] (SPARK-30739) unable to turn off Hadoop's trash feature

2020-02-06 Thread Ohad Raviv (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ohad Raviv resolved SPARK-30739. Resolution: Workaround > unable to turn off Hadoop's trash feature > -

[jira] [Commented] (SPARK-24615) SPIP: Accelerator-aware task scheduling for Spark

2020-02-06 Thread Sujith Chacko (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031378#comment-17031378 ] Sujith Chacko commented on SPARK-24615: --- Will this feature be a part of Spark 3.0?