[jira] [Assigned] (SPARK-28662) Create Hive Partitioned Table without specifying data type for partition columns will success unexpectedly

2019-08-19 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-28662: --- Assignee: Li Hao > Create Hive Partitioned Table without specifying data type for partiti

[jira] [Resolved] (SPARK-28662) Create Hive Partitioned Table without specifying data type for partition columns will success unexpectedly

2019-08-19 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-28662. - Resolution: Fixed Issue resolved by pull request 25390 [https://github.com/apache/spark/pull/253

[jira] [Resolved] (SPARK-28483) Canceling a spark job using barrier mode but barrier tasks do not exit

2019-08-19 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-28483. - Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25235 [https://gith

[jira] [Assigned] (SPARK-28483) Canceling a spark job using barrier mode but barrier tasks do not exit

2019-08-19 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-28483: --- Assignee: Weichen Xu > Canceling a spark job using barrier mode but barrier tasks do not ex

[jira] [Commented] (SPARK-28672) [UDF] Duplicate function creation should not allow

2019-08-19 Thread Liang-Chi Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911007#comment-16911007 ] Liang-Chi Hsieh commented on SPARK-28672: - Is there any rule in Hive regarding t

[jira] [Resolved] (SPARK-28426) Metadata Handling in Thrift Server

2019-08-19 Thread Xiao Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-28426. - Resolution: Fixed > Metadata Handling in Thrift Server > -- > >

[jira] [Commented] (SPARK-28672) [UDF] Duplicate function creation should not allow

2019-08-19 Thread pavithra ramachandran (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911003#comment-16911003 ] pavithra ramachandran commented on SPARK-28672: --- [~maropu] [~viirya]  The

[jira] [Commented] (SPARK-28672) [UDF] Duplicate function creation should not allow

2019-08-19 Thread pavithra ramachandran (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911002#comment-16911002 ] pavithra ramachandran commented on SPARK-28672: --- [~abhishek.akg] -  When w

[jira] [Commented] (SPARK-28774) ReusedExchangeExec cannot be columnar

2019-08-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910999#comment-16910999 ] Hyukjin Kwon commented on SPARK-28774: -- Please avoid to set target version which is

[jira] [Updated] (SPARK-28774) ReusedExchangeExec cannot be columnar

2019-08-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28774: - Target Version/s: (was: 3.0.0) > ReusedExchangeExec cannot be columnar > -

[jira] [Commented] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910994#comment-16910994 ] Dongjoon Hyun commented on SPARK-28699: --- Thank you for the update, [~XuanYuan]! >

[jira] [Updated] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanjian Li updated SPARK-28699: Description: It's another case for the indeterminate stage/RDD rerun while stage rerun happened.

[jira] [Updated] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28699: -- Affects Version/s: 2.3.3 2.4.3 > Cache an indeterminate RDD could lead

[jira] [Commented] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910990#comment-16910990 ] Yuanjian Li commented on SPARK-28699: - [~dongjoon] Sure, the affects version is spar

[jira] [Updated] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanjian Li updated SPARK-28699: Affects Version/s: 2.3.3 2.4.3 > Cache an indeterminate RDD could lead to i

[jira] [Updated] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28699: -- Target Version/s: 2.3.4, 2.4.4 Affects Version/s: (was: 2.4.3)

[jira] [Commented] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910987#comment-16910987 ] Dongjoon Hyun commented on SPARK-28699: --- :) BTW, I updated this to `Blocker` acco

[jira] [Updated] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28699: -- Priority: Blocker (was: Major) > Cache an indeterminate RDD could lead to incorrect result wh

[jira] [Updated] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28699: -- Description: It's another case for the indeterminate stage/RDD rerun while stage rerun happen

[jira] [Updated] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28699: -- Description: Related with SPARK-23207 SPARK-23243 It's another case for the indeterminate sta

[jira] [Commented] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Kazuaki Ishizaki (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910983#comment-16910983 ] Kazuaki Ishizaki commented on SPARK-28699: -- [~dongjoon] Thank you for pointing

[jira] [Comment Edited] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905250#comment-16905250 ] Yuanjian Li edited comment on SPARK-28699 at 8/20/19 4:09 AM:

[jira] [Updated] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanjian Li updated SPARK-28699: Description: Related with SPARK-23207 SPARK-23243 It's another case for the indeterminate stage/R

[jira] [Commented] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910974#comment-16910974 ] Dongjoon Hyun commented on SPARK-28699: --- ? [~kiszk]. `2.4.4-rc1` is `branch-2.4` a

[jira] [Resolved] (SPARK-28777) Pyspark sql function "format_string" has the wrong parameters in doc string

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-28777. --- Fix Version/s: 2.4.4 2.3.4 3.0.0 Resolution: Fix

[jira] [Assigned] (SPARK-28777) Pyspark sql function "format_string" has the wrong parameters in doc string

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-28777: - Assignee: Darren Tirto > Pyspark sql function "format_string" has the wrong parameters

[jira] [Closed] (SPARK-28712) spark structured stream with kafka don't really delete temp files in spark standalone cluster

2019-08-19 Thread Jira
[ https://issues.apache.org/jira/browse/SPARK-28712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] 凭落 closed SPARK-28712. -- solved by SPARK-28025  > spark structured stream with kafka don't really delete temp files in spark > standalone cluster

[jira] [Commented] (SPARK-28712) spark structured stream with kafka don't really delete temp files in spark standalone cluster

2019-08-19 Thread Jira
[ https://issues.apache.org/jira/browse/SPARK-28712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910957#comment-16910957 ] 凭落 commented on SPARK-28712: [~kabhwan] thanks a lot! It really helps! > spark structured s

[jira] [Commented] (SPARK-28712) spark structured stream with kafka don't really delete temp files in spark standalone cluster

2019-08-19 Thread Jira
[ https://issues.apache.org/jira/browse/SPARK-28712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910956#comment-16910956 ] 凭落 commented on SPARK-28712: [~hyukjin.kwon] I'm sorry about it, next time I'll use mail lis

[jira] [Commented] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Kazuaki Ishizaki (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910907#comment-16910907 ] Kazuaki Ishizaki commented on SPARK-28699: -- [~smilegator] Thank you for cc. I w

[jira] [Commented] (SPARK-28777) Pyspark sql function "format_string" has the wrong parameters in doc string

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910893#comment-16910893 ] Dongjoon Hyun commented on SPARK-28777: --- Welcome, [~darrentirto]. Thank you for fi

[jira] [Resolved] (SPARK-28775) DateTimeUtilsSuite fails for JDKs using the tzdata2018i or newer timezone database

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-28775. --- Fix Version/s: 2.4.4 2.3.4 3.0.0 Resolution: Fix

[jira] [Resolved] (SPARK-28224) Check overflow in decimal Sum aggregate

2019-08-19 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro resolved SPARK-28224. -- Fix Version/s: 3.0.0 Assignee: Mick Jermsurawong Resolution: Fixed Res

[jira] [Commented] (SPARK-28777) Pyspark sql function "format_string" has the wrong parameters in doc string

2019-08-19 Thread Darren Tirto (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910865#comment-16910865 ] Darren Tirto commented on SPARK-28777: -- Sorry, I'm a little new to this Jira board.

[jira] [Commented] (SPARK-27648) In Spark2.4 Structured Streaming:The executor storage memory increasing over time

2019-08-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910864#comment-16910864 ] Jungtaek Lim commented on SPARK-27648: -- Even better if you could reproduce with loc

[jira] [Updated] (SPARK-28777) Pyspark sql function "format_string" has the wrong parameters in doc string

2019-08-19 Thread Darren Tirto (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Darren Tirto updated SPARK-28777: - Shepherd: (was: Darren Tirto) > Pyspark sql function "format_string" has the wrong parameters

[jira] [Updated] (SPARK-28777) Pyspark sql function "format_string" has the wrong parameters in doc string

2019-08-19 Thread Darren Tirto (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Darren Tirto updated SPARK-28777: - Shepherd: Darren Tirto > Pyspark sql function "format_string" has the wrong parameters in doc st

[jira] [Commented] (SPARK-27648) In Spark2.4 Structured Streaming:The executor storage memory increasing over time

2019-08-19 Thread Puneet Loya (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910861#comment-16910861 ] Puneet Loya commented on SPARK-27648: - Cassandra Sink is nothing but Cassandra Forea

[jira] [Commented] (SPARK-22390) Aggregate push down

2019-08-19 Thread Huaxin Gao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-22390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910860#comment-16910860 ] Huaxin Gao commented on SPARK-22390: I haven't looked this Datasource V2 implementat

[jira] [Commented] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Xiao Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910859#comment-16910859 ] Xiao Li commented on SPARK-28699: - Also cc [~kiszk] Let us wait for this before starting

[jira] [Assigned] (SPARK-28749) Fix PySpark tests not to require kafka-0-8 in branch-2.4

2019-08-19 Thread Sean Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-28749: - Assignee: Matt Foley > Fix PySpark tests not to require kafka-0-8 in branch-2.4 > -

[jira] [Resolved] (SPARK-28749) Fix PySpark tests not to require kafka-0-8 in branch-2.4

2019-08-19 Thread Sean Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-28749. --- Fix Version/s: 2.4.5 Resolution: Fixed Issue resolved by pull request 25482 [https://github.c

[jira] [Updated] (SPARK-28775) DateTimeUtilsSuite fails for JDKs using the tzdata2018i or newer timezone database

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28775: -- Issue Type: Bug (was: Improvement) > DateTimeUtilsSuite fails for JDKs using the tzdata2018i

[jira] [Updated] (SPARK-28775) DateTimeUtilsSuite fails for JDKs using the tzdata2018i or newer timezone database

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28775: -- Component/s: Tests > DateTimeUtilsSuite fails for JDKs using the tzdata2018i or newer timezone

[jira] [Created] (SPARK-28779) CSV writer doesn't handle older Mac line endings

2019-08-19 Thread nicolas paris (Jira)
nicolas paris created SPARK-28779: - Summary: CSV writer doesn't handle older Mac line endings Key: SPARK-28779 URL: https://issues.apache.org/jira/browse/SPARK-28779 Project: Spark Issue Type

[jira] [Updated] (SPARK-28778) Shuffle jobs fail due to incorrect advertised address when running in virtual network

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28778: -- Summary: Shuffle jobs fail due to incorrect advertised address when running in virtual network

[jira] [Created] (SPARK-28778) [MESOS] Shuffle jobs fail due to incorrect advertised address when running in virtual network

2019-08-19 Thread Anton Kirillov (Jira)
Anton Kirillov created SPARK-28778: -- Summary: [MESOS] Shuffle jobs fail due to incorrect advertised address when running in virtual network Key: SPARK-28778 URL: https://issues.apache.org/jira/browse/SPARK-28778

[jira] [Commented] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910795#comment-16910795 ] Dongjoon Hyun commented on SPARK-28699: --- Hi, [~XuanYuan]. Could you check old Spar

[jira] [Commented] (SPARK-28466) FileSystem closed error when to call Hive.moveFile

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910790#comment-16910790 ] Dongjoon Hyun commented on SPARK-28466: --- Hi, [~angerszhuuu]. For the `Improvement`

[jira] [Updated] (SPARK-28466) FileSystem closed error when to call Hive.moveFile

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28466: -- Affects Version/s: (was: 2.4.0) (was: 2.3.0) > FileSystem clos

[jira] [Updated] (SPARK-28590) Add sort_stats Setter for Custom Profiler

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28590: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Add sort_stats Setter for C

[jira] [Updated] (SPARK-28594) Allow event logs for running streaming apps to be rolled over.

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28594: -- Affects Version/s: (was: 2.4.0) (was: 2.2.1)

[jira] [Updated] (SPARK-28590) Add sort_stats Setter for Custom Profiler

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28590: -- Target Version/s: (was: 2.4.0) > Add sort_stats Setter for Custom Profiler > ---

[jira] [Updated] (SPARK-28597) Spark streaming terminated when close meta data log error

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28597: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Spark streaming terminated

[jira] [Updated] (SPARK-28547) Make it work for wide (> 10K columns data)

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28547: -- Affects Version/s: (was: 2.4.4) (was: 2.4.3)

[jira] [Updated] (SPARK-28560) Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28560: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Optimize shuffle reader to

[jira] [Updated] (SPARK-28715) Introduce collectInPlanAndSubqueries and subqueriesAll in QueryPlan

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28715: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Introduce collectInPlanAndS

[jira] [Updated] (SPARK-28552) The URL prefix lowercase of MySQL is not necessary, but it is necessary in spark

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28552: -- Affects Version/s: (was: 2.4.3) 3.0.0 > The URL prefix lowercase of

[jira] [Updated] (SPARK-28631) Update Kinesis dependencies to the Apache version licensed versions

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28631: -- Affects Version/s: (was: 2.4.3) > Update Kinesis dependencies to the Apache version licens

[jira] [Updated] (SPARK-28678) Specify that start index is 1-based in docstring of pyspark.sql.functions.slice

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28678: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Specify that start index is

[jira] [Updated] (SPARK-28746) Add repartitionby hint to support RepartitionByExpression

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28746: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Add repartitionby hint to s

[jira] [Updated] (SPARK-28596) Use Java 8 time API in date_trunc

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28596: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Use Java 8 time API in date

[jira] [Updated] (SPARK-28771) Join partitioned dataframes on superset of partitioning columns without shuffle

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28771: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Join partitioned dataframes

[jira] [Updated] (SPARK-28577) Ensure executorMemoryHead requested value not less than MEMORY_OFFHEAP_SIZE when MEMORY_OFFHEAP_ENABLED is true

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28577: -- Affects Version/s: (was: 2.4.3) (was: 2.3.3)

[jira] [Updated] (SPARK-28727) Request for partial least square (PLS) regression model

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28727: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Request for partial least s

[jira] [Updated] (SPARK-28415) Add messageHandler to Kafka 10 direct stream API

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28415: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Add messageHandler to Kafka

[jira] [Updated] (SPARK-28716) Add id to Exchange and Subquery's stringArgs method for easier identifying their reuses in query plans

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28716: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Add id to Exchange and Subq

[jira] [Updated] (SPARK-28762) Read JAR main class if JAR is not located in local file system

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28762: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Read JAR main class if JAR

[jira] [Updated] (SPARK-28573) Convert InsertIntoTable(HiveTableRelation) to Datasource inserting for partitioned table

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28573: -- Affects Version/s: (was: 2.4.3) (was: 2.3.3)

[jira] [Updated] (SPARK-28751) Imporve java serializer deserialization performance

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28751: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Imporve java serializer des

[jira] [Updated] (SPARK-28655) Support to cut the event log, and solve the history server was too slow when event log is too large.

2019-08-19 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28655: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Support to cut the event lo

[jira] [Resolved] (SPARK-28434) Decision Tree model isn't equal after save and load

2019-08-19 Thread Sean Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-28434. --- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25485 [https://github.c

[jira] [Assigned] (SPARK-28775) DateTimeUtilsSuite fails for JDKs using the tzdata2018i or newer timezone database

2019-08-19 Thread Herman van Hovell (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell reassigned SPARK-28775: - Assignee: Sean Owen (was: Herman van Hovell) > DateTimeUtilsSuite fails for JD

[jira] [Created] (SPARK-28777) Pyspark sql function "format_string" has the wrong parameters in doc string

2019-08-19 Thread Darren Tirto (Jira)
Darren Tirto created SPARK-28777: Summary: Pyspark sql function "format_string" has the wrong parameters in doc string Key: SPARK-28777 URL: https://issues.apache.org/jira/browse/SPARK-28777 Project:

[jira] [Updated] (SPARK-28775) DateTimeUtilsSuite fails for JDKs using the tzdata2018i or newer timezone database

2019-08-19 Thread Sean Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-28775: -- Summary: DateTimeUtilsSuite fails for JDKs using the tzdata2018i or newer timezone database (was: Dat

[jira] [Created] (SPARK-28776) SparkML MLWriter gets hadoop conf from spark context instead of session

2019-08-19 Thread Helen Yu (Jira)
Helen Yu created SPARK-28776: Summary: SparkML MLWriter gets hadoop conf from spark context instead of session Key: SPARK-28776 URL: https://issues.apache.org/jira/browse/SPARK-28776 Project: Spark

[jira] [Created] (SPARK-28775) DateTimeUtilsSuite fails for JDKs using the tzdata2018h or newer timezone database

2019-08-19 Thread Herman van Hovell (Jira)
Herman van Hovell created SPARK-28775: - Summary: DateTimeUtilsSuite fails for JDKs using the tzdata2018h or newer timezone database Key: SPARK-28775 URL: https://issues.apache.org/jira/browse/SPARK-28775

[jira] [Commented] (SPARK-25603) Generalize Nested Column Pruning

2019-08-19 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910738#comment-16910738 ] Nicholas Chammas commented on SPARK-25603: -- [~dbtsai] - Just watched [your Spar

[jira] [Comment Edited] (SPARK-4502) Spark SQL reads unneccesary nested fields from Parquet

2019-08-19 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/SPARK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910655#comment-16910655 ] Nicholas Chammas edited comment on SPARK-4502 at 8/19/19 7:55 PM: -

[jira] [Comment Edited] (SPARK-25150) Joining DataFrames derived from the same source yields confusing/incorrect results

2019-08-19 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910691#comment-16910691 ] Nicholas Chammas edited comment on SPARK-25150 at 8/19/19 7:39 PM: ---

[jira] [Commented] (SPARK-27648) In Spark2.4 Structured Streaming:The executor storage memory increasing over time

2019-08-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910692#comment-16910692 ] Jungtaek Lim commented on SPARK-27648: -- [~ploya] Looks like you're running load te

[jira] [Updated] (SPARK-25150) Joining DataFrames derived from the same source yields confusing/incorrect results

2019-08-19 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-25150: - Affects Version/s: 2.4.3 Labels: correctness (was: ) I haven't been able

[jira] [Commented] (SPARK-27648) In Spark2.4 Structured Streaming:The executor storage memory increasing over time

2019-08-19 Thread Puneet Loya (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910680#comment-16910680 ] Puneet Loya commented on SPARK-27648: - Had posted about high storage memory issue on

[jira] [Updated] (SPARK-19248) Regex_replace works in 1.6 but not in 2.0

2019-08-19 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/SPARK-19248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-19248: - Labels: correctness (was: ) Tagging this as a correctness issue since Spark 2+'s output

[jira] [Updated] (SPARK-18084) write.partitionBy() does not recognize nested columns that select() can access

2019-08-19 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/SPARK-18084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-18084: - Affects Version/s: 2.4.3 Retested and confirmed that this issue is still present in Spar

[jira] [Updated] (SPARK-10892) Join with Data Frame returns wrong results

2019-08-19 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/SPARK-10892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-10892: - Affects Version/s: 2.4.0 Labels: correctness (was: ) Updating affected v

[jira] [Commented] (SPARK-4502) Spark SQL reads unneccesary nested fields from Parquet

2019-08-19 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/SPARK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910655#comment-16910655 ] Nicholas Chammas commented on SPARK-4502: - Thanks for your notes [~Bartalos]. Jus

[jira] [Assigned] (SPARK-28634) Failed to start SparkSession with Keytab file

2019-08-19 Thread Marcelo Vanzin (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-28634: -- Assignee: Marcelo Vanzin > Failed to start SparkSession with Keytab file > -

[jira] [Resolved] (SPARK-28634) Failed to start SparkSession with Keytab file

2019-08-19 Thread Marcelo Vanzin (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-28634. Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25467 [https:

[jira] [Resolved] (SPARK-25262) Support tmpfs for local dirs in k8s

2019-08-19 Thread Marcelo Vanzin (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-25262. Fix Version/s: 3.0.0 Assignee: Rob Vesse Resolution: Fixed > Support tmpfs

[jira] [Updated] (SPARK-25262) Support tmpfs for local dirs in k8s

2019-08-19 Thread Marcelo Vanzin (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-25262: --- Summary: Support tmpfs for local dirs in k8s (was: Make Spark local dir volumes configurabl

[jira] [Commented] (SPARK-25262) Make Spark local dir volumes configurable with Spark on Kubernetes

2019-08-19 Thread Marcelo Vanzin (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910616#comment-16910616 ] Marcelo Vanzin commented on SPARK-25262: Full configurability was actually added

[jira] [Created] (SPARK-28774) ReusedExchangeExec cannot be columnar

2019-08-19 Thread Robert Joseph Evans (Jira)
Robert Joseph Evans created SPARK-28774: --- Summary: ReusedExchangeExec cannot be columnar Key: SPARK-28774 URL: https://issues.apache.org/jira/browse/SPARK-28774 Project: Spark Issue Typ

[jira] [Created] (SPARK-28773) NULL Handling

2019-08-19 Thread Xiao Li (Jira)
Xiao Li created SPARK-28773: --- Summary: NULL Handling Key: SPARK-28773 URL: https://issues.apache.org/jira/browse/SPARK-28773 Project: Spark Issue Type: Sub-task Components: Documentation

[jira] [Resolved] (SPARK-28734) Create a table of content in the left hand side bar for SQL doc.

2019-08-19 Thread Xiao Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-28734. - Fix Version/s: 3.0.0 Resolution: Fixed > Create a table of content in the left hand side bar for

[jira] [Assigned] (SPARK-28734) Create a table of content in the left hand side bar for SQL doc.

2019-08-19 Thread Xiao Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-28734: --- Assignee: Dilip Biswal > Create a table of content in the left hand side bar for SQL doc. > ---

[jira] [Created] (SPARK-28772) Upgrade breeze to 1.0

2019-08-19 Thread Yuming Wang (Jira)
Yuming Wang created SPARK-28772: --- Summary: Upgrade breeze to 1.0 Key: SPARK-28772 URL: https://issues.apache.org/jira/browse/SPARK-28772 Project: Spark Issue Type: Sub-task Components

[jira] [Updated] (SPARK-28699) Cache an indeterminate RDD could lead to incorrect result while stage rerun

2019-08-19 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-28699: --- Labels: correctness (was: ) > Cache an indeterminate RDD could lead to incorrect result while stage