date:20200423

[jira] [Created] (SPARK-31526) Add a new test suite for ExperssionInfo

2020-04-23 Thread Takeshi Yamamuro (Jira)

Takeshi Yamamuro created SPARK-31526:


 Summary: Add a new test suite for ExperssionInfo
 Key: SPARK-31526
 URL: https://issues.apache.org/jira/browse/SPARK-31526
 Project: Spark
  Issue Type: Test
  Components: SQL
Affects Versions: 3.1.0
Reporter: Takeshi Yamamuro






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-27039) toPandas with Arrow swallows maxResultSize errors

2020-04-23 Thread peay (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-27039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090352#comment-17090352
 ] 

peay commented on SPARK-27039:
--

[~hyukjin.kwon] do you know if this was eventually back ported in 2.4.x? This 
was more than a year ago and I suspect 3.0 is still some ways off (taking into 
account official release and deployment for us), while this is a rather sneaky 
correctness bug.

> toPandas with Arrow swallows maxResultSize errors
> -
>
> Key: SPARK-27039
> URL: https://issues.apache.org/jira/browse/SPARK-27039
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.0
>Reporter: peay
>Priority: Minor
>
> I am running the following simple `toPandas` with {{maxResultSize}} set to 
> 1mb:
> {code:java}
> import pyspark.sql.functions as F
> df = spark.range(1000 * 1000)
> df_pd = df.withColumn("test", F.lit("this is a long string that should make 
> the resulting dataframe too large for maxResult which is 1m")).toPandas()
> {code}
>  
> With {{spark.sql.execution.arrow.enabled}} set to {{true}}, this returns an 
> empty Pandas dataframe without any error:
> {code:python}
> df_pd.info()
> # 
> # Index: 0 entries
> # Data columns (total 2 columns):
> # id  0 non-null object
> # test0 non-null object
> # dtypes: object(2)
> # memory usage: 0.0+ bytes
> {code}
> The driver stderr does have an error, and so does the Spark UI:
> {code:java}
> ERROR TaskSetManager: Total size of serialized results of 1 tasks (52.8 MB) 
> is bigger than spark.driver.maxResultSize (1024.0 KB)
> ERROR TaskSetManager: Total size of serialized results of 2 tasks (105.7 MB) 
> is bigger than spark.driver.maxResultSize (1024.0 KB)
> Exception in thread "serve-Arrow" org.apache.spark.SparkException: Job 
> aborted due to stage failure: Total size of serialized results of 1 tasks 
> (52.8 MB) is bigger than spark.driver.maxResultSize (1024.0 KB)
>  at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:2039)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:2027)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:2026)
>  at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>  at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2026)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:966)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:966)
>  at scala.Option.foreach(Option.scala:257)
>  at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:966)
>  at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2260)
>  at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2209)
>  at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2198)
>  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
>  at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:777)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061)
>  at 
> org.apache.spark.sql.Dataset$$anonfun$collectAsArrowToPython$1$$anonfun$apply$17.apply(Dataset.scala:3313)
>  at 
> org.apache.spark.sql.Dataset$$anonfun$collectAsArrowToPython$1$$anonfun$apply$17.apply(Dataset.scala:3282)
>  at 
> org.apache.spark.api.python.PythonRDD$$anonfun$6$$anonfun$apply$1.apply$mcV$sp(PythonRDD.scala:435)
>  at 
> org.apache.spark.api.python.PythonRDD$$anonfun$6$$anonfun$apply$1.apply(PythonRDD.scala:435)
>  at 
> org.apache.spark.api.python.PythonRDD$$anonfun$6$$anonfun$apply$1.apply(PythonRDD.scala:435)
>  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
>  at 
> org.apache.spark.api.python.PythonRDD$$anonfun$6.apply(PythonRDD.scala:436)
>  at 
> org.apache.spark.api.python.PythonRDD$$anonfun$6.apply(PythonRDD.scala:432)
>  at org.apache.spark.api.python.PythonServer$$anon$1.run(PythonRDD.scala:862)
> {code}
> With {{spark.sql.execution.arrow.enabled}} set to {{false}}, the Python call 
> to {{toPandas}} does fail as expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31527) date add/subtract interval only allow those day precision in ansi mode

2020-04-23 Thread Kent Yao (Jira)

Kent Yao created SPARK-31527:


 Summary: date add/subtract interval only allow those day precision 
in ansi mode
 Key: SPARK-31527
 URL: https://issues.apache.org/jira/browse/SPARK-31527
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.0.0, 3.1.0
Reporter: Kent Yao


Under ANSI mode, we should not allow date add interval with hours, minutes... 
microseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31344) Polish implementation of barrier() and allGather()

2020-04-23 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan updated SPARK-31344:

Fix Version/s: (was: 3.1.0)
   3.0.0

> Polish implementation of barrier() and allGather()
> --
>
> Key: SPARK-31344
> URL: https://issues.apache.org/jira/browse/SPARK-31344
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, Spark Core
>Affects Versions: 3.0.0
>Reporter: wuyi
>Assignee: wuyi
>Priority: Major
> Fix For: 3.0.0
>
>
> Currently, implementation of barrier() and allGather() has much duplicate 
> codes, we should polish them to make code simpler. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31527) date add/subtract interval only allow those day precision in ansi mode

2020-04-23 Thread charity johnson johnson (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090438#comment-17090438
 ] 

charity johnson johnson commented on SPARK-31527:
-

Ty

> date add/subtract interval only allow those day precision in ansi mode
> --
>
> Key: SPARK-31527
> URL: https://issues.apache.org/jira/browse/SPARK-31527
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Kent Yao
>Priority: Major
>
> Under ANSI mode, we should not allow date add interval with hours, minutes... 
> microseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31528) Remove millennium, century, decade from trunc/date_trunc fucntions

2020-04-23 Thread Kent Yao (Jira)

Kent Yao created SPARK-31528:


 Summary: Remove millennium, century, decade from  trunc/date_trunc 
fucntions
 Key: SPARK-31528
 URL: https://issues.apache.org/jira/browse/SPARK-31528
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.0.0, 3.1.0
Reporter: Kent Yao


Same as #SPARK-31507, millennium, century, and decade are not commonly used in 
most modern platforms.

for example
Negative:
https://docs.snowflake.com/en/sql-reference/functions-date-time.html#supported-date-and-time-parts
https://prestodb.io/docs/current/functions/datetime.html#date_trunc
https://teradata.github.io/presto/docs/148t/functions/datetime.html#date_trunc
https://www.oracletutorial.com/oracle-date-functions/oracle-trunc/

Positive:
https://docs.aws.amazon.com/redshift/latest/dg/r_Dateparts_for_datetime_functions.html
https://www.postgresql.org/docs/9.1/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31529) Remove redundant whitespaces in the formatted explain

2020-04-23 Thread wuyi (Jira)

wuyi created SPARK-31529:


 Summary: Remove redundant whitespaces in the formatted explain
 Key: SPARK-31529
 URL: https://issues.apache.org/jira/browse/SPARK-31529
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.0.0
Reporter: wuyi


The formatted explain included redundant whitespaces. And even the number of 
spaces are different between master and branch-3.0, which leads to failed 
explain tests if we backport to branch-3.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31529) Remove extra whitespaces in the formatted explain

2020-04-23 Thread wuyi (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wuyi updated SPARK-31529:
-
Summary: Remove extra whitespaces in the formatted explain  (was: Remove 
redundant whitespaces in the formatted explain)

> Remove extra whitespaces in the formatted explain
> -
>
> Key: SPARK-31529
> URL: https://issues.apache.org/jira/browse/SPARK-31529
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: wuyi
>Priority: Major
>
> The formatted explain included redundant whitespaces. And even the number of 
> spaces are different between master and branch-3.0, which leads to failed 
> explain tests if we backport to branch-3.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31529) Remove extra whitespaces in the formatted explain

2020-04-23 Thread wuyi (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wuyi updated SPARK-31529:
-
Description: The formatted explain included extra whitespaces. And even the 
number of spaces are different between master and branch-3.0, which leads to 
failed explain tests if we backport to branch-3.0.  (was: The formatted explain 
included redundant whitespaces. And even the number of spaces are different 
between master and branch-3.0, which leads to failed explain tests if we 
backport to branch-3.0.)

> Remove extra whitespaces in the formatted explain
> -
>
> Key: SPARK-31529
> URL: https://issues.apache.org/jira/browse/SPARK-31529
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: wuyi
>Priority: Major
>
> The formatted explain included extra whitespaces. And even the number of 
> spaces are different between master and branch-3.0, which leads to failed 
> explain tests if we backport to branch-3.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31530) Spark submit fails if we provide extraJavaOption which contains Xmx as substring

2020-04-23 Thread Mayank (Jira)

Mayank created SPARK-31530:
--

 Summary: Spark submit fails if we provide extraJavaOption which 
contains Xmx  as substring
 Key: SPARK-31530
 URL: https://issues.apache.org/jira/browse/SPARK-31530
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 2.4.0
Reporter: Mayank


Spark submit doesn't allow Xmx anywhere in the spark.driver.extraJavaOptions
 For eg:
{code:java}
bin\spark-submit --class org.apache.spark.examples.SparkPi --master local[*] 
--conf "spark.driver.extraJavaOptions=-DmyKey=MyValueContainsXmx" 
examples\jars\spark-examples_2.11-2.4.4.jar
Error: Not allowed to specify max heap(Xmx) memory settings through java 
options (was -DmyKey=MyValueContainsXmx). Use the corresponding --driver-memory 
or spark.driver.memory configuration instead.{code}
[https://github.com/apache/spark/blob/v2.4.4/launcher/src/main/java/org/apache/spark/launcher/SparkClassCommandBuilder.java#L102|http://example.com]

Can we update the above condition to check more specific for eg -Xmx

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31530) Spark submit fails if we provide extraJavaOption which contains Xmx as substring

2020-04-23 Thread Mayank (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank updated SPARK-31530:
---
Labels: 2.4.0 Spark Submit  (was: )

> Spark submit fails if we provide extraJavaOption which contains Xmx  as 
> substring
> -
>
> Key: SPARK-31530
> URL: https://issues.apache.org/jira/browse/SPARK-31530
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.4.0
>Reporter: Mayank
>Priority: Major
>  Labels: 2.4.0, Spark, Submit
>
> Spark submit doesn't allow Xmx anywhere in the spark.driver.extraJavaOptions
>  For eg:
> {code:java}
> bin\spark-submit --class org.apache.spark.examples.SparkPi --master local[*] 
> --conf "spark.driver.extraJavaOptions=-DmyKey=MyValueContainsXmx" 
> examples\jars\spark-examples_2.11-2.4.4.jar
> Error: Not allowed to specify max heap(Xmx) memory settings through java 
> options (was -DmyKey=MyValueContainsXmx). Use the corresponding 
> --driver-memory or spark.driver.memory configuration instead.{code}
> [https://github.com/apache/spark/blob/v2.4.4/launcher/src/main/java/org/apache/spark/launcher/SparkClassCommandBuilder.java#L102|http://example.com]
> Can we update the above condition to check more specific for eg -Xmx
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-25075) Build and test Spark against Scala 2.13

2020-04-23 Thread Guillaume Martres (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-25075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090582#comment-17090582
 ] 

Guillaume Martres commented on SPARK-25075:
---

> SPARK-30132 which is blocked on Scala 2.13.2

2.13.2 is out now.

> SPARK-27683 and SPARK-30090 sound like non-negligible effort as well

The first one isn't actually a blocker, as noted in that issue, TraversableOnce 
is still there (as an alias) in 2.13

> Build and test Spark against Scala 2.13
> ---
>
> Key: SPARK-25075
> URL: https://issues.apache.org/jira/browse/SPARK-25075
> Project: Spark
>  Issue Type: Umbrella
>  Components: Build, MLlib, Project Infra, Spark Core, SQL
>Affects Versions: 3.0.0
>Reporter: Guillaume Massé
>Priority: Major
>
> This umbrella JIRA tracks the requirements for building and testing Spark 
> against the current Scala 2.13 milestone.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-30090) Update REPL for 2.13

2020-04-23 Thread Seth Tisue (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-30090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090590#comment-17090590
 ] 

Seth Tisue commented on SPARK-30090:


some further REPL changes have now landed in Scala 2.13.2: 
https://github.com/scala/scala/releases/tag/v2.13.2

> Update REPL for 2.13
> 
>
> Key: SPARK-30090
> URL: https://issues.apache.org/jira/browse/SPARK-30090
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Shell
>Affects Versions: 3.0.0
>Reporter: Sean R. Owen
>Priority: Major
>
> The Spark REPL is a modified Scala REPL. It changed significantly in 2.13. We 
> will need to at least re-hack it, and along the way, see if we can do what's 
> necessary to customize it without so many invasive changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31531) sun.misc.Cleaner sun.nio.ch.DirectBuffer.cleaner() method not found during spark-submit

2020-04-23 Thread shayoni Halder (Jira)

shayoni Halder created SPARK-31531:
--

 Summary: sun.misc.Cleaner sun.nio.ch.DirectBuffer.cleaner() method 
not found during spark-submit
 Key: SPARK-31531
 URL: https://issues.apache.org/jira/browse/SPARK-31531
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 2.4.5
Reporter: shayoni Halder


I am trying to run the following Spark submit from a VM using Yarn cluster mode.
./spark-submit --master yarn --deploy-mode client test_spark_yarn.py

The VM has java version 11 and spark-2.4.5 while the yarn cluster java 8 and 
spark-2.4.0. I am getting the error below:

!image-2020-04-23-16-41-54-449.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31531) sun.misc.Cleaner sun.nio.ch.DirectBuffer.cleaner() method not found during spark-submit

2020-04-23 Thread shayoni Halder (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shayoni Halder updated SPARK-31531:
---
Attachment: error.PNG

> sun.misc.Cleaner sun.nio.ch.DirectBuffer.cleaner() method not found during 
> spark-submit
> ---
>
> Key: SPARK-31531
> URL: https://issues.apache.org/jira/browse/SPARK-31531
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.4.5
>Reporter: shayoni Halder
>Priority: Major
> Attachments: error.PNG
>
>
> I am trying to run the following Spark submit from a VM using Yarn cluster 
> mode.
> ./spark-submit --master yarn --deploy-mode client test_spark_yarn.py
> The VM has java version 11 and spark-2.4.5 while the yarn cluster java 8 and 
> spark-2.4.0. I am getting the error below:
> !image-2020-04-23-16-41-54-449.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-31472) allGather() may return null messages

2020-04-23 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-31472.
-
Fix Version/s: 3.0.0
 Assignee: wuyi
   Resolution: Fixed

> allGather() may return null messages 
> -
>
> Key: SPARK-31472
> URL: https://issues.apache.org/jira/browse/SPARK-31472
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: wuyi
>Assignee: wuyi
>Priority: Major
> Fix For: 3.0.0
>
>
> {code:java}
> [info] BarrierTaskContextSuite:
> [info] - share messages with allGather() call *** FAILED *** (18 seconds, 705 
> milliseconds)
> [info]   org.apache.spark.SparkException: Job aborted due to stage failure: 
> Could not recover from a failed barrier ResultStage. Most recent failure 
> reason: Stage failed because barrier task ResultTask(0, 2) finished 
> unsuccessfully.
> [info] java.lang.NullPointerException
> [info]at 
> scala.collection.mutable.ArrayOps$ofRef$.length$extension(ArrayOps.scala:204)
> [info]at 
> scala.collection.mutable.ArrayOps$ofRef.length(ArrayOps.scala:204)
> [info]at 
> scala.collection.IndexedSeqOptimized.toList(IndexedSeqOptimized.scala:285)
> [info]at 
> scala.collection.IndexedSeqOptimized.toList$(IndexedSeqOptimized.scala:284)
> [info]at 
> scala.collection.mutable.ArrayOps$ofRef.toList(ArrayOps.scala:198)
> [info]at 
> org.apache.spark.scheduler.BarrierTaskContextSuite.$anonfun$new$4(BarrierTaskContextSuite.scala:68)
> [info]at 
> org.apache.spark.rdd.RDDBarrier.$anonfun$mapPartitions$2(RDDBarrier.scala:51)
> [info]at 
> org.apache.spark.rdd.RDDBarrier.$anonfun$mapPartitions$2$adapted(RDDBarrier.scala:51)
> [info]at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> [info]at 
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
> [info]at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
> [info]at 
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
> [info]at org.apache.spark.scheduler.Task.run(Task.scala:127)
> [info]at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:460)
> [info]at 
> org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
> [info]at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:463)
> [info]at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [info]at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [info]at java.lang.Thread.run(Thread.java:748)
> [info]   at 
> org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2094)
> [info]   at 
> org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2043)
> [info]   at 
> org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2042)
> [info]   at 
> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
> [info]   at 
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
> [info]   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> [info]   at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2042)
> [info]   at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1831)
> [info]   at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2271)
> [info]   at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2223)
> [info]   at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2212)
> [info]   at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
> [info]   at 
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:822)
> [info]   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2108)
> [info]   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2129)
> [info]   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2148)
> [info]   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2173)
> [info]   at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1030)
> [info]   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> [info]   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
> [info]   at org.apache.spark.rdd.RDD.withScope(RDD.scala:414)
> [info]   at org.apache.spark.rdd.RDD.collect(RDD.scala:1029)
> [info]   at 
> org.apache.spark.scheduler.BarrierTaskContextSuite.$anonfun$new$3(BarrierTaskContextSuite.scala:71)
> [info]   at 
> scala.r

[jira] [Updated] (SPARK-31531) sun.misc.Cleaner sun.nio.ch.DirectBuffer.cleaner() method not found during spark-submit

2020-04-23 Thread shayoni Halder (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shayoni Halder updated SPARK-31531:
---
Description: 
I am trying to run the following Spark submit from a VM using Yarn cluster mode.
 ./spark-submit --master yarn --deploy-mode client test_spark_yarn.py

The VM has java version 11 and spark-2.4.5 while the yarn cluster java 8 and 
spark-2.4.0. I am getting the error below:

!error.PNG!

  was:
I am trying to run the following Spark submit from a VM using Yarn cluster mode.
./spark-submit --master yarn --deploy-mode client test_spark_yarn.py

The VM has java version 11 and spark-2.4.5 while the yarn cluster java 8 and 
spark-2.4.0. I am getting the error below:

!image-2020-04-23-16-41-54-449.png!


> sun.misc.Cleaner sun.nio.ch.DirectBuffer.cleaner() method not found during 
> spark-submit
> ---
>
> Key: SPARK-31531
> URL: https://issues.apache.org/jira/browse/SPARK-31531
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.4.5
>Reporter: shayoni Halder
>Priority: Major
> Attachments: error.PNG
>
>
> I am trying to run the following Spark submit from a VM using Yarn cluster 
> mode.
>  ./spark-submit --master yarn --deploy-mode client test_spark_yarn.py
> The VM has java version 11 and spark-2.4.5 while the yarn cluster java 8 and 
> spark-2.4.0. I am getting the error below:
> !error.PNG!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-31522) Hive metastore client initialization related configurations should be static

2020-04-23 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-31522.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 28302
[https://github.com/apache/spark/pull/28302]

> Hive metastore client initialization related configurations should be static 
> -
>
> Key: SPARK-31522
> URL: https://issues.apache.org/jira/browse/SPARK-31522
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.5, 3.0.0, 3.1.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
> Fix For: 3.0.0
>
>
> The following configurations defined in HiveUtils should be considered static:
>  # spark.sql.hive.metastore.version - used to determine the hive version in 
> Spark
>  # spark.sql.hive.version - the fake of the above
>  # spark.sql.hive.metastore.jars - hive metastore related jars location which 
> is used by spark to create hive client
>  # spark.sql.hive.metastore.sharedPrefixes and 
> spark.sql.hive.metastore.barrierPrefixes -  packages of classes that are 
> shared or separated between SparkContextLoader and hive client class loader
> Those are used only once when creating the hive metastore client. They should 
> be static in SQLConf for retrieving them correctly. We should avoid them 
> being changed by users with SET/RESET command. They should 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-31522) Hive metastore client initialization related configurations should be static

2020-04-23 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-31522:
---

Assignee: Kent Yao

> Hive metastore client initialization related configurations should be static 
> -
>
> Key: SPARK-31522
> URL: https://issues.apache.org/jira/browse/SPARK-31522
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.5, 3.0.0, 3.1.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>
> The following configurations defined in HiveUtils should be considered static:
>  # spark.sql.hive.metastore.version - used to determine the hive version in 
> Spark
>  # spark.sql.hive.version - the fake of the above
>  # spark.sql.hive.metastore.jars - hive metastore related jars location which 
> is used by spark to create hive client
>  # spark.sql.hive.metastore.sharedPrefixes and 
> spark.sql.hive.metastore.barrierPrefixes -  packages of classes that are 
> shared or separated between SparkContextLoader and hive client class loader
> Those are used only once when creating the hive metastore client. They should 
> be static in SQLConf for retrieving them correctly. We should avoid them 
> being changed by users with SET/RESET command. They should 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31532) SparkSessionBuilder shoud not propagate static sql configurations to the existing active/default SparkSession

2020-04-23 Thread Kent Yao (Jira)

Kent Yao created SPARK-31532:


 Summary: SparkSessionBuilder shoud not propagate static sql 
configurations to the existing active/default SparkSession
 Key: SPARK-31532
 URL: https://issues.apache.org/jira/browse/SPARK-31532
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.0.0
Reporter: Kent Yao


Clearly, this is a bug.
{code:java}
scala> spark.sql("set spark.sql.warehouse.dir").show
+++
| key|   value|
+++
|spark.sql.warehou...|file:/Users/kenty...|
+++


scala> spark.sql("set spark.sql.warehouse.dir=2");
org.apache.spark.sql.AnalysisException: Cannot modify the value of a static 
config: spark.sql.warehouse.dir;
  at 
org.apache.spark.sql.RuntimeConfig.requireNonStaticConf(RuntimeConfig.scala:154)
  at org.apache.spark.sql.RuntimeConfig.set(RuntimeConfig.scala:42)
  at 
org.apache.spark.sql.execution.command.SetCommand.$anonfun$x$7$6(SetCommand.scala:100)
  at org.apache.spark.sql.execution.command.SetCommand.run(SetCommand.scala:156)
  at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
  at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
  at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
  at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229)
  at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3644)
  at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
  at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
  at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
  at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3642)
  at org.apache.spark.sql.Dataset.(Dataset.scala:229)
  at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:607)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:602)
  ... 47 elided

scala> import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.SparkSession

scala> SparkSession.builder.config("spark.sql.warehouse.dir", "xyz").get
getClass   getOrCreate

scala> SparkSession.builder.config("spark.sql.warehouse.dir", "xyz").getOrCreate
20/04/23 23:49:13 WARN SparkSession$Builder: Using an existing SparkSession; 
some configuration may not take effect.
res7: org.apache.spark.sql.SparkSession = 
org.apache.spark.sql.SparkSession@6403d574

scala> spark.sql("set spark.sql.warehouse.dir").show
++-+
| key|value|
++-+
|spark.sql.warehou...|  xyz|
++-+


scala>
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31533) Enable DB2IntegrationSuite test and upgrade the DB2 docker inside

2020-04-23 Thread Gabor Somogyi (Jira)

Gabor Somogyi created SPARK-31533:
-

 Summary: Enable DB2IntegrationSuite test and upgrade the DB2 
docker inside
 Key: SPARK-31533
 URL: https://issues.apache.org/jira/browse/SPARK-31533
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.1.0
Reporter: Gabor Somogyi






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31533) Enable DB2IntegrationSuite test and upgrade the DB2 docker inside

2020-04-23 Thread Gabor Somogyi (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090725#comment-17090725
 ] 

Gabor Somogyi commented on SPARK-31533:
---

Started to work on this.

> Enable DB2IntegrationSuite test and upgrade the DB2 docker inside
> -
>
> Key: SPARK-31533
> URL: https://issues.apache.org/jira/browse/SPARK-31533
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Gabor Somogyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31533) Enable DB2IntegrationSuite test and upgrade the DB2 docker inside

2020-04-23 Thread Gabor Somogyi (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Somogyi updated SPARK-31533:
--
Component/s: Tests

> Enable DB2IntegrationSuite test and upgrade the DB2 docker inside
> -
>
> Key: SPARK-31533
> URL: https://issues.apache.org/jira/browse/SPARK-31533
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.1.0
>Reporter: Gabor Somogyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31532) SparkSessionBuilder shoud not propagate static sql configurations to the existing active/default SparkSession

2020-04-23 Thread Kent Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao updated SPARK-31532:
-
Affects Version/s: 3.1.0
   2.0.2
   2.1.3
   2.2.3
   2.3.4
   2.4.5

> SparkSessionBuilder shoud not propagate static sql configurations to the 
> existing active/default SparkSession
> -
>
> Key: SPARK-31532
> URL: https://issues.apache.org/jira/browse/SPARK-31532
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.5, 3.0.0, 3.1.0
>Reporter: Kent Yao
>Priority: Major
>
> Clearly, this is a bug.
> {code:java}
> scala> spark.sql("set spark.sql.warehouse.dir").show
> +++
> | key|   value|
> +++
> |spark.sql.warehou...|file:/Users/kenty...|
> +++
> scala> spark.sql("set spark.sql.warehouse.dir=2");
> org.apache.spark.sql.AnalysisException: Cannot modify the value of a static 
> config: spark.sql.warehouse.dir;
>   at 
> org.apache.spark.sql.RuntimeConfig.requireNonStaticConf(RuntimeConfig.scala:154)
>   at org.apache.spark.sql.RuntimeConfig.set(RuntimeConfig.scala:42)
>   at 
> org.apache.spark.sql.execution.command.SetCommand.$anonfun$x$7$6(SetCommand.scala:100)
>   at 
> org.apache.spark.sql.execution.command.SetCommand.run(SetCommand.scala:156)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229)
>   at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3644)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
>   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3642)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:229)
>   at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:607)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:602)
>   ... 47 elided
> scala> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.SparkSession
> scala> SparkSession.builder.config("spark.sql.warehouse.dir", "xyz").get
> getClass   getOrCreate
> scala> SparkSession.builder.config("spark.sql.warehouse.dir", 
> "xyz").getOrCreate
> 20/04/23 23:49:13 WARN SparkSession$Builder: Using an existing SparkSession; 
> some configuration may not take effect.
> res7: org.apache.spark.sql.SparkSession = 
> org.apache.spark.sql.SparkSession@6403d574
> scala> spark.sql("set spark.sql.warehouse.dir").show
> ++-+
> | key|value|
> ++-+
> |spark.sql.warehou...|  xyz|
> ++-+
> scala>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31530) Spark submit fails if we provide extraJavaOption which contains Xmx as substring

2020-04-23 Thread Mayank (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank updated SPARK-31530:
---
Description: 
Spark submit doesn't allow Xmx anywhere in the spark.driver.extraJavaOptions
 For eg:
{code:java}
bin\spark-submit --class org.apache.spark.examples.SparkPi --master local[*] 
--conf "spark.driver.extraJavaOptions=-DmyKey=MyValueContainsXmx" 
examples\jars\spark-examples_2.11-2.4.4.jar
Error: Not allowed to specify max heap(Xmx) memory settings through java 
options (was -DmyKey=MyValueContainsXmx). Use the corresponding --driver-memory 
or spark.driver.memory configuration instead.{code}
[https://github.com/apache/spark/blob/v2.4.4/launcher/src/main/java/org/apache/spark/launcher/SparkClassCommandBuilder.java#L102

https://github.com/apache/spark/blob/master/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java#L302|http://example.com/]

Can we update the above condition to check more specific for eg -Xmx

 

  was:
Spark submit doesn't allow Xmx anywhere in the spark.driver.extraJavaOptions
 For eg:
{code:java}
bin\spark-submit --class org.apache.spark.examples.SparkPi --master local[*] 
--conf "spark.driver.extraJavaOptions=-DmyKey=MyValueContainsXmx" 
examples\jars\spark-examples_2.11-2.4.4.jar
Error: Not allowed to specify max heap(Xmx) memory settings through java 
options (was -DmyKey=MyValueContainsXmx). Use the corresponding --driver-memory 
or spark.driver.memory configuration instead.{code}
[https://github.com/apache/spark/blob/v2.4.4/launcher/src/main/java/org/apache/spark/launcher/SparkClassCommandBuilder.java#L102|http://example.com]

Can we update the above condition to check more specific for eg -Xmx

 


> Spark submit fails if we provide extraJavaOption which contains Xmx  as 
> substring
> -
>
> Key: SPARK-31530
> URL: https://issues.apache.org/jira/browse/SPARK-31530
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.4.0
>Reporter: Mayank
>Priority: Major
>  Labels: 2.4.0, Spark, Submit
>
> Spark submit doesn't allow Xmx anywhere in the spark.driver.extraJavaOptions
>  For eg:
> {code:java}
> bin\spark-submit --class org.apache.spark.examples.SparkPi --master local[*] 
> --conf "spark.driver.extraJavaOptions=-DmyKey=MyValueContainsXmx" 
> examples\jars\spark-examples_2.11-2.4.4.jar
> Error: Not allowed to specify max heap(Xmx) memory settings through java 
> options (was -DmyKey=MyValueContainsXmx). Use the corresponding 
> --driver-memory or spark.driver.memory configuration instead.{code}
> [https://github.com/apache/spark/blob/v2.4.4/launcher/src/main/java/org/apache/spark/launcher/SparkClassCommandBuilder.java#L102
> https://github.com/apache/spark/blob/master/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java#L302|http://example.com/]
> Can we update the above condition to check more specific for eg -Xmx
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31534) Text for tooltip should be escaped

2020-04-23 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-31534:
--

 Summary: Text for tooltip should be escaped
 Key: SPARK-31534
 URL: https://issues.apache.org/jira/browse/SPARK-31534
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 3.1.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


Timeline View for application and job, and DAG Viz for job show tooltip but its 
text are not escaped for HTML so they cannot be shown properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29106) Add jenkins arm test for spark

2020-04-23 Thread Geoffrey Blake (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090773#comment-17090773
 ] 

Geoffrey Blake commented on SPARK-29106:


Hi all,

Regarding the leveldbjni dependency, why is the openlabtesting version of 
leveldbjni-1.8-all.jar not active for all builds of Spark going forward?  To 
get spark most will likely download from the Apache mirrors and those tarballs 
are built on an x86 system, so the aarch64 supported version of leveldbjni is 
hidden from all except those brave enough to try building from source.

Thanks,

Geoff Blake

> Add jenkins arm test for spark
> --
>
> Key: SPARK-29106
> URL: https://issues.apache.org/jira/browse/SPARK-29106
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 3.0.0
>Reporter: huangtianhua
>Assignee: Shane Knapp
>Priority: Minor
> Attachments: R-ansible.yml, R-libs.txt, 
> SparkR-and-pyspark36-testing.txt, arm-python36.txt
>
>
> Add arm test jobs to amplab jenkins for spark.
> Till now we made two arm test periodic jobs for spark in OpenLab, one is 
> based on master with hadoop 2.7(similar with QA test of amplab jenkins), 
> other one is based on a new branch which we made on date 09-09, see  
> [http://status.openlabtesting.org/builds/job/spark-master-unit-test-hadoop-2.7-arm64]
>   and 
> [http://status.openlabtesting.org/builds/job/spark-unchanged-branch-unit-test-hadoop-2.7-arm64.|http://status.openlabtesting.org/builds/job/spark-unchanged-branch-unit-test-hadoop-2.7-arm64]
>  We only have to care about the first one when integrate arm test with amplab 
> jenkins.
> About the k8s test on arm, we have took test it, see 
> [https://github.com/theopenlab/spark/pull/17], maybe we can integrate it 
> later. 
> And we plan test on other stable branches too, and we can integrate them to 
> amplab when they are ready.
> We have offered an arm instance and sent the infos to shane knapp, thanks 
> shane to add the first arm job to amplab jenkins :) 
> The other important thing is about the leveldbjni 
> [https://github.com/fusesource/leveldbjni,|https://github.com/fusesource/leveldbjni/issues/80]
>  spark depends on leveldbjni-all-1.8 
> [https://mvnrepository.com/artifact/org.fusesource.leveldbjni/leveldbjni-all/1.8],
>  we can see there is no arm64 supporting. So we build an arm64 supporting 
> release of leveldbjni see 
> [https://mvnrepository.com/artifact/org.openlabtesting.leveldbjni/leveldbjni-all/1.8],
>  but we can't modified the spark pom.xml directly with something like 
> 'property'/'profile' to choose correct jar package on arm or x86 platform, 
> because spark depends on some hadoop packages like hadoop-hdfs, the packages 
> depend on leveldbjni-all-1.8 too, unless hadoop release with new arm 
> supporting leveldbjni jar. Now we download the leveldbjni-al-1.8 of 
> openlabtesting and 'mvn install' to use it when arm testing for spark.
> PS: The issues found and fixed:
>  SPARK-28770
>  [https://github.com/apache/spark/pull/25673]
>   
>  SPARK-28519
>  [https://github.com/apache/spark/pull/25279]
>   
>  SPARK-28433
>  [https://github.com/apache/spark/pull/25186]
>  
> SPARK-28467
> [https://github.com/apache/spark/pull/25864]
>  
> SPARK-29286
> [https://github.com/apache/spark/pull/26021]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-29458) Document scalar functions usage in APIs in SQL getting started.

2020-04-23 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-29458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-29458:
---
Affects Version/s: (was: 3.1.0)
   3.0.0

> Document scalar functions usage in APIs in SQL getting started.
> ---
>
> Key: SPARK-29458
> URL: https://issues.apache.org/jira/browse/SPARK-29458
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 3.0.0
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31535) Fix nested CTE substitution

2020-04-23 Thread Peter Toth (Jira)

Peter Toth created SPARK-31535:
--

 Summary: Fix nested CTE substitution
 Key: SPARK-31535
 URL: https://issues.apache.org/jira/browse/SPARK-31535
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.0.0
Reporter: Peter Toth


The following nested CTE should result empty result instead of {{1}}

{noformat}
WITH t(c) AS (SELECT 1)
SELECT * FROM t
WHERE c IN (
  WITH t(c) AS (SELECT 2)
  SELECT * FROM t
)
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-21529) Improve the error message for unsupported Uniontype

2020-04-23 Thread Sudharshann D. (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-21529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090862#comment-17090862
 ] 

Sudharshann D. commented on SPARK-21529:


If this issue hasnt been solved yet, may I please work on it? :)

> Improve the error message for unsupported Uniontype
> ---
>
> Key: SPARK-21529
> URL: https://issues.apache.org/jira/browse/SPARK-21529
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
> Environment: Qubole, DataBricks
>Reporter: Elliot West
>Priority: Major
>  Labels: hive, starter, uniontype
>
> We encounter errors when attempting to read Hive tables whose schema contains 
> the {{uniontype}}. It appears perhaps that Catalyst
> does not support the {{uniontype}} which renders this table unreadable by 
> Spark (2.1). Although, {{uniontype}} is arguably incomplete in the Hive
> query engine, it is fully supported by the storage engine and also the Avro 
> data format, which we use for these tables. Therefore, I believe it is
> a valid, usable type construct that should be supported by Spark.
> We've attempted to read the table as follows:
> {code}
> spark.sql("select * from etl.tbl where acquisition_instant='20170706T133545Z' 
> limit 5").show
> val tblread = spark.read.table("etl.tbl")
> {code}
> But this always results in the same error message. The pertinent error 
> messages are as follows (full stack trace below):
> {code}
> org.apache.spark.SparkException: Cannot recognize hive type string: 
> uniontype ...
> Caused by: org.apache.spark.sql.catalyst.parser.ParseException: 
> mismatched input '<' expecting
> {, '('}
> (line 1, pos 9)
> == SQL ==
> uniontype -^^^
> {code}
> h2. Full stack trace
> {code}
> org.apache.spark.SparkException: Cannot recognize hive type string: 
> uniontype>>,n:boolean,o:string,p:bigint,q:string>,struct,ag:boolean,ah:string,ai:bigint,aj:string>>
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.fromHiveColumn(HiveClientImpl.scala:800)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getTableOption$1$$anonfun$apply$11$$anonfun$7.apply(HiveClientImpl.scala:377)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getTableOption$1$$anonfun$apply$11$$anonfun$7.apply(HiveClientImpl.scala:377)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getTableOption$1$$anonfun$apply$11.apply(HiveClientImpl.scala:377)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getTableOption$1$$anonfun$apply$11.apply(HiveClientImpl.scala:373)
> at scala.Option.map(Option.scala:146)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getTableOption$1.apply(HiveClientImpl.scala:373)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getTableOption$1.apply(HiveClientImpl.scala:371)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:290)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:231)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:230)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:273)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl.getTableOption(HiveClientImpl.scala:371)
> at 
> org.apache.spark.sql.hive.client.HiveClient$class.getTable(HiveClient.scala:74)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl.getTable(HiveClientImpl.scala:79)
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$getRawTable$1.apply(HiveExternalCatalog.scala:118)
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$getRawTable$1.apply(HiveExternalCatalog.scala:118)
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog.org$apache$spark$sql$hive$HiveExternalCatalog$$getRawTable(HiveExternalCatalog.scala:117)
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getTable$1.apply(HiveExternalCatalog.scala:648)
> at 
>

[jira] [Updated] (SPARK-31535) Fix nested CTE substitution

2020-04-23 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-31535:
--
Target Version/s: 3.0.0
Priority: Blocker  (was: Major)

> Fix nested CTE substitution
> ---
>
> Key: SPARK-31535
> URL: https://issues.apache.org/jira/browse/SPARK-31535
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Peter Toth
>Priority: Blocker
>  Labels: correctness
>
> The following nested CTE should result empty result instead of {{1}}
> {noformat}
> WITH t(c) AS (SELECT 1)
> SELECT * FROM t
> WHERE c IN (
>   WITH t(c) AS (SELECT 2)
>   SELECT * FROM t
> )
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31535) Fix nested CTE substitution

2020-04-23 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-31535:
--
Labels: correctness  (was: )

> Fix nested CTE substitution
> ---
>
> Key: SPARK-31535
> URL: https://issues.apache.org/jira/browse/SPARK-31535
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Peter Toth
>Priority: Major
>  Labels: correctness
>
> The following nested CTE should result empty result instead of {{1}}
> {noformat}
> WITH t(c) AS (SELECT 1)
> SELECT * FROM t
> WHERE c IN (
>   WITH t(c) AS (SELECT 2)
>   SELECT * FROM t
> )
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31535) Fix nested CTE substitution

2020-04-23 Thread Dongjoon Hyun (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090872#comment-17090872
 ] 

Dongjoon Hyun commented on SPARK-31535:
---

I added `correctness` label and raised the priority to `Blocker` with `Target 
Version: 3.0.0`. Thanks, [~petertoth] and [~cloud_fan]. 

> Fix nested CTE substitution
> ---
>
> Key: SPARK-31535
> URL: https://issues.apache.org/jira/browse/SPARK-31535
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Peter Toth
>Priority: Blocker
>  Labels: correctness
>
> The following nested CTE should result empty result instead of {{1}}
> {noformat}
> WITH t(c) AS (SELECT 1)
> SELECT * FROM t
> WHERE c IN (
>   WITH t(c) AS (SELECT 2)
>   SELECT * FROM t
> )
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31536) Backport SPARK-25407 Allow nested access for non-existent field for Parquet file when nested pruning is enabled

2020-04-23 Thread Holden Karau (Jira)

Holden Karau created SPARK-31536:


 Summary: Backport SPARK-25407   Allow nested access for 
non-existent field for Parquet file when nested pruning is enabled
 Key: SPARK-31536
 URL: https://issues.apache.org/jira/browse/SPARK-31536
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 2.4.6
Reporter: Holden Karau


Consider backporting SPARK-25407       Allow nested access for non-existent 
field for Parquet file when nested pruning is enabled to 2.4.6



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31537) Backport SPARK-25559 Remove the unsupported predicates in Parquet when possible

2020-04-23 Thread Holden Karau (Jira)

Holden Karau created SPARK-31537:


 Summary: Backport SPARK-25559  Remove the unsupported predicates 
in Parquet when possible
 Key: SPARK-31537
 URL: https://issues.apache.org/jira/browse/SPARK-31537
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 2.4.6
Reporter: Holden Karau
Assignee: DB Tsai
 Fix For: 2.4.6


Consider backporting SPARK-25559       Remove the unsupported predicates in 
Parquet when possible to 2.4.6



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31538) Backport SPARK-25338 Ensure to call super.beforeAll() and super.afterAll() in test cases

2020-04-23 Thread Holden Karau (Jira)

Holden Karau created SPARK-31538:


 Summary: Backport SPARK-25338   Ensure to call 
super.beforeAll() and super.afterAll() in test cases
 Key: SPARK-31538
 URL: https://issues.apache.org/jira/browse/SPARK-31538
 Project: Spark
  Issue Type: Bug
  Components: Tests
Affects Versions: 2.4.6
Reporter: Holden Karau


Backport SPARK-25338       Ensure to call super.beforeAll() and 
super.afterAll() in test cases



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31540) Backport SPARK-27981 Remove `Illegal reflective access` warning for `java.nio.Bits.unaligned()` in JDK9+

2020-04-23 Thread Holden Karau (Jira)

Holden Karau created SPARK-31540:


 Summary: Backport SPARK-27981   Remove `Illegal reflective 
access` warning for `java.nio.Bits.unaligned()` in JDK9+
 Key: SPARK-31540
 URL: https://issues.apache.org/jira/browse/SPARK-31540
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.4.6
Reporter: Holden Karau


SPARK-27981       Remove `Illegal reflective access` warning for 
`java.nio.Bits.unaligned()` in JDK9+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31539) Backport SPARK-27138 Remove AdminUtils calls (fixes deprecation)

2020-04-23 Thread Holden Karau (Jira)

Holden Karau created SPARK-31539:


 Summary: Backport SPARK-27138   Remove AdminUtils calls (fixes 
deprecation)
 Key: SPARK-31539
 URL: https://issues.apache.org/jira/browse/SPARK-31539
 Project: Spark
  Issue Type: Improvement
  Components: Tests
Affects Versions: 2.4.6
Reporter: Holden Karau


SPARK-27138       Remove AdminUtils calls (fixes deprecation)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31541) Backport SPARK-26095 Disable parallelization in make-distibution.sh. (Avoid build hanging)

2020-04-23 Thread Holden Karau (Jira)

Holden Karau created SPARK-31541:


 Summary: Backport SPARK-26095   Disable parallelization in 
make-distibution.sh. (Avoid build hanging)
 Key: SPARK-31541
 URL: https://issues.apache.org/jira/browse/SPARK-31541
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 2.4.6
Reporter: Holden Karau


Backport SPARK-26095       Disable parallelization in make-distibution.sh. 
(Avoid build hanging)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31542) Backport SPARK-25692 Remove static initialization of worker eventLoop handling chunk fetch requests within TransportContext. This fixes ChunkFetchIntegrationSuite

2020-04-23 Thread Holden Karau (Jira)

Holden Karau created SPARK-31542:


 Summary: Backport SPARK-25692   Remove static initialization 
of worker eventLoop handling chunk fetch requests within TransportContext. This 
fixes ChunkFetchIntegrationSuite as well
 Key: SPARK-31542
 URL: https://issues.apache.org/jira/browse/SPARK-31542
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 2.4.6
Reporter: Holden Karau


Backport SPARK-25692       Remove static initialization of worker eventLoop 
handling chunk fetch requests within TransportContext. This fixes 
ChunkFetchIntegrationSuite as well.

While the test was only flaky in the 3.0 branch, it seems possible the same 
code path could be triggered in 2.4 so consider for backport.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31543) Backport SPARK-26306 More memory to de-flake SorterSuite

2020-04-23 Thread Holden Karau (Jira)

Holden Karau created SPARK-31543:


 Summary: Backport SPARK-26306   More memory to de-flake 
SorterSuite
 Key: SPARK-31543
 URL: https://issues.apache.org/jira/browse/SPARK-31543
 Project: Spark
  Issue Type: Bug
  Components: Tests
Affects Versions: 2.4.6
Reporter: Holden Karau


SPARK-26306       More memory to de-flake SorterSuite



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31544) Backport SPARK-30199 Recover `spark.(ui|blockManager).port` from checkpoint

2020-04-23 Thread Holden Karau (Jira)

Holden Karau created SPARK-31544:


 Summary: Backport SPARK-30199   Recover 
`spark.(ui|blockManager).port` from checkpoint
 Key: SPARK-31544
 URL: https://issues.apache.org/jira/browse/SPARK-31544
 Project: Spark
  Issue Type: Bug
  Components: DStreams
Affects Versions: 2.4.6
Reporter: Holden Karau


Backport SPARK-30199       Recover `spark.(ui|blockManager).port` from 
checkpoint

cc [~dongjoon] for if you think this is a good candidate



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31545) Backport SPARK-27676 InMemoryFileIndex should respect spark.sql.files.ignoreMissingFiles

2020-04-23 Thread Holden Karau (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-31545:
-
Component/s: (was: DStreams)
 SQL

> Backport SPARK-27676   InMemoryFileIndex should respect 
> spark.sql.files.ignoreMissingFiles
> --
>
> Key: SPARK-31545
> URL: https://issues.apache.org/jira/browse/SPARK-31545
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Backport SPARK-27676       InMemoryFileIndex should respect 
> spark.sql.files.ignoreMissingFiles
> cc [~joshrosen] I think backporting this has been asked in the original 
> ticket, do you have any objections?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31545) Backport SPARK-27676 InMemoryFileIndex should respect spark.sql.files.ignoreMissingFiles

2020-04-23 Thread Holden Karau (Jira)

Holden Karau created SPARK-31545:


 Summary: Backport SPARK-27676   InMemoryFileIndex should 
respect spark.sql.files.ignoreMissingFiles
 Key: SPARK-31545
 URL: https://issues.apache.org/jira/browse/SPARK-31545
 Project: Spark
  Issue Type: Bug
  Components: DStreams
Affects Versions: 2.4.6
Reporter: Holden Karau


Backport SPARK-27676       InMemoryFileIndex should respect 
spark.sql.files.ignoreMissingFiles

cc [~joshrosen] I think backporting this has been asked in the original ticket, 
do you have any objections?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31542) Backport SPARK-25692 Remove static initialization of worker eventLoop handling chunk fetch requests within TransportContext. This fixes ChunkFetchIntegrationSuite

2020-04-23 Thread Holden Karau (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-31542:
-
Component/s: (was: Build)
 Spark Core

> Backport SPARK-25692   Remove static initialization of worker eventLoop 
> handling chunk fetch requests within TransportContext. This fixes 
> ChunkFetchIntegrationSuite as well
> 
>
> Key: SPARK-31542
> URL: https://issues.apache.org/jira/browse/SPARK-31542
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Backport SPARK-25692       Remove static initialization of worker eventLoop 
> handling chunk fetch requests within TransportContext. This fixes 
> ChunkFetchIntegrationSuite as well.
> While the test was only flaky in the 3.0 branch, it seems possible the same 
> code path could be triggered in 2.4 so consider for backport.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31546) Backport SPARK-25595 Ignore corrupt Avro file if flag IGNORE_CORRUPT_FILES enabled

2020-04-23 Thread Holden Karau (Jira)

Holden Karau created SPARK-31546:


 Summary: Backport SPARK-25595   Ignore corrupt Avro file if 
flag IGNORE_CORRUPT_FILES enabled
 Key: SPARK-31546
 URL: https://issues.apache.org/jira/browse/SPARK-31546
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.4.6
Reporter: Holden Karau


Backport SPARK-25595       Ignore corrupt Avro file if flag 
IGNORE_CORRUPT_FILES enabled

cc [~Gengliang.Wang]& [~hyukjin.kwon] for comments



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31543) Backport SPARK-26306 More memory to de-flake SorterSuite

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090901#comment-17090901
 ] 

Holden Karau commented on SPARK-31543:
--

cc [~gsomogyi] & [~srowen] for thoughts

> Backport SPARK-26306   More memory to de-flake SorterSuite
> --
>
> Key: SPARK-31543
> URL: https://issues.apache.org/jira/browse/SPARK-31543
> Project: Spark
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-26306       More memory to de-flake SorterSuite



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31537) Backport SPARK-25559 Remove the unsupported predicates in Parquet when possible

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090900#comment-17090900
 ] 

Holden Karau commented on SPARK-31537:
--

cc [~dbtsai]for thoughts

> Backport SPARK-25559  Remove the unsupported predicates in Parquet when 
> possible
> 
>
> Key: SPARK-31537
> URL: https://issues.apache.org/jira/browse/SPARK-31537
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Assignee: DB Tsai
>Priority: Major
> Fix For: 2.4.6
>
>
> Consider backporting SPARK-25559       Remove the unsupported predicates in 
> Parquet when possible to 2.4.6



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31544) Backport SPARK-30199 Recover `spark.(ui|blockManager).port` from checkpoint

2020-04-23 Thread Dongjoon Hyun (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090903#comment-17090903
 ] 

Dongjoon Hyun commented on SPARK-31544:
---

Got it. I'll make a PR.

> Backport SPARK-30199   Recover `spark.(ui|blockManager).port` from 
> checkpoint
> -
>
> Key: SPARK-31544
> URL: https://issues.apache.org/jira/browse/SPARK-31544
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Backport SPARK-30199       Recover `spark.(ui|blockManager).port` from 
> checkpoint
> cc [~dongjoon] for if you think this is a good candidate



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-31544) Backport SPARK-30199 Recover `spark.(ui|blockManager).port` from checkpoint

2020-04-23 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-31544:
-

Assignee: Dongjoon Hyun

> Backport SPARK-30199   Recover `spark.(ui|blockManager).port` from 
> checkpoint
> -
>
> Key: SPARK-31544
> URL: https://issues.apache.org/jira/browse/SPARK-31544
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Assignee: Dongjoon Hyun
>Priority: Major
>
> Backport SPARK-30199       Recover `spark.(ui|blockManager).port` from 
> checkpoint
> cc [~dongjoon] for if you think this is a good candidate



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31542) Backport SPARK-25692 Remove static initialization of worker eventLoop handling chunk fetch requests within TransportContext. This fixes ChunkFetchIntegrationSuit

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090902#comment-17090902
 ] 

Holden Karau commented on SPARK-31542:
--

cc [~sanket991] & [~zsxwing] for thoughts

> Backport SPARK-25692   Remove static initialization of worker eventLoop 
> handling chunk fetch requests within TransportContext. This fixes 
> ChunkFetchIntegrationSuite as well
> 
>
> Key: SPARK-31542
> URL: https://issues.apache.org/jira/browse/SPARK-31542
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Backport SPARK-25692       Remove static initialization of worker eventLoop 
> handling chunk fetch requests within TransportContext. This fixes 
> ChunkFetchIntegrationSuite as well.
> While the test was only flaky in the 3.0 branch, it seems possible the same 
> code path could be triggered in 2.4 so consider for backport.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31541) Backport SPARK-26095 Disable parallelization in make-distibution.sh. (Avoid build hanging)

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090904#comment-17090904
 ] 

Holden Karau commented on SPARK-31541:
--

cc [~vanzin] for thoughts

> Backport SPARK-26095   Disable parallelization in make-distibution.sh. 
> (Avoid build hanging)
> 
>
> Key: SPARK-31541
> URL: https://issues.apache.org/jira/browse/SPARK-31541
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Backport SPARK-26095       Disable parallelization in make-distibution.sh. 
> (Avoid build hanging)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31540) Backport SPARK-27981 Remove `Illegal reflective access` warning for `java.nio.Bits.unaligned()` in JDK9+

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090905#comment-17090905
 ] 

Holden Karau commented on SPARK-31540:
--

cc [~dongjoon] for thoughts

> Backport SPARK-27981   Remove `Illegal reflective access` warning for 
> `java.nio.Bits.unaligned()` in JDK9+
> --
>
> Key: SPARK-31540
> URL: https://issues.apache.org/jira/browse/SPARK-31540
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-27981       Remove `Illegal reflective access` warning for 
> `java.nio.Bits.unaligned()` in JDK9+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31539) Backport SPARK-27138 Remove AdminUtils calls (fixes deprecation)

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090906#comment-17090906
 ] 

Holden Karau commented on SPARK-31539:
--

cc [~DylanGuedes] & [~srowen] for thoughts

> Backport SPARK-27138   Remove AdminUtils calls (fixes deprecation)
> --
>
> Key: SPARK-31539
> URL: https://issues.apache.org/jira/browse/SPARK-31539
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-27138       Remove AdminUtils calls (fixes deprecation)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31538) Backport SPARK-25338 Ensure to call super.beforeAll() and super.afterAll() in test cases

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090907#comment-17090907
 ] 

Holden Karau commented on SPARK-31538:
--

cc [~kiszk] for thoughts

> Backport SPARK-25338   Ensure to call super.beforeAll() and 
> super.afterAll() in test cases
> --
>
> Key: SPARK-31538
> URL: https://issues.apache.org/jira/browse/SPARK-31538
> Project: Spark
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Backport SPARK-25338       Ensure to call super.beforeAll() and 
> super.afterAll() in test cases



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31536) Backport SPARK-25407 Allow nested access for non-existent field for Parquet file when nested pruning is enabled

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090908#comment-17090908
 ] 

Holden Karau commented on SPARK-31536:
--

cc [~michael] & [~hyukjin.kwon] for thoughts

> Backport SPARK-25407   Allow nested access for non-existent field for 
> Parquet file when nested pruning is enabled
> -
>
> Key: SPARK-31536
> URL: https://issues.apache.org/jira/browse/SPARK-31536
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Consider backporting SPARK-25407       Allow nested access for non-existent 
> field for Parquet file when nested pruning is enabled to 2.4.6



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31543) Backport SPARK-26306 More memory to de-flake SorterSuite

2020-04-23 Thread Sean R. Owen (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090911#comment-17090911
 ] 

Sean R. Owen commented on SPARK-31543:
--

Does it need to be a new JIRA?
but if it's a simple test change, I think it's plausible to back-port, if it 
affects 2.4.x.

> Backport SPARK-26306   More memory to de-flake SorterSuite
> --
>
> Key: SPARK-31543
> URL: https://issues.apache.org/jira/browse/SPARK-31543
> Project: Spark
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-26306       More memory to de-flake SorterSuite



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31539) Backport SPARK-27138 Remove AdminUtils calls (fixes deprecation)

2020-04-23 Thread Sean R. Owen (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090910#comment-17090910
 ] 

Sean R. Owen commented on SPARK-31539:
--

Is this deprecated w.r.t. the version used in 2.4.x? Is it important to fix? if 
it's just fixing something deprecated, I think I'd not touch it, all else equal.
Why would this be a new JIRA?

> Backport SPARK-27138   Remove AdminUtils calls (fixes deprecation)
> --
>
> Key: SPARK-31539
> URL: https://issues.apache.org/jira/browse/SPARK-31539
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-27138       Remove AdminUtils calls (fixes deprecation)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31485) Barrier stage can hang if only partial tasks launched

2020-04-23 Thread Holden Karau (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-31485:
-
Target Version/s: 2.4.6, 3.0.0

> Barrier stage can hang if only partial tasks launched
> -
>
> Key: SPARK-31485
> URL: https://issues.apache.org/jira/browse/SPARK-31485
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: wuyi
>Priority: Major
>
> The issue can be reproduced by following test:
>  
> {code:java}
> initLocalClusterSparkContext(2)
> val rdd0 = sc.parallelize(Seq(0, 1, 2, 3), 2)
> val dep = new OneToOneDependency[Int](rdd0)
> val rdd = new MyRDD(sc, 2, List(dep), 
> Seq(Seq("executor_h_0"),Seq("executor_h_0")))
> rdd.barrier().mapPartitions { iter =>
>   BarrierTaskContext.get().barrier()
>   iter
> }.collect()
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31485) Barrier stage can hang if only partial tasks launched

2020-04-23 Thread Holden Karau (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-31485:
-
Shepherd: Holden Karau

> Barrier stage can hang if only partial tasks launched
> -
>
> Key: SPARK-31485
> URL: https://issues.apache.org/jira/browse/SPARK-31485
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: wuyi
>Priority: Major
>
> The issue can be reproduced by following test:
>  
> {code:java}
> initLocalClusterSparkContext(2)
> val rdd0 = sc.parallelize(Seq(0, 1, 2, 3), 2)
> val dep = new OneToOneDependency[Int](rdd0)
> val rdd = new MyRDD(sc, 2, List(dep), 
> Seq(Seq("executor_h_0"),Seq("executor_h_0")))
> rdd.barrier().mapPartitions { iter =>
>   BarrierTaskContext.get().barrier()
>   iter
> }.collect()
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31539) Backport SPARK-27138 Remove AdminUtils calls (fixes deprecation)

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090914#comment-17090914
 ] 

Holden Karau commented on SPARK-31539:
--

The other Jira is marked as resolved and I want to track the outstanding issues 
for 2.4.6 to make sure we don't leave anything behind.

> Backport SPARK-27138   Remove AdminUtils calls (fixes deprecation)
> --
>
> Key: SPARK-31539
> URL: https://issues.apache.org/jira/browse/SPARK-31539
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-27138       Remove AdminUtils calls (fixes deprecation)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31539) Backport SPARK-27138 Remove AdminUtils calls (fixes deprecation)

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090915#comment-17090915
 ] 

Holden Karau commented on SPARK-31539:
--

The main reason I'd see to backport the change is it's only in test and if 
someone wants to build & test with a newer Kafka library it might be useful. 
But now that I think about it some more it's probably not worth it, I'll close 
as won't fix.

> Backport SPARK-27138   Remove AdminUtils calls (fixes deprecation)
> --
>
> Key: SPARK-31539
> URL: https://issues.apache.org/jira/browse/SPARK-31539
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-27138       Remove AdminUtils calls (fixes deprecation)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-31539) Backport SPARK-27138 Remove AdminUtils calls (fixes deprecation)

2020-04-23 Thread Holden Karau (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau resolved SPARK-31539.
--
Resolution: Won't Fix

> Backport SPARK-27138   Remove AdminUtils calls (fixes deprecation)
> --
>
> Key: SPARK-31539
> URL: https://issues.apache.org/jira/browse/SPARK-31539
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-27138       Remove AdminUtils calls (fixes deprecation)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31544) Backport SPARK-30199 Recover `spark.(ui|blockManager).port` from checkpoint

2020-04-23 Thread Dongjoon Hyun (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090917#comment-17090917
 ] 

Dongjoon Hyun commented on SPARK-31544:
---

BTW, I kept the original authorship from the beginning. This will be the same 
for `branch-2.4`.

> Backport SPARK-30199   Recover `spark.(ui|blockManager).port` from 
> checkpoint
> -
>
> Key: SPARK-31544
> URL: https://issues.apache.org/jira/browse/SPARK-31544
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Assignee: Dongjoon Hyun
>Priority: Major
>
> Backport SPARK-30199       Recover `spark.(ui|blockManager).port` from 
> checkpoint
> cc [~dongjoon] for if you think this is a good candidate



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31540) Backport SPARK-27981 Remove `Illegal reflective access` warning for `java.nio.Bits.unaligned()` in JDK9+

2020-04-23 Thread Dongjoon Hyun (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090921#comment-17090921
 ] 

Dongjoon Hyun commented on SPARK-31540:
---

[~holden]. Could you link the original JIRA additionally? Embedding to the 
description doesn't provide a bi-directional visibility.

> Backport SPARK-27981   Remove `Illegal reflective access` warning for 
> `java.nio.Bits.unaligned()` in JDK9+
> --
>
> Key: SPARK-31540
> URL: https://issues.apache.org/jira/browse/SPARK-31540
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-27981       Remove `Illegal reflective access` warning for 
> `java.nio.Bits.unaligned()` in JDK9+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31540) Backport SPARK-27981 Remove `Illegal reflective access` warning for `java.nio.Bits.unaligned()` in JDK9+

2020-04-23 Thread Dongjoon Hyun (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090922#comment-17090922
 ] 

Dongjoon Hyun commented on SPARK-31540:
---

cc [~srowen]

> Backport SPARK-27981   Remove `Illegal reflective access` warning for 
> `java.nio.Bits.unaligned()` in JDK9+
> --
>
> Key: SPARK-31540
> URL: https://issues.apache.org/jira/browse/SPARK-31540
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-27981       Remove `Illegal reflective access` warning for 
> `java.nio.Bits.unaligned()` in JDK9+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31540) Backport SPARK-27981 Remove `Illegal reflective access` warning for `java.nio.Bits.unaligned()` in JDK9+

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090924#comment-17090924
 ] 

Holden Karau commented on SPARK-31540:
--

Gotcha, I'll go through the backport JIRAs and link.

> Backport SPARK-27981   Remove `Illegal reflective access` warning for 
> `java.nio.Bits.unaligned()` in JDK9+
> --
>
> Key: SPARK-31540
> URL: https://issues.apache.org/jira/browse/SPARK-31540
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-27981       Remove `Illegal reflective access` warning for 
> `java.nio.Bits.unaligned()` in JDK9+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31544) Backport SPARK-30199 Recover `spark.(ui|blockManager).port` from checkpoint

2020-04-23 Thread Dongjoon Hyun (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090926#comment-17090926
 ] 

Dongjoon Hyun commented on SPARK-31544:
---

I made a PR, [~holden].
- https://github.com/apache/spark/pull/28320

> Backport SPARK-30199   Recover `spark.(ui|blockManager).port` from 
> checkpoint
> -
>
> Key: SPARK-31544
> URL: https://issues.apache.org/jira/browse/SPARK-31544
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Assignee: Dongjoon Hyun
>Priority: Major
>
> Backport SPARK-30199       Recover `spark.(ui|blockManager).port` from 
> checkpoint
> cc [~dongjoon] for if you think this is a good candidate



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31544) Backport SPARK-30199 Recover `spark.(ui|blockManager).port` from checkpoint

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090928#comment-17090928
 ] 

Holden Karau commented on SPARK-31544:
--

Thanks!

> Backport SPARK-30199   Recover `spark.(ui|blockManager).port` from 
> checkpoint
> -
>
> Key: SPARK-31544
> URL: https://issues.apache.org/jira/browse/SPARK-31544
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Assignee: Dongjoon Hyun
>Priority: Major
>
> Backport SPARK-30199       Recover `spark.(ui|blockManager).port` from 
> checkpoint
> cc [~dongjoon] for if you think this is a good candidate



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31540) Backport SPARK-27981 Remove `Illegal reflective access` warning for `java.nio.Bits.unaligned()` in JDK9+

2020-04-23 Thread Sean R. Owen (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090944#comment-17090944
 ] 

Sean R. Owen commented on SPARK-31540:
--

The backport is probably harmless, but, why is it needed for 2.4.x? This helps 
JDK 11 compatibility, but 2.4 won't work with JDK 11.

> Backport SPARK-27981   Remove `Illegal reflective access` warning for 
> `java.nio.Bits.unaligned()` in JDK9+
> --
>
> Key: SPARK-31540
> URL: https://issues.apache.org/jira/browse/SPARK-31540
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-27981       Remove `Illegal reflective access` warning for 
> `java.nio.Bits.unaligned()` in JDK9+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31540) Backport SPARK-27981 Remove `Illegal reflective access` warning for `java.nio.Bits.unaligned()` in JDK9+

2020-04-23 Thread Holden Karau (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090954#comment-17090954
 ] 

Holden Karau commented on SPARK-31540:
--

I was thinking some folks might build 2.4 with newer JDKs.

> Backport SPARK-27981   Remove `Illegal reflective access` warning for 
> `java.nio.Bits.unaligned()` in JDK9+
> --
>
> Key: SPARK-31540
> URL: https://issues.apache.org/jira/browse/SPARK-31540
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-27981       Remove `Illegal reflective access` warning for 
> `java.nio.Bits.unaligned()` in JDK9+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-25075) Build and test Spark against Scala 2.13

2020-04-23 Thread Dongjoon Hyun (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-25075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090996#comment-17090996
 ] 

Dongjoon Hyun commented on SPARK-25075:
---

Thank you for updates, [~smarter].

> Build and test Spark against Scala 2.13
> ---
>
> Key: SPARK-25075
> URL: https://issues.apache.org/jira/browse/SPARK-25075
> Project: Spark
>  Issue Type: Umbrella
>  Components: Build, MLlib, Project Infra, Spark Core, SQL
>Affects Versions: 3.0.0
>Reporter: Guillaume Massé
>Priority: Major
>
> This umbrella JIRA tracks the requirements for building and testing Spark 
> against the current Scala 2.13 milestone.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31547) Upgrade Genjavadoc to 0.16

2020-04-23 Thread Dongjoon Hyun (Jira)

Dongjoon Hyun created SPARK-31547:
-

 Summary: Upgrade Genjavadoc to 0.16
 Key: SPARK-31547
 URL: https://issues.apache.org/jira/browse/SPARK-31547
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-27891) Long running spark jobs fail because of HDFS delegation token expires

2020-04-23 Thread Jungtaek Lim (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved SPARK-27891.
--
Resolution: Cannot Reproduce

SPARK-23361 is in Spark 2.4.0 and the fix is not going to be 2.3.x as 2.3.x is 
EOL - please reopen if anyone encounters this in 2.4.x.

> Long running spark jobs fail because of HDFS delegation token expires
> -
>
> Key: SPARK-27891
> URL: https://issues.apache.org/jira/browse/SPARK-27891
> Project: Spark
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 2.0.1, 2.1.0, 2.3.1, 2.4.1
>Reporter: hemshankar sahu
>Priority: Critical
> Attachments: application_1559242207407_0001.log, 
> spark_2.3.1_failure.log
>
>
> When the spark job runs on a secured cluster for longer then time that is 
> mentioned in the dfs.namenode.delegation.token.renew-interval property of 
> hdfs-site.xml the spark job fails. ** 
> Following command was used to submit the spark job
> bin/spark-submit --principal acekrbuser --keytab ~/keytabs/acekrbuser.keytab 
> --master yarn --deploy-mode cluster examples/src/main/python/wordcount.py 
> /tmp/ff1.txt
>  
> Application Logs attached
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31464) Upgrade Kafka to 2.5.0

2020-04-23 Thread Dongjoon Hyun (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091051#comment-17091051
 ] 

Dongjoon Hyun commented on SPARK-31464:
---

Thank you, [~ijuma]! :)

> Upgrade Kafka to 2.5.0
> --
>
> Key: SPARK-31464
> URL: https://issues.apache.org/jira/browse/SPARK-31464
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Structured Streaming
>Affects Versions: 3.1.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-31542) Backport SPARK-25692 Remove static initialization of worker eventLoop handling chunk fetch requests within TransportContext. This fixes ChunkFetchIntegrationSuite

2020-04-23 Thread Shixiong Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu resolved SPARK-31542.
--
Resolution: Not A Problem

> Backport SPARK-25692   Remove static initialization of worker eventLoop 
> handling chunk fetch requests within TransportContext. This fixes 
> ChunkFetchIntegrationSuite as well
> 
>
> Key: SPARK-31542
> URL: https://issues.apache.org/jira/browse/SPARK-31542
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Backport SPARK-25692       Remove static initialization of worker eventLoop 
> handling chunk fetch requests within TransportContext. This fixes 
> ChunkFetchIntegrationSuite as well.
> While the test was only flaky in the 3.0 branch, it seems possible the same 
> code path could be triggered in 2.4 so consider for backport.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31542) Backport SPARK-25692 Remove static initialization of worker eventLoop handling chunk fetch requests within TransportContext. This fixes ChunkFetchIntegrationSuit

2020-04-23 Thread Shixiong Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091060#comment-17091060
 ] 

Shixiong Zhu commented on SPARK-31542:
--

[~holden] The flaky test was caused by a new improvement in 3.0: SPARK-24355 It 
doesn't impact branch-2.4.

> Backport SPARK-25692   Remove static initialization of worker eventLoop 
> handling chunk fetch requests within TransportContext. This fixes 
> ChunkFetchIntegrationSuite as well
> 
>
> Key: SPARK-31542
> URL: https://issues.apache.org/jira/browse/SPARK-31542
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Backport SPARK-25692       Remove static initialization of worker eventLoop 
> handling chunk fetch requests within TransportContext. This fixes 
> ChunkFetchIntegrationSuite as well.
> While the test was only flaky in the 3.0 branch, it seems possible the same 
> code path could be triggered in 2.4 so consider for backport.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31532) SparkSessionBuilder shoud not propagate static sql configurations to the existing active/default SparkSession

2020-04-23 Thread JinxinTang (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091067#comment-17091067
 ] 

JinxinTang commented on SPARK-31532:


Thanks for your issue, these followings not be modified after sparksession 
startup by design：

[spark.sql.codegen.comments, spark.sql.queryExecutionListeners, 
spark.sql.catalogImplementation, spark.sql.subquery.maxThreadThreshold, 
spark.sql.globalTempDatabase, spark.sql.codegen.cache.maxEntries, 
spark.sql.filesourceTableRelationCacheSize, 
spark.sql.streaming.streamingQueryListeners, spark.sql.ui.retainedExecutions, 
spark.sql.hive.thriftServer.singleSession, spark.sql.extensions, 
spark.sql.debug, spark.sql.sources.schemaStringLengthThreshold, 
spark.sql.warehouse.dir] 

> SparkSessionBuilder shoud not propagate static sql configurations to the 
> existing active/default SparkSession
> -
>
> Key: SPARK-31532
> URL: https://issues.apache.org/jira/browse/SPARK-31532
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.5, 3.0.0, 3.1.0
>Reporter: Kent Yao
>Priority: Major
>
> Clearly, this is a bug.
> {code:java}
> scala> spark.sql("set spark.sql.warehouse.dir").show
> +++
> | key|   value|
> +++
> |spark.sql.warehou...|file:/Users/kenty...|
> +++
> scala> spark.sql("set spark.sql.warehouse.dir=2");
> org.apache.spark.sql.AnalysisException: Cannot modify the value of a static 
> config: spark.sql.warehouse.dir;
>   at 
> org.apache.spark.sql.RuntimeConfig.requireNonStaticConf(RuntimeConfig.scala:154)
>   at org.apache.spark.sql.RuntimeConfig.set(RuntimeConfig.scala:42)
>   at 
> org.apache.spark.sql.execution.command.SetCommand.$anonfun$x$7$6(SetCommand.scala:100)
>   at 
> org.apache.spark.sql.execution.command.SetCommand.run(SetCommand.scala:156)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229)
>   at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3644)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
>   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3642)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:229)
>   at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:607)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:602)
>   ... 47 elided
> scala> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.SparkSession
> scala> SparkSession.builder.config("spark.sql.warehouse.dir", "xyz").get
> getClass   getOrCreate
> scala> SparkSession.builder.config("spark.sql.warehouse.dir", 
> "xyz").getOrCreate
> 20/04/23 23:49:13 WARN SparkSession$Builder: Using an existing SparkSession; 
> some configuration may not take effect.
> res7: org.apache.spark.sql.SparkSession = 
> org.apache.spark.sql.SparkSession@6403d574
> scala> spark.sql("set spark.sql.warehouse.dir").show
> ++-+
> | key|value|
> ++-+
> |spark.sql.warehou...|  xyz|
> ++-+
> scala>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-31532) SparkSessionBuilder shoud not propagate static sql configurations to the existing active/default SparkSession

2020-04-23 Thread JinxinTang (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091067#comment-17091067
 ] 

JinxinTang edited comment on SPARK-31532 at 4/24/20, 1:08 AM:
--

Thanks for your issue, these followings may not be allowed to modify after 
sparksession startup by design：

[spark.sql.codegen.comments, spark.sql.queryExecutionListeners, 
spark.sql.catalogImplementation, spark.sql.subquery.maxThreadThreshold, 
spark.sql.globalTempDatabase, spark.sql.codegen.cache.maxEntries, 
spark.sql.filesourceTableRelationCacheSize, 
spark.sql.streaming.streamingQueryListeners, spark.sql.ui.retainedExecutions, 
spark.sql.hive.thriftServer.singleSession, spark.sql.extensions, 
spark.sql.debug, spark.sql.sources.schemaStringLengthThreshold, 
spark.sql.warehouse.dir] 

So it is might not a bug.


was (Author: jinxintang):
Thanks for your issue, these followings not be modified after sparksession 
startup by design：

[spark.sql.codegen.comments, spark.sql.queryExecutionListeners, 
spark.sql.catalogImplementation, spark.sql.subquery.maxThreadThreshold, 
spark.sql.globalTempDatabase, spark.sql.codegen.cache.maxEntries, 
spark.sql.filesourceTableRelationCacheSize, 
spark.sql.streaming.streamingQueryListeners, spark.sql.ui.retainedExecutions, 
spark.sql.hive.thriftServer.singleSession, spark.sql.extensions, 
spark.sql.debug, spark.sql.sources.schemaStringLengthThreshold, 
spark.sql.warehouse.dir] 

> SparkSessionBuilder shoud not propagate static sql configurations to the 
> existing active/default SparkSession
> -
>
> Key: SPARK-31532
> URL: https://issues.apache.org/jira/browse/SPARK-31532
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.5, 3.0.0, 3.1.0
>Reporter: Kent Yao
>Priority: Major
>
> Clearly, this is a bug.
> {code:java}
> scala> spark.sql("set spark.sql.warehouse.dir").show
> +++
> | key|   value|
> +++
> |spark.sql.warehou...|file:/Users/kenty...|
> +++
> scala> spark.sql("set spark.sql.warehouse.dir=2");
> org.apache.spark.sql.AnalysisException: Cannot modify the value of a static 
> config: spark.sql.warehouse.dir;
>   at 
> org.apache.spark.sql.RuntimeConfig.requireNonStaticConf(RuntimeConfig.scala:154)
>   at org.apache.spark.sql.RuntimeConfig.set(RuntimeConfig.scala:42)
>   at 
> org.apache.spark.sql.execution.command.SetCommand.$anonfun$x$7$6(SetCommand.scala:100)
>   at 
> org.apache.spark.sql.execution.command.SetCommand.run(SetCommand.scala:156)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229)
>   at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3644)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
>   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3642)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:229)
>   at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:607)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:602)
>   ... 47 elided
> scala> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.SparkSession
> scala> SparkSession.builder.config("spark.sql.warehouse.dir", "xyz").get
> getClass   getOrCreate
> scala> SparkSession.builder.config("spark.sql.warehouse.dir", 
> "xyz").getOrCreate
> 20/04/23 23:49:13 WARN SparkSession$Builder: Using an existing SparkSession; 
> some configuration may not take effect.
> res7: org.apache.spark.sql.SparkSession = 
> org.apache.spark.sql.SparkSession@6403d574
> scala> spark.sql("set spark.sql

[jira] [Commented] (SPARK-26385) YARN - Spark Stateful Structured streaming HDFS_DELEGATION_TOKEN not found in cache

2020-04-23 Thread Jungtaek Lim (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-26385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091075#comment-17091075
 ] 

Jungtaek Lim commented on SPARK-26385:
--

The symptoms are mixed up - please clarify whether the exception occurs 
(driver, AM, executor, somewhere else??), which mode you use, which 
configuration you used to try mitigating it.

Please file a new issue with above information per case. Adding comments with 
different case in here might emphasize of the importance of the issue, but not 
helpful to investigate on the issue. Please also note that we need driver / AM 
/ executor log because we should check interaction among them (how delegation 
tokens were passed).

> YARN - Spark Stateful Structured streaming HDFS_DELEGATION_TOKEN not found in 
> cache
> ---
>
> Key: SPARK-26385
> URL: https://issues.apache.org/jira/browse/SPARK-26385
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
> Environment: Hadoop 2.6.0, Spark 2.4.0
>Reporter: T M
>Priority: Major
>
>  
> Hello,
>  
> I have Spark Structured Streaming job which is runnning on YARN(Hadoop 2.6.0, 
> Spark 2.4.0). After 25-26 hours, my job stops working with following error:
> {code:java}
> 2018-12-16 22:35:17 ERROR 
> org.apache.spark.internal.Logging$class.logError(Logging.scala:91): Query 
> TestQuery[id = a61ce197-1d1b-4e82-a7af-60162953488b, runId = 
> a56878cf-dfc7-4f6a-ad48-02cf738ccc2f] terminated with error 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for REMOVED: HDFS_DELEGATION_TOKEN owner=REMOVED, renewer=yarn, 
> realUser=, issueDate=1544903057122, maxDate=1545507857122, 
> sequenceNumber=10314, masterKeyId=344) can't be found in cache at 
> org.apache.hadoop.ipc.Client.call(Client.java:1470) at 
> org.apache.hadoop.ipc.Client.call(Client.java:1401) at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>  at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
>  at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>  at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) at 
> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1977) at 
> org.apache.hadoop.fs.Hdfs.getFileStatus(Hdfs.java:133) at 
> org.apache.hadoop.fs.FileContext$14.next(FileContext.java:1120) at 
> org.apache.hadoop.fs.FileContext$14.next(FileContext.java:1116) at 
> org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at 
> org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1116) at 
> org.apache.hadoop.fs.FileContext$Util.exists(FileContext.java:1581) at 
> org.apache.spark.sql.execution.streaming.FileContextBasedCheckpointFileManager.exists(CheckpointFileManager.scala:326)
>  at 
> org.apache.spark.sql.execution.streaming.HDFSMetadataLog.get(HDFSMetadataLog.scala:142)
>  at 
> org.apache.spark.sql.execution.streaming.HDFSMetadataLog.add(HDFSMetadataLog.scala:110)
>  at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$1.apply$mcV$sp(MicroBatchExecution.scala:544)
>  at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$1.apply(MicroBatchExecution.scala:542)
>  at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$1.apply(MicroBatchExecution.scala:542)
>  at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution.withProgressLocked(MicroBatchExecution.scala:554)
>  at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch(MicroBatchExecution.scala:542)
>  at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:198)
>  at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166)
>  at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActi

[jira] [Assigned] (SPARK-31526) Add a new test suite for ExperssionInfo

2020-04-23 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-31526:


Assignee: Takeshi Yamamuro

> Add a new test suite for ExperssionInfo
> ---
>
> Key: SPARK-31526
> URL: https://issues.apache.org/jira/browse/SPARK-31526
> Project: Spark
>  Issue Type: Test
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Takeshi Yamamuro
>Assignee: Takeshi Yamamuro
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-31526) Add a new test suite for ExperssionInfo

2020-04-23 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-31526.
--
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 28308
[https://github.com/apache/spark/pull/28308]

> Add a new test suite for ExperssionInfo
> ---
>
> Key: SPARK-31526
> URL: https://issues.apache.org/jira/browse/SPARK-31526
> Project: Spark
>  Issue Type: Test
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Takeshi Yamamuro
>Assignee: Takeshi Yamamuro
>Priority: Minor
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-31488) Support `java.time.LocalDate` in Parquet filter pushdown

2020-04-23 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-31488.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 28259
[https://github.com/apache/spark/pull/28259]

> Support `java.time.LocalDate` in Parquet filter pushdown
> 
>
> Key: SPARK-31488
> URL: https://issues.apache.org/jira/browse/SPARK-31488
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
> Fix For: 3.0.0
>
>
> Currently, ParquetFilters supports only java.sql.Date values of DateType, and 
> explicitly casts Any to java.sql.Date, see
> https://github.com/apache/spark/blob/cb0db213736de5c5c02b09a2d5c3e17254708ce1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala#L176
> So, any filters refer to date values are not pushed down to Parquet when 
> spark.sql.datetime.java8API.enabled is true.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-31488) Support `java.time.LocalDate` in Parquet filter pushdown

2020-04-23 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-31488:
---

Assignee: Maxim Gekk

> Support `java.time.LocalDate` in Parquet filter pushdown
> 
>
> Key: SPARK-31488
> URL: https://issues.apache.org/jira/browse/SPARK-31488
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
>
> Currently, ParquetFilters supports only java.sql.Date values of DateType, and 
> explicitly casts Any to java.sql.Date, see
> https://github.com/apache/spark/blob/cb0db213736de5c5c02b09a2d5c3e17254708ce1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala#L176
> So, any filters refer to date values are not pushed down to Parquet when 
> spark.sql.datetime.java8API.enabled is true.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31548) Refactor pyspark code for common methods in JavaParams and Pipeline/OneVsRest

2020-04-23 Thread Weichen Xu (Jira)

Weichen Xu created SPARK-31548:
--

 Summary: Refactor pyspark code for common methods in JavaParams 
and Pipeline/OneVsRest
 Key: SPARK-31548
 URL: https://issues.apache.org/jira/browse/SPARK-31548
 Project: Spark
  Issue Type: Improvement
  Components: PySpark
Affects Versions: 3.0.0
Reporter: Weichen Xu


Background: See discussion here
https://github.com/apache/spark/pull/28273#discussion_r411462216
and
https://github.com/apache/spark/pull/28279#discussion_r412699397




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31549) Pyspark SparkContext.cancelJobGroup do not work correctly

2020-04-23 Thread Weichen Xu (Jira)

Weichen Xu created SPARK-31549:
--

 Summary: Pyspark SparkContext.cancelJobGroup do not work correctly
 Key: SPARK-31549
 URL: https://issues.apache.org/jira/browse/SPARK-31549
 Project: Spark
  Issue Type: Improvement
  Components: PySpark
Affects Versions: 2.4.5, 3.0.0
Reporter: Weichen Xu


Pyspark SparkContext.cancelJobGroup do not work correctly. This is an issue 
existing for a long time. This is because of pyspark thread didn't pinned to 
jvm thread when invoking java side methods, which leads to all pyspark API 
which related to java local thread variables do not work correctly. (Including 
`sc.setLocalProperty`, `sc.cancelJobGroup`, `sc.setJobDescription` and so on.)

This is serious issue. Now there's an experimental pyspark 'PIN_THREAD' mode 
added in spark-3.0 which address it, but the 'PIN_THREAD' mode exists two issue:
* It is disabled by default. We need to set additional environment variable to 
enable it.
* There's memory leak issue which haven't been addressed.

Now there's a series of project like hyperopt-spark, spark-joblib which rely on 
`sc.cancelJobGroup` API (use it to stop running jobs in their code). So it is 
critical to address this issue and we hope it work under default pyspark mode. 
An optional approach is implementing methods like `rdd.setGroupAndCollect`.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31549) Pyspark SparkContext.cancelJobGroup do not work correctly

2020-04-23 Thread Weichen Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weichen Xu updated SPARK-31549:
---
Issue Type: Bug  (was: Improvement)

> Pyspark SparkContext.cancelJobGroup do not work correctly
> -
>
> Key: SPARK-31549
> URL: https://issues.apache.org/jira/browse/SPARK-31549
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.5, 3.0.0
>Reporter: Weichen Xu
>Priority: Critical
>
> Pyspark SparkContext.cancelJobGroup do not work correctly. This is an issue 
> existing for a long time. This is because of pyspark thread didn't pinned to 
> jvm thread when invoking java side methods, which leads to all pyspark API 
> which related to java local thread variables do not work correctly. 
> (Including `sc.setLocalProperty`, `sc.cancelJobGroup`, `sc.setJobDescription` 
> and so on.)
> This is serious issue. Now there's an experimental pyspark 'PIN_THREAD' mode 
> added in spark-3.0 which address it, but the 'PIN_THREAD' mode exists two 
> issue:
> * It is disabled by default. We need to set additional environment variable 
> to enable it.
> * There's memory leak issue which haven't been addressed.
> Now there's a series of project like hyperopt-spark, spark-joblib which rely 
> on `sc.cancelJobGroup` API (use it to stop running jobs in their code). So it 
> is critical to address this issue and we hope it work under default pyspark 
> mode. An optional approach is implementing methods like 
> `rdd.setGroupAndCollect`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-30804) Measure and log elapsed time for "compact" operation in CompactibleFileStreamLog

2020-04-23 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-30804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-30804.
--
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 27557
[https://github.com/apache/spark/pull/27557]

> Measure and log elapsed time for "compact" operation in 
> CompactibleFileStreamLog
> 
>
> Key: SPARK-30804
> URL: https://issues.apache.org/jira/browse/SPARK-30804
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Affects Versions: 3.1.0
>Reporter: Jungtaek Lim
>Assignee: Jungtaek Lim
>Priority: Major
> Fix For: 3.0.0
>
>
> "compact" operation in FileStreamSourceLog and FileStreamSinkLog is 
> introduced to solve "small files" problem, but introduced non-trivial latency 
> which is another headache in long run query.
> There're bunch of reports from community for the same issue (see SPARK-24295, 
> SPARK-29995, SPARK-30462) - before trying to solve the problem, it would be 
> better to measure the latency (elapsed time) and log to help indicating the 
> issue when the additional latency becomes concerns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-30804) Measure and log elapsed time for "compact" operation in CompactibleFileStreamLog

2020-04-23 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-30804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-30804:


Assignee: Jungtaek Lim

> Measure and log elapsed time for "compact" operation in 
> CompactibleFileStreamLog
> 
>
> Key: SPARK-30804
> URL: https://issues.apache.org/jira/browse/SPARK-30804
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Affects Versions: 3.1.0
>Reporter: Jungtaek Lim
>Assignee: Jungtaek Lim
>Priority: Major
>
> "compact" operation in FileStreamSourceLog and FileStreamSinkLog is 
> introduced to solve "small files" problem, but introduced non-trivial latency 
> which is another headache in long run query.
> There're bunch of reports from community for the same issue (see SPARK-24295, 
> SPARK-29995, SPARK-30462) - before trying to solve the problem, it would be 
> better to measure the latency (elapsed time) and log to help indicating the 
> issue when the additional latency becomes concerns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31546) Backport SPARK-25595 Ignore corrupt Avro file if flag IGNORE_CORRUPT_FILES enabled

2020-04-23 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091146#comment-17091146
 ] 

Hyukjin Kwon commented on SPARK-31546:
--

I think it's fine to port back.

> Backport SPARK-25595   Ignore corrupt Avro file if flag 
> IGNORE_CORRUPT_FILES enabled
> 
>
> Key: SPARK-31546
> URL: https://issues.apache.org/jira/browse/SPARK-31546
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Backport SPARK-25595       Ignore corrupt Avro file if flag 
> IGNORE_CORRUPT_FILES enabled
> cc [~Gengliang.Wang]& [~hyukjin.kwon] for comments



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31536) Backport SPARK-25407 Allow nested access for non-existent field for Parquet file when nested pruning is enabled

2020-04-23 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091148#comment-17091148
 ] 

Hyukjin Kwon commented on SPARK-31536:
--

To backport this, we should port SPARK-31116 together. I tend to think we 
shouldn't backport this also given that this feature is disabled by default in 
Spark 2.4 - it also affects the codes when the option is disabled.

> Backport SPARK-25407   Allow nested access for non-existent field for 
> Parquet file when nested pruning is enabled
> -
>
> Key: SPARK-31536
> URL: https://issues.apache.org/jira/browse/SPARK-31536
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Consider backporting SPARK-25407       Allow nested access for non-existent 
> field for Parquet file when nested pruning is enabled to 2.4.6



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31537) Backport SPARK-25559 Remove the unsupported predicates in Parquet when possible

2020-04-23 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091149#comment-17091149
 ] 

Hyukjin Kwon commented on SPARK-31537:
--

I wouldn't port this back per the guidelines in our versioning policy 
(https://spark.apache.org/versioning-policy.html). Improvement seems usually 
not ported back.

> Backport SPARK-25559  Remove the unsupported predicates in Parquet when 
> possible
> 
>
> Key: SPARK-31537
> URL: https://issues.apache.org/jira/browse/SPARK-31537
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Assignee: DB Tsai
>Priority: Major
> Fix For: 2.4.6
>
>
> Consider backporting SPARK-25559       Remove the unsupported predicates in 
> Parquet when possible to 2.4.6



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31537) Backport SPARK-25559 Remove the unsupported predicates in Parquet when possible

2020-04-23 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-31537:
-
Target Version/s: 2.4.6

> Backport SPARK-25559  Remove the unsupported predicates in Parquet when 
> possible
> 
>
> Key: SPARK-31537
> URL: https://issues.apache.org/jira/browse/SPARK-31537
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Assignee: DB Tsai
>Priority: Major
>
> Consider backporting SPARK-25559       Remove the unsupported predicates in 
> Parquet when possible to 2.4.6



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31537) Backport SPARK-25559 Remove the unsupported predicates in Parquet when possible

2020-04-23 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-31537:
-
Fix Version/s: (was: 2.4.6)

> Backport SPARK-25559  Remove the unsupported predicates in Parquet when 
> possible
> 
>
> Key: SPARK-31537
> URL: https://issues.apache.org/jira/browse/SPARK-31537
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Assignee: DB Tsai
>Priority: Major
>
> Consider backporting SPARK-25559       Remove the unsupported predicates in 
> Parquet when possible to 2.4.6



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31538) Backport SPARK-25338 Ensure to call super.beforeAll() and super.afterAll() in test cases

2020-04-23 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091150#comment-17091150
 ] 

Hyukjin Kwon commented on SPARK-31538:
--

Hm, I wonder why we should backport this. This was just a test-only and cleanup.
BTW, do we need this file a JIRA for each backport? I think you can just use 
the existing JIRA, backport, and fix the Fix Veersion.

> Backport SPARK-25338   Ensure to call super.beforeAll() and 
> super.afterAll() in test cases
> --
>
> Key: SPARK-31538
> URL: https://issues.apache.org/jira/browse/SPARK-31538
> Project: Spark
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Backport SPARK-25338       Ensure to call super.beforeAll() and 
> super.afterAll() in test cases



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31545) Backport SPARK-27676 InMemoryFileIndex should respect spark.sql.files.ignoreMissingFiles

2020-04-23 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091152#comment-17091152
 ] 

Hyukjin Kwon commented on SPARK-31545:
--

I think no .. it causes a behaviour change which can be pretty critical in SS 
cases (see the migration guide updated).

> Backport SPARK-27676   InMemoryFileIndex should respect 
> spark.sql.files.ignoreMissingFiles
> --
>
> Key: SPARK-31545
> URL: https://issues.apache.org/jira/browse/SPARK-31545
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> Backport SPARK-27676       InMemoryFileIndex should respect 
> spark.sql.files.ignoreMissingFiles
> cc [~joshrosen] I think backporting this has been asked in the original 
> ticket, do you have any objections?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-31547) Upgrade Genjavadoc to 0.16

2020-04-23 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-31547.
--
Fix Version/s: 3.1.0
   Resolution: Fixed

Fixed in https://github.com/apache/spark/pull/28321

> Upgrade Genjavadoc to 0.16
> --
>
> Key: SPARK-31547
> URL: https://issues.apache.org/jira/browse/SPARK-31547
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Dongjoon Hyun
>Priority: Major
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-31547) Upgrade Genjavadoc to 0.16

2020-04-23 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-31547:


Assignee: Dongjoon Hyun

> Upgrade Genjavadoc to 0.16
> --
>
> Key: SPARK-31547
> URL: https://issues.apache.org/jira/browse/SPARK-31547
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31532) SparkSessionBuilder shoud not propagate static sql configurations to the existing active/default SparkSession

2020-04-23 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091158#comment-17091158
 ] 

Hyukjin Kwon commented on SPARK-31532:
--

The problem is that the static configuration was changed during runtime.

> SparkSessionBuilder shoud not propagate static sql configurations to the 
> existing active/default SparkSession
> -
>
> Key: SPARK-31532
> URL: https://issues.apache.org/jira/browse/SPARK-31532
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.5, 3.0.0, 3.1.0
>Reporter: Kent Yao
>Priority: Major
>
> Clearly, this is a bug.
> {code:java}
> scala> spark.sql("set spark.sql.warehouse.dir").show
> +++
> | key|   value|
> +++
> |spark.sql.warehou...|file:/Users/kenty...|
> +++
> scala> spark.sql("set spark.sql.warehouse.dir=2");
> org.apache.spark.sql.AnalysisException: Cannot modify the value of a static 
> config: spark.sql.warehouse.dir;
>   at 
> org.apache.spark.sql.RuntimeConfig.requireNonStaticConf(RuntimeConfig.scala:154)
>   at org.apache.spark.sql.RuntimeConfig.set(RuntimeConfig.scala:42)
>   at 
> org.apache.spark.sql.execution.command.SetCommand.$anonfun$x$7$6(SetCommand.scala:100)
>   at 
> org.apache.spark.sql.execution.command.SetCommand.run(SetCommand.scala:156)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229)
>   at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3644)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
>   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3642)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:229)
>   at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:607)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:602)
>   ... 47 elided
> scala> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.SparkSession
> scala> SparkSession.builder.config("spark.sql.warehouse.dir", "xyz").get
> getClass   getOrCreate
> scala> SparkSession.builder.config("spark.sql.warehouse.dir", 
> "xyz").getOrCreate
> 20/04/23 23:49:13 WARN SparkSession$Builder: Using an existing SparkSession; 
> some configuration may not take effect.
> res7: org.apache.spark.sql.SparkSession = 
> org.apache.spark.sql.SparkSession@6403d574
> scala> spark.sql("set spark.sql.warehouse.dir").show
> ++-+
> | key|value|
> ++-+
> |spark.sql.warehou...|  xyz|
> ++-+
> scala>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-30199) Recover spark.ui.port and spark.blockManager.port from checkpoint

2020-04-23 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-30199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-30199:
--
Issue Type: Bug  (was: Improvement)

> Recover spark.ui.port and spark.blockManager.port from checkpoint
> -
>
> Key: SPARK-30199
> URL: https://issues.apache.org/jira/browse/SPARK-30199
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.4.4, 3.0.0
>Reporter: Dongjoon Hyun
>Assignee: Aaruna Godthi
>Priority: Major
> Fix For: 2.4.6, 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

1 2 >

1 - 100 of 157 matches

Mail list logo