date:20201227

[jira] [Assigned] (SPARK-33827) Unload State Store asap once it becomes inactive

2020-12-27 Thread Jungtaek Lim (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim reassigned SPARK-33827:


Assignee: L. C. Hsieh

> Unload State Store asap once it becomes inactive
> 
>
> Key: SPARK-33827
> URL: https://issues.apache.org/jira/browse/SPARK-33827
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Affects Versions: 3.2.0
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>
> SS maintains state stores in executors across batches. Due to the nature of 
> Spark scheduling, a state store might be allocated on another executor in 
> next batch. The state store in previous batch becomes inactive.
> Now we run a maintenance task periodically to unload inactive state stores. 
> So there will be some delays between a state store becomes inactive and it is 
> unloaded.
> Per the discussion on https://github.com/apache/spark/pull/30770 with 
> [~kabhwan], I think the preference is to unload inactive state store asap.
> However, we can force Spark to always allocate a state store to same 
> executor, by using task locality configuration. This can reduce the 
> possibility to have inactive state store.
> Normally, I think with locality configuration, we might not able to see 
> inactive state store generally. There is still chance an executor can be 
> failed and reallocated, but in this case, inactive state store is also lost 
> too. So it is not an issue.
> So unloading inactive store asap is only useful when we don't use task 
> locality to force state store locality across batches.
> The required change to make driver-executor bi-directional for state store 
> management looks non-trivial. If we already can reduce possibility of 
> inactive store, is it still worth making non-trivial here?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33827) Unload State Store asap once it becomes inactive

2020-12-27 Thread Jungtaek Lim (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved SPARK-33827.
--
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 30827
[https://github.com/apache/spark/pull/30827]

> Unload State Store asap once it becomes inactive
> 
>
> Key: SPARK-33827
> URL: https://issues.apache.org/jira/browse/SPARK-33827
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Affects Versions: 3.2.0
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
> Fix For: 3.2.0
>
>
> SS maintains state stores in executors across batches. Due to the nature of 
> Spark scheduling, a state store might be allocated on another executor in 
> next batch. The state store in previous batch becomes inactive.
> Now we run a maintenance task periodically to unload inactive state stores. 
> So there will be some delays between a state store becomes inactive and it is 
> unloaded.
> Per the discussion on https://github.com/apache/spark/pull/30770 with 
> [~kabhwan], I think the preference is to unload inactive state store asap.
> However, we can force Spark to always allocate a state store to same 
> executor, by using task locality configuration. This can reduce the 
> possibility to have inactive state store.
> Normally, I think with locality configuration, we might not able to see 
> inactive state store generally. There is still chance an executor can be 
> failed and reallocated, but in this case, inactive state store is also lost 
> too. So it is not an issue.
> So unloading inactive store asap is only useful when we don't use task 
> locality to force state store locality across batches.
> The required change to make driver-executor bi-directional for state store 
> management looks non-trivial. If we already can reduce possibility of 
> inactive store, is it still worth making non-trivial here?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31685) Spark structured streaming with Kafka fails with HDFS_DELEGATION_TOKEN expiration issue

2020-12-27 Thread Dongjoon Hyun (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255446#comment-17255446
 ] 

Dongjoon Hyun commented on SPARK-31685:
---

Thank you for pinging me, [~Qin Yao].
cc [~viirya] since he is looking at streaming.

> Spark structured streaming with Kafka fails with HDFS_DELEGATION_TOKEN 
> expiration issue
> ---
>
> Key: SPARK-31685
> URL: https://issues.apache.org/jira/browse/SPARK-31685
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.4
> Environment: spark-2.4.4-bin-hadoop2.7
>Reporter: Rajeev Kumar
>Priority: Major
>
> I am facing issue for spark-2.4.4-bin-hadoop2.7. I am using spark structured 
> streaming with Kafka. Reading the stream from Kafka and saving it to HBase.
> I get this error on the driver after 24 hours.
>  
> {code:java}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 6972072 for ) is expired
> at org.apache.hadoop.ipc.Client.call(Client.java:1475)
> at org.apache.hadoop.ipc.Client.call(Client.java:1412)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
> at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
> at org.apache.hadoop.fs.Hdfs.getFileStatus(Hdfs.java:130)
> at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1169)
> at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1165)
> at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
> at 
> org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1171)
> at org.apache.hadoop.fs.FileContext$Util.exists(FileContext.java:1630)
> at 
> org.apache.spark.sql.execution.streaming.FileContextBasedCheckpointFileManager.exists(CheckpointFileManager.scala:326)
> at 
> org.apache.spark.sql.execution.streaming.HDFSMetadataLog.get(HDFSMetadataLog.scala:142)
> at 
> org.apache.spark.sql.execution.streaming.HDFSMetadataLog.add(HDFSMetadataLog.scala:110)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply$mcV$sp(MicroBatchExecution.scala:382)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply(MicroBatchExecution.scala:381)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply(MicroBatchExecution.scala:381)
> at 
> org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351)
> at 
> org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply$mcZ$sp(MicroBatchExecution.scala:381)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:337)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:337)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution.withProgressLocked(MicroBatchExecution.scala:557)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch(MicroBatchExecution.

[jira] [Updated] (SPARK-33920) We cannot pass schema to a createDataFrame function in scala, however we can do this in python.

2020-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-33920:
-
Component/s: (was: Build)

> We cannot pass schema to a createDataFrame function in scala, however we can 
> do this in python.
> ---
>
> Key: SPARK-33920
> URL: https://issues.apache.org/jira/browse/SPARK-33920
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: Abdul Rafay Abdul Rafay
>Priority: Major
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> ~spark.createDataFrame(data, schema)~
> ~I am able to pass schema as a parameter to a function createDataFrame in 
> python but cannot pass this in scala for static data.~



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33920) We cannot pass schema to a createDataFrame function in scala, however we can do this in python.

2020-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-33920:
-
Priority: Major  (was: Critical)

> We cannot pass schema to a createDataFrame function in scala, however we can 
> do this in python.
> ---
>
> Key: SPARK-33920
> URL: https://issues.apache.org/jira/browse/SPARK-33920
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, SQL
>Affects Versions: 3.0.1
>Reporter: Abdul Rafay Abdul Rafay
>Priority: Major
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> ~spark.createDataFrame(data, schema)~
> ~I am able to pass schema as a parameter to a function createDataFrame in 
> python but cannot pass this in scala for static data.~



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33920) We cannot pass schema to a createDataFrame function in scala, however we can do this in python.

2020-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-33920:
-
Target Version/s:   (was: 3.0.1)

> We cannot pass schema to a createDataFrame function in scala, however we can 
> do this in python.
> ---
>
> Key: SPARK-33920
> URL: https://issues.apache.org/jira/browse/SPARK-33920
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, SQL
>Affects Versions: 3.0.1
>Reporter: Abdul Rafay Abdul Rafay
>Priority: Critical
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> ~spark.createDataFrame(data, schema)~
> ~I am able to pass schema as a parameter to a function createDataFrame in 
> python but cannot pass this in scala for static data.~



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33922) Fix error test SparkLauncherSuite.testSparkLauncherGetError

2020-12-27 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255445#comment-17255445
 ] 

Hyukjin Kwon commented on SPARK-33922:
--

[~dengziming] this passes in CI. Can you elaborate how you run the tests?

> Fix error test SparkLauncherSuite.testSparkLauncherGetError
> ---
>
> Key: SPARK-33922
> URL: https://issues.apache.org/jira/browse/SPARK-33922
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 3.0.1
>Reporter: dengziming
>Priority: Minor
>
> org.apache.spark.launcher.SparkLauncherSuite.testSparkLauncherGetError get 
> failed everytime when executing, note that it's not a flaky test because it 
> failed everytime.
> ```
> java.lang.AssertionErrorjava.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.spark.launcher.SparkLauncherSuite.testSparkLauncherGetError
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33923) Fix some tests with AQE enabled

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255439#comment-17255439
 ] 

Apache Spark commented on SPARK-33923:
--

User 'Ngone51' has created a pull request for this issue:
https://github.com/apache/spark/pull/30941

> Fix some tests with AQE enabled
> ---
>
> Key: SPARK-33923
> URL: https://issues.apache.org/jira/browse/SPARK-33923
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0, 3.2.0
>Reporter: wuyi
>Priority: Major
>
> e.g.,
> DataFrameAggregateSuite
> DataFrameJoinSuite
> JoinSuite
> PlannerSuite
> BucketedReadSuite



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33923) Fix some tests with AQE enabled

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255438#comment-17255438
 ] 

Apache Spark commented on SPARK-33923:
--

User 'Ngone51' has created a pull request for this issue:
https://github.com/apache/spark/pull/30941

> Fix some tests with AQE enabled
> ---
>
> Key: SPARK-33923
> URL: https://issues.apache.org/jira/browse/SPARK-33923
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0, 3.2.0
>Reporter: wuyi
>Priority: Major
>
> e.g.,
> DataFrameAggregateSuite
> DataFrameJoinSuite
> JoinSuite
> PlannerSuite
> BucketedReadSuite



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33923) Fix some tests with AQE enabled

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33923:


Assignee: Apache Spark

> Fix some tests with AQE enabled
> ---
>
> Key: SPARK-33923
> URL: https://issues.apache.org/jira/browse/SPARK-33923
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0, 3.2.0
>Reporter: wuyi
>Assignee: Apache Spark
>Priority: Major
>
> e.g.,
> DataFrameAggregateSuite
> DataFrameJoinSuite
> JoinSuite
> PlannerSuite
> BucketedReadSuite



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33923) Fix some tests with AQE enabled

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33923:


Assignee: (was: Apache Spark)

> Fix some tests with AQE enabled
> ---
>
> Key: SPARK-33923
> URL: https://issues.apache.org/jira/browse/SPARK-33923
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0, 3.2.0
>Reporter: wuyi
>Priority: Major
>
> e.g.,
> DataFrameAggregateSuite
> DataFrameJoinSuite
> JoinSuite
> PlannerSuite
> BucketedReadSuite



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-33923) Fix some tests with AQE enabled

2020-12-27 Thread wuyi (Jira)

wuyi created SPARK-33923:


 Summary: Fix some tests with AQE enabled
 Key: SPARK-33923
 URL: https://issues.apache.org/jira/browse/SPARK-33923
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.1.0, 3.2.0
Reporter: wuyi


e.g.,

DataFrameAggregateSuite

DataFrameJoinSuite

JoinSuite

PlannerSuite

BucketedReadSuite



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33907) Only prune columns of from_json if parsing options is empty

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255436#comment-17255436
 ] 

Apache Spark commented on SPARK-33907:
--

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/30944

> Only prune columns of from_json if parsing options is empty
> ---
>
> Key: SPARK-33907
> URL: https://issues.apache.org/jira/browse/SPARK-33907
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0, 3.2.0
>Reporter: L. C. Hsieh
>Assignee: Apache Spark
>Priority: Major
> Fix For: 3.1.0
>
>
> For safety, we should only prune columns from from_json expression if the 
> parsing option is empty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33914) Describe the structure of unified v1 and v2 tests

2020-12-27 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-33914.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 30929
[https://github.com/apache/spark/pull/30929]

> Describe the structure of unified v1 and v2 tests
> -
>
> Key: SPARK-33914
> URL: https://issues.apache.org/jira/browse/SPARK-33914
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
> Fix For: 3.2.0
>
>
> Add comments for unified v1 and v2 tests and describe their structure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33914) Describe the structure of unified v1 and v2 tests

2020-12-27 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-33914:
---

Assignee: Maxim Gekk

> Describe the structure of unified v1 and v2 tests
> -
>
> Key: SPARK-33914
> URL: https://issues.apache.org/jira/browse/SPARK-33914
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
>
> Add comments for unified v1 and v2 tests and describe their structure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33908) Refact SparkSubmitUtils.resolveMavenCoordinates return parameter

2020-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-33908.
--
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 30922
[https://github.com/apache/spark/pull/30922]

> Refact SparkSubmitUtils.resolveMavenCoordinates return parameter
> 
>
> Key: SPARK-33908
> URL: https://issues.apache.org/jira/browse/SPARK-33908
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.2.0
>
>
> Per talk in https://github.com/apache/spark/pull/29966#discussion_r531917374



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33908) Refact SparkSubmitUtils.resolveMavenCoordinates return parameter

2020-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-33908:


Assignee: angerszhu

> Refact SparkSubmitUtils.resolveMavenCoordinates return parameter
> 
>
> Key: SPARK-33908
> URL: https://issues.apache.org/jira/browse/SPARK-33908
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>
> Per talk in https://github.com/apache/spark/pull/29966#discussion_r531917374



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33901) Char and Varchar display error after DDLs

2020-12-27 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-33901.
-
Fix Version/s: 3.1.0
   Resolution: Fixed

Issue resolved by pull request 30918
[https://github.com/apache/spark/pull/30918]

> Char and Varchar display error after DDLs
> -
>
> Key: SPARK-33901
> URL: https://issues.apache.org/jira/browse/SPARK-33901
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Kent Yao
>Assignee: Apache Spark
>Priority: Major
> Fix For: 3.1.0
>
>
> CTAS / CREATE TABLE LIKE/ CVAS/ alter table add columns



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31685) Spark structured streaming with Kafka fails with HDFS_DELEGATION_TOKEN expiration issue

2020-12-27 Thread Kent Yao (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255420#comment-17255420
 ] 

Kent Yao commented on SPARK-31685:
--

Hi, [~rajeevkumar], does this issue still exist in the latest release 3.0.1, or 
the master branch? If so I guess this should be fixed as soon as possible for 
the 2.4 LTS version and the coming 3.1.0. 

Stability for long-running applications is essential. And I guess it is not 
that hard to fix it. cc [~cloud_fan] [~hyukjin.kwon] [~dongjoon]

> Spark structured streaming with Kafka fails with HDFS_DELEGATION_TOKEN 
> expiration issue
> ---
>
> Key: SPARK-31685
> URL: https://issues.apache.org/jira/browse/SPARK-31685
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.4
> Environment: spark-2.4.4-bin-hadoop2.7
>Reporter: Rajeev Kumar
>Priority: Major
>
> I am facing issue for spark-2.4.4-bin-hadoop2.7. I am using spark structured 
> streaming with Kafka. Reading the stream from Kafka and saving it to HBase.
> I get this error on the driver after 24 hours.
>  
> {code:java}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 6972072 for ) is expired
> at org.apache.hadoop.ipc.Client.call(Client.java:1475)
> at org.apache.hadoop.ipc.Client.call(Client.java:1412)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
> at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
> at org.apache.hadoop.fs.Hdfs.getFileStatus(Hdfs.java:130)
> at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1169)
> at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1165)
> at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
> at 
> org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1171)
> at org.apache.hadoop.fs.FileContext$Util.exists(FileContext.java:1630)
> at 
> org.apache.spark.sql.execution.streaming.FileContextBasedCheckpointFileManager.exists(CheckpointFileManager.scala:326)
> at 
> org.apache.spark.sql.execution.streaming.HDFSMetadataLog.get(HDFSMetadataLog.scala:142)
> at 
> org.apache.spark.sql.execution.streaming.HDFSMetadataLog.add(HDFSMetadataLog.scala:110)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply$mcV$sp(MicroBatchExecution.scala:382)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply(MicroBatchExecution.scala:381)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply(MicroBatchExecution.scala:381)
> at 
> org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351)
> at 
> org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply$mcZ$sp(MicroBatchExecution.scala:381)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:337)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:337)
> at 
> org.apache.spark.sql.execution.streami

[jira] [Commented] (SPARK-30789) Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-30789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255418#comment-17255418
 ] 

Apache Spark commented on SPARK-30789:
--

User 'beliefer' has created a pull request for this issue:
https://github.com/apache/spark/pull/30943

> Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE
> --
>
> Key: SPARK-30789
> URL: https://issues.apache.org/jira/browse/SPARK-30789
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Priority: Major
>
> All of LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE should support IGNORE NULLS 
> | RESPECT NULLS. For example:
> {code:java}
> LEAD (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> LAG (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> NTH_VALUE (expr, offset)
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER
> ( [ PARTITION BY window_partition ]
> [ ORDER BY window_ordering 
>  frame_clause ] ){code}
>  
> *Oracle:*
> [https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/NTH_VALUE.html#GUID-F8A0E88C-67E5-4AA6-9515-95D03A7F9EA0]
> *Redshift*
> [https://docs.aws.amazon.com/redshift/latest/dg/r_WF_NTH.html]
> *Presto*
> [https://prestodb.io/docs/current/functions/window.html]
> *DB2*
> [https://www.ibm.com/support/knowledgecenter/SSGU8G_14.1.0/com.ibm.sqls.doc/ids_sqs_1513.htm]
> *Teradata*
> [https://docs.teradata.com/r/756LNiPSFdY~4JcCCcR5Cw/GjCT6l7trjkIEjt~7Dhx4w]
> *Snowflake*
> [https://docs.snowflake.com/en/sql-reference/functions/lead.html]
> [https://docs.snowflake.com/en/sql-reference/functions/lag.html]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-30789) Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-30789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-30789:


Assignee: Apache Spark

> Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE
> --
>
> Key: SPARK-30789
> URL: https://issues.apache.org/jira/browse/SPARK-30789
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Assignee: Apache Spark
>Priority: Major
>
> All of LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE should support IGNORE NULLS 
> | RESPECT NULLS. For example:
> {code:java}
> LEAD (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> LAG (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> NTH_VALUE (expr, offset)
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER
> ( [ PARTITION BY window_partition ]
> [ ORDER BY window_ordering 
>  frame_clause ] ){code}
>  
> *Oracle:*
> [https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/NTH_VALUE.html#GUID-F8A0E88C-67E5-4AA6-9515-95D03A7F9EA0]
> *Redshift*
> [https://docs.aws.amazon.com/redshift/latest/dg/r_WF_NTH.html]
> *Presto*
> [https://prestodb.io/docs/current/functions/window.html]
> *DB2*
> [https://www.ibm.com/support/knowledgecenter/SSGU8G_14.1.0/com.ibm.sqls.doc/ids_sqs_1513.htm]
> *Teradata*
> [https://docs.teradata.com/r/756LNiPSFdY~4JcCCcR5Cw/GjCT6l7trjkIEjt~7Dhx4w]
> *Snowflake*
> [https://docs.snowflake.com/en/sql-reference/functions/lead.html]
> [https://docs.snowflake.com/en/sql-reference/functions/lag.html]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-30789) Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-30789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-30789:


Assignee: Apache Spark

> Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE
> --
>
> Key: SPARK-30789
> URL: https://issues.apache.org/jira/browse/SPARK-30789
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Assignee: Apache Spark
>Priority: Major
>
> All of LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE should support IGNORE NULLS 
> | RESPECT NULLS. For example:
> {code:java}
> LEAD (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> LAG (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> NTH_VALUE (expr, offset)
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER
> ( [ PARTITION BY window_partition ]
> [ ORDER BY window_ordering 
>  frame_clause ] ){code}
>  
> *Oracle:*
> [https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/NTH_VALUE.html#GUID-F8A0E88C-67E5-4AA6-9515-95D03A7F9EA0]
> *Redshift*
> [https://docs.aws.amazon.com/redshift/latest/dg/r_WF_NTH.html]
> *Presto*
> [https://prestodb.io/docs/current/functions/window.html]
> *DB2*
> [https://www.ibm.com/support/knowledgecenter/SSGU8G_14.1.0/com.ibm.sqls.doc/ids_sqs_1513.htm]
> *Teradata*
> [https://docs.teradata.com/r/756LNiPSFdY~4JcCCcR5Cw/GjCT6l7trjkIEjt~7Dhx4w]
> *Snowflake*
> [https://docs.snowflake.com/en/sql-reference/functions/lead.html]
> [https://docs.snowflake.com/en/sql-reference/functions/lag.html]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-30789) Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-30789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-30789:


Assignee: (was: Apache Spark)

> Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE
> --
>
> Key: SPARK-30789
> URL: https://issues.apache.org/jira/browse/SPARK-30789
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Priority: Major
>
> All of LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE should support IGNORE NULLS 
> | RESPECT NULLS. For example:
> {code:java}
> LEAD (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> LAG (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> NTH_VALUE (expr, offset)
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER
> ( [ PARTITION BY window_partition ]
> [ ORDER BY window_ordering 
>  frame_clause ] ){code}
>  
> *Oracle:*
> [https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/NTH_VALUE.html#GUID-F8A0E88C-67E5-4AA6-9515-95D03A7F9EA0]
> *Redshift*
> [https://docs.aws.amazon.com/redshift/latest/dg/r_WF_NTH.html]
> *Presto*
> [https://prestodb.io/docs/current/functions/window.html]
> *DB2*
> [https://www.ibm.com/support/knowledgecenter/SSGU8G_14.1.0/com.ibm.sqls.doc/ids_sqs_1513.htm]
> *Teradata*
> [https://docs.teradata.com/r/756LNiPSFdY~4JcCCcR5Cw/GjCT6l7trjkIEjt~7Dhx4w]
> *Snowflake*
> [https://docs.snowflake.com/en/sql-reference/functions/lead.html]
> [https://docs.snowflake.com/en/sql-reference/functions/lag.html]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-30789) Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE

2020-12-27 Thread jiaan.geng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-30789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiaan.geng updated SPARK-30789:
---
Description: 
All of LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE should support IGNORE NULLS | 
RESPECT NULLS. For example:
{code:java}
LEAD (value_expr [, offset ])
[ IGNORE NULLS | RESPECT NULLS ]
OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
 
{code:java}
LAG (value_expr [, offset ])
[ IGNORE NULLS | RESPECT NULLS ]
OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
 
{code:java}
NTH_VALUE (expr, offset)
[ IGNORE NULLS | RESPECT NULLS ]
OVER
( [ PARTITION BY window_partition ]
[ ORDER BY window_ordering 
 frame_clause ] ){code}
 

*Oracle:*
[https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/NTH_VALUE.html#GUID-F8A0E88C-67E5-4AA6-9515-95D03A7F9EA0]

*Redshift*
[https://docs.aws.amazon.com/redshift/latest/dg/r_WF_NTH.html]

*Presto*
[https://prestodb.io/docs/current/functions/window.html]

*DB2*
[https://www.ibm.com/support/knowledgecenter/SSGU8G_14.1.0/com.ibm.sqls.doc/ids_sqs_1513.htm]

*Teradata*
[https://docs.teradata.com/r/756LNiPSFdY~4JcCCcR5Cw/GjCT6l7trjkIEjt~7Dhx4w]

*Snowflake*
[https://docs.snowflake.com/en/sql-reference/functions/lead.html]
[https://docs.snowflake.com/en/sql-reference/functions/lag.html]

 

  was:
All of LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE support IGNORE NULLS | RESPECT 
NULLS. For example:
{code:java}
LEAD (value_expr [, offset ])
[ IGNORE NULLS | RESPECT NULLS ]
OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
 
{code:java}
LAG (value_expr [, offset ])
[ IGNORE NULLS | RESPECT NULLS ]
OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
 
{code:java}
NTH_VALUE (expr, offset)
[ IGNORE NULLS | RESPECT NULLS ]
OVER
( [ PARTITION BY window_partition ]
[ ORDER BY window_ordering 
 frame_clause ] ){code}
 

*Oracle:*
[https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/NTH_VALUE.html#GUID-F8A0E88C-67E5-4AA6-9515-95D03A7F9EA0]

*Redshift*
[https://docs.aws.amazon.com/redshift/latest/dg/r_WF_NTH.html]

*Presto*
[https://prestodb.io/docs/current/functions/window.html]

*DB2*
[https://www.ibm.com/support/knowledgecenter/SSGU8G_14.1.0/com.ibm.sqls.doc/ids_sqs_1513.htm]

*Teradata*
[https://docs.teradata.com/r/756LNiPSFdY~4JcCCcR5Cw/GjCT6l7trjkIEjt~7Dhx4w]

*Snowflake*
[https://docs.snowflake.com/en/sql-reference/functions/lead.html]
[https://docs.snowflake.com/en/sql-reference/functions/lag.html]

 


> Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE
> --
>
> Key: SPARK-30789
> URL: https://issues.apache.org/jira/browse/SPARK-30789
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Priority: Major
>
> All of LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE should support IGNORE NULLS 
> | RESPECT NULLS. For example:
> {code:java}
> LEAD (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> LAG (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> NTH_VALUE (expr, offset)
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER
> ( [ PARTITION BY window_partition ]
> [ ORDER BY window_ordering 
>  frame_clause ] ){code}
>  
> *Oracle:*
> [https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/NTH_VALUE.html#GUID-F8A0E88C-67E5-4AA6-9515-95D03A7F9EA0]
> *Redshift*
> [https://docs.aws.amazon.com/redshift/latest/dg/r_WF_NTH.html]
> *Presto*
> [https://prestodb.io/docs/current/functions/window.html]
> *DB2*
> [https://www.ibm.com/support/knowledgecenter/SSGU8G_14.1.0/com.ibm.sqls.doc/ids_sqs_1513.htm]
> *Teradata*
> [https://docs.teradata.com/r/756LNiPSFdY~4JcCCcR5Cw/GjCT6l7trjkIEjt~7Dhx4w]
> *Snowflake*
> [https://docs.snowflake.com/en/sql-reference/functions/lead.html]
> [https://docs.snowflake.com/en/sql-reference/functions/lag.html]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33801) Cleanup "Unicode escapes in triple quoted strings are deprecated" compilation warnings

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255410#comment-17255410
 ] 

Apache Spark commented on SPARK-33801:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/30926

> Cleanup "Unicode escapes in triple quoted strings are deprecated" compilation 
> warnings
> --
>
> Key: SPARK-33801
> URL: https://issues.apache.org/jira/browse/SPARK-33801
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 3.2.0
>
>
> There are total 15 compilation warnings about this
> {code:java}
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2930: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2931: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2932: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2933: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2934: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2935: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2936: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2937: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtils.scala:82:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtilsSuite.scala:32:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtilsSuite.scala:79:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ParserUtilsSuite.scala:97:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ParserUtilsSuite.scala:101:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonParsingOptionsSuite.scala:76:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonParsingOptionsSuite.scala:83:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33801) Cleanup "Unicode escapes in triple quoted strings are deprecated" compilation warnings

2020-12-27 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255409#comment-17255409
 ] 

Hyukjin Kwon commented on SPARK-33801:
--

Fixed in https://github.com/apache/spark/pull/30926

> Cleanup "Unicode escapes in triple quoted strings are deprecated" compilation 
> warnings
> --
>
> Key: SPARK-33801
> URL: https://issues.apache.org/jira/browse/SPARK-33801
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Yang Jie
>Priority: Minor
>
> There are total 15 compilation warnings about this
> {code:java}
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2930: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2931: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2932: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2933: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2934: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2935: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2936: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2937: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtils.scala:82:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtilsSuite.scala:32:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtilsSuite.scala:79:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ParserUtilsSuite.scala:97:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ParserUtilsSuite.scala:101:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonParsingOptionsSuite.scala:76:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonParsingOptionsSuite.scala:83:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33801) Cleanup "Unicode escapes in triple quoted strings are deprecated" compilation warnings

2020-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-33801.
--
Fix Version/s: 3.2.0
 Assignee: Yang Jie
   Resolution: Fixed

> Cleanup "Unicode escapes in triple quoted strings are deprecated" compilation 
> warnings
> --
>
> Key: SPARK-33801
> URL: https://issues.apache.org/jira/browse/SPARK-33801
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 3.2.0
>
>
> There are total 15 compilation warnings about this
> {code:java}
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2930: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2931: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2932: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2933: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2934: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2935: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2936: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/core/src/main/scala/org/apache/spark/util/Utils.scala:2937: 
> Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtils.scala:82:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtilsSuite.scala:32:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtilsSuite.scala:79:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ParserUtilsSuite.scala:97:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ParserUtilsSuite.scala:101:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonParsingOptionsSuite.scala:76:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> [WARNING] 
> /spark-source/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonParsingOptionsSuite.scala:83:
>  Unicode escapes in triple quoted strings are deprecated, use the literal 
> character instead
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-31168) Upgrade Scala to 2.12.13

2020-12-27 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-31168:
--
Affects Version/s: (was: 3.1.0)
   3.2.0

> Upgrade Scala to 2.12.13
> 
>
> Key: SPARK-31168
> URL: https://issues.apache.org/jira/browse/SPARK-31168
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0
>Reporter: Yuming Wang
>Priority: Major
>
> h2. Highlights
>  * Performance improvements in the collections library: algorithmic 
> improvements and changes to avoid unnecessary allocations ([list of 
> PRs|https://github.com/scala/scala/pulls?q=is%3Apr+milestone%3A2.12.11+is%3Aclosed+sort%3Acreated-desc+label%3Alibrary%3Acollections+label%3Aperformance])
>  * Performance improvements in the compiler ([list of 
> PRs|https://github.com/scala/scala/pulls?q=is%3Apr+milestone%3A2.12.11+is%3Aclosed+sort%3Acreated-desc+-label%3Alibrary%3Acollections+label%3Aperformance+],
>  minor [effects in our 
> benchmarks|https://scala-ci.typesafe.com/grafana/dashboard/db/scala-benchmark?orgId=1&from=1567985515850&to=1584355915694&var-branch=2.12.x&var-source=All&var-bench=HotScalacBenchmark.compile&var-host=scalabench@scalabench@])
>  * Improvements to {{-Yrepl-class-based}}, an alternative internal REPL 
> encoding that avoids deadlocks (details on 
> [#8712|https://github.com/scala/scala/pull/8712])
>  * A new {{-Yrepl-use-magic-imports}} flag that avoids deep class nesting in 
> the REPL, which can lead to deteriorating performance in long sessions 
> ([#8576|https://github.com/scala/scala/pull/8576])
>  * Fix some {{toX}} methods that could expose the underlying mutability of a 
> {{ListBuffer}}-generated collection 
> ([#8674|https://github.com/scala/scala/pull/8674])
> h3. JDK 9+ support
>  * ASM was upgraded to 7.3.1, allowing the optimizer to run on JDK 13+ 
> ([#8676|https://github.com/scala/scala/pull/8676])
>  * {{:javap}} in the REPL now works on JDK 9+ 
> ([#8400|https://github.com/scala/scala/pull/8400])
> h3. Other changes
>  * Support new labels for creating durations for consistency: 
> {{Duration("1m")}}, {{Duration("3 hrs")}} 
> ([#8325|https://github.com/scala/scala/pull/8325], 
> [#8450|https://github.com/scala/scala/pull/8450])
>  * Fix memory leak in runtime reflection's {{TypeTag}} caches 
> ([#8470|https://github.com/scala/scala/pull/8470]) and some thread safety 
> issues in runtime reflection 
> ([#8433|https://github.com/scala/scala/pull/8433])
>  * When using compiler plugins, the ordering of compiler phases may change 
> due to [#8427|https://github.com/scala/scala/pull/8427]
> For more details, see [https://github.com/scala/scala/releases/tag/v2.12.11].
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33884) Simplify CaseWhenclauses with (true and false) and (false and true)

2020-12-27 Thread Yuming Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-33884:

Description: 
Simplify CaseWhenclauses with (true and false) and (false and true):
||Expression||After simplify||
|case when cond then true else false end|cond|
|case when cond then false else true end|!cond|

  was:
This pr simplify {{CaseWhen}} when only one branch and one clause is null and 
another is boolean. This simplify similar to SPARK-32721.
||Expression||After simplify||
|case when cond then true else false end|cond|
|case when cond then false else true end|!cond|


> Simplify CaseWhenclauses with (true and false) and (false and true)
> ---
>
> Key: SPARK-33884
> URL: https://issues.apache.org/jira/browse/SPARK-33884
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Yuming Wang
>Priority: Major
>
> Simplify CaseWhenclauses with (true and false) and (false and true):
> ||Expression||After simplify||
> |case when cond then true else false end|cond|
> |case when cond then false else true end|!cond|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33884) Simplify CaseWhenclauses with (true and false) and (false and true)

2020-12-27 Thread Yuming Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-33884:

Description: 
This pr simplify {{CaseWhen}} when only one branch and one clause is null and 
another is boolean. This simplify similar to SPARK-32721.
||Expression||After simplify||
|case when cond then true else false end|cond|
|case when cond then false else true end|!cond|

  was:
This pr simplify {{CaseWhen}} when only one branch and one clause is null and 
another is boolean. This simplify similar to SPARK-32721.
||Expression||After simplify||
|case when cond then null else false end|and(cond, null)|
|case when cond then null else true end|or(not(cond), null)|
|case when cond then false else null end|and(not(cond), null)|
|case when cond then false end|and(not(cond), null)|
|case when cond then true else null end|or(cond, null)|
|case when cond then true end|or(cond, null)|


> Simplify CaseWhenclauses with (true and false) and (false and true)
> ---
>
> Key: SPARK-33884
> URL: https://issues.apache.org/jira/browse/SPARK-33884
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Yuming Wang
>Priority: Major
>
> This pr simplify {{CaseWhen}} when only one branch and one clause is null and 
> another is boolean. This simplify similar to SPARK-32721.
> ||Expression||After simplify||
> |case when cond then true else false end|cond|
> |case when cond then false else true end|!cond|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33884) Simplify CaseWhenclauses with (true and false) and (false and true)

2020-12-27 Thread Yuming Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-33884:

Summary: Simplify CaseWhenclauses with (true and false) and (false and 
true)  (was: Simplify conditional if all branches are foldable boolean type)

> Simplify CaseWhenclauses with (true and false) and (false and true)
> ---
>
> Key: SPARK-33884
> URL: https://issues.apache.org/jira/browse/SPARK-33884
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Yuming Wang
>Priority: Major
>
> This pr simplify {{CaseWhen}} when only one branch and one clause is null and 
> another is boolean. This simplify similar to SPARK-32721.
> ||Expression||After simplify||
> |case when cond then null else false end|and(cond, null)|
> |case when cond then null else true end|or(not(cond), null)|
> |case when cond then false else null end|and(not(cond), null)|
> |case when cond then false end|and(not(cond), null)|
> |case when cond then true else null end|or(cond, null)|
> |case when cond then true end|or(cond, null)|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33913) Upgrade Kafka to 2.7.0

2020-12-27 Thread L. C. Hsieh (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255395#comment-17255395
 ] 

L. C. Hsieh commented on SPARK-33913:
-

Seems we cannot upgrade to Kafka 2.7.0. Because Kafka core inlines the Scala 
library, you can not use a different Scala patch version than what Kafka used 
to compile its jars: https://github.com/embeddedkafka/embedded-kafka/issues/202

Kafka 2.7.0 uses Scala 2.12.12 and currently Spark uses Scala 2.12.10, so there 
will be {{java.lang.NoClassDefFoundError: scala/math/Ordering$$anon$7}} errors.

Due to an issue in Scala 2.12.12, Spark won't upgrade to Scala 2.12.12: 
https://github.com/scala/bug/issues/12096, and waits for Scala 2.12.13.

So, seems for Kafka, Spark needs to wait for next Kafka version which uses 
Scala 2.12.13 too.

> Upgrade Kafka to 2.7.0
> --
>
> Key: SPARK-33913
> URL: https://issues.apache.org/jira/browse/SPARK-33913
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, DStreams
>Affects Versions: 3.2.0
>Reporter: dengziming
>Priority: Major
>
>  
> The Apache Kafka community has released for Apache Kafka 2.7.0, some features 
> are useful for example the KAFKA-9893
>  configurable TCP connection timeout, more details : 
> https://downloads.apache.org/kafka/2.7.0/RELEASE_NOTES.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Closed] (SPARK-33921) Upgrade Scala version to 2.12.12

2020-12-27 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun closed SPARK-33921.
-

We tried Scala 2.12.12 at SPARK-33168 already and revised SPARK-33168 to target 
Scala 2.12.13 to avoid Scala compiler bug.

> Upgrade Scala version to 2.12.12
> 
>
> Key: SPARK-33921
> URL: https://issues.apache.org/jira/browse/SPARK-33921
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0
>Reporter: L. C. Hsieh
>Priority: Major
>
> Upgrade Scala 2.12 patch version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-33921) Upgrade Scala version to 2.12.12

2020-12-27 Thread Dongjoon Hyun (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255394#comment-17255394
 ] 

Dongjoon Hyun edited comment on SPARK-33921 at 12/28/20, 5:59 AM:
--

We tried Scala 2.12.12 at SPARK-31168 already and revised SPARK-31168 to target 
Scala 2.12.13 to avoid Scala compiler bug.


was (Author: dongjoon):
We tried Scala 2.12.12 at SPARK-33168 already and revised SPARK-33168 to target 
Scala 2.12.13 to avoid Scala compiler bug.

> Upgrade Scala version to 2.12.12
> 
>
> Key: SPARK-33921
> URL: https://issues.apache.org/jira/browse/SPARK-33921
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0
>Reporter: L. C. Hsieh
>Priority: Major
>
> Upgrade Scala 2.12 patch version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33921) Upgrade Scala version to 2.12.12

2020-12-27 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-33921.
---
  Assignee: (was: L. C. Hsieh)
Resolution: Duplicate

> Upgrade Scala version to 2.12.12
> 
>
> Key: SPARK-33921
> URL: https://issues.apache.org/jira/browse/SPARK-33921
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0
>Reporter: L. C. Hsieh
>Priority: Major
>
> Upgrade Scala 2.12 patch version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33918) UnresolvedView should retain SQL text position

2020-12-27 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-33918.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 30936
[https://github.com/apache/spark/pull/30936]

> UnresolvedView should retain SQL text position
> --
>
> Key: SPARK-33918
> URL: https://issues.apache.org/jira/browse/SPARK-33918
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Terry Kim
>Assignee: Terry Kim
>Priority: Major
> Fix For: 3.2.0
>
>
> UnresolvedView should retain SQL text position. The following commands will 
> be handled:
> "DROP VIEW v"
> "ALTER VIEW v SET TBLPROPERTIES ('k'='v')"
> "ALTER VIEW v UNSET TBLPROPERTIES ('k')"
> "ALTER VIEW v AS SELECT 1"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33915) Allow json expression to be pushable column

2020-12-27 Thread Ted Yu (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255389#comment-17255389
 ] 

Ted Yu commented on SPARK-33915:


Here is the plan prior to predicate pushdown:
{code}
2020-12-26 03:28:59,926 (Time-limited test) [DEBUG - 
org.apache.spark.internal.Logging.logDebug(Logging.scala:61)] Adaptive 
execution enabled for plan: Sort [id#34 ASC NULLS FIRST], true, 0
+- Project [id#34, address#35, phone#37, get_json_object(phone#37, $.code) AS 
phone#33]
   +- Filter (get_json_object(phone#37, $.phone) = 1200)
  +- BatchScan[id#34, address#35, phone#37] Cassandra Scan: test.person
 - Cassandra Filters: []
 - Requested Columns: [id,address,phone]
{code}
Here is the plan with pushdown:
{code}
2020-12-28 01:40:08,150 (Time-limited test) [DEBUG - 
org.apache.spark.internal.Logging.logDebug(Logging.scala:61)] Adaptive 
execution enabled for plan: Sort [id#34 ASC NULLS FIRST], true, 0
+- Project [id#34, address#35, phone#37, get_json_object(phone#37, $.code) AS 
phone#33]
   +- BatchScan[id#34, address#35, phone#37] Cassandra Scan: test.person
 - Cassandra Filters: [["`GetJsonObject(phone#37,$.phone)`" = ?, 1200]]
 - Requested Columns: [id,address,phone]
{code}

> Allow json expression to be pushable column
> ---
>
> Key: SPARK-33915
> URL: https://issues.apache.org/jira/browse/SPARK-33915
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.1
>Reporter: Ted Yu
>Priority: Major
>
> Currently PushableColumnBase provides no support for json / jsonb expression.
> Example of json expression:
> {code}
> get_json_object(phone, '$.code') = '1200'
> {code}
> If non-string literal is part of the expression, the presence of cast() would 
> complicate the situation.
> Implication is that implementation of SupportsPushDownFilters doesn't have a 
> chance to perform pushdown even if third party DB engine supports json 
> expression pushdown.
> This issue is for discussion and implementation of Spark core changes which 
> would allow json expression to be recognized as pushable column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33918) UnresolvedView should retain SQL text position

2020-12-27 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-33918:
---

Assignee: Terry Kim

> UnresolvedView should retain SQL text position
> --
>
> Key: SPARK-33918
> URL: https://issues.apache.org/jira/browse/SPARK-33918
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Terry Kim
>Assignee: Terry Kim
>Priority: Major
>
> UnresolvedView should retain SQL text position. The following commands will 
> be handled:
> "DROP VIEW v"
> "ALTER VIEW v SET TBLPROPERTIES ('k'='v')"
> "ALTER VIEW v UNSET TBLPROPERTIES ('k')"
> "ALTER VIEW v AS SELECT 1"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33921) Upgrade Scala version to 2.12.12

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255383#comment-17255383
 ] 

Apache Spark commented on SPARK-33921:
--

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/30939

> Upgrade Scala version to 2.12.12
> 
>
> Key: SPARK-33921
> URL: https://issues.apache.org/jira/browse/SPARK-33921
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>
> Upgrade Scala 2.12 patch version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33921) Upgrade Scala version to 2.12.12

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255381#comment-17255381
 ] 

Apache Spark commented on SPARK-33921:
--

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/30939

> Upgrade Scala version to 2.12.12
> 
>
> Key: SPARK-33921
> URL: https://issues.apache.org/jira/browse/SPARK-33921
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>
> Upgrade Scala 2.12 patch version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33532) Remove unreachable branch in SpecificParquetRecordReaderBase.initialize method

2020-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-33532.
--
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 30484
[https://github.com/apache/spark/pull/30484]

> Remove unreachable branch in SpecificParquetRecordReaderBase.initialize method
> --
>
> Key: SPARK-33532
> URL: https://issues.apache.org/jira/browse/SPARK-33532
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 3.2.0
>
>
> There are two places to call the "SpecificParquetRecordReaderBase.initialize(
> InputSplit inputSplit, TaskAttemptContext taskAttemptContext
> )" method, one is in ParquetFileFormat and the other one is in 
> ParquetPartitionReaderFactory, 
> the "inputSplit.rowGroupOffsets" passed in both places are null, it seems 
> that the branch of "rowgroupoffsets! = null" is useless.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33532) Remove unreachable branch in SpecificParquetRecordReaderBase.initialize method

2020-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-33532:


Assignee: Yang Jie

> Remove unreachable branch in SpecificParquetRecordReaderBase.initialize method
> --
>
> Key: SPARK-33532
> URL: https://issues.apache.org/jira/browse/SPARK-33532
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
>
> There are two places to call the "SpecificParquetRecordReaderBase.initialize(
> InputSplit inputSplit, TaskAttemptContext taskAttemptContext
> )" method, one is in ParquetFileFormat and the other one is in 
> ParquetPartitionReaderFactory, 
> the "inputSplit.rowGroupOffsets" passed in both places are null, it seems 
> that the branch of "rowgroupoffsets! = null" is useless.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-24913) Make `AssertTrue` and `AssertNotNull` non-deterministic

2020-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-24913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-24913.
--
Resolution: Not A Problem

> Make `AssertTrue` and `AssertNotNull` non-deterministic
> ---
>
> Key: SPARK-24913
> URL: https://issues.apache.org/jira/browse/SPARK-24913
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: DB Tsai
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-32685) Script transform hive serde default field.delimit is '\t'

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-32685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-32685:


Assignee: Apache Spark

> Script transform hive serde default field.delimit is '\t'
> -
>
> Key: SPARK-32685
> URL: https://issues.apache.org/jira/browse/SPARK-32685
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: angerszhu
>Assignee: Apache Spark
>Priority: Major
>
>  
> {code:java}
> select split(value, "\t") from (
> SELECT TRANSFORM(a, b, c, null)
> USING 'cat' 
> FROM (select 1 as a, 2 as b, 3  as c) t
> ) temp;
> result is :
> _c0
> ["2","3","\\N"]{code}
>  
> {code:java}
> select split(value, "\t") from (
> SELECT TRANSFORM(a, b, c, null)
>   ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> USING 'cat' 
>   ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
>   WITH SERDEPROPERTIES (
>'serialization.last.column.takes.rest' = 'true'
>   )
> FROM (select 1 as a, 2 as b, 3  as c) t
> ) temp;
> result is :
> _c0
> ["2","3","\\N"]{code}
>  
>  
>  
> {code:java}
> select split(value, "\t") from (
> SELECT TRANSFORM(a, b, c, null)
>   ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> USING 'cat' 
>   ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> FROM (select 1 as a, 2 as b, 3  as c) t
> ) temp;
> result is :
> _c0 
> ["2"]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-32685) Script transform hive serde default field.delimit is '\t'

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-32685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-32685:


Assignee: (was: Apache Spark)

> Script transform hive serde default field.delimit is '\t'
> -
>
> Key: SPARK-32685
> URL: https://issues.apache.org/jira/browse/SPARK-32685
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: angerszhu
>Priority: Major
>
>  
> {code:java}
> select split(value, "\t") from (
> SELECT TRANSFORM(a, b, c, null)
> USING 'cat' 
> FROM (select 1 as a, 2 as b, 3  as c) t
> ) temp;
> result is :
> _c0
> ["2","3","\\N"]{code}
>  
> {code:java}
> select split(value, "\t") from (
> SELECT TRANSFORM(a, b, c, null)
>   ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> USING 'cat' 
>   ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
>   WITH SERDEPROPERTIES (
>'serialization.last.column.takes.rest' = 'true'
>   )
> FROM (select 1 as a, 2 as b, 3  as c) t
> ) temp;
> result is :
> _c0
> ["2","3","\\N"]{code}
>  
>  
>  
> {code:java}
> select split(value, "\t") from (
> SELECT TRANSFORM(a, b, c, null)
>   ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> USING 'cat' 
>   ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> FROM (select 1 as a, 2 as b, 3  as c) t
> ) temp;
> result is :
> _c0 
> ["2"]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-32685) Script transform hive serde default field.delimit is '\t'

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-32685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255365#comment-17255365
 ] 

Apache Spark commented on SPARK-32685:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/30942

> Script transform hive serde default field.delimit is '\t'
> -
>
> Key: SPARK-32685
> URL: https://issues.apache.org/jira/browse/SPARK-32685
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: angerszhu
>Priority: Major
>
>  
> {code:java}
> select split(value, "\t") from (
> SELECT TRANSFORM(a, b, c, null)
> USING 'cat' 
> FROM (select 1 as a, 2 as b, 3  as c) t
> ) temp;
> result is :
> _c0
> ["2","3","\\N"]{code}
>  
> {code:java}
> select split(value, "\t") from (
> SELECT TRANSFORM(a, b, c, null)
>   ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> USING 'cat' 
>   ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
>   WITH SERDEPROPERTIES (
>'serialization.last.column.takes.rest' = 'true'
>   )
> FROM (select 1 as a, 2 as b, 3  as c) t
> ) temp;
> result is :
> _c0
> ["2","3","\\N"]{code}
>  
>  
>  
> {code:java}
> select split(value, "\t") from (
> SELECT TRANSFORM(a, b, c, null)
>   ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> USING 'cat' 
>   ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> FROM (select 1 as a, 2 as b, 3  as c) t
> ) temp;
> result is :
> _c0 
> ["2"]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33680) Fix PrunePartitionSuiteBase/BucketedReadWithHiveSupportSuite not to depend on the default conf

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255364#comment-17255364
 ] 

Apache Spark commented on SPARK-33680:
--

User 'Ngone51' has created a pull request for this issue:
https://github.com/apache/spark/pull/30941

> Fix PrunePartitionSuiteBase/BucketedReadWithHiveSupportSuite not to depend on 
> the default conf
> --
>
> Key: SPARK-33680
> URL: https://issues.apache.org/jira/browse/SPARK-33680
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33680) Fix PrunePartitionSuiteBase/BucketedReadWithHiveSupportSuite not to depend on the default conf

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255362#comment-17255362
 ] 

Apache Spark commented on SPARK-33680:
--

User 'Ngone51' has created a pull request for this issue:
https://github.com/apache/spark/pull/30941

> Fix PrunePartitionSuiteBase/BucketedReadWithHiveSupportSuite not to depend on 
> the default conf
> --
>
> Key: SPARK-33680
> URL: https://issues.apache.org/jira/browse/SPARK-33680
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-33922) Fix error test SparkLauncherSuite.testSparkLauncherGetError

2020-12-27 Thread dengziming (Jira)

dengziming created SPARK-33922:
--

 Summary: Fix error test 
SparkLauncherSuite.testSparkLauncherGetError
 Key: SPARK-33922
 URL: https://issues.apache.org/jira/browse/SPARK-33922
 Project: Spark
  Issue Type: Improvement
  Components: Tests
Affects Versions: 3.0.1
Reporter: dengziming


org.apache.spark.launcher.SparkLauncherSuite.testSparkLauncherGetError get 
failed everytime when executing, note that it's not a flaky test because it 
failed everytime.

```

java.lang.AssertionErrorjava.lang.AssertionError at 
org.junit.Assert.fail(Assert.java:87) at 
org.junit.Assert.assertTrue(Assert.java:42) at 
org.junit.Assert.assertTrue(Assert.java:53) at 
org.apache.spark.launcher.SparkLauncherSuite.testSparkLauncherGetError

```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33921) Upgrade Scala version to 2.12.12

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255346#comment-17255346
 ] 

Apache Spark commented on SPARK-33921:
--

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/30940

> Upgrade Scala version to 2.12.12
> 
>
> Key: SPARK-33921
> URL: https://issues.apache.org/jira/browse/SPARK-33921
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>
> Upgrade Scala 2.12 patch version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33921) Upgrade Scala version to 2.12.12

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33921:


Assignee: Apache Spark  (was: L. C. Hsieh)

> Upgrade Scala version to 2.12.12
> 
>
> Key: SPARK-33921
> URL: https://issues.apache.org/jira/browse/SPARK-33921
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0
>Reporter: L. C. Hsieh
>Assignee: Apache Spark
>Priority: Major
>
> Upgrade Scala 2.12 patch version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33921) Upgrade Scala version to 2.12.12

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33921:


Assignee: L. C. Hsieh  (was: Apache Spark)

> Upgrade Scala version to 2.12.12
> 
>
> Key: SPARK-33921
> URL: https://issues.apache.org/jira/browse/SPARK-33921
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>
> Upgrade Scala 2.12 patch version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33921) Upgrade Scala version to 2.12.12

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255345#comment-17255345
 ] 

Apache Spark commented on SPARK-33921:
--

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/30940

> Upgrade Scala version to 2.12.12
> 
>
> Key: SPARK-33921
> URL: https://issues.apache.org/jira/browse/SPARK-33921
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>
> Upgrade Scala 2.12 patch version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33896) Make Spark DAGScheduler datasource cache aware when scheduling tasks in a multi-replication HDFS

2020-12-27 Thread Xudingyu (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255342#comment-17255342
 ] 

Xudingyu commented on SPARK-33896:
--

[~sro...@scient.com][~sro...@yahoo.com][~sowen]

> Make Spark DAGScheduler datasource cache aware when scheduling tasks in a 
> multi-replication HDFS
> 
>
> Key: SPARK-33896
> URL: https://issues.apache.org/jira/browse/SPARK-33896
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xudingyu
>Priority: Critical
>
> *Goals:*
> • Make Spark 3.0 Scheduler DataSource-Cache-Aware in multi-replication 
> HDFS cluster
> • Performance gain in E2E workload when enabling this feature
> *Problem Statement:*
> Spark’s DAGScheduler currently schedule tasks according to RDD’s 
> preferLocations, which repects HDFS BlockLocation. In a multi-replication 
> cluster, HDFS BlockLocation can be returned as an Array[BlockLocation], Spark 
> chooses one of the BlockLocation to run tasks on. +However,tasks can run 
> faster if scheduled to the nodes with datasource cache that they need. 
> Currently there’re no datasource cache locality provision mechanism in Spark 
> if nodes in the cluster have cache data+.
> This project aims to add a cache-locality-aware mechanism. Spark DAGScheduler 
> can schedule tasks to the nodes with datasource cache according to cache 
> locality in a multi-replication HDFS.
> *Basic idea:*
> The basic idea is to open a datasource cache locality provider interface in 
> Spark and with default implementation is to respect HDFS BlockLocation. 
> Worker nodes datasource cache meta(like offset, length) needs to be stored in 
> an externalDB like Redis. Spark driver can look up these cache meta and 
> customize task schedule locality algorithm to choose the most efficient node.
> *CBL(Cost Based Locality)*
> CBL(cost based locality), takes cache size、disk IO、network IO.. into 
> account when scheduling tasks.
> Say there’re 3 nodes A、B、C in a 2-replication HDFS cluster. When Spark 
> scheduling task1, nodeB have all the data replication on disk that task1 
> needs, at the same time, nodeA has 20% datasource cache and 50% data 
> replication on disk.
> Then we calculate the cost for schedule task1 on nodeA、nodeB and nodeC.
> CostA = CalculateCost(20% read from cache) + CalculateCost(50% read from 
> disk) + CalculateCost(30% read from remote)
> CostB = CalculateCost(100% read from disk)
> CostC = CalculateCost(100% read from remote)
> Return the node with minimal cost.
> *Modifications:*
> A config is needed to decide which cache locality provider to use, can be as 
> follows
> {code:java}
> SQLConf.PARTITIONED_FILE_PREFERREDLOC_IMPL
> {code}
> For Spark3.0 need to modify FilePartition.scala$preferredLocations() can be 
> as follows
> {code:java}
> override def preferredLocations(): Array[String] = {
> Utils.classForName(SparkEnv.get.conf.get(SQLConf.PARTITIONED_FILE_PREFERREDLOC_IMPL))
>   . getConstructor()
>   . newInstance()
>   . getPreferredLocs()
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-33921) Upgrade Scala version to 2.12.12

2020-12-27 Thread L. C. Hsieh (Jira)

L. C. Hsieh created SPARK-33921:
---

 Summary: Upgrade Scala version to 2.12.12
 Key: SPARK-33921
 URL: https://issues.apache.org/jira/browse/SPARK-33921
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.2.0
Reporter: L. C. Hsieh
Assignee: L. C. Hsieh


Upgrade Scala 2.12 patch version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33913) Upgrade Kafka to 2.7.0

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255324#comment-17255324
 ] 

Apache Spark commented on SPARK-33913:
--

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/30939

> Upgrade Kafka to 2.7.0
> --
>
> Key: SPARK-33913
> URL: https://issues.apache.org/jira/browse/SPARK-33913
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, DStreams
>Affects Versions: 3.2.0
>Reporter: dengziming
>Priority: Major
>
>  
> The Apache Kafka community has released for Apache Kafka 2.7.0, some features 
> are useful for example the KAFKA-9893
>  configurable TCP connection timeout, more details : 
> https://downloads.apache.org/kafka/2.7.0/RELEASE_NOTES.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33913) Upgrade Kafka to 2.7.0

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33913:


Assignee: (was: Apache Spark)

> Upgrade Kafka to 2.7.0
> --
>
> Key: SPARK-33913
> URL: https://issues.apache.org/jira/browse/SPARK-33913
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, DStreams
>Affects Versions: 3.2.0
>Reporter: dengziming
>Priority: Major
>
>  
> The Apache Kafka community has released for Apache Kafka 2.7.0, some features 
> are useful for example the KAFKA-9893
>  configurable TCP connection timeout, more details : 
> https://downloads.apache.org/kafka/2.7.0/RELEASE_NOTES.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33913) Upgrade Kafka to 2.7.0

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33913:


Assignee: Apache Spark

> Upgrade Kafka to 2.7.0
> --
>
> Key: SPARK-33913
> URL: https://issues.apache.org/jira/browse/SPARK-33913
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, DStreams
>Affects Versions: 3.2.0
>Reporter: dengziming
>Assignee: Apache Spark
>Priority: Major
>
>  
> The Apache Kafka community has released for Apache Kafka 2.7.0, some features 
> are useful for example the KAFKA-9893
>  configurable TCP connection timeout, more details : 
> https://downloads.apache.org/kafka/2.7.0/RELEASE_NOTES.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33920) We cannot pass schema to a createDataFrame function in scala, however we can do this in python.

2020-12-27 Thread L. C. Hsieh (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255321#comment-17255321
 ] 

L. C. Hsieh commented on SPARK-33920:
-

There is `{{def createDataFrame(rowRDD: RDD[Row], schema: StructType)}}` in 
Scala API. If you mean `{{def createDataFrame[A <: Product : TypeTag](data: 
Seq[A])}}`, Scala API uses Scala reflection to infer the schema of the given 
Product. Why you need `{{schema}}` parameter here?

> We cannot pass schema to a createDataFrame function in scala, however we can 
> do this in python.
> ---
>
> Key: SPARK-33920
> URL: https://issues.apache.org/jira/browse/SPARK-33920
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, SQL
>Affects Versions: 3.0.1
>Reporter: Abdul Rafay Abdul Rafay
>Priority: Critical
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> ~spark.createDataFrame(data, schema)~
> ~I am able to pass schema as a parameter to a function createDataFrame in 
> python but cannot pass this in scala for static data.~



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33824) Restructure and improve Python package management page

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255229#comment-17255229
 ] 

Apache Spark commented on SPARK-33824:
--

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/30938

> Restructure and improve Python package management page
> --
>
> Key: SPARK-33824
> URL: https://issues.apache.org/jira/browse/SPARK-33824
> Project: Spark
>  Issue Type: Sub-task
>  Components: docs, PySpark
>Affects Versions: 3.1.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 3.1.0
>
>
> I lately wrote a blog post (pending to publish soon) about Python dependency 
> management.
> This JIRA aims to aa some of contents in the blog post into PySpark 
> documentation for users.
> Please see the linked PR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33824) Restructure and improve Python package management page

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255228#comment-17255228
 ] 

Apache Spark commented on SPARK-33824:
--

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/30938

> Restructure and improve Python package management page
> --
>
> Key: SPARK-33824
> URL: https://issues.apache.org/jira/browse/SPARK-33824
> Project: Spark
>  Issue Type: Sub-task
>  Components: docs, PySpark
>Affects Versions: 3.1.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 3.1.0
>
>
> I lately wrote a blog post (pending to publish soon) about Python dependency 
> management.
> This JIRA aims to aa some of contents in the blog post into PySpark 
> documentation for users.
> Please see the linked PR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-33920) We cannot pass schema to a createDataFrame function in scala, however we can do this in python.

2020-12-27 Thread Abdul Rafay Abdul Rafay (Jira)

Abdul Rafay Abdul Rafay created SPARK-33920:
---

 Summary: We cannot pass schema to a createDataFrame function in 
scala, however we can do this in python.
 Key: SPARK-33920
 URL: https://issues.apache.org/jira/browse/SPARK-33920
 Project: Spark
  Issue Type: Improvement
  Components: Build, SQL
Affects Versions: 3.0.1
Reporter: Abdul Rafay Abdul Rafay


~spark.createDataFrame(data, schema)~

~I am able to pass schema as a parameter to a function createDataFrame in 
python but cannot pass this in scala for static data.~



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33911) Update SQL migration guide about changes in HiveClientImpl

2020-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-33911:


Assignee: Maxim Gekk

> Update SQL migration guide about changes in HiveClientImpl
> --
>
> Key: SPARK-33911
> URL: https://issues.apache.org/jira/browse/SPARK-33911
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.8, 3.0.2, 3.1.0, 3.2.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
>
> 1. https://github.com/apache/spark/pull/30802
> 2. https://github.com/apache/spark/pull/30711



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33911) Update SQL migration guide about changes in HiveClientImpl

2020-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-33911.
--
Fix Version/s: 2.4.8
   3.1.0
   Resolution: Fixed

Issue resolved by pull request 30933
[https://github.com/apache/spark/pull/30933]

> Update SQL migration guide about changes in HiveClientImpl
> --
>
> Key: SPARK-33911
> URL: https://issues.apache.org/jira/browse/SPARK-33911
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.8, 3.0.2, 3.1.0, 3.2.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
> Fix For: 3.1.0, 2.4.8
>
>
> 1. https://github.com/apache/spark/pull/30802
> 2. https://github.com/apache/spark/pull/30711



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33919) Unify v1 and v2 SHOW NAMESPACES tests

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255190#comment-17255190
 ] 

Apache Spark commented on SPARK-33919:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/30937

> Unify v1 and v2 SHOW NAMESPACES tests
> -
>
> Key: SPARK-33919
> URL: https://issues.apache.org/jira/browse/SPARK-33919
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Maxim Gekk
>Priority: Major
>
> Write unified tests for SHOW DATABASES and SHOW NAMESPACES that can be run 
> for v1 and v2 catalogs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33919) Unify v1 and v2 SHOW NAMESPACES tests

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33919:


Assignee: (was: Apache Spark)

> Unify v1 and v2 SHOW NAMESPACES tests
> -
>
> Key: SPARK-33919
> URL: https://issues.apache.org/jira/browse/SPARK-33919
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Maxim Gekk
>Priority: Major
>
> Write unified tests for SHOW DATABASES and SHOW NAMESPACES that can be run 
> for v1 and v2 catalogs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33919) Unify v1 and v2 SHOW NAMESPACES tests

2020-12-27 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33919:


Assignee: Apache Spark

> Unify v1 and v2 SHOW NAMESPACES tests
> -
>
> Key: SPARK-33919
> URL: https://issues.apache.org/jira/browse/SPARK-33919
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Maxim Gekk
>Assignee: Apache Spark
>Priority: Major
>
> Write unified tests for SHOW DATABASES and SHOW NAMESPACES that can be run 
> for v1 and v2 catalogs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33919) Unify v1 and v2 SHOW NAMESPACES tests

2020-12-27 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255189#comment-17255189
 ] 

Apache Spark commented on SPARK-33919:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/30937

> Unify v1 and v2 SHOW NAMESPACES tests
> -
>
> Key: SPARK-33919
> URL: https://issues.apache.org/jira/browse/SPARK-33919
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Maxim Gekk
>Priority: Major
>
> Write unified tests for SHOW DATABASES and SHOW NAMESPACES that can be run 
> for v1 and v2 catalogs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

68 matches

Mail list logo