[spark] branch branch-3.5 updated: [SPARK-45484][SQL][3.5] Deprecated the incorrect parquet compression codec lz4raw

2023-10-16 Thread beliefer
This is an automated email from the ASF dual-hosted git repository.

beliefer pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
 new b2103731bcf [SPARK-45484][SQL][3.5] Deprecated the incorrect parquet 
compression codec lz4raw
b2103731bcf is described below

commit b2103731bcfe7e0bee3b1302c773e46f80badcc9
Author: Jiaan Geng 
AuthorDate: Tue Oct 17 09:50:39 2023 +0800

[SPARK-45484][SQL][3.5] Deprecated the incorrect parquet compression codec 
lz4raw

### What changes were proposed in this pull request?
According to the discussion at 
https://github.com/apache/spark/pull/43310#issuecomment-1757139681, this PR 
want deprecates the incorrect parquet compression codec `lz4raw` at Spark 3.5.1 
and adds a warning log.

The warning log prompts users that `lz4raw` will be removed it at Apache 
Spark 4.0.0.

### Why are the changes needed?
Deprecated the incorrect parquet compression codec `lz4raw`.

### Does this PR introduce _any_ user-facing change?
'Yes'.
Users will see the waring log below.
`Parquet compression codec 'lz4raw' is deprecated, please use 'lz4_raw'`

### How was this patch tested?
Exists test cases and new test cases.

### Was this patch authored or co-authored using generative AI tooling?
'No'.

Closes #43330 from beliefer/SPARK-45484_3.5.

Authored-by: Jiaan Geng 
Signed-off-by: Jiaan Geng 
---
 .../org/apache/spark/sql/internal/SQLConf.scala| 14 ++--
 .../datasources/parquet/ParquetOptions.scala   |  8 ++-
 .../datasources/FileSourceCodecSuite.scala |  2 +-
 .../ParquetCompressionCodecPrecedenceSuite.scala   | 25 ++
 4 files changed, 41 insertions(+), 8 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 73d3756ef6b..427d0480190 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -995,12 +995,22 @@ object SQLConf {
   "`parquet.compression` is specified in the table-specific 
options/properties, the " +
   "precedence would be `compression`, `parquet.compression`, " +
   "`spark.sql.parquet.compression.codec`. Acceptable values include: none, 
uncompressed, " +
-  "snappy, gzip, lzo, brotli, lz4, lz4raw, zstd.")
+  "snappy, gzip, lzo, brotli, lz4, lz4raw, lz4_raw, zstd.")
 .version("1.1.1")
 .stringConf
 .transform(_.toLowerCase(Locale.ROOT))
 .checkValues(
-  Set("none", "uncompressed", "snappy", "gzip", "lzo", "brotli", "lz4", 
"lz4raw", "zstd"))
+  Set(
+"none",
+"uncompressed",
+"snappy",
+"gzip",
+"lzo",
+"brotli",
+"lz4",
+"lz4raw",
+"lz4_raw",
+"zstd"))
 .createWithDefault("snappy")
 
   val PARQUET_FILTER_PUSHDOWN_ENABLED = 
buildConf("spark.sql.parquet.filterPushdown")
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala
index 023d2460959..95869b6fbb9 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala
@@ -22,6 +22,7 @@ import java.util.Locale
 import org.apache.parquet.hadoop.ParquetOutputFormat
 import org.apache.parquet.hadoop.metadata.CompressionCodecName
 
+import org.apache.spark.internal.Logging
 import org.apache.spark.sql.catalyst.{DataSourceOptions, FileSourceOptions}
 import org.apache.spark.sql.catalyst.util.CaseInsensitiveMap
 import org.apache.spark.sql.internal.SQLConf
@@ -32,7 +33,7 @@ import org.apache.spark.sql.internal.SQLConf
 class ParquetOptions(
 @transient private val parameters: CaseInsensitiveMap[String],
 @transient private val sqlConf: SQLConf)
-  extends FileSourceOptions(parameters) {
+  extends FileSourceOptions(parameters) with Logging {
 
   import ParquetOptions._
 
@@ -59,6 +60,9 @@ class ParquetOptions(
   throw new IllegalArgumentException(s"Codec [$codecName] " +
 s"is not available. Available codecs are ${availableCodecs.mkString(", 
")}.")
 }
+if (codecName == "lz4raw") {
+  log.warn("Parquet compression codec 'lz4raw' is deprecated, please use 
'lz4_raw'")
+}
 shortParquetCompressionCodecNames(codecName).name()
   }
 
@@ -96,7 +100,9 @@ object ParquetOptions extends DataSourceOptions {
 "lzo" -> CompressionCodecName.LZO,
 "brotli" -> CompressionCodecName.BROTLI,
 "lz4" -> CompressionCodecName.LZ4,
+// Deprecated, to be removed at Spark 4.0.0, 

Re: [PR] Add canonical links to the PySpark docs page for published docs [spark-website]

2023-10-16 Thread via GitHub


panbingkun commented on PR #482:
URL: https://github.com/apache/spark-website/pull/482#issuecomment-1765503134

   > > Because the location of the same document may change in different 
versions
   > 
   > Yes exactly, and we should not change the URL structure of any 
documentation published in the future. I think the URL structure stays the same 
for docs after 3.2 (correct me if I am wrong).
   > 
   > @panbingkun if it's not too much trouble, can we manually update the 
canonical link for docs < version 3.2? And we need to make sure we don't change 
the doc URL structure again in the future. cc @allanf-db @HyukjinKwon 
@zhengruifeng
   
   Of course, I wrote a small tool last time. Slightly modify its logic, it 
should be able to handle it, but I need to carefully check to ensure that it is 
completely accurate this time, waiting for me 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-45517][CONNECT] Expand more exception constructors to support error framework parameters

2023-10-16 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 28961a6ce00 [SPARK-45517][CONNECT] Expand more exception constructors 
to support error framework parameters
28961a6ce00 is described below

commit 28961a6ce001e0c25c780a39a726fdd825542cee
Author: Yihong He 
AuthorDate: Tue Oct 17 09:11:22 2023 +0900

[SPARK-45517][CONNECT] Expand more exception constructors to support error 
framework parameters

### What changes were proposed in this pull request?

- Expand more exception constructors to support error framework parameters

### Why are the changes needed?

- Better integration with the error framework

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`build/sbt "connect-client-jvm/testOnly *SparkConnectClientSuite"`

### Was this patch authored or co-authored using generative AI tooling?

Closes #43368 from heyihong/SPARK-45517.

Authored-by: Yihong He 
Signed-off-by: Hyukjin Kwon 
---
 .../scala/org/apache/spark/SparkException.scala| 16 ++--
 .../sql/streaming/StreamingQueryException.scala|  3 +-
 .../connect/client/SparkConnectClientSuite.scala   | 28 ++-
 .../connect/client/GrpcExceptionConverter.scala| 93 ++
 .../catalyst/analysis/alreadyExistException.scala  |  6 +-
 .../catalyst/analysis/noSuchItemsExceptions.scala  |  4 +-
 6 files changed, 119 insertions(+), 31 deletions(-)

diff --git a/common/utils/src/main/scala/org/apache/spark/SparkException.scala 
b/common/utils/src/main/scala/org/apache/spark/SparkException.scala
index 828948b48c1..3bcdd0a7c29 100644
--- a/common/utils/src/main/scala/org/apache/spark/SparkException.scala
+++ b/common/utils/src/main/scala/org/apache/spark/SparkException.scala
@@ -133,7 +133,7 @@ private[spark] case class ExecutorDeadException(message: 
String)
 /**
  * Exception thrown when Spark returns different result after upgrading to a 
new version.
  */
-private[spark] class SparkUpgradeException private(
+private[spark] class SparkUpgradeException(
   message: String,
   cause: Option[Throwable],
   errorClass: Option[String],
@@ -169,7 +169,7 @@ private[spark] class SparkUpgradeException private(
 /**
  * Arithmetic exception thrown from Spark with an error class.
  */
-private[spark] class SparkArithmeticException private(
+private[spark] class SparkArithmeticException(
 message: String,
 errorClass: Option[String],
 messageParameters: Map[String, String],
@@ -207,7 +207,7 @@ private[spark] class SparkArithmeticException private(
 /**
  * Unsupported operation exception thrown from Spark with an error class.
  */
-private[spark] class SparkUnsupportedOperationException private(
+private[spark] class SparkUnsupportedOperationException(
   message: String,
   errorClass: Option[String],
   messageParameters: Map[String, String])
@@ -271,7 +271,7 @@ private[spark] class SparkConcurrentModificationException(
 /**
  * Datetime exception thrown from Spark with an error class.
  */
-private[spark] class SparkDateTimeException private(
+private[spark] class SparkDateTimeException(
 message: String,
 errorClass: Option[String],
 messageParameters: Map[String, String],
@@ -324,7 +324,7 @@ private[spark] class SparkFileNotFoundException(
 /**
  * Number format exception thrown from Spark with an error class.
  */
-private[spark] class SparkNumberFormatException private(
+private[spark] class SparkNumberFormatException private[spark](
 message: String,
 errorClass: Option[String],
 messageParameters: Map[String, String],
@@ -363,7 +363,7 @@ private[spark] class SparkNumberFormatException private(
 /**
  * Illegal argument exception thrown from Spark with an error class.
  */
-private[spark] class SparkIllegalArgumentException private(
+private[spark] class SparkIllegalArgumentException(
 message: String,
 cause: Option[Throwable],
 errorClass: Option[String],
@@ -403,7 +403,7 @@ private[spark] class SparkIllegalArgumentException private(
   override def getQueryContext: Array[QueryContext] = context
 }
 
-private[spark] class SparkRuntimeException private(
+private[spark] class SparkRuntimeException(
 message: String,
 cause: Option[Throwable],
 errorClass: Option[String],
@@ -480,7 +480,7 @@ private[spark] class SparkSecurityException(
 /**
  * Array index out of bounds exception thrown from Spark with an error class.
  */
-private[spark] class SparkArrayIndexOutOfBoundsException private(
+private[spark] class SparkArrayIndexOutOfBoundsException(
   message: String,
   errorClass: Option[String],
   messageParameters: Map[String, String],
diff --git 
a/common/utils/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryException.scala
 

Re: [PR] Add canonical links to the PySpark docs page for published docs [spark-website]

2023-10-16 Thread via GitHub


allisonwang-db commented on PR #482:
URL: https://github.com/apache/spark-website/pull/482#issuecomment-1765322679

   > Because the location of the same document may change in different versions
   
   Yes exactly, and we should not change the URL structure of any documentation 
published in the future. I think the URL structure stays the same for docs 
after 3.2 (correct me if I am wrong).
   
   @panbingkun if it's not too much trouble, can we manually update the 
canonical link for docs < version 3.2? And we need to make sure we don't change 
the doc URL structure again in the future. cc @allanf-db @HyukjinKwon 
@zhengruifeng 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [MINOR][CORE] Add `@Deprecated` for `SparkLauncher#DEPRECATED_CHILD_CONNECTION_TIMEOUT`

2023-10-16 Thread yangjie01
This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 721ea9bbb2f [MINOR][CORE] Add `@Deprecated` for 
`SparkLauncher#DEPRECATED_CHILD_CONNECTION_TIMEOUT`
721ea9bbb2f is described below

commit 721ea9bbb2ff77b6d2f575fdca0aeda84990cc3b
Author: yangjie01 
AuthorDate: Mon Oct 16 23:37:38 2023 +0800

[MINOR][CORE] Add `@Deprecated` for 
`SparkLauncher#DEPRECATED_CHILD_CONNECTION_TIMEOUT`

### What changes were proposed in this pull request?
This pr just add `Deprecated` for 
`SparkLauncher#DEPRECATED_CHILD_CONNECTION_TIMEOUT`.

### Why are the changes needed?
From the javadoc, `DEPRECATED_CHILD_CONNECTION_TIMEOUT` has been 
deprecated, so it should carry a `Deprecated` marker.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43374 from 
LuciferYang/DEPRECATED_CHILD_CONNECTION_TIMEOUT-Deprecated.

Authored-by: yangjie01 
Signed-off-by: yangjie01 
---
 launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java 
b/launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java
index 61624779027..bcff454b99a 100644
--- a/launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java
+++ b/launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java
@@ -99,6 +99,7 @@ public class SparkLauncher extends 
AbstractLauncher {
* @deprecated use `CHILD_CONNECTION_TIMEOUT`
* @since 1.6.0
*/
+  @Deprecated
   public static final String DEPRECATED_CHILD_CONNECTION_TIMEOUT =
 "spark.launcher.childConectionTimeout";
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-45540][BUILD] Upgrade jetty to 9.4.53.v20231009

2023-10-16 Thread yangjie01
This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 86001c13865 [SPARK-45540][BUILD] Upgrade jetty to 9.4.53.v20231009
86001c13865 is described below

commit 86001c13865eae6bfcab4dd7c8e0390a1cbb5adc
Author: yangjie01 
AuthorDate: Mon Oct 16 22:47:28 2023 +0800

[SPARK-45540][BUILD] Upgrade jetty to 9.4.53.v20231009

### What changes were proposed in this pull request?
This pr aims to upgrade jetty from 9.4.52.v20230823 to 9.4.53.v20231009

### Why are the changes needed?
This version fix 2 CVE:

- [CVE-2023-36478](https://github.com/advisories/GHSA-wgh7-54f2-x98r) | 
https://github.com/apache/spark/security/dependabot/77
- [CVE-2023-44487](https://github.com/advisories/GHSA-qppj-fm5r-hxr3)

the full release notes as follows:
- https://github.com/jetty/jetty.project/releases/tag/jetty-9.4.53.v20231009

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43375 from LuciferYang/SPARK-45540.

Lead-authored-by: yangjie01 
Co-authored-by: YangJie 
Signed-off-by: yangjie01 
---
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 4 ++--
 pom.xml   | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index 7286b4bd131..f896df11923 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -129,8 +129,8 @@ 
jersey-container-servlet/2.40//jersey-container-servlet-2.40.jar
 jersey-hk2/2.40//jersey-hk2-2.40.jar
 jersey-server/2.40//jersey-server-2.40.jar
 jettison/1.5.4//jettison-1.5.4.jar
-jetty-util-ajax/9.4.52.v20230823//jetty-util-ajax-9.4.52.v20230823.jar
-jetty-util/9.4.52.v20230823//jetty-util-9.4.52.v20230823.jar
+jetty-util-ajax/9.4.53.v20231009//jetty-util-ajax-9.4.53.v20231009.jar
+jetty-util/9.4.53.v20231009//jetty-util-9.4.53.v20231009.jar
 jline/2.14.6//jline-2.14.6.jar
 jline/3.22.0//jline-3.22.0.jar
 jna/5.13.0//jna-5.13.0.jar
diff --git a/pom.xml b/pom.xml
index 4741afd1a64..b6804dfb75f 100644
--- a/pom.xml
+++ b/pom.xml
@@ -143,7 +143,7 @@
 1.13.1
 1.9.1
 shaded-protobuf
-9.4.52.v20230823
+9.4.53.v20231009
 4.0.3
 0.10.0
 

[spark] branch master updated: Revert "[SPARK-45502][BUILD] Upgrade Kafka to 3.6.0"

2023-10-16 Thread yangjie01
This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 62653b96c9c Revert "[SPARK-45502][BUILD] Upgrade Kafka to 3.6.0"
62653b96c9c is described below

commit 62653b96c9c9fbe98117f656d8000bf39a29498e
Author: yangjie01 
AuthorDate: Mon Oct 16 17:42:43 2023 +0800

Revert "[SPARK-45502][BUILD] Upgrade Kafka to 3.6.0"

This reverts commit d1bd21a2a219ebe6c5ac3fcb1e17db75af3c670c.

### What changes were proposed in this pull request?
This pr aims to revert SPARK-45502 to make the test case 
`KafkaSourceStressSuite` stable.

### Why are the changes needed?
The test case `KafkaSourceStressSuite` has become very unstable after the 
merger of SPARK-45502, with 10 out of the recent 22 tests failing because of 
it. Revert it for now, and we can upgrade Kafka again after resolving the test 
issues.

- https://github.com/apache/spark/actions/runs/6497999347/job/17648385705
- https://github.com/apache/spark/actions/runs/6502219014/job/17660900989
- https://github.com/apache/spark/actions/runs/6502591917/job/17661861797
- https://github.com/apache/spark/actions/runs/6503144598/job/17663199041
- https://github.com/apache/spark/actions/runs/6503233514/job/17663413817
- https://github.com/apache/spark/actions/runs/6504416528/job/17666334238
- https://github.com/apache/spark/actions/runs/6509796846/job/17682130466
- https://github.com/apache/spark/actions/runs/6510877112/job/17685502094
- https://github.com/apache/spark/actions/runs/6512948316/job/17691625228
- https://github.com/apache/spark/actions/runs/6516366232/job/17699813649

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43379 from LuciferYang/Revert-SPARK-45502.

Authored-by: yangjie01 
Signed-off-by: yangjie01 
---
 .../apache/spark/sql/kafka010/KafkaTestUtils.scala |  4 ++--
 .../spark/streaming/kafka010/KafkaRDDSuite.scala   | 16 ++
 .../spark/streaming/kafka010/KafkaTestUtils.scala  |  4 ++--
 .../streaming/kafka010/mocks/MockScheduler.scala   | 25 +++---
 pom.xml|  2 +-
 5 files changed, 25 insertions(+), 26 deletions(-)

diff --git 
a/connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
 
b/connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
index 2b0c13ed443..c54afc6290b 100644
--- 
a/connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
+++ 
b/connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
@@ -28,6 +28,7 @@ import scala.io.Source
 import scala.jdk.CollectionConverters._
 
 import com.google.common.io.Files
+import kafka.api.Request
 import kafka.server.{HostedPartition, KafkaConfig, KafkaServer}
 import kafka.server.checkpoints.OffsetCheckpointFile
 import kafka.zk.KafkaZkClient
@@ -39,7 +40,6 @@ import org.apache.kafka.clients.producer._
 import org.apache.kafka.common.TopicPartition
 import org.apache.kafka.common.config.SaslConfigs
 import org.apache.kafka.common.network.ListenerName
-import org.apache.kafka.common.requests.FetchRequest
 import org.apache.kafka.common.security.auth.SecurityProtocol.{PLAINTEXT, 
SASL_PLAINTEXT}
 import org.apache.kafka.common.serialization.StringSerializer
 import org.apache.kafka.common.utils.SystemTime
@@ -597,7 +597,7 @@ class KafkaTestUtils(
 .getPartitionInfo(topic, partition) match {
   case Some(partitionState) =>
 zkClient.getLeaderForPartition(new TopicPartition(topic, 
partition)).isDefined &&
-  FetchRequest.isValidBrokerId(partitionState.leader) &&
+  Request.isValidBrokerId(partitionState.leader) &&
   !partitionState.replicas.isEmpty
 
   case _ =>
diff --git 
a/connector/kafka-0-10/src/test/scala/org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala
 
b/connector/kafka-0-10/src/test/scala/org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala
index ae941b1fddd..735ec2f7b44 100644
--- 
a/connector/kafka-0-10/src/test/scala/org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala
+++ 
b/connector/kafka-0-10/src/test/scala/org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala
@@ -24,14 +24,12 @@ import scala.concurrent.duration._
 import scala.jdk.CollectionConverters._
 import scala.util.Random
 
-import kafka.log.{LogCleaner, UnifiedLog}
-import kafka.server.BrokerTopicStats
+import kafka.log.{CleanerConfig, LogCleaner, LogConfig, 
ProducerStateManagerConfig, UnifiedLog}
+import kafka.server.{BrokerTopicStats, LogDirFailureChannel}
 import kafka.utils.Pool
 

[spark] branch branch-3.5 updated: [SPARK-44619][INFRA][3.5] Free up disk space for container jobs

2023-10-16 Thread yangjie01
This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
 new 0dc1962374d [SPARK-44619][INFRA][3.5] Free up disk space for container 
jobs
0dc1962374d is described below

commit 0dc1962374dceea29a0fa7802881dfeff335d3c9
Author: Ruifeng Zheng 
AuthorDate: Mon Oct 16 17:10:26 2023 +0800

[SPARK-44619][INFRA][3.5] Free up disk space for container jobs

### What changes were proposed in this pull request?
Free up disk space for container jobs

### Why are the changes needed?
increase the available disk space

before this PR

![image](https://github.com/apache/spark/assets/7322292/64230324-607b-4c1d-ac2d-84b9bcaab12a)

after this PR

![image](https://github.com/apache/spark/assets/7322292/aafed2d6-5d26-4f7f-b020-1efe4f551a8f)

### Does this PR introduce _any_ user-facing change?
No, infra-only

### How was this patch tested?
updated CI

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43381 from LuciferYang/SPARK-44619-35.

Authored-by: Ruifeng Zheng 
Signed-off-by: yangjie01 
---
 .github/workflows/build_and_test.yml |  6 ++
 dev/free_disk_space_container| 33 +
 2 files changed, 39 insertions(+)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index 1fcca7e4c39..674e5950851 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -407,6 +407,8 @@ jobs:
 key: pyspark-coursier-${{ hashFiles('**/pom.xml', '**/plugins.sbt') }}
 restore-keys: |
   pyspark-coursier-
+- name: Free up disk space
+  run: ./dev/free_disk_space_container
 - name: Install Java ${{ matrix.java }}
   uses: actions/setup-java@v3
   with:
@@ -504,6 +506,8 @@ jobs:
 key: sparkr-coursier-${{ hashFiles('**/pom.xml', '**/plugins.sbt') }}
 restore-keys: |
   sparkr-coursier-
+- name: Free up disk space
+  run: ./dev/free_disk_space_container
 - name: Install Java ${{ inputs.java }}
   uses: actions/setup-java@v3
   with:
@@ -612,6 +616,8 @@ jobs:
 key: docs-maven-${{ hashFiles('**/pom.xml') }}
 restore-keys: |
   docs-maven-
+- name: Free up disk space
+  run: ./dev/free_disk_space_container
 - name: Install Java 8
   uses: actions/setup-java@v3
   with:
diff --git a/dev/free_disk_space_container b/dev/free_disk_space_container
new file mode 100755
index 000..cc3b74643e4
--- /dev/null
+++ b/dev/free_disk_space_container
@@ -0,0 +1,33 @@
+#!/usr/bin/env bash
+
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+echo "=="
+echo "Free up disk space on CI system"
+echo "=="
+
+echo "Listing 100 largest packages"
+dpkg-query -Wf '${Installed-Size}\t${Package}\n' | sort -n | tail -n 100
+df -h
+
+echo "Removing large packages"
+rm -rf /__t/CodeQL
+rm -rf /__t/go
+rm -rf /__t/node
+
+df -h


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-44262][SQL] Add `dropTable` and `getInsertStatement` to JdbcDialect

2023-10-16 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6994bad5e6e [SPARK-44262][SQL] Add `dropTable` and 
`getInsertStatement` to JdbcDialect
6994bad5e6e is described below

commit 6994bad5e6ea8700d48cbe20e9b406b89925adc7
Author: Jia Fan 
AuthorDate: Mon Oct 16 13:55:24 2023 +0500

[SPARK-44262][SQL] Add `dropTable` and `getInsertStatement` to JdbcDialect

### What changes were proposed in this pull request?
1. This PR add `dropTable` function to `JdbcDialect`. So user can override 
dropTable SQL by other JdbcDialect like Neo4J
Neo4J Drop case
```sql
MATCH (m:Person {name: 'Mark'})
DELETE m
```
2. Also add `getInsertStatement` for same reason.
Neo4J Insert case
```sql
MATCH (p:Person {name: 'Jennifer'})
SET p.birthdate = date('1980-01-01')
RETURN p
```
Neo4J SQL(in fact named `CQL`) not like normal SQL, but it have JDBC driver.

### Why are the changes needed?
Make JdbcDialect more useful

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
exist test

Closes #41855 from Hisoka-X/SPARK-44262_JDBCUtils_improve.

Authored-by: Jia Fan 
Signed-off-by: Max Gekk 
---
 .../sql/execution/datasources/jdbc/JdbcUtils.scala | 14 +--
 .../org/apache/spark/sql/jdbc/JdbcDialects.scala   | 29 ++
 2 files changed, 35 insertions(+), 8 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
index fb9e11df188..f2b84810175 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
@@ -78,7 +78,8 @@ object JdbcUtils extends Logging with SQLConfHelper {
* Drops a table from the JDBC database.
*/
   def dropTable(conn: Connection, table: String, options: JDBCOptions): Unit = 
{
-executeStatement(conn, options, s"DROP TABLE $table")
+val dialect = JdbcDialects.get(options.url)
+executeStatement(conn, options, dialect.dropTable(table))
   }
 
   /**
@@ -114,22 +115,19 @@ object JdbcUtils extends Logging with SQLConfHelper {
   isCaseSensitive: Boolean,
   dialect: JdbcDialect): String = {
 val columns = if (tableSchema.isEmpty) {
-  rddSchema.fields.map(x => dialect.quoteIdentifier(x.name)).mkString(",")
+  rddSchema.fields
 } else {
   // The generated insert statement needs to follow rddSchema's column 
sequence and
   // tableSchema's column names. When appending data into some 
case-sensitive DBMSs like
   // PostgreSQL/Oracle, we need to respect the existing case-sensitive 
column names instead of
   // RDD column names for user convenience.
-  val tableColumnNames = tableSchema.get.fieldNames
   rddSchema.fields.map { col =>
-val normalizedName = tableColumnNames.find(f => conf.resolver(f, 
col.name)).getOrElse {
+tableSchema.get.find(f => conf.resolver(f.name, col.name)).getOrElse {
   throw QueryCompilationErrors.columnNotFoundInSchemaError(col, 
tableSchema)
 }
-dialect.quoteIdentifier(normalizedName)
-  }.mkString(",")
+  }
 }
-val placeholders = rddSchema.fields.map(_ => "?").mkString(",")
-s"INSERT INTO $table ($columns) VALUES ($placeholders)"
+dialect.insertIntoTable(table, columns)
   }
 
   /**
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
index 22625523a04..37c378c294c 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
@@ -193,6 +193,24 @@ abstract class JdbcDialect extends Serializable with 
Logging {
 statement.executeUpdate(s"CREATE TABLE $tableName ($strSchema) 
$createTableOptions")
   }
 
+  /**
+   * Returns an Insert SQL statement template for inserting a row into the 
target table via JDBC
+   * conn. Use "?" as placeholder for each value to be inserted.
+   * E.g. `INSERT INTO t ("name", "age", "gender") VALUES (?, ?, ?)`
+   *
+   * @param table The name of the table.
+   * @param fields The fields of the row that will be inserted.
+   * @return The SQL query to use for insert data into table.
+   */
+  @Since("4.0.0")
+  def insertIntoTable(
+  table: String,
+  fields: Array[StructField]): String = {
+val placeholders = fields.map(_ => "?").mkString(",")
+val columns = fields.map(x => quoteIdentifier(x.name)).mkString(",")
+s"INSERT INTO $table ($columns) VALUES 

[spark] branch branch-3.5 updated: [SPARK-45538][PYTHON][CONNECT] pyspark connect overwrite_partitions bug

2023-10-16 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
 new daa3281e6a6 [SPARK-45538][PYTHON][CONNECT] pyspark connect 
overwrite_partitions bug
daa3281e6a6 is described below

commit daa3281e6a68845943fcf61ba7ad1d2d3c8be28f
Author: xieshuaihu 
AuthorDate: Mon Oct 16 17:01:18 2023 +0900

[SPARK-45538][PYTHON][CONNECT] pyspark connect overwrite_partitions bug

Fix a bug in pyspark connect.

DataFrameWriterV2.overwritePartitions set mode as overwrite_partitions 
[pyspark/sql/connect/readwriter.py, line 825], but WirteOperationV2 take it as 
overwrite_partition [pyspark/sql/connect/plan.py, line 1660]

make dataframe.writeTo(table).overwritePartitions() work

No

No test. This bug is very obvious.

No

Closes #43367 from xieshuaihu/python_connect_overwrite.

Authored-by: xieshuaihu 
Signed-off-by: Hyukjin Kwon 
(cherry picked from commit 9bdad31039134b492caeeba430120d5978a085ee)
Signed-off-by: Hyukjin Kwon 
---
 python/pyspark/sql/connect/plan.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/pyspark/sql/connect/plan.py 
b/python/pyspark/sql/connect/plan.py
index b7ea1f94993..9af5823dd8b 100644
--- a/python/pyspark/sql/connect/plan.py
+++ b/python/pyspark/sql/connect/plan.py
@@ -1655,7 +1655,7 @@ class WriteOperationV2(LogicalPlan):
 plan.write_operation_v2.mode = 
proto.WriteOperationV2.Mode.MODE_CREATE
 elif wm == "overwrite":
 plan.write_operation_v2.mode = 
proto.WriteOperationV2.Mode.MODE_OVERWRITE
-elif wm == "overwrite_partition":
+elif wm == "overwrite_partitions":
 plan.write_operation_v2.mode = 
proto.WriteOperationV2.Mode.MODE_OVERWRITE_PARTITIONS
 elif wm == "append":
 plan.write_operation_v2.mode = 
proto.WriteOperationV2.Mode.MODE_APPEND


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-45538][PYTHON][CONNECT] pyspark connect overwrite_partitions bug

2023-10-16 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 9bdad310391 [SPARK-45538][PYTHON][CONNECT] pyspark connect 
overwrite_partitions bug
9bdad310391 is described below

commit 9bdad31039134b492caeeba430120d5978a085ee
Author: xieshuaihu 
AuthorDate: Mon Oct 16 17:01:18 2023 +0900

[SPARK-45538][PYTHON][CONNECT] pyspark connect overwrite_partitions bug

### What changes were proposed in this pull request?

Fix a bug in pyspark connect.

DataFrameWriterV2.overwritePartitions set mode as overwrite_partitions 
[pyspark/sql/connect/readwriter.py, line 825], but WirteOperationV2 take it as 
overwrite_partition [pyspark/sql/connect/plan.py, line 1660]

### Why are the changes needed?

make dataframe.writeTo(table).overwritePartitions() work

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

No test. This bug is very obvious.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43367 from xieshuaihu/python_connect_overwrite.

Authored-by: xieshuaihu 
Signed-off-by: Hyukjin Kwon 
---
 python/pyspark/sql/connect/plan.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/pyspark/sql/connect/plan.py 
b/python/pyspark/sql/connect/plan.py
index 10565b9965a..0121d4c3d57 100644
--- a/python/pyspark/sql/connect/plan.py
+++ b/python/pyspark/sql/connect/plan.py
@@ -1743,7 +1743,7 @@ class WriteOperationV2(LogicalPlan):
 plan.write_operation_v2.mode = 
proto.WriteOperationV2.Mode.MODE_CREATE
 elif wm == "overwrite":
 plan.write_operation_v2.mode = 
proto.WriteOperationV2.Mode.MODE_OVERWRITE
-elif wm == "overwrite_partition":
+elif wm == "overwrite_partitions":
 plan.write_operation_v2.mode = 
proto.WriteOperationV2.Mode.MODE_OVERWRITE_PARTITIONS
 elif wm == "append":
 plan.write_operation_v2.mode = 
proto.WriteOperationV2.Mode.MODE_APPEND


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-45531][SQL][DOCS] Add more comments and rename some variable name for InjectRuntimeFilter

2023-10-16 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 7796d8a6331 [SPARK-45531][SQL][DOCS] Add more comments and rename some 
variable name for InjectRuntimeFilter
7796d8a6331 is described below

commit 7796d8a63318d560b08d4d6a8b4d68ea0112bd3e
Author: Jiaan Geng 
AuthorDate: Mon Oct 16 15:40:17 2023 +0800

[SPARK-45531][SQL][DOCS] Add more comments and rename some variable name 
for InjectRuntimeFilter

### What changes were proposed in this pull request?
After many improvements, `InjectRuntimeFilter` is a bit complex. We need 
add more comments to give more design details and rename some variable name so 
that the `InjectRuntimeFilter` have better readability and maintainability.

The core of a runtime filter is join keys, but the suffix `Exp` is not 
intuitive, so it's better to use the suffix `Key` directly. So rename as 
follows:
`filterApplicationSideExp` -> `filterApplicationSideKey`
`filterCreationSideExp` -> `filterCreationSideKey`
`findBloomFilterWithExp` -> `findBloomFilterWithKey`
`expr` -> `joinKey`

### Why are the changes needed?
Improve the readability and maintainability.

### Does this PR introduce _any_ user-facing change?
'No'.

### How was this patch tested?
N/A

### Was this patch authored or co-authored using generative AI tooling?
'No'.

Closes #43359 from beliefer/SPARK-45531.

Authored-by: Jiaan Geng 
Signed-off-by: Wenchen Fan 
---
 .../catalyst/optimizer/InjectRuntimeFilter.scala   | 76 --
 1 file changed, 40 insertions(+), 36 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/InjectRuntimeFilter.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/InjectRuntimeFilter.scala
index 614ab4a1d01..8737082e571 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/InjectRuntimeFilter.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/InjectRuntimeFilter.scala
@@ -29,48 +29,50 @@ import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
 
 /**
- * Insert a filter on one side of the join if the other side has a selective 
predicate.
- * The filter could be an IN subquery (converted to a semi join), a bloom 
filter, or something
- * else in the future.
+ * Insert a runtime filter on one side of the join (we call this side the 
application side) if
+ * we can extract a runtime filter from the other side (creation side). A 
simple case is that
+ * the creation side is a table scan with a selective filter.
+ * The runtime filter is logically an IN subquery with the join keys 
(converted to a semi join),
+ * but can be something different physically, such as a bloom filter.
  */
 object InjectRuntimeFilter extends Rule[LogicalPlan] with PredicateHelper with 
JoinSelectionHelper {
 
-  // Wraps `expr` with a hash function if its byte size is larger than an 
integer.
-  private def mayWrapWithHash(expr: Expression): Expression = {
-if (expr.dataType.defaultSize > IntegerType.defaultSize) {
-  new Murmur3Hash(Seq(expr))
+  // Wraps `joinKey` with a hash function if its byte size is larger than an 
integer.
+  private def mayWrapWithHash(joinKey: Expression): Expression = {
+if (joinKey.dataType.defaultSize > IntegerType.defaultSize) {
+  new Murmur3Hash(Seq(joinKey))
 } else {
-  expr
+  joinKey
 }
   }
 
   private def injectFilter(
-  filterApplicationSideExp: Expression,
+  filterApplicationSideKey: Expression,
   filterApplicationSidePlan: LogicalPlan,
-  filterCreationSideExp: Expression,
+  filterCreationSideKey: Expression,
   filterCreationSidePlan: LogicalPlan): LogicalPlan = {
 require(conf.runtimeFilterBloomFilterEnabled || 
conf.runtimeFilterSemiJoinReductionEnabled)
 if (conf.runtimeFilterBloomFilterEnabled) {
   injectBloomFilter(
-filterApplicationSideExp,
+filterApplicationSideKey,
 filterApplicationSidePlan,
-filterCreationSideExp,
+filterCreationSideKey,
 filterCreationSidePlan
   )
 } else {
   injectInSubqueryFilter(
-filterApplicationSideExp,
+filterApplicationSideKey,
 filterApplicationSidePlan,
-filterCreationSideExp,
+filterCreationSideKey,
 filterCreationSidePlan
   )
 }
   }
 
   private def injectBloomFilter(
-  filterApplicationSideExp: Expression,
+  filterApplicationSideKey: Expression,
   filterApplicationSidePlan: LogicalPlan,
-  filterCreationSideExp: Expression,
+  filterCreationSideKey: Expression,
   filterCreationSidePlan: LogicalPlan): LogicalPlan = {
 // Skip if the filter creation 

[spark] branch master updated: [SPARK-45491] Add missing SQLSTATES 2/2

2023-10-16 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new c69117742b7 [SPARK-45491] Add missing SQLSTATES 2/2
c69117742b7 is described below

commit c69117742b7c05fe58ca2b94cd713231dc8f7a3d
Author: srielau 
AuthorDate: Mon Oct 16 14:15:16 2023 +0800

[SPARK-45491] Add missing SQLSTATES 2/2

### What changes were proposed in this pull request?

Complete addition of SQLSTATE's to all named error classes.

### Why are the changes needed?

We need SQLSTATEs top classify errors and catch them in JDBC/ODBC

### Does this PR introduce _any_ user-facing change?

Yes, SQLSTATE's are documented

### How was this patch tested?

Run existing QA

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43376 from srielau/SPARK-45491-tenp-errors-sqlstates-2.

Authored-by: srielau 
Signed-off-by: Wenchen Fan 
---
 common/utils/src/main/resources/error/README.md|  14 +-
 .../src/main/resources/error/error-classes.json| 174 ++---
 .../org/apache/spark/SparkThrowableSuite.scala |   4 +
 ...-conditions-unsupported-add-file-error-class.md |   2 +-
 ...itions-unsupported-default-value-error-class.md |   2 +-
 ...conditions-unsupported-generator-error-class.md |   2 +-
 ...or-conditions-unsupported-insert-error-class.md |   2 +-
 ...ions-unsupported-merge-condition-error-class.md |   2 +-
 ...conditions-unsupported-overwrite-error-class.md |   2 +-
 ...conditions-unsupported-save-mode-error-class.md |   2 +-
 docs/sql-error-conditions.md   | 126 ---
 .../spark/sql/errors/QueryExecutionErrors.scala|   2 +-
 .../ceil-floor-with-scale-param.sql.out|   8 +-
 .../sql-tests/analyzer-results/extract.sql.out |   4 +-
 .../analyzer-results/group-analytics.sql.out   |  14 +-
 .../named-function-arguments.sql.out   |   4 +-
 .../postgreSQL/window_part3.sql.out|   2 +
 .../table-valued-functions.sql.out |   7 +-
 .../udf/udf-group-analytics.sql.out|  14 +-
 .../results/ceil-floor-with-scale-param.sql.out|   8 +-
 .../resources/sql-tests/results/extract.sql.out|   4 +-
 .../sql-tests/results/group-analytics.sql.out  |  14 +-
 .../results/named-function-arguments.sql.out   |   4 +-
 .../results/postgreSQL/window_part3.sql.out|   2 +
 .../results/table-valued-functions.sql.out |   7 +-
 .../results/udf/udf-group-analytics.sql.out|  14 +-
 26 files changed, 279 insertions(+), 161 deletions(-)

diff --git a/common/utils/src/main/resources/error/README.md 
b/common/utils/src/main/resources/error/README.md
index 3402f1a..ac388c29250 100644
--- a/common/utils/src/main/resources/error/README.md
+++ b/common/utils/src/main/resources/error/README.md
@@ -868,6 +868,10 @@ The following SQLSTATEs are collated from:
 |42K0D|42   |Syntax error or Access Rule violation |K0D 
|Invalid lambda function |Spark  |N 
  |Spark
   |
 |42K0E|42   |Syntax error or Access Rule violation |K0E 
|An expression is not valid in teh context it is used|Spark  |N 
  |Spark
   |
 |42K0F|42   |Syntax error or Access Rule violation |K0F |A 
persisted object cannot reference a temporary object. |Spark  |N
   |Spark   
|
+|42K0G|42   |Syntax error or Access Rule violation |K0G |A 
protobuf is invalid   |Spark  |N
   |Spark   
|
+|42K0H|42   |Syntax error or Access Rule violation |K0H |A 
cyclic invocation has been detected.  |Spark  |N
   |Spark   
|
+|42K0I|42   |Syntax error or Access Rule violation |K0I 
|SQL Config not found.   |Spark  |N 
  |Spark
   |
+|42K0J|42   |Syntax error or Access Rule violation |K0J 
|Property not found. |Spark  |N 
  |Spark
   |
 |42KD0|42   |Syntax error or Access Rule violation |KD0 
|Ambiguous name reference.   |Databricks