This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
     new 64242bf6a64 [SPARK-43380][SQL][FOLLOW-UP] Fix slowdown in Avro read
64242bf6a64 is described below

commit 64242bf6a6425274b83bc1191230437c2d3fbc71
Author: zeruibao <zerui....@databricks.com>
AuthorDate: Tue Oct 31 16:46:40 2023 -0700

    [SPARK-43380][SQL][FOLLOW-UP] Fix slowdown in Avro read
    
    ### What changes were proposed in this pull request?
    Fix slowdown in Avro read. There is a 
https://github.com/apache/spark/pull/42503 that causes the performance 
regression. It seems that `SQLConf.get.getConf(confKey)` is very costly. Move 
it out of `newWriter` function.
    
    ### Why are the changes needed?
    Need to fix the performance regression of Avro read.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    Existing UT test
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No
    
    Closes #43606 from zeruibao/SPARK-43380-FIX-SLOWDOWN.
    
    Authored-by: zeruibao <zerui....@databricks.com>
    Signed-off-by: Gengliang Wang <gengli...@apache.org>
    (cherry picked from commit 45f73bc69655a236323be1bcb2988341d2aa5203)
    Signed-off-by: Gengliang Wang <gengli...@apache.org>
---
 .../src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala  | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git 
a/connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
 
b/connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
index fe0bd7392b6..ec34d10a5ff 100644
--- 
a/connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
+++ 
b/connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
@@ -105,6 +105,9 @@ private[sql] class AvroDeserializer(
       s"Cannot convert Avro type $rootAvroType to SQL type 
${rootCatalystType.sql}.", ise)
   }
 
+  private lazy val preventReadingIncorrectType = !SQLConf.get
+    .getConf(SQLConf.LEGACY_AVRO_ALLOW_INCOMPATIBLE_SCHEMA)
+
   def deserialize(data: Any): Option[Any] = converter(data)
 
   /**
@@ -122,8 +125,6 @@ private[sql] class AvroDeserializer(
         s"schema is incompatible (avroType = $avroType, sqlType = 
${catalystType.sql})"
 
     val realDataType = SchemaConverters.toSqlType(avroType, 
useStableIdForUnionType).dataType
-    val confKey = SQLConf.LEGACY_AVRO_ALLOW_INCOMPATIBLE_SCHEMA
-    val preventReadingIncorrectType = !SQLConf.get.getConf(confKey)
 
     (avroType.getType, catalystType) match {
       case (NULL, NullType) => (updater, ordinal, _) =>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to