StructType data using no-serde mode script transform

GitBox Wed, 30 Dec 2020 07:24:17 -0800


AngersZhuuuu commented on a change in pull request #30957:
URL: https://github.com/apache/spark/pull/30957#discussion_r550230592




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/BaseScriptTransformationExec.scala
##########
@@ -329,14 +332,45 @@ case class ScriptTransformationIOSchema(
     schemaLess: Boolean) extends Serializable {
   import ScriptTransformationIOSchema._
 
-  val inputRowFormatMap = inputRowFormat.toMap.withDefault((k) => 
defaultFormat(k))
-  val outputRowFormatMap = outputRowFormat.toMap.withDefault((k) => 
defaultFormat(k))
+  val inputRowFormatMap = inputRowFormat.toMap.withDefault(k => 
defaultFormat(k))
+  val outputRowFormatMap = outputRowFormat.toMap.withDefault(k => 
defaultFormat(k))
+
+  val separators = (getByte(inputRowFormatMap("TOK_TABLEROWFORMATFIELD"), 
0.toByte) ::
+    getByte(inputRowFormatMap("TOK_TABLEROWFORMATCOLLITEMS"), 1.toByte) ::
+    getByte(inputRowFormatMap("TOK_TABLEROWFORMATMAPKEYS"), 2.toByte) :: Nil) 
++
+    (4 to 8).map(_.toByte)

Review comment:
       > we cannot write a custom parser for nested arrays just like a json 
parser?
   
   you mean similar like `DelimitedJSONSerDe` ? I think default way should like 
`LazySimpleSerde` and keep same with hive.
   Construct Json string should be done in spark sql transform(no-serde)'s own 
serde. @Alfozan have done this and he will raise pr.
   https://github.com/apache/spark/pull/29085#issuecomment-658131729




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

Reply via email to