[I] code implementation to support structured streaming [hudi]

via GitHub Sat, 29 Nov 2025 19:33:37 -0800


hudi-bot opened a new issue, #14631:
URL: https://github.com/apache/hudi/issues/14631


   code implementation to support structured streaming
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-1126
   - Type: Sub-task
   
   
   ---
   
   
   ## Comments
   
   06/Aug/20 07:47;linshan;After a few days of thinking, trial and error, I 
have no idea.My implementation is as follows
   {code:java}
   // code placeholder
   override def getBatch(start: Option[Offset], end: Offset): DataFrame = {
     val rdd:RDD[Row] = new IncrementalRelation(sqlContext, path.get, 
parameters, schema).buildScan()
     val rddInternalRow:RDD[InternalRow]  = rdd.map(i=>InternalRow(i))
     sqlContext.internalCreateDataFrame(
       rddInternalRow, schema,  true)
   }{code}
   but it is error.
   {code:java}
   // code placeholder
   [ERROR] 2020-08-06 15:39:52,495(9152) --> [Executor task launch worker for 
task 3] org.apache.spark.internal.Logging$class.logError(Logging.scala:91): 
Exception in task 3.0 in stage 0.0 (TID 3)  [ERROR] 2020-08-06 
15:39:52,495(9152) --> [Executor task launch worker for task 3] 
org.apache.spark.internal.Logging$class.logError(Logging.scala:91): Exception 
in task 3.0 in stage 0.0 (TID 3)  java.lang.ClassCastException: 
org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema cannot be cast 
to org.apache.spark.unsafe.types.UTF8String at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getUTF8String(rows.scala:46)
 at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getUTF8String(rows.scala:195)
 at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_0_0$(generated.java:91)
 at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(generated.java:30)
 at org.apache.spark.s
 
ql.execution.RDDScanExec$$anonfun$doExecute$2$$anonfun$apply$5.apply(ExistingRDD.scala:194)
 at 
org.apache.spark.sql.execution.RDDScanExec$$anonfun$doExecute$2$$anonfun$apply$5.apply(ExistingRDD.scala:192)
 at scala.collection.Iterator$$anon$11.next(Iterator.scala:410) at 
org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:118)
 at 
org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116)
 at 
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
 at 
org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146)
 at 
org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67)
 at 
org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66)
 at org.apac
 he.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at 
org.apache.spark.scheduler.Task.run(Task.scala:121) at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
 at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)
   {code}
   I need help. thank you
   
   Other relevant links are below
   
   https://issues.apache.org/jira/browse/HUDI-1109
   
   https://issues.apache.org/jira/browse/HUDI-1125
   
   the pr 
   
   [https://github.com/apache/hudi/pull/1880];;;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] code implementation to support structured streaming [hudi]

Reply via email to