hudi-bot opened a new issue, #14631: URL: https://github.com/apache/hudi/issues/14631
code implementation to support structured streaming ## JIRA info - Link: https://issues.apache.org/jira/browse/HUDI-1126 - Type: Sub-task --- ## Comments 06/Aug/20 07:47;linshan;After a few days of thinking, trial and error, I have no idea.My implementation is as follows {code:java} // code placeholder override def getBatch(start: Option[Offset], end: Offset): DataFrame = { val rdd:RDD[Row] = new IncrementalRelation(sqlContext, path.get, parameters, schema).buildScan() val rddInternalRow:RDD[InternalRow] = rdd.map(i=>InternalRow(i)) sqlContext.internalCreateDataFrame( rddInternalRow, schema, true) }{code} but it is error. {code:java} // code placeholder [ERROR] 2020-08-06 15:39:52,495(9152) --> [Executor task launch worker for task 3] org.apache.spark.internal.Logging$class.logError(Logging.scala:91): Exception in task 3.0 in stage 0.0 (TID 3) [ERROR] 2020-08-06 15:39:52,495(9152) --> [Executor task launch worker for task 3] org.apache.spark.internal.Logging$class.logError(Logging.scala:91): Exception in task 3.0 in stage 0.0 (TID 3) java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema cannot be cast to org.apache.spark.unsafe.types.UTF8String at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getUTF8String(rows.scala:46) at org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getUTF8String(rows.scala:195) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_0_0$(generated.java:91) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(generated.java:30) at org.apache.spark.s ql.execution.RDDScanExec$$anonfun$doExecute$2$$anonfun$apply$5.apply(ExistingRDD.scala:194) at org.apache.spark.sql.execution.RDDScanExec$$anonfun$doExecute$2$$anonfun$apply$5.apply(ExistingRDD.scala:192) at scala.collection.Iterator$$anon$11.next(Iterator.scala:410) at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:118) at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394) at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146) at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67) at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66) at org.apac he.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} I need help. thank you Other relevant links are below https://issues.apache.org/jira/browse/HUDI-1109 https://issues.apache.org/jira/browse/HUDI-1125 the pr [https://github.com/apache/hudi/pull/1880];;; -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
