[ 
https://issues.apache.org/jira/browse/NIFI-15866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18075194#comment-18075194
 ] 

David Handermann commented on NIFI-15866:
-----------------------------------------

Thanks for describing the problem and including the stack trace with Avro 
Schema configuration. On initial review, it seems like the problem might be 
related to due the use of the choice type including both {{null}} and the 
logical type. I should be able to take a closer look soon.

> Inserting Date values in Iceberg tables results in Exception
> ------------------------------------------------------------
>
>                 Key: NIFI-15866
>                 URL: https://issues.apache.org/jira/browse/NIFI-15866
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 2.9.0
>            Reporter: Sönke Liebau
>            Priority: Minor
>
> The PutIcebergRecord processor currently does not support columns with type 
> Date. 
> When inserting in a table with a Date column, each time a ClassCastException 
> between java.sql.Date (from the Flowfile) and java.time.LocalDate (required 
> from PutIcebergRecord) is thrown.
> This behavior occurs with Avro as well as Parquet and their specific readers 
> configured in PutIcebergRecord, with a hardcorded Avro Schema referencing the 
> Date column.
>  
> This bug is very similar to an already fixed bug regarding 
> Datetimes/Timestamps: https://issues.apache.org/jira/browse/NIFI-15568.
>  
> *How to reproduce*
>  * Create an Iceberg table with one Date column
>  * Generate a record flowfile with GenerateFlowFile containing a value for 
> this column, for example as CSV or JSON. Give an explicit Avro Schema as 
> attribute in the GenerateFlowfile:
>  
> +CSV:+
>  
> {code:java}
> dateCol
> 234234
> {code}
>  
>  
> +Avro Schema:+
>  
> {code:java}
> {
>   "type": "record",
>   "name": "Document",
>   "namespace": "com.example",
>   "fields": [
>     {
>       "name": "dateCol",
>       "type": [
>         "null",
>         {
>           "type": "int",
>           "logicalType": "date"
>         }
>       ]
>     }
>   ]
> }{code}
>  
>  * Use a ConvertRecord to transform the CSV to an Avro flowfile, using a 
> AvroRecordSetWriter with setting „Use ‚Schema Text‘ Property“
>  * Write into Iceberg with PutIcebergRecord and a default AvroReader
>  
> Stacktrace: 
> {code:java}
> PutIcebergRecord[id=8e5f6f95-9ec0-37d7-ad34-9aa4500355f7] Write Rows to Table 
> [xyz.test] failed FlowFile[filename=b13a8af8-851e-4f00-93dd-69f40a36e7b5]: 
> java.lang.ClassCastException: class java.sql.Date cannot be cast to class 
> java.time.LocalDate (java.sql.Date is in module java.sql of loader 
> 'platform'; java.time.LocalDate is in module java.base of loader 'bootstrap') 
> java.lang.ClassCastException: class java.sql.Date cannot be cast to class 
> java.time.LocalDate (java.sql.Date is in module java.sql of loader 
> 'platform'; java.time.LocalDate is in module java.base of loader 
> 'bootstrap')at 
> org.apache.iceberg.data.parquet.GenericParquetWriter$DateWriter.write(GenericParquetWriter.java:91)at
>  
> org.apache.iceberg.parquet.ParquetValueWriters$OptionWriter.write(ParquetValueWriters.java:421)at
>  
> org.apache.iceberg.parquet.ParquetValueWriters$StructWriter.write(ParquetValueWriters.java:665)at
>  org.apache.iceberg.parquet.ParquetWriter.add(ParquetWriter.java:138)at 
> org.apache.iceberg.io.DataWriter.write(DataWriter.java:71)at 
> org.apache.iceberg.io.BaseTaskWriter$RollingFileWriter.write(BaseTaskWriter.java:401)at
>  
> org.apache.iceberg.io.BaseTaskWriter$RollingFileWriter.write(BaseTaskWriter.java:384)at
>  
> org.apache.iceberg.io.BaseTaskWriter$BaseRollingWriter.write(BaseTaskWriter.java:311)at
>  
> org.apache.iceberg.io.UnpartitionedWriter.write(UnpartitionedWriter.java:42)at
>  
> org.apache.nifi.services.iceberg.parquet.io.ParquetIcebergRowWriter.write(ParquetIcebergRowWriter.java:39)at
>  java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(Unknown 
> Source)at java.base/java.lang.reflect.Method.invoke(Unknown Source)at 
> org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:251)at
>  
> org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler$ProxiedReturnObjectInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:237)at
>  jdk.proxy515/jdk.proxy515.$Proxy706.write(Unknown Source)at 
> org.apache.nifi.processors.iceberg.PutIcebergRecord.writeRecords(PutIcebergRecord.java:240)at
>  
> org.apache.nifi.processors.iceberg.PutIcebergRecord.processFlowFiles(PutIcebergRecord.java:176)at
>  
> org.apache.nifi.processors.iceberg.PutIcebergRecord.onTrigger(PutIcebergRecord.java:156)at
>  
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)at
>  
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1274)at
>  
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:229)at
>  
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)at
>  org.apache.nifi.engine.FlowEngine.lambda$wrap$1(FlowEngine.java:105)at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown 
> Source)at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown 
> Source)at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
>  Source)at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
> Source)at java.base/java.lang.Thread.run(Unknown Source) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to