[ 
https://issues.apache.org/jira/browse/BEAM-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17549077#comment-17549077
 ] 

Danny McCormick commented on BEAM-10934:
----------------------------------------

This issue has been migrated to https://github.com/apache/beam/issues/20685

> handling Date type in HCatToRow
> -------------------------------
>
>                 Key: BEAM-10934
>                 URL: https://issues.apache.org/jira/browse/BEAM-10934
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-hcatalog, sdk-java-core
>            Reporter: chie hayashida
>            Priority: P3
>              Labels: Clarified, starter
>
> When I convert HCatRecord include Date type record to Row, it failed with the 
> following errors.
> * the code
> ```
>     PCollection<Row> p =
>         pipeline
>             /*
>              * Step #1: Read hive table rows from Hive.
>              */
>             .apply(
>                 "Read from Hive source",
>                     HCatToRow.fromSpec(
>                             HCatalogIO.read()
>                                     .withConfigProperties(configProperties)
>                                     
> .withDatabase(options.getHiveDatabaseName())
>                                     .withTable(options.getHiveTableName())
>                                     .withFilter(options.getFilterString())));
> ```
> * error log
> ```
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.IllegalArgumentException: For field name submissiondate and 
> DATETIME type got unexpected class class java.sql.Date
>         at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:348)
>         at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:318)
>         at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:213)
>         at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
>         at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>         at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303)
>         at 
> com.google.cloud.teleport.v2.templates.HiveToBigQuery.run(HiveToBigQuery.java:234)
>         at 
> com.google.cloud.teleport.v2.templates.HiveToBigQuery.main(HiveToBigQuery.java:176)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalArgumentException: For field name submissiondate 
> and DATETIME type got unexpected class class java.sql.Date
>         at org.apache.beam.sdk.values.Row$Builder.verifyDateTime(Row.java:828)
>         at 
> org.apache.beam.sdk.values.Row$Builder.verifyPrimitiveType(Row.java:755)
>         at org.apache.beam.sdk.values.Row$Builder.verify(Row.java:654)
>         at org.apache.beam.sdk.values.Row$Builder.verify(Row.java:635)
>         at org.apache.beam.sdk.values.Row$Builder.build(Row.java:840)
>         at 
> org.apache.beam.sdk.io.hcatalog.HCatToRow$HCatToRowFn.processElement(HCatToRow.java:84)
> ```
> It occurs because HCatalogIO reads Date type as java.sql.Date in HCatRecord, 
> but Row class doesn't support Date and HCatToRow doesn't care about it.
> I think there are two solution about it.
> 1. Row type supports Date type(java.util.Date or java.sql.Date)
>    I don't know another IO classes enough, but there may be another IO 
> classes which has same problem, and this solution may be able to solve those 
> problem.
> 2. Add logic to convert Date type to Datetime type in HCatToRow
> The impact of change will be smaller then 1. because it doesn't change Row 
> class.
> Which would be better?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to