added to contributors and you have the jira now ;)

On Fri, Nov 1, 2019 at 10:41 AM Gurudatt Kulkarni <guruak...@gmail.com>
wrote:

> Hi Vinoth,
> Sure, I can work on that issue, my jira username is gurudatt.
>
> On Fri, Nov 1, 2019 at 10:05 PM Vinoth Chandar <vin...@apache.org> wrote:
>
> > Hi Gurudatt,
> >
> > Right now, we are assuming longs to be unix_timestamp. It should very
> easy
> > to add support for milli seconds.
> > https://issues.apache.org/jira/browse/HUDI-324 Do you want to give it a
> > shot?
> > If not, i can give you a patch.For you, can also create your own key
> > extractor class ..
> >
> >
> > Thanks
> > Vinoth
> >
> > On Fri, Nov 1, 2019 at 5:12 AM Gurudatt Kulkarni <guruak...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > I have a use-case where the data from the database is being pulled by
> > Kafka
> > > Connect using JDBC Connector, and the data is in Avro format. I am
> trying
> > > to partition the table based on date, but the column is in long type.
> > Below
> > > is the avro schema for the column
> > >
> > >         {
> > >           "type": "long",
> > >           "connect.version": 1,
> > >           "connect.name": "org.apache.kafka.connect.data.Timestamp",
> > >           "logicalType": "timestamp-millis"
> > >         }
> > >
> > > I have added the following configuration for DeltaStreamer
> > >
> > > hoodie.datasource.write.partitionpath.field=datetime_column
> > >
> > >
> >
> hoodie.datasource.write.keygenerator.class=org.apache.hudi.utilities.keygen.TimestampBasedKeyGenerator
> > > hoodie.deltastreamer.keygen.timebased.timestamp.type=UNIX_TIMESTAMP
> > > hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd
> > > hoodie.datasource.hive_sync.partition_fields=datetime_column
> > >
> > > Data is getting written successfully, but it fails to sync partitions
> due
> > > to data type mismatch of the column. Also, I
> > > checked TimestampBasedKeyGenerator, it only supports timestamp in
> > seconds.
> > > So the timestamp generated is wrong.
> > >
> > > Here's the error
> > >
> > > 19/11/01 15:00:52 ERROR yarn.ApplicationMaster: User class threw
> > > exception: org.apache.hudi.hive.HoodieHiveSyncException: Failed to
> > > sync partitions for table table_name
> > > org.apache.hudi.hive.HoodieHiveSyncException: Failed to sync
> > > partitions for table table_name
> > >   at
> > > org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:172)
> > >   at
> > >
> org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:107)
> > >   at
> > > org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:71)
> > >   at
> > >
> >
> org.apache.hudi.utilities.deltastreamer.DeltaSync.syncHive(DeltaSync.java:440)
> > >   at
> > >
> >
> org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:382)
> > >   at
> > >
> >
> org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226)
> > >   at
> > >
> >
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:120)
> > >   at
> > >
> >
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:292)
> > >   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >   at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > >   at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >   at java.lang.reflect.Method.invoke(Method.java:498)
> > >   at
> > >
> >
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:688)
> > > Caused by: org.apache.hudi.hive.HoodieHiveSyncException: Failed in
> > > executing SQL ALTER TABLE default.table_name ADD IF NOT EXISTS
> > > PARTITION (`datetime_column`='11536-04-01') LOCATION
> > > 'hdfs:/tmp/hoodie/tables/table_name/11536/04/01'
> > >   at
> > >
> >
> org.apache.hudi.hive.HoodieHiveClient.updateHiveSQL(HoodieHiveClient.java:476)
> > >   at
> > >
> >
> org.apache.hudi.hive.HoodieHiveClient.addPartitionsToTable(HoodieHiveClient.java:139)
> > >   at
> > > org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:167)
> > >   ... 12 more
> > >
> > > Caused by: org.apache.hive.service.cli.HiveSQLException: Error while
> > > compiling statement: FAILED: SemanticException [Error 10248]: Cannot
> > > add partition column datetime_column of type string as it cannot be
> > > converted to type bigint
> > >
> > >   at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:241)
> > >   at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:227)
> > >   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:255)
> > >   at
> > >
> >
> org.apache.hudi.hive.HoodieHiveClient.updateHiveSQL(HoodieHiveClient.java:474)
> > >   ... 14 more
> > >
> > >
> > >
> > > Anyone who's faced this use-case before, and any thoughts on handling
> > this?
> > >
> > >
> > > Regards,
> > >
> > > Gurudatt
> > >
> >
>

Reply via email to