added to contributors and you have the jira now ;) On Fri, Nov 1, 2019 at 10:41 AM Gurudatt Kulkarni <guruak...@gmail.com> wrote:
> Hi Vinoth, > Sure, I can work on that issue, my jira username is gurudatt. > > On Fri, Nov 1, 2019 at 10:05 PM Vinoth Chandar <vin...@apache.org> wrote: > > > Hi Gurudatt, > > > > Right now, we are assuming longs to be unix_timestamp. It should very > easy > > to add support for milli seconds. > > https://issues.apache.org/jira/browse/HUDI-324 Do you want to give it a > > shot? > > If not, i can give you a patch.For you, can also create your own key > > extractor class .. > > > > > > Thanks > > Vinoth > > > > On Fri, Nov 1, 2019 at 5:12 AM Gurudatt Kulkarni <guruak...@gmail.com> > > wrote: > > > > > Hi All, > > > > > > I have a use-case where the data from the database is being pulled by > > Kafka > > > Connect using JDBC Connector, and the data is in Avro format. I am > trying > > > to partition the table based on date, but the column is in long type. > > Below > > > is the avro schema for the column > > > > > > { > > > "type": "long", > > > "connect.version": 1, > > > "connect.name": "org.apache.kafka.connect.data.Timestamp", > > > "logicalType": "timestamp-millis" > > > } > > > > > > I have added the following configuration for DeltaStreamer > > > > > > hoodie.datasource.write.partitionpath.field=datetime_column > > > > > > > > > hoodie.datasource.write.keygenerator.class=org.apache.hudi.utilities.keygen.TimestampBasedKeyGenerator > > > hoodie.deltastreamer.keygen.timebased.timestamp.type=UNIX_TIMESTAMP > > > hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd > > > hoodie.datasource.hive_sync.partition_fields=datetime_column > > > > > > Data is getting written successfully, but it fails to sync partitions > due > > > to data type mismatch of the column. Also, I > > > checked TimestampBasedKeyGenerator, it only supports timestamp in > > seconds. > > > So the timestamp generated is wrong. > > > > > > Here's the error > > > > > > 19/11/01 15:00:52 ERROR yarn.ApplicationMaster: User class threw > > > exception: org.apache.hudi.hive.HoodieHiveSyncException: Failed to > > > sync partitions for table table_name > > > org.apache.hudi.hive.HoodieHiveSyncException: Failed to sync > > > partitions for table table_name > > > at > > > org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:172) > > > at > > > > org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:107) > > > at > > > org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:71) > > > at > > > > > > org.apache.hudi.utilities.deltastreamer.DeltaSync.syncHive(DeltaSync.java:440) > > > at > > > > > > org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:382) > > > at > > > > > > org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226) > > > at > > > > > > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:120) > > > at > > > > > > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:292) > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > at > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > > at > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > > at java.lang.reflect.Method.invoke(Method.java:498) > > > at > > > > > > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:688) > > > Caused by: org.apache.hudi.hive.HoodieHiveSyncException: Failed in > > > executing SQL ALTER TABLE default.table_name ADD IF NOT EXISTS > > > PARTITION (`datetime_column`='11536-04-01') LOCATION > > > 'hdfs:/tmp/hoodie/tables/table_name/11536/04/01' > > > at > > > > > > org.apache.hudi.hive.HoodieHiveClient.updateHiveSQL(HoodieHiveClient.java:476) > > > at > > > > > > org.apache.hudi.hive.HoodieHiveClient.addPartitionsToTable(HoodieHiveClient.java:139) > > > at > > > org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:167) > > > ... 12 more > > > > > > Caused by: org.apache.hive.service.cli.HiveSQLException: Error while > > > compiling statement: FAILED: SemanticException [Error 10248]: Cannot > > > add partition column datetime_column of type string as it cannot be > > > converted to type bigint > > > > > > at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:241) > > > at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:227) > > > at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:255) > > > at > > > > > > org.apache.hudi.hive.HoodieHiveClient.updateHiveSQL(HoodieHiveClient.java:474) > > > ... 14 more > > > > > > > > > > > > Anyone who's faced this use-case before, and any thoughts on handling > > this? > > > > > > > > > Regards, > > > > > > Gurudatt > > > > > >