Hi,

https://github.com/apache/incubator-hudi/issues/547 
(https://link.getmailspring.com/link/[email protected]/0?redirect=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-hudi%2Fissues%2F547&recipient=ZGV2QGh1ZGkuYXBhY2hlLm9yZw%3D%3D)
 has resulted in the jira https://issues.apache.org/jira/browse/HUDI-12 
(https://link.getmailspring.com/link/[email protected]/1?redirect=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHUDI-12&recipient=ZGV2QGh1ZGkuYXBhY2hlLm9yZw%3D%3D).
The requirement is to be able to interpret timestamp from CSV and store it in 
the parquet table. Does anyone have a working example on these lines?
Going by the Hudi example from the GitHub:
Timestamp is being encoded in avro as double: 
https://github.com/apache/incubator-hudi/blob/master/hoodie-client/src/test/java/com/uber/hoodie/common/HoodieTestDataGenerator.java#L69
 
(https://link.getmailspring.com/link/[email protected]/2?redirect=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-hudi%2Fblob%2Fmaster%2Fhoodie-client%2Fsrc%2Ftest%2Fjava%2Fcom%2Fuber%2Fhoodie%2Fcommon%2FHoodieTestDataGenerator.java%23L69&recipient=ZGV2QGh1ZGkuYXBhY2hlLm9yZw%3D%3D)

The end result is that parquet field for timestamp is not of timestamp (INT96).

My best guess is that this would have been a requirement at Uber (tracking 
trips in minutes and seconds) and how is it being handled.

If anyone else has handled this and has an example that can be shared, it will 
be much appreciated.
Kabeer Ahmed, http://www.linkedin.com/in/kabeerahmed

Reply via email to