hudi-bot opened a new issue, #14718:
URL: https://github.com/apache/hudi/issues/14718

   Hi, all:
    We plan to use Hudi to sync mysql binlog data. There will be a flink ETL 
task to consume binlog records from kafka and save data to hudi every one hour. 
The binlog records are also grouped every one hour and all records of one hour 
will be saved in one commit. The data transmission pipeline should be like – 
binlog -> kafka -> flink -> parquet.
   
   After the data is synced to hudi, we want to querying the historical hourly 
versions of the Hudi table in hive SQL.
   
   Here is a more detailed description of our issue along with a simply design 
of Time Travel for Hudi, the design is under development and testing:
   
   
[https://docs.google.com/document/d/1r0iwUsklw9aKSDMzZaiq43dy57cSJSAqT9KCvgjbtUo/edit?usp=sharing]
   
   We have to support Time Travel ability recently for our business needs. We 
also have seen the [RFC 
07|https://cwiki.apache.org/confluence/display/HUDI/RFC+-+07+%3A+Point+in+time+Time-Travel+queries+on+Hudi+table].
    Be glad to receive any suggestion or dicussion.
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-1460
   - Type: New Feature
   
   
   ---
   
   
   ## Comments
   
   14/Dec/20 16:07;xleesf;[~qian heng] sorry would not access the google doc 
you provided, and it would be better if you would send a discuss email to dev 
ML. ;;;
   
   ---
   
   14/Dec/20 18:50;nishith29;[~qian heng] Like [~xleesf] pointed, even I was 
unable to access the google doc. Could you please start a discuss thread on the 
dev mailing list ? This will help you get feedback from other members as well. 
Based on that, we can see if this needs a separate RFC or we can make changes 
to RFC-07;;;
   
   ---
   
   15/Dec/20 00:02;vinoth;+1 if we can keep discussions to the mailing list and 
then onto the cWIki, that would be great. 
   
   Happy to provide any access/permissions as needed. ;;;
   
   ---
   
   15/Dec/20 06:20;qian heng;The doc is already available, sorry for the 
mistake;;;
   
   ---
   
   12/Mar/22 14:12;xushiyan;[~x1q1j1] can you please go through the description 
and design doc to see if any further work needed?;;;
   
   ---
   
   13/Mar/22 05:39;x1q1j1;hi [~qian heng] 1. SparkSQL already supports time 
travel to query Hudi table HUDI-3221
   2. Hive SQL needs to add syntax support to hive source code.(This priority 
will be implemented later than presto)
   
   3. Presto/Trino SQL implemented time travel to query Hudi table. (will be 
next);;;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to