[GitHub] [hudi] vinothchandar commented on pull request #2485: [HUDI-1109] Support Spark Structured Streaming read from Hudi table

2021-02-06 Thread GitBox


vinothchandar commented on pull request #2485:
URL: https://github.com/apache/hudi/pull/2485#issuecomment-774528989


   >how can we know the max commit_seq_no in the commit
   
   I think we should do what would be done for Kafka's case. or just use an 
accumulator to obtain this on each commit? Either way, lets file a follow up 
JIRA to allow record level streaming? We can do it in a follow up 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] vinothchandar commented on pull request #2485: [HUDI-1109] Support Spark Structured Streaming read from Hudi table

2021-02-03 Thread GitBox


vinothchandar commented on pull request #2485:
URL: https://github.com/apache/hudi/pull/2485#issuecomment-772797323


   @pengzhiwei2018 I am planning to spend sometime on this as well. 
   
   High level question. does the `offset` for the streaming read map to 
`_hoodie_commit_seq_no` in this implementation. This way we can actually do 
record level streams and even resume. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] vinothchandar commented on pull request #2485: [HUDI-1109] Support Spark Structured Streaming read from Hudi table

2021-01-25 Thread GitBox


vinothchandar commented on pull request #2485:
URL: https://github.com/apache/hudi/pull/2485#issuecomment-766593559


   cc @garyli1019 mind taking a first pass at this PR? :) 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] vinothchandar commented on pull request #2485: [HUDI-1109] Support Spark Structured Streaming read from Hudi table

2021-01-24 Thread GitBox


vinothchandar commented on pull request #2485:
URL: https://github.com/apache/hudi/pull/2485#issuecomment-766593559


   cc @garyli1019 mind taking a first pass at this PR? :) 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org