[GitHub] [hudi] pengzhiwei2018 commented on pull request #1880: [WIP] [HUDI-1125] build framework to support structured streaming

2021-01-25 Thread GitBox


pengzhiwei2018 commented on pull request #1880:
URL: https://github.com/apache/hudi/pull/1880#issuecomment-766562247


   > Hello,
   > 
   > Hudi will have nice features like clustering and clustering probably will 
rewrite a lot of data, so is it possible this rewrites without new data doesn't 
affect downstream consumer of spark structured streaming?
   > 
   > It is something like delta lake has on compaction operation
   > 
   > https://docs.delta.io/latest/best-practices.html
   > 
   > On compaction has .option("dataChange", "false"), so the downstream 
consumer won't be affected.
   > 
   > Thank you.
   
   Hi @leesf  @n3nash @rubenssoto A new PR has proposed at 
https://github.com/apache/hudi/pull/2485, we can move the discuss there.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] pengzhiwei2018 commented on pull request #1880: [WIP] [HUDI-1125] build framework to support structured streaming

2021-01-24 Thread GitBox


pengzhiwei2018 commented on pull request #1880:
URL: https://github.com/apache/hudi/pull/1880#issuecomment-766562247


   > Hello,
   > 
   > Hudi will have nice features like clustering and clustering probably will 
rewrite a lot of data, so is it possible this rewrites without new data doesn't 
affect downstream consumer of spark structured streaming?
   > 
   > It is something like delta lake has on compaction operation
   > 
   > https://docs.delta.io/latest/best-practices.html
   > 
   > On compaction has .option("dataChange", "false"), so the downstream 
consumer won't be affected.
   > 
   > Thank you.
   
   Hi @leesf  @n3nash @rubenssoto A new PR has proposed at 
https://github.com/apache/hudi/pull/2485, we can move the discuss there.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] pengzhiwei2018 commented on pull request #1880: [WIP] [HUDI-1125] build framework to support structured streaming

2021-01-19 Thread GitBox


pengzhiwei2018 commented on pull request #1880:
URL: https://github.com/apache/hudi/pull/1880#issuecomment-762813931


   > > @yanghua @leesf Any update on this PR ?
   > 
   > @n3nash hi, about this work. @pengzhiwei2018 is taking over this.
   
   Hi @n3nash @leesf  I am still working on this Feature. Maybe the next week,I 
will provide a new version of struct streaming source.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org