Re: [I] [SUPPORT] Datasource incremental subsequent read same as first read [hudi]

2024-01-31 Thread via GitHub
parisni commented on issue #7846: URL: https://github.com/apache/hudi/issues/7846#issuecomment-1920644875 Thanks both for your insight. I am wondering if this behavior also apply for iceberg and delta. If not hudi might align to disable this cache by default. -- This is an automated messa

Re: [I] [SUPPORT] Datasource incremental subsequent read same as first read [hudi]

2024-01-31 Thread via GitHub
beyond1920 commented on issue #7846: URL: https://github.com/apache/hudi/issues/7846#issuecomment-1920428504 @parisni I agree with @ad1happy2go cache behavior happened in spark instead of HUDI. Spark would cache by `dbName`.`tableName`. https://github.com/apache/hudi/assets/1525333/2557fd

Re: [I] [SUPPORT] Datasource incremental subsequent read same as first read [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #7846: URL: https://github.com/apache/hudi/issues/7846#issuecomment-1919582070 @parisni This is similar issue related to Spark SQL cache the results. This is done to optimise subsequent reads from the table in the running terminal. -- This is an automated m

Re: [I] [SUPPORT] Datasource incremental subsequent read same as first read [hudi]

2024-01-31 Thread via GitHub
parisni commented on issue #7846: URL: https://github.com/apache/hudi/issues/7846#issuecomment-1919296148 we recently faced a more general problem with spark datasource where subsequent read.table("hudi_table") are cached and won't reflect hudi commits except if you restart the context (or

Re: [I] [SUPPORT] Datasource incremental subsequent read same as first read [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #7846: URL: https://github.com/apache/hudi/issues/7846#issuecomment-1918831478 adding @beyond1920 @yihua @nsivabalan for more insights here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] [SUPPORT] Datasource incremental subsequent read same as first read [hudi]

2024-01-26 Thread via GitHub
parisni commented on issue #7846: URL: https://github.com/apache/hudi/issues/7846#issuecomment-1912794241 @ad1happy2go thanks for the suggestion. Yes it does work. However I would say the caching should not happen. Hudi should deliver the last version of the table once available.