parisni commented on issue #7846:
URL: https://github.com/apache/hudi/issues/7846#issuecomment-1920644875
Thanks both for your insight. I am wondering if this behavior also apply for
iceberg and delta. If not hudi might align to disable this cache by default.
--
This is an automated messa
beyond1920 commented on issue #7846:
URL: https://github.com/apache/hudi/issues/7846#issuecomment-1920428504
@parisni I agree with @ad1happy2go cache behavior happened in spark instead
of HUDI. Spark would cache by `dbName`.`tableName`.
https://github.com/apache/hudi/assets/1525333/2557fd
ad1happy2go commented on issue #7846:
URL: https://github.com/apache/hudi/issues/7846#issuecomment-1919582070
@parisni This is similar issue related to Spark SQL cache the results. This
is done to optimise subsequent reads from the table in the running terminal.
--
This is an automated m
parisni commented on issue #7846:
URL: https://github.com/apache/hudi/issues/7846#issuecomment-1919296148
we recently faced a more general problem with spark datasource where
subsequent read.table("hudi_table") are cached and won't reflect hudi commits
except if you restart the context (or
ad1happy2go commented on issue #7846:
URL: https://github.com/apache/hudi/issues/7846#issuecomment-1918831478
adding @beyond1920 @yihua @nsivabalan for more insights here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and us
parisni commented on issue #7846:
URL: https://github.com/apache/hudi/issues/7846#issuecomment-1912794241
@ad1happy2go thanks for the suggestion. Yes it does work.
However I would say the caching should not happen. Hudi should deliver the
last version of the table once available.