Re: [DISCUSS] Storing Table Metadata in the Metastore

2025-05-27 Thread Michael Collado
I've been doing some experimenting with storing table metadata in the persistence, and I think I'm a fan, though I haven't yet collected real numbers. Personally, though I think storing as separate entities is the right approach rather than putting it in table properties. I appreciate that we don't

Re: [DISCUSS] Storing Table Metadata in the Metastore

2025-05-26 Thread Eric Maynard
> Loading internal properties requires parsing the whole text, which will include Iceberg metadata, even when the metadata is not used by the code accessing internal properties. This concern should definitely be considered in the design -- but it should also be borne out in benchmarks if there is

Re: [DISCUSS] Storing Table Metadata in the Metastore

2025-05-26 Thread Dmitri Bourlatchkov
> > An older version of the implementation took this approach, but some months > ago Dennis left comments suggesting that we use an internal property and > the PR was updated Apologies for missing out on earlier reviews on this change. Why do we need the best approach to the first implementation

Re: [DISCUSS] Storing Table Metadata in the Metastore

2025-05-23 Thread Jean-Baptiste Onofré
Hi Thanks Eric. We quickly discussed together about metadata json in London, so it’s aligned. If the caching makes sense, I wonder for the persistence layer. Maybe we should clearly state who is responsible of what. Overall it looks a great idea. Regards JB Le ven. 23 mai 2025 à 23:52, Yufei G

Re: [DISCUSS] Storing Table Metadata in the Metastore

2025-05-23 Thread Eric Maynard
An older version of the implementation took this approach, but some months ago Dennis left comments suggesting that we use an internal property and the PR was updated. Replies to your specific concerns inline: > Regarding the specific proposal of using a internal property of entities for holding t

Re: [DISCUSS] Storing Table Metadata in the Metastore

2025-05-23 Thread Dmitri Bourlatchkov
Thanks for starting this discussion, Eric! Overall, I think the idea of storing (some) table metadata in the Polaris database is very relevant and a sound approach to improving query performance on the engine side. Regarding the specific proposal of using a internal property of entities for holdi

Re: [DISCUSS] Storing Table Metadata in the Metastore

2025-05-23 Thread Yufei Gu
Thanks for doing this, Eric! It will boost performance a lot for tables with reasonable size metadata.json files. We also automatically get an in-memory cache since the Polaris entity is cached by default. Agreed to defer any separated caching mechanism so that we don't have to care about consisten

[DISCUSS] Storing Table Metadata in the Metastore

2025-05-23 Thread Eric Maynard
Hi all, Some time ago I opened this PR which proposes to store/cache TableMetadata in the Polaris metastore, avoiding a trip to object storage in many cases. Based on this recent comment