Hi everyone, In our team, we started exploring the integration of Hudi in Qbeast ( qbeast.io), and over the last few months, I've been diving into Hudi's internals to identify the best place for integration and the most suitable components to use.
In Qbeast, we’re introducing a new way of indexing data, which requires additional metadata for each Parquet file. Currently, we’re storing this metadata in the extraMetadata field of the commit files, as Hudi allows user-defined metadata there. However, I’m wondering if this is the best approach or if it would be better to store this information in the metadata table. >From my understanding: - The extraMetadata field is flexible and user-defined, which makes it easy to use for custom metadata. - The metadata table seems more "closed" and focused on specific system-level functionalities. Would it make more sense to continue using extraMetadata, or is there a recommended way to extend the metadata table to include custom fields like these? Any guidance or best practices would be greatly appreciated! Thanks in advance!