While it may be common knowledge among Iceberg devs that writing the root JSON file on commit is somewhat optional with a right catalog, what can we do in V4 to solve this problem for all? My problem is the suboptimal behavior that new users get by default with HMS or Hadoop catalogs and how this impacts their perception of Iceberg. We are doing a bunch of work for streaming (e.g. changelog scans, single file commits, etc), but the need to write the root JSON file may cancel all of that.
Let me throw some ideas out there. - Describe how catalogs can make the generation of the root metadata file optional in the spec. Ideally, implement that in a built-in catalog of choice as a reference implementation. - Offload portions of the root metadata file to external files and keep references to them. Thoughts? - Anton
