Hi folks, I'm new to the Iceberg community, currently contributing to Polaris OSS on the tagging design. Before going deeper into a design doc, I want to surface the direction on this list and invite early input from people with more context on how IRC-level concepts get shaped here.
Polaris users are asking for a classification primitive that covers compliance (PII, sensitivity, data domain), ownership and cost attribution, and AI or semantic hints on columns. My read is that we will build this regardless, but designing it inside Polaris alone reduces its value. Governance tools would need per-catalog adapters. If the shape is standardized at the IRC level, the ecosystem benefits far more broadly. Across catalogs and governance platforms, the tag concept has independently converged on a similar shape: a first-class Tag entity with identity (name + namespace), optional schema (allowed values, inheritability), and attachments to objects carrying a value. Snowflake tags, Unity Catalog governed tags, Google Cloud Dataplex tag templates, Apache Atlas classifications, Apache Gravitino tags, and DataHub tags all expose this pattern, across ownership, FinOps, AI reasoning, and governance use cases. When independent products converge, my read is that the shape is the natural decomposition rather than a vendor-specific artifact. Two adjacent efforts are already in flight. The read-restrictions proposal ( apache/iceberg#13879 <https://github.com/apache/iceberg/issues/13879>) delivers enforcement to engines. A Tag proposal would complement it as the classification input side, so catalogs can resolve tag-driven enforcement internally and deliver the outcome via read-restrictions. The labels proposal (apache/iceberg#15521 <https://github.com/apache/iceberg/issues/15521>) serves generic catalog-managed metadata. My read is that a first-class Tag with identity and lifecycle is distinct from labels; they solve different problems and can coexist. At a high level, I think the minimum valuable scope in the IRC spec is: a Tag entity with CRUD at the namespace level, tag attachments with target and value applied to tables, columns via field-id, views, and namespaces, a reverse lookup endpoint for "find objects with tag X", tag attachment retrieval via a dedicated endpoint, and a small set of normative clauses on privilege enforcement, visibility filtering, and rename atomicity. Resolved tags do not need to live in LoadTableResult. Things I'd like to keep out of the core spec as layered extensions, not first pass: typed multi-field per-attachment values (Atlas, Dataplex; addable non-breaking later), a Governed-vs-Standard type distinction (Unity Catalog's pattern can be expressed through configuration), and tag-to-policy binding (belongs in a separate Policy authoring phase). What I'm asking: early feedback on whether this direction fits the IRC roadmap, pointers to prior discussions I may have missed, and interest in co-championing from contributors outside Polaris. I'll follow up with a full design doc in the coming week. An issue placeholder is at apache/iceberg#16165 <https://github.com/apache/iceberg/issues/16165> for tracking. -ej
