> It introduces a prefix hierarchy - /db1/my/table1/ in your example - that doesn't exist conceptually.
I'm less concerned about this, as users can introduce a similar hierarchy with custom locations. The location overlap check will occur in both cases: custom location and extra hierarchy derived from the naming. "worth considering," is a typo introduced by Gmail autocorrection, it flips the logic without my notice :-(. My original words are "still concerning". Let me reiterate my thoughts for clarity: the slash (/) is supported by certain engines (e.g., Spark) for cosmetic benefits (users can include the slash in the name) but it adds burdens for the whole ecosystem. We may reject it by default. Yufei On Mon, Apr 27, 2026 at 9:36 AM Alexandre Dutra <[email protected]> wrote: > Hi again, > > I have been reviewing the documentation for various storage providers > [1] [2] [3] to identify which characters they restrict or advise > against using. > > The slash remains the most prominent issue because obviously, it's > accepted by all storage providers, but it has a special meaning for > Polaris-created locations. > > That said, other characters may cause trouble as well. I wonder if we > shouldn't add them to the list of forbidden chars: > > - Control characters > - Backslash `\` > - Path segments equal to `.` or `..` > - Commonly discouraged symbols: * ? " < > | # > > Given that most storage providers already reject or discourage these, > formalizing their exclusion seems like a safe step. Prohibiting these > characters explicitly prevents issues with invalid locations that > could hinder client access, while simultaneously addressing potential > security vulnerabilities. > > What do you all think? > > Thanks, > Alex > > [1]: > https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html > [2]: https://docs.cloud.google.com/storage/docs/objects#naming > [3]: > https://learn.microsoft.com/en-us/rest/api/storageservices/naming-and-referencing-containers--blobs--and-metadata > > On Sun, Apr 26, 2026 at 5:15 PM Alexandre Dutra <[email protected]> wrote: > > > > Hi Yufei, > > > > > the name is persisted verbatim in Polaris's catalog entity and baked > as a directory boundary in the S3 location (s3://bucket123/db1/my/table1/…) > > > > While your research suggests this is a positive outcome, in fact this > > is *exactly* why I am concerned about using slashes. It introduces a > > prefix hierarchy - /db1/my/table1/ in your example - that doesn't > > exist conceptually. > > > > I'm also finding the conclusion of your research a bit unclear. > > Although it mentions the slash is "worth considering," it then > > provides three arguments against it before ultimately suggesting it's > > "not worth fighting." And among the 3 action items your research > > recommends, the first two are already implemented in the PR. > > > > About the feature flag idea: in my opinion, a feature flag is only > > viable if we also strengthen the URL construction logic; otherwise, I > > believe slashes should be prohibited unconditionally. > > > > Thanks, > > Alex > > > > > > On Fri, Apr 24, 2026 at 8:43 PM Yufei Gu <[email protected]> wrote: > > > > > > Thanks for the PR, Alex! I researched whether we should block slack in > the > > > table name. > > > > > > Here is what I tested. Created *db1.my/table1 <http://db1.my/table1>* > in a > > > Polaris quickstart catalog (RustFS-backed, in-memory metastore) and > > > exercised it against three client surfaces. All three surfaces work > well: > > > 1. Iceberg REST API via curl. Create, list, and load all worked. The > > > slash must be percent-encoded as %2F in the path (e.g. > > > .../tables/my%2Ftable1); the name is persisted verbatim in > > > Polaris's catalog entity and baked as a directory boundary in the S3 > > > location (s3://bucket123/db1/my/table1/…). > > > 2. PyIceberg (RestCatalog). list_namespaces, list_tables, > load_table, and > > > scan().to_arrow() round-tripped the slash correctly end-to-end, > including > > > fetching metadata JSON from storage with vended credentials. > > > 3. Spark SQL. The name is addressable via single-part backticks: > > > polaris.db1.`my/table1`. Other engines need their own quoting (Trino: > > > double quotes, etc.). > > > > > > Why the slash is still worth considering: > > > > > > - URI-level fragility. %2F is a reserved character; intermediaries > > > routinely reject it (Apache default `AllowEncodedSlashes Off` > results in a > > > 404, ALB results in a 400) or silently normalize it to / (some nginx > > > configs, API Gateway REST, CloudFront), which would dispatch the > request to > > > a different namespace/table entirely. These failures surface only > once a > > > proxy/WAF/CDN is in the call path. > > > - Storage-layout collision. Polaris builds default locations as > > > <warehouse>/<namespace>/<name>. A table named my/table1 shares a > prefix > > > with a hypothetical future namespace db1.my, which could let vended > > > credentials for one leak into the blast radius of another. > > > - Engine quoting drift and bad UX. Every downstream engine has its > own > > > identifier-quoting rules. Slashes survive in Spark with backticks > and in > > > Trino with double quotes, but tools, dashboards, and DDL generators > > > frequently drop or mangle them. Users has to think about which > quote to > > > use. > > > > > > *My recommendation: Not worth fighting. *The features work today in > > > isolated testing, but keeping them working requires every future hop, > like > > > proxy, WAF, CDN, ingress, engine, and SDK to handle URLs exactly > > > right, forever. The upside is purely cosmetic (the slash in the name). > I > > > suggest putting the restriction behind a feature flag, defaulted to > reject. > > > Here are action items: > > > > > > - Validate table and namespace names server-side at create time > which > > > the PR does already. > > > - Reject with a clear 400 and an error message pointing to the flag. > > > - Flag can be flipped on per realm for teams that genuinely need > exotic > > > names, with a documented warning about proxy-chain testing. > > > > > > This gets us the robustness benefits immediately, keeps the door open > for > > > backward compatibility and niche use cases, and avoids a long tail of > "it > > > works on my laptop, fails in prod" tickets. WDYT? > > > > > > Yufei > > > > > > > > > On Thu, Apr 23, 2026 at 6:07 AM Alexandre Dutra <[email protected]> > wrote: > > > > > > > Hi Yufei, > > > > > > > > Yes, I think we can view storage location sanitizing as a parallel > effort. > > > > > > > > With that, here is a simple PR that aims at forbidding slashes and a > > > > few other pathological cases for Iceberg and Generic Tables entities > > > > at creation time: > > > > > > > > https://github.com/apache/polaris/pull/4282 > > > > > > > > Thanks, > > > > Alex > > > > > > > > On Thu, Apr 23, 2026 at 1:14 AM Yufei Gu <[email protected]> > wrote: > > > > > > > > > > Hi Alex, it's a good point that the storage location build is also > > > > > affected, but it feels less controversial and somewhat separate > from the > > > > > main question here. > > > > > > > > > > The immediate discussion, at least from my perspective, is about > entity > > > > > naming guardrails and externally visible behavior, for example > preventing > > > > > names that are ambiguous or likely to break REST access and cross > client > > > > > behavior. > > > > > > > > > > Storage location construction is important too, but that feels > more like > > > > an > > > > > internal implementation hardening task than a spec or user-facing > > > > semantics > > > > > question. I would view it as a parallel track rather than > something that > > > > > should block agreement on the narrower entity name issue. I'm also > fine > > > > if > > > > > someone wants to tackle the location building issue first. That > could > > > > > provide useful context for resolving the user-facing naming > questions. > > > > > > > > > > Yufei > > > > > > > > > > > > > > > On Wed, Apr 22, 2026 at 8:28 AM Alexandre Dutra <[email protected] > > > > > > wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > Disallowing the most problematic cases seems the right way to > go. I > > > > > > can provide a PR to quickly implement that. > > > > > > > > > > > > However, we must keep in mind that disallowing a few chars will > not > > > > > > solve all our problems. IMHO we need to consistently replace all > > > > > > string concatenations that we use today for creating storage > locations > > > > > > with a proper location builder that will take care of proper path > > > > > > escaping and sanitization. That part of the job is way more > complex, > > > > > > due to the blast radius. > > > > > > > > > > > > Thanks, > > > > > > Alex > > > > > > > > > > > > > > > > > > On Wed, Apr 22, 2026 at 2:07 AM Yufei Gu <[email protected]> > wrote: > > > > > > > > > > > > > > Sorry for jumping into this thread a bit late. > > > > > > > > > > > > > > I’m supportive of introducing some guardrails for namespace and > > > > table or > > > > > > > view names. Specifically, I think we should disallow a few > > > > problematic > > > > > > > cases to avoid ambiguity and downstream issues: > > > > > > > > > > > > > > - Disallow the slash character “/” > > > > > > > - Disallow empty strings > > > > > > > - Disallow leading or trailing whitespace > > > > > > > > > > > > > > These constraints seem reasonable given the interactions > across REST, > > > > > > > storage paths, and different client behaviors. Adding clear > > > > guardrails > > > > > > > early can prevent subtle bugs and inconsistencies later on. > Curious > > > > to > > > > > > hear > > > > > > > if others see any concerns or edge cases with this approach. > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Yufei > > > > > > > > > > > > > > > > > > > > > On Thu, Apr 16, 2026 at 9:11 AM Alexandre Dutra < > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > > > > Do you think it's worth having a separate discussion about > > > > > > guardrails for > > > > > > > > namespace elements and table/view names? [...] > > > > > > > > > > > > > > > > Completely agree here. I think the slash character in > particular > > > > > > > > should definitely be banned. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Alex > > > > > > > > > > > > > > > > On Thu, Apr 16, 2026 at 6:03 PM Dmitri Bourlatchkov < > > > > [email protected]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Do you think it's worth having a separate discussion > about > > > > > > guardrails > > > > > > > > for > > > > > > > > > namespace elements and table/view names? [...] > > > > > > > > > > > > > > > > > > Definitely! > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > Dmitri. > > > > > > > > > > > > > > > > > > On Thu, Apr 16, 2026 at 6:57 AM Robert Stupp < > [email protected]> > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > spark-sql ()> create namespace `n/s`; > > > > > > > > > > > However, the S3 location in this case gets a proper > directory > > > > > > > > breakdown: > > > > > > > > > > > ... and table metadata has: > "location":"s3://pol/n/s/t1" > > > > > > > > > > > ... but that is probably a different issue. > > > > > > > > > > > > > > > > > > > > Yea, it's different from the URL en/decoding topic. Do > you > > > > think > > > > > > it's > > > > > > > > worth > > > > > > > > > > having a separate discussion about guardrails for > namespace > > > > > > elements > > > > > > > > and > > > > > > > > > > table/view names? For example, disallowing '/', > disallowing > > > > > > empty/blank > > > > > > > > > > namespace elements and table/view names, disallowing > > > > > > leading/trailing > > > > > > > > > > whitespaces? Sure, some of these checks already happen, > but > > > > not at > > > > > > > > every > > > > > > > > > > level/layer (defense-in-depth). > > > > > > > > > > > > > > > > > > > > > when Iceberg itself will introduce configurable > separators, > > > > we > > > > > > MAY > > > > > > > > ask > > > > > > > > > > ourselves if Polaris should allow them to beconfigurable > or > > > > not. > > > > > > [...] > > > > > > > > > > separator is just a REST layer thing > > > > > > > > > > > > > > > > > > > > True, the separator is a primarily a REST-layer namespace > > > > > > en/decoding > > > > > > > > > > thing. What worries me slightly is that (existing) > namespace > > > > > > elements > > > > > > > > with > > > > > > > > > > the configured separator character could become > inaccessible. > > > > > > However, > > > > > > > > > > "configurable separator" is IMO a different discussion. > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Robert > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 15, 2026 at 8:20 PM Dmitri Bourlatchkov < > > > > > > [email protected]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi All, > > > > > > > > > > > > > > > > > > > > > > My understanding of the need to make namespace > separators > > > > > > > > configurable is > > > > > > > > > > > that there exist a rather narrow set of deployment > cases > > > > where > > > > > > the > > > > > > > > ASCII > > > > > > > > > > > "0x1F" (unit separator) character is not permitted in > URL > > > > paths > > > > > > by > > > > > > > > some > > > > > > > > > > > infrastructure components. > > > > > > > > > > > > > > > > > > > > > > It might be worth allowing users to define a different > > > > > > separator, but > > > > > > > > > > since > > > > > > > > > > > no one has brought this up yet, I assume it is not a > > > > priority. > > > > > > > > > > > > > > > > > > > > > > In any case, using a different separator is completely > a > > > > REST API > > > > > > > > > > > concern and should not affect how Polaris stores data > > > > internally. > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > Dmitri. > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 15, 2026 at 2:03 PM Alexandre Dutra < > > > > > > [email protected]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > > > > > > I wonder how namespace elements and table/view > names > > > > with a > > > > > > slash > > > > > > > > > > ('/') > > > > > > > > > > > > character in the middle behave. Or other characters > like > > > > '&' or > > > > > > > > '?' or > > > > > > > > > > > '#'. > > > > > > > > > > > > > > > > > > > > > > > > For the REST layer, these will be percent-encoded, > and > > > > with my > > > > > > PR > > > > > > > > to > > > > > > > > > > > > fix a double-decoding issue, these characters > "survive" the > > > > > > REST > > > > > > > > layer > > > > > > > > > > > > just fine. > > > > > > > > > > > > > > > > > > > > > > > > The issue now is in some layers beneath: as I > pointed out > > > > and > > > > > > as > > > > > > > > > > > > Dmitri demonstrated, we are unfortunately > concatenating > > > > > > identifiers > > > > > > > > > > > > together to create storage locations, without proper > > > > escaping. > > > > > > This > > > > > > > > > > > > currently results in corrupted storage locations. > > > > > > > > > > > > > > > > > > > > > > > > I'm trying first to fix the REST layer first, then > I'll > > > > move > > > > > > to the > > > > > > > > > > > > storage layer. > > > > > > > > > > > > > > > > > > > > > > > > > What's your take on leveraging > > > > > > > > > > jakarta.ws.rs.ext.ParamConverterProvider > > > > > > > > > > > > / jakarta.ws.rs.ext.ParamConverter for the path > parameters > > > > and > > > > > > have > > > > > > > > > > > > centralized helpers that deal with "proper" URL > > > > > > encoding/decoding? > > > > > > > > > > > > > > > > > > > > > > > > For now I don't see a valid usage in Polaris for > that, > > > > since > > > > > > Jersey > > > > > > > > > > > > handles decoding path parameters already. > > > > > > > > > > > > > > > > > > > > > > > > > I also agree that the "configurable namespace > separator" > > > > must > > > > > > > > never > > > > > > > > > > > > change. Is my assumption correct, that it must > always be > > > > the > > > > > > same > > > > > > > > > > > character > > > > > > > > > > > > as it is today? > > > > > > > > > > > > > > > > > > > > > > > > In Polaris, we are using the namespace separator in > two > > > > > > different > > > > > > > > use > > > > > > > > > > > > cases: > > > > > > > > > > > > > > > > > > > > > > > > 1) For path parameters in the REST layer > > > > > > > > > > > > 2) For storing namespaces in Polaris entities > > > > > > > > > > > > > > > > > > > > > > > > What is clear is that in the second use case, the > namespace > > > > > > must > > > > > > > > NEVER > > > > > > > > > > > > change. I just opened a PR for that: > > > > > > > > > > > > https://github.com/apache/polaris/pull/4214 > > > > > > > > > > > > > > > > > > > > > > > > Regarding the first use case, once we solve all our > > > > > > > > encoding/decoding > > > > > > > > > > > > issues, and when Iceberg itself will introduce > configurable > > > > > > > > > > > > separators, we MAY ask ourselves if Polaris should > allow > > > > them > > > > > > to be > > > > > > > > > > > > configurable or not. I don't have strong opinions, > but if > > > > the > > > > > > > > > > > > separator is just a REST layer thing, it should be > > > > possible to > > > > > > > > change > > > > > > > > > > > > it without breaking the storage layer or the > metastore. > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 15, 2026 at 7:47 PM Dmitri Bourlatchkov < > > > > > > > > [email protected]> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > Hi All, > > > > > > > > > > > > > > > > > > > > > > > > > > Slashes in namespace seem to work fine (Spark 3.5 + > > > > Iceberg > > > > > > > > 1.10.0): > > > > > > > > > > > > > > > > > > > > > > > > > > spark-sql ()> create namespace `n/s`; > > > > > > > > > > > > > Time taken: 0.335 seconds > > > > > > > > > > > > > spark-sql ()> show namespaces; > > > > > > > > > > > > > `n/s` > > > > > > > > > > > > > Time taken: 0.232 seconds, Fetched 1 row(s) > > > > > > > > > > > > > spark-sql ()> use `n/s`; > > > > > > > > > > > > > Time taken: 0.028 seconds > > > > > > > > > > > > > spark-sql (`n/s`)> create table t1 (n string); > > > > > > > > > > > > > Time taken: 0.702 seconds > > > > > > > > > > > > > > > > > > > > > > > > > > The URLs appear to be encoded properly, e.g. (from > > > > Polaris > > > > > > log): > > > > > > > > > > > > > > > > > > > > > > > > > > 2026-04-15 13:41:17,594 INFO > [io.qua.htt.access-log] > > > > > > > > > > > > > > > > > > > > > > [dee1505c-ec1d-4f90-a9de-154eac66a40c_0000000000000000013,POLARIS] > > > > > > > > > > > [,,,] > > > > > > > > > > > > > (executor-thread-1) 127.0.0.1 - root > > > > [15/Apr/2026:13:41:17 > > > > > > -0400] > > > > > > > > > > "GET > > > > > > > > > > > > > > > > > /api/catalog/v1/polaris/namespaces/n%2Fs/tables?pageToken= > > > > > > > > HTTP/1.1" > > > > > > > > > > > 200 > > > > > > > > > > > > 74 > > > > > > > > > > > > > > > > > > > > > > > > > > I did not test trickier chars, but adding CI > coverage for > > > > > > them > > > > > > > > would > > > > > > > > > > be > > > > > > > > > > > > > good. > > > > > > > > > > > > > > > > > > > > > > > > > > However, the S3 location in this case gets a proper > > > > directory > > > > > > > > > > > breakdown: > > > > > > > > > > > > > > > > > > > > > > > > > > $ mc ls rustfs/pol/n/s > > > > > > > > > > > > > [2026-04-15 13:44:37 EDT] 0B t1/ > > > > > > > > > > > > > > > > > > > > > > > > > > ... and table metadata has: > "location":"s3://pol/n/s/t1" > > > > > > > > > > > > > > > > > > > > > > > > > > ... but that is probably a different issue. > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > Dmitri. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 15, 2026 at 10:35 AM Robert Stupp < > > > > > > [email protected]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks Alex for the thorough investigation! > > > > > > > > > > > > > > > > > > > > > > > > > > > > URL en/decoding is really not that easy. > > > > > > > > > > > > > > I wonder how namespace elements and table/view > names > > > > with a > > > > > > > > slash > > > > > > > > > > > ('/') > > > > > > > > > > > > > > character in the middle behave. Or other > characters > > > > like > > > > > > '&' > > > > > > > > or '?' > > > > > > > > > > > or > > > > > > > > > > > > '#'. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Overall, I agree with your idea to implement > correct > > > > URL > > > > > > > > > > > > encoding/decoding > > > > > > > > > > > > > > in the Polaris code base to protect Polaris from > > > > upstream > > > > > > > > behavior > > > > > > > > > > > > changes > > > > > > > > > > > > > > that can seriously break or even corrupt things. > > > > > > > > > > > > > > > > > > > > > > > > > > > > What's your take on leveraging > > > > > > > > > > > jakarta.ws.rs.ext.ParamConverterProvider > > > > > > > > > > > > > > / jakarta.ws.rs.ext.ParamConverter for the path > > > > parameters > > > > > > and > > > > > > > > have > > > > > > > > > > > > > > centralized helpers that deal with "proper" URL > > > > > > > > encoding/decoding? > > > > > > > > > > > > > > > > > > > > > > > > > > > > I also agree that the "configurable namespace > > > > separator" > > > > > > must > > > > > > > > never > > > > > > > > > > > > change. > > > > > > > > > > > > > > Is my assumption correct, that it must always be > the > > > > same > > > > > > > > character > > > > > > > > > > > as > > > > > > > > > > > > it > > > > > > > > > > > > > > is today? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > Robert > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 15, 2026 at 3:48 PM Alexandre Dutra < > > > > > > > > [email protected] > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > FYI I created a first PR to address the > > > > double-decoding > > > > > > > > issue: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/polaris/pull/4210 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 14, 2026 at 9:56 PM Alexandre > Dutra < > > > > > > > > > > [email protected] > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I would also point out that Polaris uses > > > > > > > > > > RESTUtil.encodeNamespace > > > > > > > > > > > > and > > > > > > > > > > > > > > > > RESTUtil.decodeNamespace for encoding and > decoding > > > > the > > > > > > > > parent > > > > > > > > > > > > > > > > namespace within a NamespaceEntity [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > These methods also exhibit the faulty space > > > > encoding > > > > > > > > behavior. > > > > > > > > > > > > > > > > Therefore, we must exercise **extreme > caution** > > > > > > regarding > > > > > > > > any > > > > > > > > > > > > upcoming > > > > > > > > > > > > > > > > Iceberg project fixes for space-encoding > issues. If > > > > > > these > > > > > > > > > > methods > > > > > > > > > > > > are > > > > > > > > > > > > > > > > modified, it is imperative that we retain the > > > > legacy > > > > > > > > versions > > > > > > > > > > > > > > > > specifically for encoding and decoding > > > > NamespaceEntity > > > > > > > > > > > properties – > > > > > > > > > > > > > > > > otherwise we could end up with a corrupted > > > > database. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The same goes for the future namespace > separator > > > > coming > > > > > > > > with > > > > > > > > > > > > Iceberg > > > > > > > > > > > > > > > > 1.11: for the sake of encoding and decoding > > > > > > NamespaceEntity > > > > > > > > > > > > > > > > properties, the separator must never change. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I would actually be in favor of proactively > > > > > > internalizing > > > > > > > > the > > > > > > > > > > > > > > > > encoding/decoding algorithm used in > > > > NamespaceEntity. > > > > > > What > > > > > > > > do > > > > > > > > > > you > > > > > > > > > > > > > > > > think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1]: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/polaris/blob/8ad8f74f62258ab6238190271603e4d4c8a75998/polaris-core/src/main/java/org/apache/polaris/core/entity/NamespaceEntity.java#L92 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 14, 2026 at 7:43 PM Alexandre > Dutra < > > > > > > > > > > > [email protected] > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > A discussion on the Iceberg ML [1] recently > > > > > > highlighted > > > > > > > > that > > > > > > > > > > > URL > > > > > > > > > > > > path > > > > > > > > > > > > > > > > > segments are not being decoded correctly > > > > according > > > > > > to RFC > > > > > > > > > > 3986, > > > > > > > > > > > > > > > > > specifically regarding space encoding. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I investigated the situation in Polaris, > and > > > > found > > > > > > many > > > > > > > > > > > problems: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > TLDR > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Table names with the + sign can be > created but > > > > > > cannot > > > > > > > > be > > > > > > > > > > > > retrieved > > > > > > > > > > > > > > > > > - Namespace names with the + sign are OK > (can be > > > > > > created > > > > > > > > and > > > > > > > > > > > > > > retrieved) > > > > > > > > > > > > > > > > > - Table names with spaces cannot be created > > > > > > > > > > > > > > > > > - Namespace names with spaces cannot be > created > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > DISCUSSION > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Table names such as "foo+bar" can be > created (via > > > > > > POST, > > > > > > > > where > > > > > > > > > > > the > > > > > > > > > > > > > > name > > > > > > > > > > > > > > > > > is in the request body). But they cannot be > > > > > > retrieved: > > > > > > > > when > > > > > > > > > > > > reading > > > > > > > > > > > > > > > > > tables, the name is part of the URL path. > Polaris > > > > > > > > incorrectly > > > > > > > > > > > > > > performs > > > > > > > > > > > > > > > > > a second decoding step using > > > > > > > > RESTUtil.decodeString(table), > > > > > > > > > > even > > > > > > > > > > > > > > though > > > > > > > > > > > > > > > > > the REST framework has already decoded it. > > > > > > Consequently, > > > > > > > > a > > > > > > > > > > > client > > > > > > > > > > > > > > > > > sends "foo%2Bbar" which is first decoded to > > > > > > "foo+bar" by > > > > > > > > the > > > > > > > > > > > > > > framework > > > > > > > > > > > > > > > > > (correct) and then re-decoded by Polaris > to "foo > > > > bar" > > > > > > > > > > > > (incorrect), > > > > > > > > > > > > > > > > > resulting in a "not found" error. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Table and namespace names like "foo bar" > simply > > > > > > cannot be > > > > > > > > > > > > created at > > > > > > > > > > > > > > > > > all. This is because in > > > > > > > > > > > > IcebergCatalog.defaultWarehouseLocation() and > > > > > > > > > > > > > > > > > other similar places, we create locations > merely > > > > by > > > > > > > > joining > > > > > > > > > > > > > > > > > identifiers together, without any form of > URL > > > > > > encoding: > > > > > > > > see > > > > > > > > > > [2] > > > > > > > > > > > > [3]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > And even if tables like "foo bar" could be > > > > created, > > > > > > they > > > > > > > > > > > > couldn't be > > > > > > > > > > > > > > > > > retrieved by Java clients. This occurs > because > > > > > > current > > > > > > > > Java > > > > > > > > > > > > clients > > > > > > > > > > > > > > > > > incorrectly encode that name as "foo+bar", > which > > > > the > > > > > > REST > > > > > > > > > > > > framework > > > > > > > > > > > > > > > > > does not modify. Consequently, Polaris > would look > > > > > > for a > > > > > > > > table > > > > > > > > > > > > named > > > > > > > > > > > > > > > > > "foo+bar" instead and throw a "not found" > error. > > > > > > (Other > > > > > > > > > > clients > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > send "foo%20bar" which would be correctly > > > > decoded by > > > > > > the > > > > > > > > > > > > framework as > > > > > > > > > > > > > > > > > "foo bar", and thus it would succeed.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > PROPOSAL > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To resolve the issue with the + sign in > table > > > > names, > > > > > > we > > > > > > > > > > simply > > > > > > > > > > > > need > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > eliminate the redundant decoding step. I > can > > > > open a > > > > > > PR > > > > > > > > for > > > > > > > > > > that > > > > > > > > > > > > > > > > > shortly. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To resolve the issue with spaces in table > and > > > > > > namespace > > > > > > > > > > names, > > > > > > > > > > > we > > > > > > > > > > > > > > > > > could fix all the methods that incorrectly > join > > > > > > together > > > > > > > > > > > > identifiers > > > > > > > > > > > > > > > > > without proper URL encoding. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Finally, addressing the Java clients > encoding > > > > > > problem is > > > > > > > > > > > > complex, but > > > > > > > > > > > > > > > > > we could consider implementing a > workaround as > > > > > > follows: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1) If the client is Java and lacks the > upcoming > > > > > > Iceberg > > > > > > > > fix > > > > > > > > > > for > > > > > > > > > > > > space > > > > > > > > > > > > > > > > > encoding, manually replace "+" with a > space to > > > > > > correct > > > > > > > > the > > > > > > > > > > > > client's > > > > > > > > > > > > > > > > > faulty encoding. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2) For non-Java clients or those with the > fix, no > > > > > > > > workaround > > > > > > > > > > > > would be > > > > > > > > > > > > > > > required. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What are your thoughts on this? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1]: > > > > > > > > > > > > > > > > > > > > > > > https://lists.apache.org/thread/c498svln0x18vvm42998b9nm9j6ck5yh > > > > > > > > > > > > > > > > > [2]: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/polaris/blob/e94fdff63852dc41635c9e7eb62b3627ba562b85/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java#L379 > > > > > > > > > > > > > > > > > [3]: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/polaris/blob/e94fdff63852dc41635c9e7eb62b3627ba562b85/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java#L571 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
