Hi Yufei,

>  the name is persisted verbatim in Polaris's catalog entity and baked as a 
> directory boundary in the S3 location (s3://bucket123/db1/my/table1/…)

While your research suggests this is a positive outcome, in fact this
is *exactly* why I am concerned about using slashes. It introduces a
prefix hierarchy - /db1/my/table1/ in your example - that doesn't
exist conceptually.

I'm also finding the conclusion of your research a bit unclear.
Although it mentions the slash is "worth considering," it then
provides three arguments against it before ultimately suggesting it's
"not worth fighting." And among the 3 action items your research
recommends, the first two are already implemented in the PR.

About the feature flag idea: in my opinion, a feature flag is only
viable if we also strengthen the URL construction logic; otherwise, I
believe slashes should be prohibited unconditionally.

Thanks,
Alex


On Fri, Apr 24, 2026 at 8:43 PM Yufei Gu <[email protected]> wrote:
>
> Thanks for the PR, Alex! I researched whether we should block slack in the
> table name.
>
> Here is what I tested. Created *db1.my/table1 <http://db1.my/table1>* in a
> Polaris quickstart catalog (RustFS-backed, in-memory metastore) and
> exercised it against three client surfaces. All three surfaces work well:
>   1. Iceberg REST API via curl. Create, list, and load all worked. The
> slash must be percent-encoded as %2F in the path (e.g.
> .../tables/my%2Ftable1); the name is persisted verbatim in
> Polaris's catalog entity and baked as a directory boundary in the S3
> location (s3://bucket123/db1/my/table1/…).
>   2. PyIceberg (RestCatalog). list_namespaces, list_tables, load_table, and
> scan().to_arrow() round-tripped the slash correctly end-to-end, including
> fetching metadata JSON from storage with vended credentials.
>   3. Spark SQL. The name is addressable via single-part backticks:
> polaris.db1.`my/table1`. Other engines need their own quoting (Trino:
> double quotes, etc.).
>
> Why the slash is still worth considering:
>
>    - URI-level fragility. %2F is a reserved character; intermediaries
>    routinely reject it (Apache default `AllowEncodedSlashes Off` results in a
>    404, ALB results in a 400) or silently normalize it to / (some nginx
>    configs, API Gateway REST, CloudFront), which would dispatch the request to
>    a different namespace/table entirely. These failures surface only once a
>    proxy/WAF/CDN is in the call path.
>    - Storage-layout collision. Polaris builds default locations as
>    <warehouse>/<namespace>/<name>. A table named my/table1 shares a prefix
>    with a hypothetical future namespace db1.my, which could let vended
>    credentials for one leak into the blast radius of another.
>    - Engine quoting drift and bad UX. Every downstream engine has its own
>    identifier-quoting rules. Slashes survive in Spark with backticks and in
>    Trino with double quotes, but tools, dashboards, and DDL generators
>    frequently drop or mangle them. Users has to think about which quote to
>    use.
>
> *My recommendation: Not worth fighting. *The features work today in
> isolated testing, but keeping them working requires every future hop, like
> proxy, WAF, CDN, ingress, engine, and SDK to handle URLs exactly
> right, forever. The upside is purely cosmetic (the slash in the name). I
> suggest putting the restriction behind a feature flag, defaulted to reject.
> Here are action items:
>
>    - Validate table and namespace names server-side at create time which
>    the PR does already.
>    - Reject with a clear 400 and an error message pointing to the flag.
>    - Flag can be flipped on per realm for teams that genuinely need exotic
>    names, with a documented warning about proxy-chain testing.
>
> This gets us the robustness benefits immediately, keeps the door open for
> backward compatibility and niche use cases, and avoids a long tail of "it
> works on my laptop, fails in prod" tickets. WDYT?
>
> Yufei
>
>
> On Thu, Apr 23, 2026 at 6:07 AM Alexandre Dutra <[email protected]> wrote:
>
> > Hi Yufei,
> >
> > Yes, I think we can view storage location sanitizing as a parallel effort.
> >
> > With that, here is a simple PR that aims at forbidding slashes and a
> > few other pathological cases for Iceberg and Generic Tables entities
> > at creation time:
> >
> > https://github.com/apache/polaris/pull/4282
> >
> > Thanks,
> > Alex
> >
> > On Thu, Apr 23, 2026 at 1:14 AM Yufei Gu <[email protected]> wrote:
> > >
> > > Hi Alex, it's a good point that the storage location build is also
> > > affected, but it feels less controversial and somewhat separate from the
> > > main question here.
> > >
> > > The immediate discussion, at least from my perspective, is about entity
> > > naming guardrails and externally visible behavior, for example preventing
> > > names that are ambiguous or likely to break REST access and cross client
> > > behavior.
> > >
> > > Storage location construction is important too, but that feels more like
> > an
> > > internal implementation hardening task than a spec or user-facing
> > semantics
> > > question. I would view it as a parallel track rather than something that
> > > should block agreement on the narrower entity name issue. I'm also fine
> > if
> > > someone wants to tackle the location building issue first. That could
> > > provide useful context for resolving the user-facing naming questions.
> > >
> > > Yufei
> > >
> > >
> > > On Wed, Apr 22, 2026 at 8:28 AM Alexandre Dutra <[email protected]>
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > Disallowing the most problematic cases seems the right way to go. I
> > > > can provide a PR to quickly implement that.
> > > >
> > > > However, we must keep in mind that disallowing a few chars will not
> > > > solve all our problems. IMHO we need to consistently replace all
> > > > string concatenations that we use today for creating storage locations
> > > > with a proper location builder that will take care of proper path
> > > > escaping and sanitization. That part of the job is way more complex,
> > > > due to the blast radius.
> > > >
> > > > Thanks,
> > > > Alex
> > > >
> > > >
> > > > On Wed, Apr 22, 2026 at 2:07 AM Yufei Gu <[email protected]> wrote:
> > > > >
> > > > > Sorry for jumping into this thread a bit late.
> > > > >
> > > > > I’m supportive of introducing some guardrails for namespace and
> > table or
> > > > > view names. Specifically, I think we should disallow a few
> > problematic
> > > > > cases to avoid ambiguity and downstream issues:
> > > > >
> > > > >    - Disallow the slash character “/”
> > > > >    - Disallow empty strings
> > > > >    - Disallow leading or trailing whitespace
> > > > >
> > > > > These constraints seem reasonable given the interactions across REST,
> > > > > storage paths, and different client behaviors. Adding clear
> > guardrails
> > > > > early can prevent subtle bugs and inconsistencies later on. Curious
> > to
> > > > hear
> > > > > if others see any concerns or edge cases with this approach.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Yufei
> > > > >
> > > > >
> > > > > On Thu, Apr 16, 2026 at 9:11 AM Alexandre Dutra <[email protected]>
> > > > wrote:
> > > > >
> > > > > > > Do you think it's worth having a separate discussion about
> > > > guardrails for
> > > > > > namespace elements and table/view names? [...]
> > > > > >
> > > > > > Completely agree here. I think the slash character in particular
> > > > > > should definitely be banned.
> > > > > >
> > > > > > Thanks,
> > > > > > Alex
> > > > > >
> > > > > > On Thu, Apr 16, 2026 at 6:03 PM Dmitri Bourlatchkov <
> > [email protected]>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Do you think it's worth having a separate discussion about
> > > > guardrails
> > > > > > for
> > > > > > > namespace elements and table/view names? [...]
> > > > > > >
> > > > > > > Definitely!
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Dmitri.
> > > > > > >
> > > > > > > On Thu, Apr 16, 2026 at 6:57 AM Robert Stupp <[email protected]>
> > wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > > spark-sql ()> create namespace `n/s`;
> > > > > > > > > However, the S3 location in this case gets a proper directory
> > > > > > breakdown:
> > > > > > > > > ... and table metadata has: "location":"s3://pol/n/s/t1"
> > > > > > > > > ... but that is probably a different issue.
> > > > > > > >
> > > > > > > > Yea, it's different from the URL en/decoding topic. Do you
> > think
> > > > it's
> > > > > > worth
> > > > > > > > having a separate discussion about guardrails for namespace
> > > > elements
> > > > > > and
> > > > > > > > table/view names? For example, disallowing '/', disallowing
> > > > empty/blank
> > > > > > > > namespace elements and table/view names, disallowing
> > > > leading/trailing
> > > > > > > > whitespaces? Sure, some of these checks already happen, but
> > not at
> > > > > > every
> > > > > > > > level/layer (defense-in-depth).
> > > > > > > >
> > > > > > > > > when Iceberg itself will introduce configurable separators,
> > we
> > > > MAY
> > > > > > ask
> > > > > > > > ourselves if Polaris should allow them to beconfigurable or
> > not.
> > > > [...]
> > > > > > > > separator is just a REST layer thing
> > > > > > > >
> > > > > > > > True, the separator is a primarily a REST-layer namespace
> > > > en/decoding
> > > > > > > > thing. What worries me slightly is that (existing) namespace
> > > > elements
> > > > > > with
> > > > > > > > the configured separator character could become inaccessible.
> > > > However,
> > > > > > > > "configurable separator" is IMO a different discussion.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Robert
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Apr 15, 2026 at 8:20 PM Dmitri Bourlatchkov <
> > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi All,
> > > > > > > > >
> > > > > > > > > My understanding of the need to make namespace separators
> > > > > > configurable is
> > > > > > > > > that there exist a rather narrow set of deployment cases
> > where
> > > > the
> > > > > > ASCII
> > > > > > > > > "0x1F" (unit separator) character is not permitted in URL
> > paths
> > > > by
> > > > > > some
> > > > > > > > > infrastructure components.
> > > > > > > > >
> > > > > > > > > It might be worth allowing users to define a different
> > > > separator, but
> > > > > > > > since
> > > > > > > > > no one has brought this up yet, I assume it is not a
> > priority.
> > > > > > > > >
> > > > > > > > > In any case, using a different separator is completely a
> > REST API
> > > > > > > > > concern and should not affect how Polaris stores data
> > internally.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Dmitri.
> > > > > > > > >
> > > > > > > > > On Wed, Apr 15, 2026 at 2:03 PM Alexandre Dutra <
> > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > > I wonder how namespace elements and table/view names
> > with a
> > > > slash
> > > > > > > > ('/')
> > > > > > > > > > character in the middle behave. Or other characters like
> > '&' or
> > > > > > '?' or
> > > > > > > > > '#'.
> > > > > > > > > >
> > > > > > > > > > For the REST layer, these will be percent-encoded, and
> > with my
> > > > PR
> > > > > > to
> > > > > > > > > > fix a double-decoding issue, these characters "survive" the
> > > > REST
> > > > > > layer
> > > > > > > > > > just fine.
> > > > > > > > > >
> > > > > > > > > > The issue now is in some layers beneath: as I pointed out
> > and
> > > > as
> > > > > > > > > > Dmitri demonstrated, we are unfortunately concatenating
> > > > identifiers
> > > > > > > > > > together to create storage locations, without proper
> > escaping.
> > > > This
> > > > > > > > > > currently results in corrupted storage locations.
> > > > > > > > > >
> > > > > > > > > > I'm trying first to fix the REST layer first, then I'll
> > move
> > > > to the
> > > > > > > > > > storage layer.
> > > > > > > > > >
> > > > > > > > > > > What's your take on leveraging
> > > > > > > > jakarta.ws.rs.ext.ParamConverterProvider
> > > > > > > > > > / jakarta.ws.rs.ext.ParamConverter for the path parameters
> > and
> > > > have
> > > > > > > > > > centralized helpers that deal with "proper" URL
> > > > encoding/decoding?
> > > > > > > > > >
> > > > > > > > > > For now I don't see a valid usage in Polaris for that,
> > since
> > > > Jersey
> > > > > > > > > > handles decoding path parameters already.
> > > > > > > > > >
> > > > > > > > > > > I also agree that the "configurable namespace separator"
> > must
> > > > > > never
> > > > > > > > > > change. Is my assumption correct, that it must always be
> > the
> > > > same
> > > > > > > > > character
> > > > > > > > > > as it is today?
> > > > > > > > > >
> > > > > > > > > > In Polaris, we are using the namespace separator in two
> > > > different
> > > > > > use
> > > > > > > > > > cases:
> > > > > > > > > >
> > > > > > > > > > 1) For path parameters in the REST layer
> > > > > > > > > > 2) For storing namespaces in Polaris entities
> > > > > > > > > >
> > > > > > > > > > What is clear is that in the second use case, the namespace
> > > > must
> > > > > > NEVER
> > > > > > > > > > change. I just opened a PR for that:
> > > > > > > > > > https://github.com/apache/polaris/pull/4214
> > > > > > > > > >
> > > > > > > > > > Regarding the first use case, once we solve all our
> > > > > > encoding/decoding
> > > > > > > > > > issues, and when Iceberg itself will introduce configurable
> > > > > > > > > > separators, we MAY ask ourselves if Polaris should allow
> > them
> > > > to be
> > > > > > > > > > configurable or not. I don't have strong opinions, but if
> > the
> > > > > > > > > > separator is just a REST layer thing, it should be
> > possible to
> > > > > > change
> > > > > > > > > > it without breaking the storage layer or the metastore.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Alex
> > > > > > > > > >
> > > > > > > > > > On Wed, Apr 15, 2026 at 7:47 PM Dmitri Bourlatchkov <
> > > > > > [email protected]>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi All,
> > > > > > > > > > >
> > > > > > > > > > > Slashes in namespace seem to work fine (Spark 3.5 +
> > Iceberg
> > > > > > 1.10.0):
> > > > > > > > > > >
> > > > > > > > > > > spark-sql ()> create namespace `n/s`;
> > > > > > > > > > > Time taken: 0.335 seconds
> > > > > > > > > > > spark-sql ()> show namespaces;
> > > > > > > > > > > `n/s`
> > > > > > > > > > > Time taken: 0.232 seconds, Fetched 1 row(s)
> > > > > > > > > > > spark-sql ()> use `n/s`;
> > > > > > > > > > > Time taken: 0.028 seconds
> > > > > > > > > > > spark-sql (`n/s`)> create table t1 (n string);
> > > > > > > > > > > Time taken: 0.702 seconds
> > > > > > > > > > >
> > > > > > > > > > > The URLs appear to be encoded properly, e.g. (from
> > Polaris
> > > > log):
> > > > > > > > > > >
> > > > > > > > > > > 2026-04-15 13:41:17,594 INFO  [io.qua.htt.access-log]
> > > > > > > > > > >
> > > > > > [dee1505c-ec1d-4f90-a9de-154eac66a40c_0000000000000000013,POLARIS]
> > > > > > > > > [,,,]
> > > > > > > > > > > (executor-thread-1) 127.0.0.1 - root
> > [15/Apr/2026:13:41:17
> > > > -0400]
> > > > > > > > "GET
> > > > > > > > > > >
> > /api/catalog/v1/polaris/namespaces/n%2Fs/tables?pageToken=
> > > > > > HTTP/1.1"
> > > > > > > > > 200
> > > > > > > > > > 74
> > > > > > > > > > >
> > > > > > > > > > > I did not test trickier chars, but adding CI coverage for
> > > > them
> > > > > > would
> > > > > > > > be
> > > > > > > > > > > good.
> > > > > > > > > > >
> > > > > > > > > > > However, the S3 location in this case gets a proper
> > directory
> > > > > > > > > breakdown:
> > > > > > > > > > >
> > > > > > > > > > > $ mc ls rustfs/pol/n/s
> > > > > > > > > > > [2026-04-15 13:44:37 EDT]     0B t1/
> > > > > > > > > > >
> > > > > > > > > > > ... and table metadata has: "location":"s3://pol/n/s/t1"
> > > > > > > > > > >
> > > > > > > > > > > ... but that is probably a different issue.
> > > > > > > > > > >
> > > > > > > > > > > Cheers,
> > > > > > > > > > > Dmitri.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Apr 15, 2026 at 10:35 AM Robert Stupp <
> > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks Alex for the thorough investigation!
> > > > > > > > > > > >
> > > > > > > > > > > > URL en/decoding is really not that easy.
> > > > > > > > > > > > I wonder how namespace elements and table/view names
> > with a
> > > > > > slash
> > > > > > > > > ('/')
> > > > > > > > > > > > character in the middle behave. Or other characters
> > like
> > > > '&'
> > > > > > or '?'
> > > > > > > > > or
> > > > > > > > > > '#'.
> > > > > > > > > > > >
> > > > > > > > > > > > Overall, I agree with your idea to implement correct
> > URL
> > > > > > > > > > encoding/decoding
> > > > > > > > > > > > in the Polaris code base to protect Polaris from
> > upstream
> > > > > > behavior
> > > > > > > > > > changes
> > > > > > > > > > > > that can seriously break or even corrupt things.
> > > > > > > > > > > >
> > > > > > > > > > > > What's your take on leveraging
> > > > > > > > > jakarta.ws.rs.ext.ParamConverterProvider
> > > > > > > > > > > > / jakarta.ws.rs.ext.ParamConverter for the path
> > parameters
> > > > and
> > > > > > have
> > > > > > > > > > > > centralized helpers that deal with "proper" URL
> > > > > > encoding/decoding?
> > > > > > > > > > > >
> > > > > > > > > > > > I also agree that the "configurable namespace
> > separator"
> > > > must
> > > > > > never
> > > > > > > > > > change.
> > > > > > > > > > > > Is my assumption correct, that it must always be the
> > same
> > > > > > character
> > > > > > > > > as
> > > > > > > > > > it
> > > > > > > > > > > > is today?
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Robert
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Apr 15, 2026 at 3:48 PM Alexandre Dutra <
> > > > > > [email protected]
> > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > >
> > > > > > > > > > > > > FYI I created a first PR to address the
> > double-decoding
> > > > > > issue:
> > > > > > > > > > > > >
> > > > > > > > > > > > > https://github.com/apache/polaris/pull/4210
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > Alex
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Apr 14, 2026 at 9:56 PM Alexandre Dutra <
> > > > > > > > [email protected]
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I would also point out that Polaris uses
> > > > > > > > RESTUtil.encodeNamespace
> > > > > > > > > > and
> > > > > > > > > > > > > > RESTUtil.decodeNamespace for encoding and decoding
> > the
> > > > > > parent
> > > > > > > > > > > > > > namespace within a NamespaceEntity [1].
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > These methods also exhibit the faulty space
> > encoding
> > > > > > behavior.
> > > > > > > > > > > > > > Therefore, we must exercise **extreme caution**
> > > > regarding
> > > > > > any
> > > > > > > > > > upcoming
> > > > > > > > > > > > > > Iceberg project fixes for space-encoding issues. If
> > > > these
> > > > > > > > methods
> > > > > > > > > > are
> > > > > > > > > > > > > > modified, it is imperative that we retain the
> > legacy
> > > > > > versions
> > > > > > > > > > > > > > specifically for encoding and decoding
> > NamespaceEntity
> > > > > > > > > properties –
> > > > > > > > > > > > > > otherwise we could end up with a corrupted
> > database.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The same goes for the future namespace separator
> > coming
> > > > > > with
> > > > > > > > > > Iceberg
> > > > > > > > > > > > > > 1.11: for the sake of encoding and decoding
> > > > NamespaceEntity
> > > > > > > > > > > > > > properties, the separator must never change.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I would actually be in favor of proactively
> > > > internalizing
> > > > > > the
> > > > > > > > > > > > > > encoding/decoding algorithm used in
> > NamespaceEntity.
> > > > What
> > > > > > do
> > > > > > > > you
> > > > > > > > > > > > > > think?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > Alex
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [1]:
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/polaris/blob/8ad8f74f62258ab6238190271603e4d4c8a75998/polaris-core/src/main/java/org/apache/polaris/core/entity/NamespaceEntity.java#L92
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, Apr 14, 2026 at 7:43 PM Alexandre Dutra <
> > > > > > > > > [email protected]
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > A discussion on the Iceberg ML [1] recently
> > > > highlighted
> > > > > > that
> > > > > > > > > URL
> > > > > > > > > > path
> > > > > > > > > > > > > > > segments are not being decoded correctly
> > according
> > > > to RFC
> > > > > > > > 3986,
> > > > > > > > > > > > > > > specifically regarding space encoding.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I investigated the situation in Polaris, and
> > found
> > > > many
> > > > > > > > > problems:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > TLDR
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > - Table names with the + sign can be created but
> > > > cannot
> > > > > > be
> > > > > > > > > > retrieved
> > > > > > > > > > > > > > > - Namespace names with the + sign are OK (can be
> > > > created
> > > > > > and
> > > > > > > > > > > > retrieved)
> > > > > > > > > > > > > > > - Table names with spaces cannot be created
> > > > > > > > > > > > > > > - Namespace names with spaces cannot be created
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > DISCUSSION
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Table names such as "foo+bar" can be created (via
> > > > POST,
> > > > > > where
> > > > > > > > > the
> > > > > > > > > > > > name
> > > > > > > > > > > > > > > is in the request body). But they cannot be
> > > > retrieved:
> > > > > > when
> > > > > > > > > > reading
> > > > > > > > > > > > > > > tables, the name is part of the URL path. Polaris
> > > > > > incorrectly
> > > > > > > > > > > > performs
> > > > > > > > > > > > > > > a second decoding step using
> > > > > > RESTUtil.decodeString(table),
> > > > > > > > even
> > > > > > > > > > > > though
> > > > > > > > > > > > > > > the REST framework has already decoded it.
> > > > Consequently,
> > > > > > a
> > > > > > > > > client
> > > > > > > > > > > > > > > sends "foo%2Bbar" which is first decoded to
> > > > "foo+bar" by
> > > > > > the
> > > > > > > > > > > > framework
> > > > > > > > > > > > > > > (correct) and then re-decoded by Polaris to "foo
> > bar"
> > > > > > > > > > (incorrect),
> > > > > > > > > > > > > > > resulting in a "not found" error.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Table and namespace names like "foo bar" simply
> > > > cannot be
> > > > > > > > > > created at
> > > > > > > > > > > > > > > all. This is because in
> > > > > > > > > > IcebergCatalog.defaultWarehouseLocation() and
> > > > > > > > > > > > > > > other similar places, we create locations merely
> > by
> > > > > > joining
> > > > > > > > > > > > > > > identifiers together, without any form of URL
> > > > encoding:
> > > > > > see
> > > > > > > > [2]
> > > > > > > > > > [3].
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > And even if tables like "foo bar" could be
> > created,
> > > > they
> > > > > > > > > > couldn't be
> > > > > > > > > > > > > > > retrieved by Java clients. This occurs because
> > > > current
> > > > > > Java
> > > > > > > > > > clients
> > > > > > > > > > > > > > > incorrectly encode that name as "foo+bar", which
> > the
> > > > REST
> > > > > > > > > > framework
> > > > > > > > > > > > > > > does not modify. Consequently, Polaris would look
> > > > for a
> > > > > > table
> > > > > > > > > > named
> > > > > > > > > > > > > > > "foo+bar" instead and throw a "not found" error.
> > > > (Other
> > > > > > > > clients
> > > > > > > > > > would
> > > > > > > > > > > > > > > send "foo%20bar" which would be correctly
> > decoded by
> > > > the
> > > > > > > > > > framework as
> > > > > > > > > > > > > > > "foo bar", and thus it would succeed.)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > PROPOSAL
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > To resolve the issue with the + sign in table
> > names,
> > > > we
> > > > > > > > simply
> > > > > > > > > > need
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > eliminate the redundant decoding step. I can
> > open a
> > > > PR
> > > > > > for
> > > > > > > > that
> > > > > > > > > > > > > > > shortly.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > To resolve the issue with spaces in table and
> > > > namespace
> > > > > > > > names,
> > > > > > > > > we
> > > > > > > > > > > > > > > could fix all the methods that incorrectly join
> > > > together
> > > > > > > > > > identifiers
> > > > > > > > > > > > > > > without proper URL encoding.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Finally, addressing the Java clients encoding
> > > > problem is
> > > > > > > > > > complex, but
> > > > > > > > > > > > > > > we could consider implementing a workaround as
> > > > follows:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1) If the client is Java and lacks the upcoming
> > > > Iceberg
> > > > > > fix
> > > > > > > > for
> > > > > > > > > > space
> > > > > > > > > > > > > > > encoding, manually replace "+" with a space to
> > > > correct
> > > > > > the
> > > > > > > > > > client's
> > > > > > > > > > > > > > > faulty encoding.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2) For non-Java clients or those with the fix, no
> > > > > > workaround
> > > > > > > > > > would be
> > > > > > > > > > > > > required.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > What are your thoughts on this?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > Alex
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > [1]:
> > > > > > > > > > > >
> > > > > > https://lists.apache.org/thread/c498svln0x18vvm42998b9nm9j6ck5yh
> > > > > > > > > > > > > > > [2]:
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/polaris/blob/e94fdff63852dc41635c9e7eb62b3627ba562b85/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java#L379
> > > > > > > > > > > > > > > [3]:
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/polaris/blob/e94fdff63852dc41635c9e7eb62b3627ba562b85/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalog.java#L571
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >

Reply via email to