Re: [Community Meeting] NoSQL Support in Apache Polaris

2025-07-09 Thread Adam Christian
This meeting is to:
1. Walkthrough this proposal
https://docs.google.com/document/d/1POUWe0xMZOBoaJ6Rgiw35ziEoc6OEYCiW7Zk6bR9H6M/edit?tab=t.0#heading=h.nx9vzhg2x8v2
2. Maybe, look at some of the high-level modules in the draft PR
https://github.com/apache/polaris/pull/1189

In regards to Spanner, this proposal handles most databases that can:
1. Do single-row compare-and-swap
2. Do optimized point queries

So, I would expect that Spanner could be covered under this, but right now,
it's not an explicit goal.

Go team,

Adam Christian
Principal Software Engineer
adam.christ...@dremio.com
LinkedIn Profile <https://www.linkedin.com/in/adam-christian-40309824/>

<https://hello.dremio.com/email-signature-url>



On Wed, Jul 9, 2025 at 11:37 AM Eric Maynard 
wrote:

> I know that there was a spanner implementation mentioned in a recent
> community sync, is that also part of the same discussion?
>
> On Wed, Jul 9, 2025 at 11:12 AM Jean-Baptiste Onofré 
> wrote:
>
> > Actually the invite was not correct: the meeting is on 7/17 as
> > originally in my first email.
> >
> > Sorry for the confusion.
> >
> > Regards
> > JB
> >
> > On Wed, Jul 9, 2025 at 2:06 PM Jean-Baptiste Onofré 
> > wrote:
> > >
> > > My bad, yes, it's 7/16.
> > >
> > > Regards
> > > JB
> > >
> > > On Wed, Jul 9, 2025 at 2:04 PM Robert Stupp  wrote:
> > > >
> > > > The invite says July 16th?
> > > >
> > > > On Wed, Jul 9, 2025 at 8:02 PM Jean-Baptiste Onofré  >
> > wrote:
> > > > >
> > > > > Hi folks,
> > > > >
> > > > > Several contributors asked an update about NoSQL support in Apache
> > Polaris.
> > > > >
> > > > > Now that Polaris 1.0 is almost there :) I propose to have a meeting
> > to
> > > > > resume the discussion.
> > > > >
> > > > > I scheduled a community meeting for July 17th 9am PST:
> > > > >
> > > > > https://calendar.app.google/8gBEGDxq2vqoQePz5
> > > > >
> > > > > If you are in the Polaris group, you should have received the
> invite
> > already.
> > > > >
> > > > > Thoughts ?
> > > > >
> > > > > Thanks !
> > > > > Regards
> > > > > JB
> >
>


Re: [ANNOUNCE] Apache Polaris 1.0.0-incubating

2025-07-10 Thread Adam Christian
Congrats everyone!

Go team,

Adam Christian
Principal Software Engineer
adam.christ...@dremio.com
LinkedIn Profile <https://www.linkedin.com/in/adam-christian-40309824/>

<https://hello.dremio.com/email-signature-url>



On Wed, Jul 9, 2025 at 11:03 PM Omar Al-Safi  wrote:

> Congratulations folks for this milestone!
>
> On Thu, 10 Jul 2025, 02:41 Yufei Gu,  wrote:
>
> > The Apache Polaris team is pleased to announce Apache Polaris
> > 1.0.0-incubating.
> >
> > Apache Polaris is an open-source, fully-featured catalog for Apache
> > Iceberg™. It implements Iceberg's REST API, enabling seamless
> multi-engine
> > interoperability across a wide range of platforms, including Apache
> Doris™,
> > Apache Flink®, Apache Spark™, Dremio®, StarRocks, and Trino.
> >
> > This release can be downloaded from:
> https://polaris.apache.org/downloads/
> >
> > Release notes: https://polaris.apache.org/downloads/#100-release
> >
> > Artifacts are available on Maven Central.
> >
> > Website: https://polaris.apache.org
> > Mailing list: dev@polaris.apache.org
> >
> > Enjoy!
> >
> > Yufei
> >
>


Any volunteers for the next Polaris Release Manager?

2025-07-10 Thread Adam Christian
Howdy community!

Per our conversation in the community sync, do we have any volunteers for
our next release manager?

Go team,

Adam Christian


[DISCUSS] Dedicated Community Backlog Meeting

2025-07-14 Thread Adam Christian
Howdy Polaris Community,

Now that 1.0 has shipped, I think it might be nice to have a dedicated
community meeting where we discuss what large Polaris enhancements we want
to work on in the next few months.

>From what I've learned from folks, back in February, the community came
together and had a rough idea of what would be on the "Polaris backlog" for
the next few releases: https://github.com/apache/polaris/discussions/1028

I know that Yufei has done a good job at updating this as we have shipped
large enhancements, but I believe it would be nice to have the whole
community at the table, so we can iterate together. Just to be clear, I'm
not saying that anything generated from these talks would be considered a
"fixed" commitment and I don't want to tie specific enhancements to
specific releases, but I do think it would be helpful to speak about where
this project is going.

What do y'all think?

Go team,

Adam Christian


Re: [DISCUSS] Apache Polaris (incubating) 1.1.0 release mid August ?

2025-07-18 Thread Adam Christian
This time frame looks good to me. It would be great to have the regular
cadence of monthly releases for our users (especially as so much is coming
in)!

Do we have an idea of what sort of enhancements would likely make it into
the release? I think MinIO Support would be 100% supported at that point.
Anything else?

Go team,

Adam Christian
Principal Software Engineer
adam.christ...@dremio.com
LinkedIn Profile <https://www.linkedin.com/in/adam-christian-40309824/>

<https://hello.dremio.com/email-signature-url>



On Thu, Jul 17, 2025 at 10:07 PM Jean-Baptiste Onofré 
wrote:

> Hi folks,
>
> I propose to have 1.1.0-incubating release around August 20.
>
> On github:
> 1. I updated 1.1.0 milestone due date
> 2. I will move the open issues still on 1.0.0 milestone to 1.1.0
> 3. I will close the 1.0.0 milestone (as it has been released)
>
> My main focus is to review/update/improve the release guide and move
> forward on semi-automatic release.
>
> Thoughts ?
>
> If you agree, feel free to create/assign issues for the 1.1.0 milestone.
>
> Thanks !
> Regards
> JB
>


[DISCUSS] - NoSQL Persistence

2025-07-18 Thread Adam Christian
Howdy folks!

Thanks for joining the community meeting about the NoSQL presentation
yesterday. In this mail, I'd love to:
1. Detail the plan of moving forward with NoSQL
2. Gather more feedback on the work

*Moving Forward*
In terms of moving forward, I'll be opening up a serial series of PRs that
are based upon the initial implementation PR:
https://github.com/apache/polaris/pull/1189 . The goal is to break this
into smaller, cohesive PRs. Those PRs will only be about items that are not
actively in discussion. For example, there are some frameworks built in
that original PR which can be easily carved out and reviewed on their own
to deal with Snowflake ID generation. Like everything else, I'll be working
with folks on what we think is best for the community.

*Feedback*
In terms of feedback, I gathered a few items from the session. I'll answer
them here and, also, start an FAQ Doc
<https://docs.google.com/document/d/1NvZp9ro9FXvK_jkUlKSym03BhpVQU9gqIkGFC_M5bOg/edit?tab=t.0#heading=h.h7yhlew0hwvq>
where we can keep track of the frequently asked questions as this is a
large chunk of work and I expect that we might have the same question asked
a few times. :)

During our discussion, I noted 5 main pieces of feedback:

   1. Is there a bottleneck on the catalog content "named pointer" during
   commits?
   2. How are we handling caching with this approach?
   3. Are there any scenarios where we are going to be crossing "named
   pointers", but we need to be able to ensure consistency?
   4. In the initial implementation PR, there were a few modules
concerning some
   authorization stuff
   
<https://github.com/apache/polaris/pull/1189/files#diff-82794bfe7193249c378e723e9a4ca243212e18b195d353248b1c470fa9f89104>.
   Can you explain how this interfaces with the existing Polaris authorization
   system?
   5. Can we revisit the name of "Named Pointer"?


*#1 - Catalog Content Bottleneck**Question Details:* The catalog content
“Named Pointer” needs to be updated anytime there is a write to any catalog
content. This could be a bottleneck because the compare-and-swap (CAS) of
the “Named Pointer” will only succeed if the new commit ID is committed on
top of the commit that was the previous when the commit retry loop started.
If this fails to be the case, a commit will have to be partially rebuilt.
*Answer:* While it is true that the commit will have to be partially
rebuilt if the commit fails the CAS, Pierre Laporte has done extensive
scale testing to find that this does not limit high concurrency in practice
when testing the initial implementation PR. I'll work with Pierre to send
out the scale testing information to the team.

*#2 - Caching*
*Question Details:* There is some discussion about interfacing with a cache
in a layer above the persistence layer versus having the persistence layer
own the cache.
*Answer:* This model mostly relies on immutable objects which helps with
caching. This implementation does its own caching and does not necessarily
need an EntityCache at a higher-level of abstraction above the persistence
layer due to the object immutability.

*#3 - Cross "Named Pointer" Consistency*
*Question Details:* For example, when a user creates a catalog, the code
has to create a catalog role, grant a record associated with a principal
role, and a catalog. This crosses three separate “Named Pointers”. How
should we solve this?
*Answer:* If the code serializes the creation of the grant record, the
catalog role, and the catalog, it should be solved in practice as long as
there is an out-of-band clean-up to ensure that there is proper consistency.

*#4 - Authorization Items*
*Answer:* The initial implementation was done on March 17th. A lot has
changed since then around project maturity. At this point, Robert & I would
only bring in the necessary items to persist grants. If the privilege
checking mechanism implemented in the initial implementation is helpful in
the future, someone can file a different enhancement issue to incorporate
it.

*#5 - "Named Pointer" Name*
*Question Details:* The name of "Named Pointer" tells exactly what the
thing is but not necessarily how it functions in the codebase.
*Answer:* I would be amenable to changing this name. I could see several
other names:
1. Consistency Boundary
2. State Reference
3. State Pointer
4. Consistency Groupings
In my opinion, we could probably solve this during PRs, but I understand
that names are important and hard.

If you made it to the end of this mail, congrats! Let me know if I can help
answer any feedback!

Go team,

Adam Christian


[DISCUSS] - When to remove EclipseLink?

2025-09-17 Thread Adam Christian
Howdy Polaris Community!

I was going through our open bugs and I noticed that there are around 5 to
10 bugs related to EclipseLink persistence. I was wondering when we
believe a good time to remove EclipseLink would be.

Personally, I think we could probably start doing it now since it's been
deprecated since 1.0.0 and we have a clear alternative. I believe there are
several pros for our users such as streamlined documentation and benefits
to the contributors such as less issues, dependencies, and modules.

How do y'all feel about this?

If we are aligned, I can create the issue and start working on it.

Cheers,

Adam


[DISCUSS] Enabling New ErrorProne Rules

2025-09-18 Thread Adam Christian
Howdy Polaris Community,

I was digging through some TODOs in our codebase and I saw that there were
some ErrorProne rules that we could enable. I have enabled them here
https://github.com/apache/polaris/pull/2600 . Since this was a Java
project-wide setting, I wanted to give folks some opportunity to weigh in.

Please take a gander!

Thanks,

Adam


Re: [DISCUSS] - When to remove EclipseLink?

2025-09-17 Thread Adam Christian
You are right, Russell. We should make a clear migration path, so our
EclipseLink users are able to easily transition off on EclipseLink. I know
that this has come up before [1]. Let me investigate a few options on what
guidance we can give or what tooling we can produce.

[1] https://github.com/apache/polaris/issues/1875

Cheers,

Adam

On Wed, Sep 17, 2025 at 3:49 PM Dmitri Bourlatchkov 
wrote:

> We have two migration tools:
>
> *
> https://github.com/apache/polaris-tools/tree/main/iceberg-catalog-migrator
>
> * https://github.com/apache/polaris-tools/tree/main/polaris-synchronizer
>
> I'm pretty confident that iceberg-catalog-migrator works well, but it can
> only migrate tables, not principals.
>
> I never personally used polaris-synchronizer, still it's supposed to
> migrate all Polaris data, including principals.
>
> Cheers,
> Dmitri.
>
> On Wed, Sep 17, 2025 at 3:13 PM Russell Spitzer  >
> wrote:
>
> > +1 I think removing EclipseLink should happen soon now that we have 2
> > releases with it deprecated. I have
> > looked too deeply into this but do we have a migration plan for users
> > already on EclipseLink to get over to the
> > JDBC Impl?
> >
> > On Wed, Sep 17, 2025 at 12:53 PM Dmitri Bourlatchkov 
> > wrote:
> >
> > > Thanks for bringing this issue up, Adam!
> > >
> > > I support removing EclipseLink code immediately.
> > >
> > > My rationale:
> > >
> > > * Due to EclipseLink deprecation, non-trivial new features are not
> > > necessarily implemented there [1]
> > >
> > > * Any new bugs reported for EclipseLink are not likely to get attention
> > > because this backend is in decline.
> > >
> > > * Users had better migrate to a supported backend earlier. If migration
> > is
> > > deferred, it will likely mean that any issues related to migration will
> > > take even longer to be found.
> > >
> > > * Polaris 1.1.0 still has EclipseLink, which offers users a supported
> > > version where critical issues could still be fixed, if they are found.
> > >
> > > * Having EclipseLink in the codebase adds overhead for new features
> that
> > > touch Persistence.
> > >
> > > [1]
> > >
> > >
> >
> https://github.com/apache/polaris/pull/2197/files#diff-59a870c7af1578200236f22d35fd2eb75dc2a1e73e51218464eb7ba089217da7R759
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > > On Wed, Sep 17, 2025 at 1:27 PM Adam Christian <
> > > adam.christian.softw...@gmail.com> wrote:
> > >
> > > > Howdy Polaris Community!
> > > >
> > > > I was going through our open bugs and I noticed that there are
> around 5
> > > to
> > > > 10 bugs related to EclipseLink persistence. I was wondering when we
> > > > believe a good time to remove EclipseLink would be.
> > > >
> > > > Personally, I think we could probably start doing it now since it's
> > been
> > > > deprecated since 1.0.0 and we have a clear alternative. I believe
> there
> > > are
> > > > several pros for our users such as streamlined documentation and
> > benefits
> > > > to the contributors such as less issues, dependencies, and modules.
> > > >
> > > > How do y'all feel about this?
> > > >
> > > > If we are aligned, I can create the issue and start working on it.
> > > >
> > > > Cheers,
> > > >
> > > > Adam
> > > >
> > >
> >
>


Re: [DISCUSS] Enabling New ErrorProne Rules

2025-09-19 Thread Adam Christian
This PR has been merged. Thanks, y'all!

On Thu, Sep 18, 2025 at 12:04 PM Robert Stupp  wrote:

> 2nd-ing my approval on the PR: +1
>
> On Thu, Sep 18, 2025 at 6:03 PM Alexandre Dutra  wrote:
> >
> > +1 to all proposed rules.
> >
> > On Thu, Sep 18, 2025 at 5:29 PM Russell Spitzer
> >  wrote:
> > >
> > > +1
> > >
> > > On Thu, Sep 18, 2025 at 9:53 AM Jean-Baptiste Onofré 
> > > wrote:
> > >
> > > > +1
> > > >
> > > > It sounds like a good idea to me.
> > > >
> > > > Regards
> > > > JB
> > > >
> > > > On Thu, Sep 18, 2025 at 3:45 PM Adam Christian
> > > >  wrote:
> > > > >
> > > > > Howdy Polaris Community,
> > > > >
> > > > > I was digging through some TODOs in our codebase and I saw that
> there
> > > > were
> > > > > some ErrorProne rules that we could enable. I have enabled them
> here
> > > > > https://github.com/apache/polaris/pull/2600 . Since this was a
> Java
> > > > > project-wide setting, I wanted to give folks some opportunity to
> weigh
> > > > in.
> > > > >
> > > > > Please take a gander!
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Adam
> > > >
>


Re: Declarative RBAC Config

2025-09-25 Thread Adam Christian
Howdy Graeme,

Personally, I think this would be a very interesting enhancement for
Polaris. Right now, I do not believe that there has been much thought given
to this area yet, but there have been a few things on our "discussed
backlog" [1] such as attribute-based access control which would benefit
from something like this.

In terms of why this is interesting, I have seen several providers starting
to offer permissions as policies. It helps security professionals audit and
ensure that the right privileges are given to the right folks. As
organizations scale, having this as a configuration file makes building
compliance tooling much easier. For example, I know that some people have
found Cedar [2] as a useful standard since it allows for both RBAC & ABAC.
It was open-sourced in 2023 under the Apache 2.0 License and it seems to
have a robust ecosystem around it.

[1] - https://github.com/apache/polaris/discussions/1028
[2] - https://www.cedarpolicy.com/en

Cheers,

Adam

On Tue, Sep 23, 2025 at 11:30 AM Graeme Hendrickson
 wrote:

> Hi folks,
>
> One of the things that’s been a little painful for us operating Polaris is
> configuring new catalogs or ensuring that a catalog has the roles and
> grants configured that we expect. Has there been any interest or thought
> put into an idempotent “apply” action for principal roles, catalog roles,
> and privilege grants based on some sort of configuration file? If not, is
> that something that’s interesting to this group?
>
> Cheers,
> Graeme
>