I apologize for the delay in responding.

I'm pleased to see the development of an open-source REST catalog
implementation, and the potential transition of Gravitino to an ASF project
is certainly promising.
But REST catalog server implementation will be a small part of Gravitino
ASF project. Which has many other things along with the catalog?

While I understand Iceberg's focus on the table format specification and
its implementation,
*I would like to propose the creation of a sub-project for the REST catalog
server implementation under the Iceberg repository (similar to pyiceberg,
iceberg-rust, etc.). *
This suggestion is based on several reasons:

   - Everytime we make a change to the REST spec, there is no reference
   implementation to refer to or modify it.
   - Many companies such as AWS, Apple, Tabular, and Datastrato are each
   implementing their own REST servers.
   Consolidating efforts within a sub-project could lead to efficiency
   gains and potential collaboration opportunities.
   - From the perspective of open-source users, the absence of an
   open-source implementation for the REST catalog within Iceberg may be
   inconvenient or frustrating.

I believe creating a dedicated sub-project would address these concerns and
enhance the overall usability and collaborative nature of the Iceberg
ecosystem.
I also think we can have a sub-project for kafka-connect and iceberg tools
(delta converter, catalog migrator etc) as they need not have to depend on
the Iceberg release cycle
and they are independent of table format spec.

Let me know your thoughts on this. I can open a separate thread for
discussion if required.

- Ajantha


On Wed, Jan 31, 2024 at 5:32 AM Jack Ye <yezhao...@gmail.com> wrote:

> +1 for using test-jar!
>
> -Jack
>
> On Fri, Jan 26, 2024 at 10:48 AM Ryan Blue <b...@tabular.io> wrote:
>
>> I think I'd be fine exposing this through a test Jar, but it seems to me
>> that if we were to put it into a normal package it would turn into the
>> situation we want to avoid. People would use it for unintended purposes and
>> it would become a distraction.
>>
>> What do you think about using the tests Jar for this?
>>
>> On Thu, Jan 25, 2024 at 12:48 PM Jack Ye <yezhao...@gmail.com> wrote:
>>
>>> Yes, sorry I did not make it clear, I also agree it is not the right
>>> direction to invest a lot of community effort. I am more talking about
>>> casual use cases like importing a server for unit tests outside Iceberg,
>>> running some local debugging, etc. I think it would be valuable to provide
>>> a server in Iceberg for that purpose, and maybe vend it as test utils.
>>> Thoughts?
>>>
>>> -Jack
>>>
>>> On Thu, Jan 25, 2024 at 11:35 AM Ryan Blue <b...@tabular.io> wrote:
>>>
>>>> > I know we have the RESTCatalogAdapter and RESTCatalogSevlet for unit
>>>> tests, and technically we have a very similar Jetty server implementation
>>>> in TestRESTCatalog. Should we think about making those components out of
>>>> the tests into an iceberg-rest-server module for this use case, and merge
>>>> with the implementation that Gravitino has?
>>>>
>>>> I think that this would take the Iceberg project in the wrong
>>>> direction. Iceberg has always been a library and I think it should continue
>>>> to be. Concerns about runtime should be left to other projects that need to
>>>> fit into existing infrastructure or skillsets of people maintaining them.
>>>> The question of whether to use Jetty or Tomcat or whatever else is a
>>>> serious consideration, as is how to monitor that application and send
>>>> metrics. I think it would slow down the core purpose of Iceberg if we got
>>>> distracted by these things.
>>>>
>>>> In fact, I think that this project shows that the library is getting
>>>> the balance right: it is using `CatalogHandlers` for their intended
>>>> purpose. It has opinions about how to run the actual HTTP service and
>>>> people that agree can use it. Other people could use `CatalogHandlers` to
>>>> build on a different foundation.
>>>>
>>>> On Thu, Jan 25, 2024 at 11:13 AM Jack Ye <yezhao...@gmail.com> wrote:
>>>>
>>>>> Really cool project!
>>>>>
>>>>> I browsed a bit of the codebase, and see this implementation of the
>>>>> REST service backend:
>>>>> -
>>>>> https://github.com/datastrato/gravitino/blob/main/catalogs/catalog-lakehouse-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/IcebergRESTService.java#L39
>>>>> -
>>>>> https://github.com/datastrato/gravitino/blob/main/catalogs/catalog-lakehouse-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/ops/IcebergTableOps.java#L42-L51
>>>>>
>>>>>  Looks like it is initializing a Jetty server that uses
>>>>> CatalogHandlers to delegate the execution to a specific Java Catalog
>>>>> implementation.
>>>>>
>>>>> I think this is actually something that is lacking today in Iceberg,
>>>>> which is an easy way for users to start an actual REST HTTP server.
>>>>>
>>>>> I know we have the RESTCatalogAdapter and RESTCatalogSevlet for unit
>>>>> tests, and technically we have a very similar Jetty server implementation
>>>>> in TestRESTCatalog. Should we think about making those components out of
>>>>> the tests into an iceberg-rest-server module for this use case, and merge
>>>>> with the implementation that Gravitino has?
>>>>>
>>>>> Best,
>>>>> Jack Ye
>>>>>
>>>>> On Thu, Jan 25, 2024 at 10:47 AM Yufei Gu <flyrain...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks Justin for the sharing.
>>>>>>
>>>>>> It's pretty cool to see an open source REST catalog implementation in
>>>>>> action. Having dabbled a bit in the early development of Gravitino 
>>>>>> myself,
>>>>>> I'm really excited about its potential with the Iceberg REST catalog.
>>>>>>
>>>>>> The idea of Gravitino moving to an ASF project is promising. It’ll
>>>>>> surely boost its visibility and open up more doors for collaboration and
>>>>>> adoption.
>>>>>>
>>>>>> Looking forward to where this goes. Keep up the fantastic work!
>>>>>>
>>>>>> Yufei
>>>>>>
>>>>>>
>>>>>> On Thu, Jan 25, 2024 at 5:55 AM Jean-Baptiste Onofré <j...@nanthrax.net>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Justin,
>>>>>>>
>>>>>>> I talked with Junping a couple of months ago about Gravitino. Thanks
>>>>>>> for sharing !
>>>>>>>
>>>>>>> Regards
>>>>>>> JB
>>>>>>>
>>>>>>> On Thu, Jan 25, 2024 at 12:15 AM Justin Mclean <
>>>>>>> jus...@classsoftware.com> wrote:
>>>>>>> >
>>>>>>> > Hi,
>>>>>>> >
>>>>>>> > We open-sourced a new project, Gravitino, in December and have
>>>>>>> been working on growing the community and adding new functionality. We 
>>>>>>> plan
>>>>>>> to donate the project to the ASF this year. Gravitino is a unified 
>>>>>>> metadata
>>>>>>> lake solution offering a unified approach to managing datasets from 
>>>>>>> diverse
>>>>>>> sources and regions across multiple cloud platforms. Its core is an 
>>>>>>> Iceberg
>>>>>>> REST catalog service implementation to manage Iceberg tables 
>>>>>>> efficiently.
>>>>>>> >
>>>>>>> > If this sounds like something you would be interested in, then the
>>>>>>> following resources will help:
>>>>>>> > -  Blog post:
>>>>>>> https://datastrato.ai/blog/gravitino-iceberg-rest-catalog-service/
>>>>>>> > -  Gravitino documentation: https://datastrato.ai/docs/0.3.1/
>>>>>>> > -  Iceberg REST service documentation:
>>>>>>> https://datastrato.ai/docs/0.3.1/iceberg-rest-service
>>>>>>> >
>>>>>>> > We welcome any feedback and suggestions you have, and as always,
>>>>>>> all contributions are welcome. You can find the source code at
>>>>>>> https://github.com/datastrato/gravitino.
>>>>>>> >
>>>>>>> > Kind Regards,
>>>>>>> > Justin
>>>>>>>
>>>>>>
>>>>
>>>> --
>>>> Ryan Blue
>>>> Tabular
>>>>
>>>
>>
>> --
>> Ryan Blue
>> Tabular
>>
>

Reply via email to