Thanks Jack and Ryan for the insights!

On Thu, Feb 9, 2023 at 5:15 PM Ryan Blue <[email protected]> wrote:

> There were a few different reasons for building the REST catalog:
>
>    1. Standardize an interface for catalogs, like the Hive Thrift API.
>    This makes customization easy: you don’t need to get Dremio or Athena to
>    put your custom catalog’s Jar in their classpath. It also makes catalogs
>    across languages more reliable and easier to manage. For example, having a
>    slightly different JDBC implementation in every language is going to lead
>    to inconsistencies and problems.
>    2. Enable new catalog features: It has been difficult to fit Iceberg
>    into existing catalogs and this has limited better features. For example,
>    we want to be able to un-drop tables for a certain period of time, to
>    support multi-table transactions, and to enable authentication and
>    authorization. I don’t think the right strategy is to try to adapt the Hive
>    MetaStore for it.
>    3. Fix some problems: There are some boring things we can fix like the
>    time it takes to load metadata files with lots of snapshots. There are also
>    issues with the way catalogs handle metadata locations and FileIO with
>    respect to tables. Ideally, these would be configured for each table, but
>    you have to create a FileIO to read the metadata file.
>
> I think Jack’s answer validates that the first goal has a lot of value for
> the community. But your question sounds like it is mainly addressing the
> second goal, where new features and investment will be.
>
> No one intends to stop development on other catalogs or to limit new
> features to only the REST catalog. In fact, the upcoming 1.2.0 release has
> a new catalog contributed by Snowflake. But, I think that it will be much,
> much easier to develop new features for the REST catalog because one of the
> goals was to avoid needing to go to crazy lengths to make things fit with
> the Hive MetaStore. That will naturally lead to new features that can’t or
> won’t be implemented.
>
> Ryan
>
> On Thu, Feb 9, 2023 at 1:34 PM Jack Ye <[email protected]> wrote:
>
>> Most of the development of REST catalog comes from Tabular at this
>> moment, I will let them comment more about this.
>>
>> Speaking from AWS perspective, we have been recommending REST catalog for
>> organizations that have their internal in house catalog systems. The REST
>> catalog provides a really well-designed and standardized API spec that
>> organizations can translate requests to and from their existing catalog
>> system, so that it can (1) work with Iceberg tables, or (2) even translate
>> their non-Iceberg tables to be exposed as an Iceberg table for query, so
>> they can standardize their readers and writers just to Iceberg and reduce
>> maintenance burden of multiple readers and writers of different table and
>> file formats.
>>
>> I cannot say ONLY for new feature development, because for example we
>> will likely continue to support AWS Glue catalog integration and it has an
>> active roadmap. But the interest in REST is overall strong compared to the
>> other catalog types like Hive, JDBC and DynamoDB.
>>
>> Best,
>> Jack Ye
>>
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Feb 9, 2023 at 1:24 PM Xinyi Lu <[email protected]>
>> wrote:
>>
>>> Hi Community,
>>>
>>> We’ve been evaluating the RestCatalog and want to know your feedback on
>>> what’s the best scenarios for using RestCatalog and the current adoption
>>> status in the industry. Will this be the iceberg catalog standard going
>>> forward to encourage users to move metadata transactions to the server
>>> side? Are we looking to adding more features which are only supported by
>>> the RestCatalog?
>>>
>>>
>>>
>>> Thanks,
>>> Xinyi
>>
>>
>
> --
> Ryan Blue
> Tabular
>

Reply via email to