Thanks Jack and Ryan for the insights! On Thu, Feb 9, 2023 at 5:15 PM Ryan Blue <[email protected]> wrote:
> There were a few different reasons for building the REST catalog: > > 1. Standardize an interface for catalogs, like the Hive Thrift API. > This makes customization easy: you don’t need to get Dremio or Athena to > put your custom catalog’s Jar in their classpath. It also makes catalogs > across languages more reliable and easier to manage. For example, having a > slightly different JDBC implementation in every language is going to lead > to inconsistencies and problems. > 2. Enable new catalog features: It has been difficult to fit Iceberg > into existing catalogs and this has limited better features. For example, > we want to be able to un-drop tables for a certain period of time, to > support multi-table transactions, and to enable authentication and > authorization. I don’t think the right strategy is to try to adapt the Hive > MetaStore for it. > 3. Fix some problems: There are some boring things we can fix like the > time it takes to load metadata files with lots of snapshots. There are also > issues with the way catalogs handle metadata locations and FileIO with > respect to tables. Ideally, these would be configured for each table, but > you have to create a FileIO to read the metadata file. > > I think Jack’s answer validates that the first goal has a lot of value for > the community. But your question sounds like it is mainly addressing the > second goal, where new features and investment will be. > > No one intends to stop development on other catalogs or to limit new > features to only the REST catalog. In fact, the upcoming 1.2.0 release has > a new catalog contributed by Snowflake. But, I think that it will be much, > much easier to develop new features for the REST catalog because one of the > goals was to avoid needing to go to crazy lengths to make things fit with > the Hive MetaStore. That will naturally lead to new features that can’t or > won’t be implemented. > > Ryan > > On Thu, Feb 9, 2023 at 1:34 PM Jack Ye <[email protected]> wrote: > >> Most of the development of REST catalog comes from Tabular at this >> moment, I will let them comment more about this. >> >> Speaking from AWS perspective, we have been recommending REST catalog for >> organizations that have their internal in house catalog systems. The REST >> catalog provides a really well-designed and standardized API spec that >> organizations can translate requests to and from their existing catalog >> system, so that it can (1) work with Iceberg tables, or (2) even translate >> their non-Iceberg tables to be exposed as an Iceberg table for query, so >> they can standardize their readers and writers just to Iceberg and reduce >> maintenance burden of multiple readers and writers of different table and >> file formats. >> >> I cannot say ONLY for new feature development, because for example we >> will likely continue to support AWS Glue catalog integration and it has an >> active roadmap. But the interest in REST is overall strong compared to the >> other catalog types like Hive, JDBC and DynamoDB. >> >> Best, >> Jack Ye >> >> >> >> >> >> >> >> >> On Thu, Feb 9, 2023 at 1:24 PM Xinyi Lu <[email protected]> >> wrote: >> >>> Hi Community, >>> >>> We’ve been evaluating the RestCatalog and want to know your feedback on >>> what’s the best scenarios for using RestCatalog and the current adoption >>> status in the industry. Will this be the iceberg catalog standard going >>> forward to encourage users to move metadata transactions to the server >>> side? Are we looking to adding more features which are only supported by >>> the RestCatalog? >>> >>> >>> >>> Thanks, >>> Xinyi >> >> > > -- > Ryan Blue > Tabular >
