mchades commented on code in PR #9173:
URL: https://github.com/apache/gravitino/pull/9173#discussion_r2596939589


##########
docs/lakehouse-generic-catalog.md:
##########
@@ -0,0 +1,587 @@
+---
+title: "Generic Lakehouse Catalog"
+slug: /lakehouse-generic-catalog
+keywords:
+  - lakehouse
+  - lance
+  - metadata
+  - generic catalog
+  - file system
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Overview
+
+The Generic Lakehouse Catalog is a Gravitino catalog implementation designed 
to seamlessly integrate with lakehouse storage systems built on file 
system-based architectures. This catalog enables unified metadata management 
for lakehouse tables stored on various storage backends, providing a consistent 
interface for data discovery, governance, and access control.
+
+### What is a Lakehouse?
+
+A lakehouse combines the best features of data lakes and data warehouses:
+
+- **Data Lake Benefits**: 
+  - Low-cost storage for massive volumes of raw data
+  - Support for diverse data formats (structured, semi-structured, 
unstructured)
+  - Decoupled storage and compute for flexible scaling
+
+- **Data Warehouse Benefits**:
+  - ACID transactions for data consistency
+  - Schema enforcement and evolution
+  - High-performance analytical queries
+  - Time travel and versioning
+
+### Supported Storage Systems
+
+The catalog works with lakehouse systems built on top of:
+
+**Storage Backends:**
+- **Object Stores:** Amazon S3, Azure Blob Storage, Google Cloud Storage, MinIO
+- **Distributed File Systems:** HDFS, Apache Ozone
+- **Local File Systems:** For development and testing
+
+**Lakehouse Formats:**
+- **Lance** ✅ (We only support Lance format fully at present)
+
+:::info Current Support Status
+While the architecture is designed to support various lakehouse formats, 
Gravitino currently provides **native production support only for Lance-based 
lakehouse systems** with comprehensive testing and optimization.
+:::
+
+### Why Use Generic Lakehouse Catalog?
+
+1. **Unified Metadata Management**: Single source of truth for table metadata 
across multiple storage backends
+2. **Multi-Format Support**: Extensible architecture to support various 
lakehouse table formats
+3. **Storage Flexibility**: Work with any file system - local, HDFS, or cloud 
object stores
+4. **Gravitino Integration**: Leverage Gravitino's access control, lineage 
tracking, and data discovery
+5. **Easy Migration**: Register existing lakehouse tables without data movement
+
+### System Requirements
+
+**Storage Requirements:**
+- Lakehouse storage system must support standard file system operations:
+  - Directory listing and navigation
+  - File reading and writing with atomic operations
+  - File deletion and renaming
+  - Path-based access control (optional but recommended)
+
+**Gravitino Requirements:**
+- Gravitino server version 1.1.0 or later
+- Configured metalake for catalog creation
+- Appropriate permissions for catalog management
+
+**Network Requirements:**
+- Network connectivity between Gravitino server and storage backend
+- For cloud storage: Internet access and valid credentials
+- For HDFS: Proper Hadoop configuration and network access
+
+## Catalog Management
+
+### Capabilities
+
+The Generic Lakehouse Catalog provides comprehensive relational metadata 
management capabilities equivalent to standard relational catalogs:
+
+**Supported Operations:**
+- ✅ Create, read, update, and delete catalogs
+- ✅ List all catalogs in a metalake
+- ✅ Manage catalog properties and metadata
+- ✅ Set and modify catalog locations
+- ✅ Configure storage backend credentials
+
+For detailed information on available operations, see [Manage Relational 
Metadata Using Gravitino](./manage-relational-metadata-using-gravitino.md).
+
+### Properties
+
+#### Required Properties
+
+| Property   | Description                                          | Example  
                        | Required |

Review Comment:
   should add `Since Version` column?



##########
docs/lakehouse-generic-catalog.md:
##########


Review Comment:
   you should also update the doc 
https://github.com/apache/gravitino/blob/main/docs/manage-relational-metadata-using-gravitino.md



##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,397 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+  - Lance REST
+  - Lance datasets
+  - REST API
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Overview
+
+The Lance REST service provides a RESTful interface for managing Lance 
datasets through HTTP endpoints. Introduced in Gravitino version 1.1.0, this 
service enables seamless interaction with Lance datasets for data operations 
and metadata management.
+
+The service implements the [Lance REST API 
specification](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/lancedb/lance-namespace/refs/heads/main/docs/src/rest.yaml).
 For detailed specification documentation, see the [official Lance REST 
documentation](https://lance.org/format/namespace/impls/rest/).
+
+### What is Lance?
+
+[Lance](https://lancedb.github.io/lance/) is a modern columnar data format 
designed for AI/ML workloads. It provides:
+
+- **High-performance vector search**: Native support for similarity search on 
high-dimensional embeddings
+- **Columnar storage**: Optimized for analytical queries and machine learning 
pipelines
+- **Fast random access**: Efficient row-level operations unlike traditional 
columnar formats
+- **Version control**: Built-in dataset versioning and time-travel capabilities
+- **Incremental updates**: Append and update data without full rewrites
+
+### Architecture
+
+The Lance REST service acts as a bridge between Lance datasets and 
applications:
+
+```
+┌─────────────────┐
+│   Applications  │
+│  (Python/Java)  │
+└────────┬────────┘
+         │ HTTP/REST
+         ▼
+┌─────────────────┐
+│  Lance REST     │◄──── Gravitino Metalake
+│    Service      │      (Metadata Backend)
+└────────┬────────┘
+         │ File System Operations
+         ▼
+┌─────────────────┐
+│  Lance Datasets │
+│ (S3/HDFS/Local) │
+└─────────────────┘
+```
+
+**Key Features:**
+- Full compliance with Lance REST API specification
+- Can run standalone or integrated with Gravitino server
+- Support for namespace and table management
+- Index creation and management capabilities
+- Metadata stored in Gravitino for unified governance
+
+## Supported Operations
+
+The Lance REST service provides comprehensive support for namespace 
management, table management, and index operations. The table below lists all 
supported operations:
+
+| Operation         | Description                                              
         | HTTP Method | Endpoint Pattern                    | Since Version |
+|-------------------|-------------------------------------------------------------------|-------------|-------------------------------------|---------------|
+| CreateNamespace   | Create a new Lance namespace                             
         | POST        | `/lance/v1/namespace/{id}/create`   | 1.1.0         |
+| ListNamespaces    | List all namespaces under a parent namespace             
         | GET         | `/lance/v1/namespace/{parent}/list` | 1.1.0         |
+| DescribeNamespace | Retrieve detailed information about a specific namespace 
         | POST        | `/lance/v1/namespace/{id}/describe` | 1.1.0         |
+| DropNamespace     | Delete a namespace                                       
         | POST        | `/lance/v1/namespace/{id}/drop`     | 1.1.0         |
+| NamespaceExists   | Check whether a namespace exists                         
         | POST        | `/lance/v1/namespace/{id}/exists`   | 1.1.0         |
+| ListTables        | List all tables in a namespace                           
         | GET         | `/lance/v1/table/{namespace}/list`  | 1.1.0         |
+| CreateTable       | Create a new table in a namespace                        
         | POST        | `/lance/v1/table/{id}/create`       | 1.1.0         |
+| DropTable         | Delete a table including both metadata and data          
         | POST        | `/lance/v1/table/{id}/drop`         | 1.1.0         |
+| TableExists       | Check whether a table exists                             
         | POST        | `/lance/v1/table/{id}/exists`       | 1.1.0         |
+| RegisterTable     | Register an existing Lance table to a namespace          
         | POST        | `/lance/v1/table/{id}/register`     | 1.1.0         |
+| DeregisterTable   | Unregister a table from a namespace (metadata only, data 
remains) | POST        | `/lance/v1/table/{id}/deregister`   | 1.1.0         |
+
+### Operation Details
+
+#### Namespace Operations
+
+**CreateNamespace** supports three modes:
+- `create`: Fails if namespace already exists
+- `exist_ok`: Succeeds even if namespace exists  
+- `overwrite`: Replaces existing namespace
+
+**DropNamespace** behavior:
+- Recursively deletes all child namespaces and tables
+- Deletes both metadata and Lance data files
+- Operation is irreversible
+
+#### Table Operations
+
+**RegisterTable vs CreateTable**:
+- **RegisterTable**: Links existing Lance datasets into Gravitino catalog 
without data movement
+- **CreateTable**: Creates new Lance table with schema and writes data files
+
+**DropTable vs DeregisterTable**:
+- **DropTable**: Permanently deletes metadata and data files from storage
+- **DeregisterTable**: Removes metadata from Gravitino but preserves Lance 
data files
+
+:::note
+Index deletion is not supported in version 1.1.0.
+:::
+
+## Deployment
+
+### Running with Gravitino Server
+
+To enable the Lance REST service within Gravitino server, configure the 
following properties in your Gravitino configuration file:
+
+| Configuration Property                    | Description                      
                                            | Default Value           | 
Required | Since Version |
+|-------------------------------------------|------------------------------------------------------------------------------|-------------------------|----------|---------------|
+| `gravitino.auxService.names`              | Auxiliary services to run. 
Include `lance-rest` to enable Lance REST service | iceberg-rest,lance-rest | 
Yes      | 0.2.0         |
+| `gravitino.lance-rest.classpath`          | Classpath for Lance REST 
service, relative to Gravitino home directory       | lance-rest-server/libs  | 
Yes      | 1.1.0         |
+| `gravitino.lance-rest.httpPort`           | Port number for Lance REST 
service                                           | 9101                    | 
Yes      | 1.1.0         |
+| `gravitino.lance-rest.host`               | Hostname for Lance REST service  
                                            | 0.0.0.0                 | Yes     
 | 1.1.0         |
+| `gravitino.lance-rest.namespace-backend`  | Namespace metadata backend 
(currently only `gravitino` is supported)         | gravitino               | 
Yes      | 1.1.0         |
+| `gravitino.lance-rest.gravitino-uri`      | Gravitino server URI (required 
when namespace-backend is `gravitino`)        | http://localhost:8090   | Yes   
   | 1.1.0         |
+| `gravitino.lance-rest.gravitino-metalake` | Gravitino metalake name 
(required when namespace-backend is `gravitino`)     | (none)                  
| Yes      | 1.1.0         |
+
+**Example Configuration:**
+
+```properties
+gravitino.auxService.names = lance-rest
+gravitino.lance-rest.httpPort = 9101
+gravitino.lance-rest.host = 0.0.0.0
+gravitino.lance-rest.namespace-backend = gravitino
+gravitino.lance-rest.gravitino.uri = http://localhost:8090
+gravitino.lance-rest.gravitino.metalake-name = my_metalake
+```
+
+### Running Standalone
+
+To run Lance REST service independently without Gravitino server:
+
+```shell
+{GRAVITINO_HOME}/bin/gravitino-lance-rest-server.sh start
+```
+
+Configure the service by editing `gravitino-lance-rest-server.conf` or passing 
command-line arguments:
+
+| Configuration Property                         | Description                 
| Default Value         | Required | Since Version |
+|------------------------------------------------|-----------------------------|-----------------------|----------|---------------|
+| `gravitino.lance-rest.namespace-backend`       | Namespace metadata backend  
| gravitino             | Yes      | 1.1.0         |
+| `gravitino.lance-rest.gravitino.uri`           | Gravitino server URI        
| http://localhost:8090 | Yes      | 1.1.0         |
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name     
| (none)                | Yes      | 1.1.0         |
+| `gravitino.lance-rest.httpPort`                | Service port number         
| 9101                  | No       | 1.1.0         |
+| `gravitino.lance-rest.host`                    | Service hostname            
| 0.0.0.0               | No       | 1.1.0         |
+
+:::tip
+In most cases, you only need to configure 
`gravitino.lance-rest.gravitino.metalake-name`. Other properties can use their 
default values.
+:::
+
+## Usage Guidelines
+
+When using Lance REST service with Gravitino backend, keep the following 
considerations in mind:
+
+### Prerequisites
+- A running Gravitino server with a created metalake
+- A generic-lakehouse catalog created in Gravitino metalake
+
+### Namespace Hierarchy
+Gravitino follows a three-level hierarchy: **catalog → schema → table**. When 
creating namespaces or tables:
+
+1. **Parent must exist:** Before creating `lance_catalog/schema`, ensure 
`lance_catalog` catalog exists in Gravitino metalake
+2. **Two-level limit:** You can create `lance_catalog/schema`, but **not** 
`lance_catalog/schema/sub_schema`
+3. **Table placement:** Tables can only be created under 
`lance_catalog/schema`, not at catalog level
+
+**Example Hierarchy:**
+```
+metalake
+└── lance_catalog (catalog - must pre-exist in Gravitino)
+    └── schema (namespace - create via REST)
+        └── table01 (table - create via REST)
+```
+
+### Delimiter Convention
+
+The Lance REST API uses `$` as the default delimiter to separate namespace 
levels in URIs. When making HTTP requests:
+
+- **URL Encoding Required**: `$` must be URL-encoded as `%24`
+- **Example**: `lance_catalog$schema$table01` becomes 
`lance_catalog%24schema%24table01` in URLs
+
+**Common Delimiters:**
+```
+Namespace path:     lance_catalog.schema.table01
+URI representation: lance_catalog$schema$table01  
+URL encoded:        lance_catalog%24schema%24table01
+```
+
+:::caution Important Limitations
+- Currently supports only **two levels of namespaces** before tables
+- Tables **cannot** be nested deeper than schema level  
+- Parent catalog must be created in Gravitino before using Lance REST API
+- Metadata operations require Gravitino server to be available
+- Namespace deletion is recursive and irreversible
+:::
+- Currently supports only **two levels of namespaces** before tables
+- Tables **cannot** be nested deeper than schema level
+- Parent catalog must be created in Gravitino before using Lance REST API
+:::

Review Comment:
   Duplicated content



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to