Copilot commented on code in PR #9173:
URL: https://github.com/apache/gravitino/pull/9173#discussion_r2598246511
##########
docs/lakehouse-generic-catalog.md:
##########
@@ -0,0 +1,189 @@
+---
+title: "Generic Lakehouse Catalog"
+slug: /lakehouse-generic-catalog
+keywords:
+ - lakehouse
+ - lance
+ - metadata
+ - generic catalog
+ - file system
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Overview
+
+The Generic Lakehouse Catalog is a Gravitino catalog implementation designed
to seamlessly integrate with lakehouse storage systems built on file
system-based architectures. This catalog enables unified metadata management
for lakehouse tables stored on various storage backends, providing a consistent
interface for data discovery, governance, and access control.
+
+Currently, Gravitino fully supports the **Lance** lakehouse format, with plans
to extend support to additional formats in the future.
+
+### Why Use Generic Lakehouse Catalog?
+
+1. **Unified Metadata Management**: Single source of truth for table metadata
across multiple storage backends
+2. **Multi-Format Support**: Extensible architecture to support various
lakehouse table formats such as Lance, Iceberg, Hudi, etc.
+3. **Storage Flexibility**: Work with any file system - local, or cloud object
stores
+4. **Gravitino Integration**: Leverage Gravitino's metadata management, access
control, lineage tracking, and data discovery
+5. **Easy Migration**: Register existing lakehouse tables without data movement
+
+## Catalog Management
+
+### Capabilities
+
+The Generic Lakehouse Catalog provides comprehensive relational metadata
management capabilities equivalent to standard relational catalogs:
+
+**Supported Operations:**
+- ✅ Create, read, update, and delete catalogs
+- ✅ List all catalogs in a metalake
+- ✅ Manage catalog properties and metadata
+- ✅ Set and modify catalog locations
+- ✅ Configure storage backend credentials
+
+For detailed information on available operations, see [Manage Relational
Metadata Using Gravitino](./manage-relational-metadata-using-gravitino.md).
+
+### Catalog Properties
+
+#### Required Properties
+
+| Property | Description | Example
| Required | Since Version |
+|------------|----------------------------------------------|--------------------------|----------|---------------|
+| `provider` | Catalog provider type |
`lakehouse-generic` | Yes | 1.1.0 |
+| `location` | Root storage path for all schemas and tables |
`s3://buecket/lakehouse` | False | 1.1.0 |
Review Comment:
Typo in example value: "buecket" should be "bucket".
```suggestion
| `location` | Root storage path for all schemas and tables |
`s3://bucket/lakehouse` | False | 1.1.0 |
```
##########
docs/lakehouse-generic-lance-table.md:
##########
@@ -0,0 +1,335 @@
+
+
+---
+title: "Generic lakehouse catalog with Lance"
+slug: /lakehouse-generic-catalog-with-lance
+keywords:
+- lakehouse
+- lance
+- metadata
+- generic catalog
+- file system
+ license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+## Overview
+
+This document describes how to use Apache Gravitino to manage a generic
lakehouse catalog using Lance as the underlying table format.
+
+
+## Table Management
+
+### Supported Operations
+
+For Lance tables in a Generic Lakehouse Catalog, the following table
summarizes supported operations:
+
+| Operation | Support Status |
+|-----------|----------------|
+| List | ✅ Full |
+| Load | ✅ Full |
+| Alter | No support now |
+| Create | ✅ Full |
+| Register | ✅ Full |
+| Drop | ✅ Full |
+| Truncate | ✅ Full |
+
+:::note Feature Limitations
+- **Partitioning:** Not currently supported
+- **Sort Orders:** Not currently supported
+- **Distributions:** Not currently supported
+ :::
+
+### Data Type Mappings
+
+Lance uses Apache Arrow for table schemas. The following table shows type
mappings between Gravitino and Arrow:
+
+| Gravitino Type | Arrow Type |
+|----------------------------------|-----------------------------------------|
+| `Struct` | `Struct` |
+| `Map` | `Map` |
+| `List` | `Array` |
+| `Boolean` | `Boolean` |
+| `Byte` | `Int8` |
+| `Short` | `Int16` |
+| `Integer` | `Int32` |
+| `Long` | `Int64` |
+| `Float` | `Float` |
+| `Double` | `Double` |
+| `String` | `Utf8` |
+| `Binary` | `Binary` |
+| `Decimal(p, s)` | `Decimal(p, s)` (128-bit) |
+| `Date` | `Date` |
+| `Timestamp`/`Timestamp(6)` | `TimestampType withoutZone` |
+| `Timestamp(0)` | `TimestampType Second withoutZone` |
+| `Timestamp(3)` | `TimestampType Millisecond withoutZone` |
+| `Timestamp(9)` | `TimestampType Nanosecond withoutZone` |
+| `Timestamp_tz`/`Timestamp_tz(6)` | `TimestampType Microsecond withUtc` |
+| `Timestamp_tz(0)` | `TimestampType Second withUtc` |
+| `Timestamp_tz(3)` | `TimestampType Millisecond withUtc` |
+| `Timestamp_tz(9)` | `TimestampType Nanosecond withUtc` |
+| `Time`/`Time(9)` | `Time Nanosecond` |
+| `Null` | `Null` |
+| `Fixed(n)` | `Fixed-Size Binary(n)` |
+| `Interval_year` | `Interval(YearMonth)` |
+| `Interval_day` | `Duration(Microsecond)` |
+| `External(arrow_field_json_str)` | Any Arrow Field |
+
+### External Type Support
+
+For Arrow types not natively mapped in Gravitino, use the
`External(arrow_field_json_str)` type, which accepts a JSON string
representation of an Arrow `Field`.
+
+**Requirements:**
+- JSON must conform to Apache Arrow [Field
specification](https://github.com/apache/arrow-java/blob/ed81e5981a2bee40584b3a411ed755cb4cc5b91f/vector/src/main/java/org/apache/arrow/vector/types/pojo/Field.java#L80C1-L86C68)
+- `name` attribute must match column name exactly
+- `nullable` attribute must match column nullability
+- `children` array:
+ - Empty for primitive types
+ - Contains child field definitions for complex types (Struct, List)
+
+**Examples:**
+
+| Arrow Type | External Type Definition
|
+|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `Large Utf8` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largeutf8\"},\"children\":[]}")`
|
+| `Large Binary` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largebinary\"},\"children\":[]}")`
|
+| `Large List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largelist\"},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",\"bitWidth\":32,\"isSigned\":true},\"children\":[]}]}")`
|
+| `Fixed-Size List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"fixedsizelist\",\"listSize\":10},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",\"bitWidth\":32,\"isSigned\":true},\"children\":[]}]}")`
|
+
+### Table Properties
+
+Required and optional properties for tables in a Generic Lakehouse Catalog:
+
+| Property | Description
| Default | Required | Since
Version |
+|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|--------------|---------------|
+| `format` | Table format: `lance`, `iceberg`, etc. (currently
only `lance` is fully supported)
| (none) | Yes | 1.1.0
|
+| `location` | Storage path for table metadata and data, Lance
currently supports: S3, GCS, OSS, AZ, File, Memory and file-object-store.
| (none) | Conditional* |
1.1.0 |
+| `external` | Whether the data directory is an external location.
If it's `true`, dropping a table will only remove metadata in Gravitino and
will not delete the data directory, and purge table will delete both. For a
non-external table, dropping will drop both.
| false | No |
1.1.0 |
+| `lance.creation-mode` | Create mode: for create table, it can be `CREATE`,
`EXIST_OK` or `OVERWRITE`. and it should be `CREATE` and `OVERWRITE` for
registering tables
| `CREATE` | No |
1.1.0 |
Review Comment:
Grammatical error: "and it should be" should be "and it should be `CREATE`
or `OVERWRITE`" for clarity and consistency. The current phrasing is ambiguous
about whether both modes are required or either can be used.
```suggestion
| `lance.creation-mode` | Create mode: for create table, it can be `CREATE`,
`EXIST_OK` or `OVERWRITE`. and it should be `CREATE` or `OVERWRITE` for
registering tables
| `CREATE` | No |
1.1.0 |
```
##########
docs/manage-relational-metadata-using-gravitino.md:
##########
@@ -1007,22 +1011,24 @@ The following table shows the column auto-increment
that Gravitino supports for
| `jdbc-doris` | ✘
|
| `jdbc-oceanbase` |
✔([limitations](./jdbc-oceanbase-catalog.md#table-column-auto-increment))
|
| `jdbc-starrocks` | ✔
|
+| `lakehouse-generic` | ✘
|
#### Table property and type mapping
The following is the table property that Gravitino supports:
-| Catalog provider | Table property
| Type mapping
|
-|---------------------|-----------------------------------------------------------------------------|----------------------------------------------------------------------------|
-| `hive` | [Hive table
property](./apache-hive-catalog.md#table-properties) | [Hive type
mapping](./apache-hive-catalog.md#table-column-types) |
-| `lakehouse-iceberg` | [Iceberg table
property](./lakehouse-iceberg-catalog.md#table-properties) | [Iceberg type
mapping](./lakehouse-iceberg-catalog.md#table-column-types) |
-| `lakehouse-paimon` | [Paimon table
property](./lakehouse-paimon-catalog.md#table-properties) | [Paimon type
mapping](./lakehouse-paimon-catalog.md#table-column-types) |
-| `lakehouse-hudi` | [Hudi table
property](./lakehouse-hudi-catalog.md#table-properties) | [Hudi type
mapping](./lakehouse-hudi-catalog.md#table-column-types) |
-| `jdbc-mysql` | [MySQL table
property](./jdbc-mysql-catalog.md#table-properties) | [MySQL type
mapping](./jdbc-mysql-catalog.md#table-column-types) |
-| `jdbc-postgresql` | [PostgreSQL table
property](./jdbc-postgresql-catalog.md#table-properties) | [PostgreSQL type
mapping](./jdbc-postgresql-catalog.md#table-column-types) |
-| `jdbc-doris` | [Doris table
property](./jdbc-doris-catalog.md#table-properties) | [Doris type
mapping](./jdbc-doris-catalog.md#table-column-types) |
-| `jdbc-oceanbase` | [OceanBase table
property](./jdbc-oceanbase-catalog.md#table-properties) | [OceanBase type
mapping](./jdbc-oceanbase-catalog.md#table-column-types) |
-| `jdbc-starrocks` | [StarRocks table
property](./jdbc-starrocks-catalog.md#table-properties) | [StarRocks type
mapping](./jdbc-starrocks-catalog.md#table-column-types) |
+| Catalog provider | Table property
| Type mapping
|
+|---------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `hive` | [Hive table
property](./apache-hive-catalog.md#table-properties)
| [Hive type
mapping](./apache-hive-catalog.md#table-column-types)
|
+| `lakehouse-iceberg` | [Iceberg table
property](./lakehouse-iceberg-catalog.md#table-properties)
| [Iceberg type
mapping](./lakehouse-iceberg-catalog.md#table-column-types)
|
+| `lakehouse-paimon` | [Paimon table
property](./lakehouse-paimon-catalog.md#table-properties)
| [Paimon type
mapping](./lakehouse-paimon-catalog.md#table-column-types)
|
+| `lakehouse-hudi` | [Hudi table
property](./lakehouse-hudi-catalog.md#table-properties)
| [Hudi type
mapping](./lakehouse-hudi-catalog.md#table-column-types)
|
+| `jdbc-mysql` | [MySQL table
property](./jdbc-mysql-catalog.md#table-properties)
| [MySQL type
mapping](./jdbc-mysql-catalog.md#table-column-types)
|
+| `jdbc-postgresql` | [PostgreSQL table
property](./jdbc-postgresql-catalog.md#table-properties)
| [PostgreSQL type
mapping](./jdbc-postgresql-catalog.md#table-column-types)
|
+| `jdbc-doris` | [Doris table
property](./jdbc-doris-catalog.md#table-properties)
| [Doris type
mapping](./jdbc-doris-catalog.md#table-column-types)
|
+| `jdbc-oceanbase` | [OceanBase table
property](./jdbc-oceanbase-catalog.md#table-properties)
| [OceanBase type
mapping](./jdbc-oceanbase-catalog.md#table-column-types)
|
+| `jdbc-starrocks` | [StarRocks table
property](./jdbc-starrocks-catalog.md#table-properties)
| [StarRocks type
mapping](./jdbc-starrocks-catalog.md#table-column-types)
|
+| `lakehouse-generic` | Lakehouse generic table property depends on specific
table implementation, for Lance table, please refer to
[doc](./lakehouse-generic-lance-table#table-properties), other table format,
please refer to related docs. | Lakehouse generic type mapping. Similar to
table properties, for Lance table, please refer to
[docs](./lakehouse-generic-lance-table#data-type-mappings) |
Review Comment:
Broken documentation links - missing `.md` file extension. The links should
include the file extension for proper navigation:
- `./lakehouse-generic-lance-table#table-properties` should be
`./lakehouse-generic-lance-table.md#table-properties`
- `./lakehouse-generic-lance-table#data-type-mappings` should be
`./lakehouse-generic-lance-table.md#data-type-mappings`
```suggestion
| `lakehouse-generic` | Lakehouse generic table property depends on specific
table implementation, for Lance table, please refer to
[doc](./lakehouse-generic-lance-table.md#table-properties), other table format,
please refer to related docs. | Lakehouse generic type mapping. Similar to
table properties, for Lance table, please refer to
[docs](./lakehouse-generic-lance-table.md#data-type-mappings) |
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,390 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+ - Lance datasets
+ - REST API
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Overview
+
+The Lance REST service provides a RESTful interface for managing Lance
datasets through HTTP endpoints. Introduced in Gravitino version 1.1.0, this
service enables seamless interaction with Lance datasets for data operations
and metadata management.
+
+The service implements the [Lance REST API
specification](https://docs.lancedb.com/api-reference/introduction). For
detailed specification documentation, see the [official Lance REST
documentation](https://lance.org/format/namespace/rest/catalog-spec/).
+
+### What is Lance?
+
+[Lance](https://lance.org/format/) is a modern columnar data format designed
for AI/ML workloads. It provides:
+
+- **High-performance vector search**: Native support for similarity search on
high-dimensional embeddings
+- **Columnar storage**: Optimized for analytical queries and machine learning
pipelines
+- **Fast random access**: Efficient row-level operations unlike traditional
columnar formats
+- **Version control**: Built-in dataset versioning and time-travel capabilities
+- **Incremental updates**: Append and update data without full rewrites
+
+### Architecture
+
+The Lance REST service acts as a bridge between Lance datasets and
applications:
+
+```
+┌─────────────────┐
+│ Applications │
+│ (Python/Java) │
+└────────┬────────┘
+ │ HTTP/REST
+ ▼
+┌─────────────────┐
+│ Lance REST │◄──── Gravitino Metalake
+│ Service │ (Metadata Backend)
+└────────┬────────┘
+ │ File System Operations
+ ▼
+┌─────────────────┐
+│ Lance Datasets │
+│ (S3/GCS/Local) │
+└─────────────────┘
+```
+
+**Key Features:**
+- Full compliance with Lance REST API specification
+- Can run standalone or integrated with Gravitino server
+- Support for namespace and table management
+- Index creation and management capabilities (Index operations are not
supported in version 1.1.0)
+- Metadata stored in Gravitino for unified governance
+
+## Supported Operations
+
+The Lance REST service provides comprehensive support for namespace
management, table management, and index operations. The table below lists all
supported operations:
+
+| Operation | Description
| HTTP Method | Endpoint Pattern | Since Version |
+|-------------------|-------------------------------------------------------------------|-------------|-------------------------------------|---------------|
+| CreateNamespace | Create a new Lance namespace
| POST | `/lance/v1/namespace/{id}/create` | 1.1.0 |
+| ListNamespaces | List all namespaces under a parent namespace
| GET | `/lance/v1/namespace/{parent}/list` | 1.1.0 |
+| DescribeNamespace | Retrieve detailed information about a specific namespace
| POST | `/lance/v1/namespace/{id}/describe` | 1.1.0 |
+| DropNamespace | Delete a namespace
| POST | `/lance/v1/namespace/{id}/drop` | 1.1.0 |
+| NamespaceExists | Check whether a namespace exists
| POST | `/lance/v1/namespace/{id}/exists` | 1.1.0 |
+| ListTables | List all tables in a namespace
| GET | `/lance/v1/table/{namespace}/list` | 1.1.0 |
+| CreateTable | Create a new table in a namespace
| POST | `/lance/v1/table/{id}/create` | 1.1.0 |
+| DropTable | Delete a table including both metadata and data
| POST | `/lance/v1/table/{id}/drop` | 1.1.0 |
+| TableExists | Check whether a table exists
| POST | `/lance/v1/table/{id}/exists` | 1.1.0 |
+| RegisterTable | Register an existing Lance table to a namespace
| POST | `/lance/v1/table/{id}/register` | 1.1.0 |
+| DeregisterTable | Unregister a table from a namespace (metadata only, data
remains) | POST | `/lance/v1/table/{id}/deregister` | 1.1.0 |
+
+More details, please refer to the [Lance REST API
specification](https://lance.org/format/namespace/rest/catalog-spec/)
+
+### Operation Details
+
+Some operations have specific behaviors and modes. Below are important details
to consider:
+
+#### Namespace Operations
+
+**CreateNamespace** supports three modes:
+- `create`: Fails if namespace already exists
+- `exist_ok`: Succeeds even if namespace exists
+- `overwrite`: Replaces existing namespace
+
+**DropNamespace** behavior:
+- Recursively deletes all child namespaces and tables
+- Deletes both metadata and Lance data files
+- Operation is irreversible
+
+#### Table Operations
+
+**RegisterTable vs CreateTable**:
+- **RegisterTable**: Links existing Lance datasets into Gravitino catalog
without data movement
+- **CreateTable**: Creates new Lance table with schema and writes data files
+
+**DropTable vs DeregisterTable**:
+- **DropTable**: Permanently deletes metadata and data files from storage
+- **DeregisterTable**: Removes metadata from Gravitino but preserves Lance
data files
+
+
+## Deployment
+
+### Running with Gravitino Server
+
+To enable the Lance REST service within Gravitino server, configure the
following properties in your Gravitino configuration file:
+
+| Configuration Property | Description
| Default Value |
Required | Since Version |
+|-------------------------------------------|------------------------------------------------------------------------------|-------------------------|----------|---------------|
+| `gravitino.auxService.names` | Auxiliary services to run.
Include `lance-rest` to enable Lance REST service | iceberg-rest,lance-rest |
Yes | 0.2.0 |
+| `gravitino.lance-rest.classpath` | Classpath for Lance REST
service, relative to Gravitino home directory | lance-rest-server/libs |
Yes | 1.1.0 |
+| `gravitino.lance-rest.httpPort` | Port number for Lance REST
service | 9101 |
Yes | 1.1.0 |
+| `gravitino.lance-rest.host` | Hostname for Lance REST service
| 0.0.0.0 | Yes
| 1.1.0 |
+| `gravitino.lance-rest.namespace-backend` | Namespace metadata backend
(currently only `gravitino` is supported) | gravitino |
Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino-uri` | Gravitino server URI (required
when namespace-backend is `gravitino`) | http://localhost:8090 | Yes
| 1.1.0 |
+| `gravitino.lance-rest.gravitino-metalake` | Gravitino metalake name
(required when namespace-backend is `gravitino`) | (none)
| Yes | 1.1.0 |
+
+**Example Configuration:**
+
+```properties
+gravitino.auxService.names = lance-rest
+gravitino.lance-rest.httpPort = 9101
+gravitino.lance-rest.host = 0.0.0.0
+gravitino.lance-rest.namespace-backend = gravitino
+gravitino.lance-rest.gravitino.uri = http://localhost:8090
+gravitino.lance-rest.gravitino.metalake-name = my_metalake
+```
+
+### Running Standalone
+
+To run Lance REST service independently without Gravitino server:
+
+```shell
+{GRAVITINO_HOME}/bin/gravitino-lance-rest-server.sh start
+```
+
+Configure the service by editing
`{GRAVITINO_HOME}/conf/gravitino-lance-rest-server.conf` or passing
command-line arguments:
+
+| Configuration Property | Description
| Default Value | Required | Since Version |
+|------------------------------------------------|-----------------------------|-----------------------|----------|---------------|
+| `gravitino.lance-rest.namespace-backend` | Namespace metadata backend
| gravitino | Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI
| http://localhost:8090 | Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name
| (none) | Yes | 1.1.0 |
+| `gravitino.lance-rest.httpPort` | Service port number
| 9101 | No | 1.1.0 |
+| `gravitino.lance-rest.host` | Service hostname
| 0.0.0.0 | No | 1.1.0 |
+
+:::tip
+In most cases, you only need to configure
`gravitino.lance-rest.gravitino.metalake-name` and other properties can use
their default values.
+:::
+
+
+### Running with Docker
+
+Launch Lance REST service using Docker:
+
+```shell
+docker run -d --name lance-rest-service -p 9101:9101 \
+ -e LANCE_REST_GRAVITINO_URI=http://gravitino-host:8090 \
+ -e LANCE_REST_GRAVITINO_METALAKE_NAME=your_metalake_name \
+ apache/gravitino-lance-rest:latest
+```
+
+Access the service at `http://localhost:9101`.
+
+**Environment Variables:**
+
+| Environment Variable | Configuration Property
| Required | Default Value | Since Version |
+|--------------------------------------|-------------------------------------------|----------|-------------------------|---------------|
+| `LANCE_REST_NAMESPACE_BACKEND` |
`gravitino.lance-rest.namespace-backend` | No | `gravitino`
| 1.1.0 |
+| `LANCE_REST_GRAVITINO_METALAKE_NAME` |
`gravitino.lance-rest.gravitino-metalake` | Yes | (none)
| 1.1.0 |
+| `LANCE_REST_GRAVITINO_URI` | `gravitino.lance-rest.gravitino-uri`
| No | `http://localhost:8090` | 1.1.0 |
+| `LANCE_REST_HOST` | `gravitino.lance-rest.host`
| No | `0.0.0.0` | 1.1.0 |
+| `LANCE_REST_PORT` | `gravitino.lance-rest.httpPort`
| No | `9101` | 1.1.0 |
+
+:::tip Configuration Tips
+- **Required:** Set `LANCE_REST_GRAVITINO_METALAKE_NAME` to your Gravitino
metalake name
+- **Conditional:** Update `LANCE_REST_GRAVITINO_URI` if Gravitino server is
not on `localhost`
+- **Optional:** Other variables can use default values unless you have
specific requirements
+
+
+## Usage Guidelines
+
+When using Lance REST service with Gravitino backend, keep the following
considerations in mind:
+
+### Prerequisites
+- A running Gravitino server with a created metalake
+
+### Namespace Hierarchy
+Gravitino follows a three-level hierarchy: **catalog → schema → table**. When
creating namespaces or tables:
+
+1. **Parent must exist:** Before creating `lance_catalog/schema`, ensure
`lance_catalog` catalog exists in Gravitino metalake.
+2. **Two-level limit:** You can create namespace `lance_catalog/schema`, but
**not** `lance_catalog/schema/sub_schema`.
+3. **Table placement:** Tables can only be created under
`lance_catalog/schema`, not at catalog level.
+
+**Example Hierarchy:**
+```
+metalake
+└── lance_catalog (catalog - create via REST)
+ └── schema (namespace - create via REST)
+ └── table01 (table - create via REST)
+```
+
+### Delimiter Convention
+
+The Lance REST API uses `$` as the default delimiter to separate namespace
levels in URIs. When making HTTP requests:
+
+- **URL Encoding Required**: `$` must be URL-encoded as `%24`
+- **Example**: `lance_catalog$schema$table01` becomes
`lance_catalog%24schema%24table01` in URLs
+
+**Common Delimiters:**
+```
+Namespace path: lance_catalog.schema.table01
+URI representation: lance_catalog$schema$table01
+URL encoded: lance_catalog%24schema%24table01
+```
+
+:::caution Important Limitations
+- Currently supports only **two levels of namespaces** before tables
+- Tables **cannot** be nested deeper than schema level
+- Parent catalog must be created in Gravitino before using Lance REST API
+- Metadata operations require Gravitino server to be available
+- Namespace deletion is recursive and irreversible
+
+## Examples
+
+The following examples demonstrate how to interact with Lance REST service
using different programming languages and tools.
+
+**Prerequisites:**
+- Gravitino server is running with Lance REST service enabled.
+- A metalake has been created in Gravitino.
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+# Create a catalog-level namespace
+# mode: "create" | "exist_ok" | "overwrite" for create namespace/table; mode:
"create" | "overwrite" for register table
+curl -X POST http://localhost:9101/lance/v1/namespace/lance_catalog/create \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "id": ["lance_catalog"],
+ "mode": "create"
+ }'
+
+# Create a schema namespace
+# Note: %24 is URL-encoded '$' character used as delimiter
+curl -X POST
http://localhost:9101/lance/v1/namespace/lance_catalog%24schema/create \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "id": ["lance_catalog", "schema"],
+ "mode": "create"
+ }'
+
+# Register an existing table
+curl -X POST
http://localhost:9101/lance/v1/table/lance_catalog%24schema%24table01/register \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "id": ["lance_catalog", "schema", "table01"],
+ "location": "/tmp/lance_catalog/schema/table01",
+ "mode": "CREATE"
+ }'
+
+# Create a new empty table
+curl -X POST
http://localhost:9101/lance/v1/table/lance_catalog%24schema%24table02/create-empty
\
+ -H 'Content-Type: application/json' \
+ -d '{
+ "id": ["lance_catalog", "schema", "table02"],
+ "location": "/tmp/lance_catalog/schema/table02",
+ "properties": { "description": "This is table02" }
+ }'
+
+# Create a table with schema, the schema is inferred from the Arrow IPC file
+curl -X POST \
+
"http://localhost:9101/lance/v1/table/lance_catalog%24schema%24table03/create" \
+ -H 'Content-Type: application/vnd.apache.arrow.stream' \
+ -H "x-lance-table-location: "/tmp/lance_catalog/schema/table04" \
Review Comment:
Incorrect quoting in the curl command header. The header value has
mismatched quotes:
```
-H "x-lance-table-location: "/tmp/lance_catalog/schema/table04"
```
Should be:
```
-H "x-lance-table-location: /tmp/lance_catalog/schema/table04"
```
The inner quotes around the path should be removed to avoid shell parsing
errors.
```suggestion
-H "x-lance-table-location: /tmp/lance_catalog/schema/table04" \
```
##########
docs/lakehouse-generic-lance-table.md:
##########
@@ -0,0 +1,335 @@
+
+
+---
+title: "Generic lakehouse catalog with Lance"
+slug: /lakehouse-generic-catalog-with-lance
+keywords:
+- lakehouse
+- lance
+- metadata
+- generic catalog
+- file system
+ license: "This software is licensed under the Apache License version 2."
Review Comment:
Incorrect YAML frontmatter formatting. The `license` field has incorrect
indentation - it should be at the same level as other frontmatter fields
(`title`, `slug`, `keywords`), not nested under `keywords`.
```suggestion
license: "This software is licensed under the Apache License version 2."
```
##########
docs/lakehouse-generic-lance-table.md:
##########
@@ -0,0 +1,335 @@
+
+
+---
+title: "Generic lakehouse catalog with Lance"
+slug: /lakehouse-generic-catalog-with-lance
+keywords:
+- lakehouse
+- lance
+- metadata
+- generic catalog
+- file system
+ license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+## Overview
+
+This document describes how to use Apache Gravitino to manage a generic
lakehouse catalog using Lance as the underlying table format.
+
+
+## Table Management
+
+### Supported Operations
+
+For Lance tables in a Generic Lakehouse Catalog, the following table
summarizes supported operations:
+
+| Operation | Support Status |
+|-----------|----------------|
+| List | ✅ Full |
+| Load | ✅ Full |
+| Alter | No support now |
+| Create | ✅ Full |
+| Register | ✅ Full |
+| Drop | ✅ Full |
+| Truncate | ✅ Full |
+
+:::note Feature Limitations
+- **Partitioning:** Not currently supported
+- **Sort Orders:** Not currently supported
+- **Distributions:** Not currently supported
+ :::
+
+### Data Type Mappings
+
+Lance uses Apache Arrow for table schemas. The following table shows type
mappings between Gravitino and Arrow:
+
+| Gravitino Type | Arrow Type |
+|----------------------------------|-----------------------------------------|
+| `Struct` | `Struct` |
+| `Map` | `Map` |
+| `List` | `Array` |
+| `Boolean` | `Boolean` |
+| `Byte` | `Int8` |
+| `Short` | `Int16` |
+| `Integer` | `Int32` |
+| `Long` | `Int64` |
+| `Float` | `Float` |
+| `Double` | `Double` |
+| `String` | `Utf8` |
+| `Binary` | `Binary` |
+| `Decimal(p, s)` | `Decimal(p, s)` (128-bit) |
+| `Date` | `Date` |
+| `Timestamp`/`Timestamp(6)` | `TimestampType withoutZone` |
+| `Timestamp(0)` | `TimestampType Second withoutZone` |
+| `Timestamp(3)` | `TimestampType Millisecond withoutZone` |
+| `Timestamp(9)` | `TimestampType Nanosecond withoutZone` |
+| `Timestamp_tz`/`Timestamp_tz(6)` | `TimestampType Microsecond withUtc` |
+| `Timestamp_tz(0)` | `TimestampType Second withUtc` |
+| `Timestamp_tz(3)` | `TimestampType Millisecond withUtc` |
+| `Timestamp_tz(9)` | `TimestampType Nanosecond withUtc` |
+| `Time`/`Time(9)` | `Time Nanosecond` |
+| `Null` | `Null` |
+| `Fixed(n)` | `Fixed-Size Binary(n)` |
+| `Interval_year` | `Interval(YearMonth)` |
+| `Interval_day` | `Duration(Microsecond)` |
+| `External(arrow_field_json_str)` | Any Arrow Field |
+
+### External Type Support
+
+For Arrow types not natively mapped in Gravitino, use the
`External(arrow_field_json_str)` type, which accepts a JSON string
representation of an Arrow `Field`.
+
+**Requirements:**
+- JSON must conform to Apache Arrow [Field
specification](https://github.com/apache/arrow-java/blob/ed81e5981a2bee40584b3a411ed755cb4cc5b91f/vector/src/main/java/org/apache/arrow/vector/types/pojo/Field.java#L80C1-L86C68)
+- `name` attribute must match column name exactly
+- `nullable` attribute must match column nullability
+- `children` array:
+ - Empty for primitive types
+ - Contains child field definitions for complex types (Struct, List)
+
+**Examples:**
+
+| Arrow Type | External Type Definition
|
+|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `Large Utf8` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largeutf8\"},\"children\":[]}")`
|
+| `Large Binary` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largebinary\"},\"children\":[]}")`
|
+| `Large List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largelist\"},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",\"bitWidth\":32,\"isSigned\":true},\"children\":[]}]}")`
|
+| `Fixed-Size List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"fixedsizelist\",\"listSize\":10},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",\"bitWidth\":32,\"isSigned\":true},\"children\":[]}]}")`
|
+
+### Table Properties
+
+Required and optional properties for tables in a Generic Lakehouse Catalog:
+
+| Property | Description
| Default | Required | Since
Version |
+|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|--------------|---------------|
+| `format` | Table format: `lance`, `iceberg`, etc. (currently
only `lance` is fully supported)
| (none) | Yes | 1.1.0
|
+| `location` | Storage path for table metadata and data, Lance
currently supports: S3, GCS, OSS, AZ, File, Memory and file-object-store.
| (none) | Conditional* |
1.1.0 |
+| `external` | Whether the data directory is an external location.
If it's `true`, dropping a table will only remove metadata in Gravitino and
will not delete the data directory, and purge table will delete both. For a
non-external table, dropping will drop both.
| false | No |
1.1.0 |
+| `lance.creation-mode` | Create mode: for create table, it can be `CREATE`,
`EXIST_OK` or `OVERWRITE`. and it should be `CREATE` and `OVERWRITE` for
registering tables
| `CREATE` | No |
1.1.0 |
+| `lance.register` | Whether it is a register table operation. This API
will not create data directory acutally and it's the user responsibility to
create and manage the data directory.
| false | No |
1.1.0 |
Review Comment:
Typo: "acutally" should be "actually".
```suggestion
| `lance.register` | Whether it is a register table operation. This API
will not create data directory actually and it's the user responsibility to
create and manage the data directory.
| false | No |
1.1.0 |
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,390 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+ - Lance datasets
+ - REST API
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Overview
+
+The Lance REST service provides a RESTful interface for managing Lance
datasets through HTTP endpoints. Introduced in Gravitino version 1.1.0, this
service enables seamless interaction with Lance datasets for data operations
and metadata management.
+
+The service implements the [Lance REST API
specification](https://docs.lancedb.com/api-reference/introduction). For
detailed specification documentation, see the [official Lance REST
documentation](https://lance.org/format/namespace/rest/catalog-spec/).
+
+### What is Lance?
+
+[Lance](https://lance.org/format/) is a modern columnar data format designed
for AI/ML workloads. It provides:
+
+- **High-performance vector search**: Native support for similarity search on
high-dimensional embeddings
+- **Columnar storage**: Optimized for analytical queries and machine learning
pipelines
+- **Fast random access**: Efficient row-level operations unlike traditional
columnar formats
+- **Version control**: Built-in dataset versioning and time-travel capabilities
+- **Incremental updates**: Append and update data without full rewrites
+
+### Architecture
+
+The Lance REST service acts as a bridge between Lance datasets and
applications:
+
+```
+┌─────────────────┐
+│ Applications │
+│ (Python/Java) │
+└────────┬────────┘
+ │ HTTP/REST
+ ▼
+┌─────────────────┐
+│ Lance REST │◄──── Gravitino Metalake
+│ Service │ (Metadata Backend)
+└────────┬────────┘
+ │ File System Operations
+ ▼
+┌─────────────────┐
+│ Lance Datasets │
+│ (S3/GCS/Local) │
+└─────────────────┘
+```
+
+**Key Features:**
+- Full compliance with Lance REST API specification
+- Can run standalone or integrated with Gravitino server
+- Support for namespace and table management
+- Index creation and management capabilities (Index operations are not
supported in version 1.1.0)
+- Metadata stored in Gravitino for unified governance
+
+## Supported Operations
+
+The Lance REST service provides comprehensive support for namespace
management, table management, and index operations. The table below lists all
supported operations:
+
+| Operation | Description
| HTTP Method | Endpoint Pattern | Since Version |
+|-------------------|-------------------------------------------------------------------|-------------|-------------------------------------|---------------|
+| CreateNamespace | Create a new Lance namespace
| POST | `/lance/v1/namespace/{id}/create` | 1.1.0 |
+| ListNamespaces | List all namespaces under a parent namespace
| GET | `/lance/v1/namespace/{parent}/list` | 1.1.0 |
+| DescribeNamespace | Retrieve detailed information about a specific namespace
| POST | `/lance/v1/namespace/{id}/describe` | 1.1.0 |
+| DropNamespace | Delete a namespace
| POST | `/lance/v1/namespace/{id}/drop` | 1.1.0 |
+| NamespaceExists | Check whether a namespace exists
| POST | `/lance/v1/namespace/{id}/exists` | 1.1.0 |
+| ListTables | List all tables in a namespace
| GET | `/lance/v1/table/{namespace}/list` | 1.1.0 |
+| CreateTable | Create a new table in a namespace
| POST | `/lance/v1/table/{id}/create` | 1.1.0 |
+| DropTable | Delete a table including both metadata and data
| POST | `/lance/v1/table/{id}/drop` | 1.1.0 |
+| TableExists | Check whether a table exists
| POST | `/lance/v1/table/{id}/exists` | 1.1.0 |
+| RegisterTable | Register an existing Lance table to a namespace
| POST | `/lance/v1/table/{id}/register` | 1.1.0 |
+| DeregisterTable | Unregister a table from a namespace (metadata only, data
remains) | POST | `/lance/v1/table/{id}/deregister` | 1.1.0 |
+
+More details, please refer to the [Lance REST API
specification](https://lance.org/format/namespace/rest/catalog-spec/)
+
+### Operation Details
+
+Some operations have specific behaviors and modes. Below are important details
to consider:
+
+#### Namespace Operations
+
+**CreateNamespace** supports three modes:
+- `create`: Fails if namespace already exists
+- `exist_ok`: Succeeds even if namespace exists
+- `overwrite`: Replaces existing namespace
+
+**DropNamespace** behavior:
+- Recursively deletes all child namespaces and tables
+- Deletes both metadata and Lance data files
+- Operation is irreversible
+
+#### Table Operations
+
+**RegisterTable vs CreateTable**:
+- **RegisterTable**: Links existing Lance datasets into Gravitino catalog
without data movement
+- **CreateTable**: Creates new Lance table with schema and writes data files
+
+**DropTable vs DeregisterTable**:
+- **DropTable**: Permanently deletes metadata and data files from storage
+- **DeregisterTable**: Removes metadata from Gravitino but preserves Lance
data files
+
+
+## Deployment
+
+### Running with Gravitino Server
+
+To enable the Lance REST service within Gravitino server, configure the
following properties in your Gravitino configuration file:
+
+| Configuration Property | Description
| Default Value |
Required | Since Version |
+|-------------------------------------------|------------------------------------------------------------------------------|-------------------------|----------|---------------|
+| `gravitino.auxService.names` | Auxiliary services to run.
Include `lance-rest` to enable Lance REST service | iceberg-rest,lance-rest |
Yes | 0.2.0 |
+| `gravitino.lance-rest.classpath` | Classpath for Lance REST
service, relative to Gravitino home directory | lance-rest-server/libs |
Yes | 1.1.0 |
+| `gravitino.lance-rest.httpPort` | Port number for Lance REST
service | 9101 |
Yes | 1.1.0 |
+| `gravitino.lance-rest.host` | Hostname for Lance REST service
| 0.0.0.0 | Yes
| 1.1.0 |
+| `gravitino.lance-rest.namespace-backend` | Namespace metadata backend
(currently only `gravitino` is supported) | gravitino |
Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino-uri` | Gravitino server URI (required
when namespace-backend is `gravitino`) | http://localhost:8090 | Yes
| 1.1.0 |
+| `gravitino.lance-rest.gravitino-metalake` | Gravitino metalake name
(required when namespace-backend is `gravitino`) | (none)
| Yes | 1.1.0 |
+
+**Example Configuration:**
+
+```properties
+gravitino.auxService.names = lance-rest
+gravitino.lance-rest.httpPort = 9101
+gravitino.lance-rest.host = 0.0.0.0
+gravitino.lance-rest.namespace-backend = gravitino
+gravitino.lance-rest.gravitino.uri = http://localhost:8090
+gravitino.lance-rest.gravitino.metalake-name = my_metalake
+```
+
+### Running Standalone
+
+To run Lance REST service independently without Gravitino server:
+
+```shell
+{GRAVITINO_HOME}/bin/gravitino-lance-rest-server.sh start
+```
+
+Configure the service by editing
`{GRAVITINO_HOME}/conf/gravitino-lance-rest-server.conf` or passing
command-line arguments:
+
+| Configuration Property | Description
| Default Value | Required | Since Version |
+|------------------------------------------------|-----------------------------|-----------------------|----------|---------------|
+| `gravitino.lance-rest.namespace-backend` | Namespace metadata backend
| gravitino | Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI
| http://localhost:8090 | Yes | 1.1.0 |
Review Comment:
Inconsistent configuration property name. The property is documented as both:
- `gravitino.lance-rest.gravitino.uri` (line 130, 147)
- `gravitino.lance-rest.gravitino-uri` (line 120, 176, and in conf file)
Based on the configuration template file (line 108), the correct property
name should be `gravitino.lance-rest.gravitino-uri` (with hyphen). Please
update lines 130 and 147 to use the correct property name.
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,390 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+ - Lance datasets
+ - REST API
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Overview
+
+The Lance REST service provides a RESTful interface for managing Lance
datasets through HTTP endpoints. Introduced in Gravitino version 1.1.0, this
service enables seamless interaction with Lance datasets for data operations
and metadata management.
+
+The service implements the [Lance REST API
specification](https://docs.lancedb.com/api-reference/introduction). For
detailed specification documentation, see the [official Lance REST
documentation](https://lance.org/format/namespace/rest/catalog-spec/).
+
+### What is Lance?
+
+[Lance](https://lance.org/format/) is a modern columnar data format designed
for AI/ML workloads. It provides:
+
+- **High-performance vector search**: Native support for similarity search on
high-dimensional embeddings
+- **Columnar storage**: Optimized for analytical queries and machine learning
pipelines
+- **Fast random access**: Efficient row-level operations unlike traditional
columnar formats
+- **Version control**: Built-in dataset versioning and time-travel capabilities
+- **Incremental updates**: Append and update data without full rewrites
+
+### Architecture
+
+The Lance REST service acts as a bridge between Lance datasets and
applications:
+
+```
+┌─────────────────┐
+│ Applications │
+│ (Python/Java) │
+└────────┬────────┘
+ │ HTTP/REST
+ ▼
+┌─────────────────┐
+│ Lance REST │◄──── Gravitino Metalake
+│ Service │ (Metadata Backend)
+└────────┬────────┘
+ │ File System Operations
+ ▼
+┌─────────────────┐
+│ Lance Datasets │
+│ (S3/GCS/Local) │
+└─────────────────┘
+```
+
+**Key Features:**
+- Full compliance with Lance REST API specification
+- Can run standalone or integrated with Gravitino server
+- Support for namespace and table management
+- Index creation and management capabilities (Index operations are not
supported in version 1.1.0)
+- Metadata stored in Gravitino for unified governance
+
+## Supported Operations
+
+The Lance REST service provides comprehensive support for namespace
management, table management, and index operations. The table below lists all
supported operations:
+
+| Operation | Description
| HTTP Method | Endpoint Pattern | Since Version |
+|-------------------|-------------------------------------------------------------------|-------------|-------------------------------------|---------------|
+| CreateNamespace | Create a new Lance namespace
| POST | `/lance/v1/namespace/{id}/create` | 1.1.0 |
+| ListNamespaces | List all namespaces under a parent namespace
| GET | `/lance/v1/namespace/{parent}/list` | 1.1.0 |
+| DescribeNamespace | Retrieve detailed information about a specific namespace
| POST | `/lance/v1/namespace/{id}/describe` | 1.1.0 |
+| DropNamespace | Delete a namespace
| POST | `/lance/v1/namespace/{id}/drop` | 1.1.0 |
+| NamespaceExists | Check whether a namespace exists
| POST | `/lance/v1/namespace/{id}/exists` | 1.1.0 |
+| ListTables | List all tables in a namespace
| GET | `/lance/v1/table/{namespace}/list` | 1.1.0 |
+| CreateTable | Create a new table in a namespace
| POST | `/lance/v1/table/{id}/create` | 1.1.0 |
+| DropTable | Delete a table including both metadata and data
| POST | `/lance/v1/table/{id}/drop` | 1.1.0 |
+| TableExists | Check whether a table exists
| POST | `/lance/v1/table/{id}/exists` | 1.1.0 |
+| RegisterTable | Register an existing Lance table to a namespace
| POST | `/lance/v1/table/{id}/register` | 1.1.0 |
+| DeregisterTable | Unregister a table from a namespace (metadata only, data
remains) | POST | `/lance/v1/table/{id}/deregister` | 1.1.0 |
+
+More details, please refer to the [Lance REST API
specification](https://lance.org/format/namespace/rest/catalog-spec/)
+
+### Operation Details
+
+Some operations have specific behaviors and modes. Below are important details
to consider:
+
+#### Namespace Operations
+
+**CreateNamespace** supports three modes:
+- `create`: Fails if namespace already exists
+- `exist_ok`: Succeeds even if namespace exists
+- `overwrite`: Replaces existing namespace
+
+**DropNamespace** behavior:
+- Recursively deletes all child namespaces and tables
+- Deletes both metadata and Lance data files
+- Operation is irreversible
+
+#### Table Operations
+
+**RegisterTable vs CreateTable**:
+- **RegisterTable**: Links existing Lance datasets into Gravitino catalog
without data movement
+- **CreateTable**: Creates new Lance table with schema and writes data files
+
+**DropTable vs DeregisterTable**:
+- **DropTable**: Permanently deletes metadata and data files from storage
+- **DeregisterTable**: Removes metadata from Gravitino but preserves Lance
data files
+
+
+## Deployment
+
+### Running with Gravitino Server
+
+To enable the Lance REST service within Gravitino server, configure the
following properties in your Gravitino configuration file:
+
+| Configuration Property | Description
| Default Value |
Required | Since Version |
+|-------------------------------------------|------------------------------------------------------------------------------|-------------------------|----------|---------------|
+| `gravitino.auxService.names` | Auxiliary services to run.
Include `lance-rest` to enable Lance REST service | iceberg-rest,lance-rest |
Yes | 0.2.0 |
+| `gravitino.lance-rest.classpath` | Classpath for Lance REST
service, relative to Gravitino home directory | lance-rest-server/libs |
Yes | 1.1.0 |
+| `gravitino.lance-rest.httpPort` | Port number for Lance REST
service | 9101 |
Yes | 1.1.0 |
+| `gravitino.lance-rest.host` | Hostname for Lance REST service
| 0.0.0.0 | Yes
| 1.1.0 |
+| `gravitino.lance-rest.namespace-backend` | Namespace metadata backend
(currently only `gravitino` is supported) | gravitino |
Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino-uri` | Gravitino server URI (required
when namespace-backend is `gravitino`) | http://localhost:8090 | Yes
| 1.1.0 |
+| `gravitino.lance-rest.gravitino-metalake` | Gravitino metalake name
(required when namespace-backend is `gravitino`) | (none)
| Yes | 1.1.0 |
+
+**Example Configuration:**
+
+```properties
+gravitino.auxService.names = lance-rest
+gravitino.lance-rest.httpPort = 9101
+gravitino.lance-rest.host = 0.0.0.0
+gravitino.lance-rest.namespace-backend = gravitino
+gravitino.lance-rest.gravitino.uri = http://localhost:8090
+gravitino.lance-rest.gravitino.metalake-name = my_metalake
+```
+
+### Running Standalone
+
+To run Lance REST service independently without Gravitino server:
+
+```shell
+{GRAVITINO_HOME}/bin/gravitino-lance-rest-server.sh start
+```
+
+Configure the service by editing
`{GRAVITINO_HOME}/conf/gravitino-lance-rest-server.conf` or passing
command-line arguments:
+
+| Configuration Property | Description
| Default Value | Required | Since Version |
+|------------------------------------------------|-----------------------------|-----------------------|----------|---------------|
+| `gravitino.lance-rest.namespace-backend` | Namespace metadata backend
| gravitino | Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI
| http://localhost:8090 | Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name
| (none) | Yes | 1.1.0 |
+| `gravitino.lance-rest.httpPort` | Service port number
| 9101 | No | 1.1.0 |
+| `gravitino.lance-rest.host` | Service hostname
| 0.0.0.0 | No | 1.1.0 |
+
+:::tip
+In most cases, you only need to configure
`gravitino.lance-rest.gravitino.metalake-name` and other properties can use
their default values.
Review Comment:
Inconsistent configuration property name. The property is documented as both:
- `gravitino.lance-rest.gravitino.metalake-name` (line 131, 148, 153)
- `gravitino.lance-rest.gravitino-metalake` (line 121, 175, and in conf file)
Based on the configuration template file change in this PR (line 110), the
correct property name should be `gravitino.lance-rest.gravitino-metalake` (with
hyphen, no dot before metalake). Please update lines 131, 148, and 153 to use
the correct property name.
```suggestion
| `gravitino.lance-rest.gravitino-metalake` | Gravitino metalake name
| (none) | Yes | 1.1.0 |
| `gravitino.lance-rest.httpPort` | Service port number
| 9101 | No | 1.1.0 |
| `gravitino.lance-rest.host` | Service hostname
| 0.0.0.0 | No | 1.1.0 |
:::tip
In most cases, you only need to configure
`gravitino.lance-rest.gravitino-metalake` and other properties can use their
default values.
```
##########
docs/manage-relational-metadata-using-gravitino.md:
##########
@@ -979,17 +982,18 @@ When defining a table column, you can specify a
[literal](./expression.md#litera
The following is a table of the column default value that Gravitino supports
for different catalogs:
-| Catalog provider | Supported default value |
-|---------------------|-------------------------|
-| `hive` | ✘ |
-| `lakehouse-iceberg` | ✘ |
-| `lakehouse-paimon` | ✘ |
-| `lakehouse-hudi` | ✘ |
-| `jdbc-mysql` | ✔ |
-| `jdbc-postgresql` | ✔ |
-| `jdbc-doris` | ✔ |
-| `jdbc-oceanbase` | ✔ |
-| `jdbc-starrocks` | ✔ |
+| Catalog provider | Supported default value |
+|----------------------|-------------------------|
+| `hive` | ✘ |
+| `lakehouse-iceberg` | ✘ |
+| `lakehouse-paimon` | ✘ |
+| `lakehouse-hudi` | ✘ |
+| `jdbc-mysql` | ✔ |
+| `jdbc-postgresql` | ✔ |
+| `jdbc-doris` | ✔ |
+| `jdbc-oceanbase` | ✔ |
+| `jdbc-starrocks` | ✔ |
+| `lakehouse-generic` | ✘ |
Review Comment:
Table formatting issue: Extra space before the pipe character. Should be `|`
not ` |` at the start of the line for consistent table formatting.
##########
docs/manage-relational-metadata-using-gravitino.md:
##########
@@ -506,6 +508,7 @@ Currently, Gravitino supports the following schema property:
| `jdbc-doris` | [Doris schema
property](./jdbc-doris-catalog.md#schema-properties) |
| `jdbc-oceanbase` | [OceanBase schema
property](./jdbc-oceanbase-catalog.md#schema-properties) |
| `jdbc-starrocks` | [StarRocks schema
property](./jdbc-starrocks-catalog.md#schema-properties) |
+| `lakehouse-genric` | [Lakehouse generic schema
property](./lakehouse-generic-catalog.md#schema-properties) |
Review Comment:
Typo in catalog provider name: "lakehouse-genric" should be
"lakehouse-generic" to match the correct spelling used elsewhere in the
documentation.
```suggestion
| `lakehouse-generic` | [Lakehouse generic schema
property](./lakehouse-generic-catalog.md#schema-properties) |
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,390 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+ - Lance datasets
+ - REST API
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Overview
+
+The Lance REST service provides a RESTful interface for managing Lance
datasets through HTTP endpoints. Introduced in Gravitino version 1.1.0, this
service enables seamless interaction with Lance datasets for data operations
and metadata management.
+
+The service implements the [Lance REST API
specification](https://docs.lancedb.com/api-reference/introduction). For
detailed specification documentation, see the [official Lance REST
documentation](https://lance.org/format/namespace/rest/catalog-spec/).
+
+### What is Lance?
+
+[Lance](https://lance.org/format/) is a modern columnar data format designed
for AI/ML workloads. It provides:
+
+- **High-performance vector search**: Native support for similarity search on
high-dimensional embeddings
+- **Columnar storage**: Optimized for analytical queries and machine learning
pipelines
+- **Fast random access**: Efficient row-level operations unlike traditional
columnar formats
+- **Version control**: Built-in dataset versioning and time-travel capabilities
+- **Incremental updates**: Append and update data without full rewrites
+
+### Architecture
+
+The Lance REST service acts as a bridge between Lance datasets and
applications:
+
+```
+┌─────────────────┐
+│ Applications │
+│ (Python/Java) │
+└────────┬────────┘
+ │ HTTP/REST
+ ▼
+┌─────────────────┐
+│ Lance REST │◄──── Gravitino Metalake
+│ Service │ (Metadata Backend)
+└────────┬────────┘
+ │ File System Operations
+ ▼
+┌─────────────────┐
+│ Lance Datasets │
+│ (S3/GCS/Local) │
+└─────────────────┘
+```
+
+**Key Features:**
+- Full compliance with Lance REST API specification
+- Can run standalone or integrated with Gravitino server
+- Support for namespace and table management
+- Index creation and management capabilities (Index operations are not
supported in version 1.1.0)
+- Metadata stored in Gravitino for unified governance
+
+## Supported Operations
+
+The Lance REST service provides comprehensive support for namespace
management, table management, and index operations. The table below lists all
supported operations:
+
+| Operation | Description
| HTTP Method | Endpoint Pattern | Since Version |
+|-------------------|-------------------------------------------------------------------|-------------|-------------------------------------|---------------|
+| CreateNamespace | Create a new Lance namespace
| POST | `/lance/v1/namespace/{id}/create` | 1.1.0 |
+| ListNamespaces | List all namespaces under a parent namespace
| GET | `/lance/v1/namespace/{parent}/list` | 1.1.0 |
+| DescribeNamespace | Retrieve detailed information about a specific namespace
| POST | `/lance/v1/namespace/{id}/describe` | 1.1.0 |
+| DropNamespace | Delete a namespace
| POST | `/lance/v1/namespace/{id}/drop` | 1.1.0 |
+| NamespaceExists | Check whether a namespace exists
| POST | `/lance/v1/namespace/{id}/exists` | 1.1.0 |
+| ListTables | List all tables in a namespace
| GET | `/lance/v1/table/{namespace}/list` | 1.1.0 |
+| CreateTable | Create a new table in a namespace
| POST | `/lance/v1/table/{id}/create` | 1.1.0 |
+| DropTable | Delete a table including both metadata and data
| POST | `/lance/v1/table/{id}/drop` | 1.1.0 |
+| TableExists | Check whether a table exists
| POST | `/lance/v1/table/{id}/exists` | 1.1.0 |
+| RegisterTable | Register an existing Lance table to a namespace
| POST | `/lance/v1/table/{id}/register` | 1.1.0 |
+| DeregisterTable | Unregister a table from a namespace (metadata only, data
remains) | POST | `/lance/v1/table/{id}/deregister` | 1.1.0 |
+
+More details, please refer to the [Lance REST API
specification](https://lance.org/format/namespace/rest/catalog-spec/)
+
+### Operation Details
+
+Some operations have specific behaviors and modes. Below are important details
to consider:
+
+#### Namespace Operations
+
+**CreateNamespace** supports three modes:
+- `create`: Fails if namespace already exists
+- `exist_ok`: Succeeds even if namespace exists
+- `overwrite`: Replaces existing namespace
+
+**DropNamespace** behavior:
+- Recursively deletes all child namespaces and tables
+- Deletes both metadata and Lance data files
+- Operation is irreversible
+
+#### Table Operations
+
+**RegisterTable vs CreateTable**:
+- **RegisterTable**: Links existing Lance datasets into Gravitino catalog
without data movement
+- **CreateTable**: Creates new Lance table with schema and writes data files
+
+**DropTable vs DeregisterTable**:
+- **DropTable**: Permanently deletes metadata and data files from storage
+- **DeregisterTable**: Removes metadata from Gravitino but preserves Lance
data files
+
+
+## Deployment
+
+### Running with Gravitino Server
+
+To enable the Lance REST service within Gravitino server, configure the
following properties in your Gravitino configuration file:
+
+| Configuration Property | Description
| Default Value |
Required | Since Version |
+|-------------------------------------------|------------------------------------------------------------------------------|-------------------------|----------|---------------|
+| `gravitino.auxService.names` | Auxiliary services to run.
Include `lance-rest` to enable Lance REST service | iceberg-rest,lance-rest |
Yes | 0.2.0 |
+| `gravitino.lance-rest.classpath` | Classpath for Lance REST
service, relative to Gravitino home directory | lance-rest-server/libs |
Yes | 1.1.0 |
+| `gravitino.lance-rest.httpPort` | Port number for Lance REST
service | 9101 |
Yes | 1.1.0 |
+| `gravitino.lance-rest.host` | Hostname for Lance REST service
| 0.0.0.0 | Yes
| 1.1.0 |
+| `gravitino.lance-rest.namespace-backend` | Namespace metadata backend
(currently only `gravitino` is supported) | gravitino |
Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino-uri` | Gravitino server URI (required
when namespace-backend is `gravitino`) | http://localhost:8090 | Yes
| 1.1.0 |
+| `gravitino.lance-rest.gravitino-metalake` | Gravitino metalake name
(required when namespace-backend is `gravitino`) | (none)
| Yes | 1.1.0 |
+
+**Example Configuration:**
+
+```properties
+gravitino.auxService.names = lance-rest
+gravitino.lance-rest.httpPort = 9101
+gravitino.lance-rest.host = 0.0.0.0
+gravitino.lance-rest.namespace-backend = gravitino
+gravitino.lance-rest.gravitino.uri = http://localhost:8090
+gravitino.lance-rest.gravitino.metalake-name = my_metalake
+```
+
+### Running Standalone
+
+To run Lance REST service independently without Gravitino server:
+
+```shell
+{GRAVITINO_HOME}/bin/gravitino-lance-rest-server.sh start
+```
+
+Configure the service by editing
`{GRAVITINO_HOME}/conf/gravitino-lance-rest-server.conf` or passing
command-line arguments:
+
+| Configuration Property | Description
| Default Value | Required | Since Version |
+|------------------------------------------------|-----------------------------|-----------------------|----------|---------------|
+| `gravitino.lance-rest.namespace-backend` | Namespace metadata backend
| gravitino | Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI
| http://localhost:8090 | Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name
| (none) | Yes | 1.1.0 |
+| `gravitino.lance-rest.httpPort` | Service port number
| 9101 | No | 1.1.0 |
+| `gravitino.lance-rest.host` | Service hostname
| 0.0.0.0 | No | 1.1.0 |
+
+:::tip
+In most cases, you only need to configure
`gravitino.lance-rest.gravitino.metalake-name` and other properties can use
their default values.
+:::
+
+
+### Running with Docker
+
+Launch Lance REST service using Docker:
+
+```shell
+docker run -d --name lance-rest-service -p 9101:9101 \
+ -e LANCE_REST_GRAVITINO_URI=http://gravitino-host:8090 \
+ -e LANCE_REST_GRAVITINO_METALAKE_NAME=your_metalake_name \
+ apache/gravitino-lance-rest:latest
+```
+
+Access the service at `http://localhost:9101`.
+
+**Environment Variables:**
+
+| Environment Variable | Configuration Property
| Required | Default Value | Since Version |
+|--------------------------------------|-------------------------------------------|----------|-------------------------|---------------|
+| `LANCE_REST_NAMESPACE_BACKEND` |
`gravitino.lance-rest.namespace-backend` | No | `gravitino`
| 1.1.0 |
+| `LANCE_REST_GRAVITINO_METALAKE_NAME` |
`gravitino.lance-rest.gravitino-metalake` | Yes | (none)
| 1.1.0 |
+| `LANCE_REST_GRAVITINO_URI` | `gravitino.lance-rest.gravitino-uri`
| No | `http://localhost:8090` | 1.1.0 |
+| `LANCE_REST_HOST` | `gravitino.lance-rest.host`
| No | `0.0.0.0` | 1.1.0 |
+| `LANCE_REST_PORT` | `gravitino.lance-rest.httpPort`
| No | `9101` | 1.1.0 |
+
+:::tip Configuration Tips
+- **Required:** Set `LANCE_REST_GRAVITINO_METALAKE_NAME` to your Gravitino
metalake name
+- **Conditional:** Update `LANCE_REST_GRAVITINO_URI` if Gravitino server is
not on `localhost`
+- **Optional:** Other variables can use default values unless you have
specific requirements
+
Review Comment:
Missing closing tag for the tip admonition block. The `:::tip Configuration
Tips` block (line 180) is not properly closed with `:::` before the next
section starts.
```suggestion
:::
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,390 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+ - Lance datasets
+ - REST API
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Overview
+
+The Lance REST service provides a RESTful interface for managing Lance
datasets through HTTP endpoints. Introduced in Gravitino version 1.1.0, this
service enables seamless interaction with Lance datasets for data operations
and metadata management.
+
+The service implements the [Lance REST API
specification](https://docs.lancedb.com/api-reference/introduction). For
detailed specification documentation, see the [official Lance REST
documentation](https://lance.org/format/namespace/rest/catalog-spec/).
+
+### What is Lance?
+
+[Lance](https://lance.org/format/) is a modern columnar data format designed
for AI/ML workloads. It provides:
+
+- **High-performance vector search**: Native support for similarity search on
high-dimensional embeddings
+- **Columnar storage**: Optimized for analytical queries and machine learning
pipelines
+- **Fast random access**: Efficient row-level operations unlike traditional
columnar formats
+- **Version control**: Built-in dataset versioning and time-travel capabilities
+- **Incremental updates**: Append and update data without full rewrites
+
+### Architecture
+
+The Lance REST service acts as a bridge between Lance datasets and
applications:
+
+```
+┌─────────────────┐
+│ Applications │
+│ (Python/Java) │
+└────────┬────────┘
+ │ HTTP/REST
+ ▼
+┌─────────────────┐
+│ Lance REST │◄──── Gravitino Metalake
+│ Service │ (Metadata Backend)
+└────────┬────────┘
+ │ File System Operations
+ ▼
+┌─────────────────┐
+│ Lance Datasets │
+│ (S3/GCS/Local) │
+└─────────────────┘
+```
+
+**Key Features:**
+- Full compliance with Lance REST API specification
+- Can run standalone or integrated with Gravitino server
+- Support for namespace and table management
+- Index creation and management capabilities (Index operations are not
supported in version 1.1.0)
+- Metadata stored in Gravitino for unified governance
+
+## Supported Operations
+
+The Lance REST service provides comprehensive support for namespace
management, table management, and index operations. The table below lists all
supported operations:
+
+| Operation | Description
| HTTP Method | Endpoint Pattern | Since Version |
+|-------------------|-------------------------------------------------------------------|-------------|-------------------------------------|---------------|
+| CreateNamespace | Create a new Lance namespace
| POST | `/lance/v1/namespace/{id}/create` | 1.1.0 |
+| ListNamespaces | List all namespaces under a parent namespace
| GET | `/lance/v1/namespace/{parent}/list` | 1.1.0 |
+| DescribeNamespace | Retrieve detailed information about a specific namespace
| POST | `/lance/v1/namespace/{id}/describe` | 1.1.0 |
+| DropNamespace | Delete a namespace
| POST | `/lance/v1/namespace/{id}/drop` | 1.1.0 |
+| NamespaceExists | Check whether a namespace exists
| POST | `/lance/v1/namespace/{id}/exists` | 1.1.0 |
+| ListTables | List all tables in a namespace
| GET | `/lance/v1/table/{namespace}/list` | 1.1.0 |
+| CreateTable | Create a new table in a namespace
| POST | `/lance/v1/table/{id}/create` | 1.1.0 |
+| DropTable | Delete a table including both metadata and data
| POST | `/lance/v1/table/{id}/drop` | 1.1.0 |
+| TableExists | Check whether a table exists
| POST | `/lance/v1/table/{id}/exists` | 1.1.0 |
+| RegisterTable | Register an existing Lance table to a namespace
| POST | `/lance/v1/table/{id}/register` | 1.1.0 |
+| DeregisterTable | Unregister a table from a namespace (metadata only, data
remains) | POST | `/lance/v1/table/{id}/deregister` | 1.1.0 |
+
+More details, please refer to the [Lance REST API
specification](https://lance.org/format/namespace/rest/catalog-spec/)
+
+### Operation Details
+
+Some operations have specific behaviors and modes. Below are important details
to consider:
+
+#### Namespace Operations
+
+**CreateNamespace** supports three modes:
+- `create`: Fails if namespace already exists
+- `exist_ok`: Succeeds even if namespace exists
+- `overwrite`: Replaces existing namespace
+
+**DropNamespace** behavior:
+- Recursively deletes all child namespaces and tables
+- Deletes both metadata and Lance data files
+- Operation is irreversible
+
+#### Table Operations
+
+**RegisterTable vs CreateTable**:
+- **RegisterTable**: Links existing Lance datasets into Gravitino catalog
without data movement
+- **CreateTable**: Creates new Lance table with schema and writes data files
+
+**DropTable vs DeregisterTable**:
+- **DropTable**: Permanently deletes metadata and data files from storage
+- **DeregisterTable**: Removes metadata from Gravitino but preserves Lance
data files
+
+
+## Deployment
+
+### Running with Gravitino Server
+
+To enable the Lance REST service within Gravitino server, configure the
following properties in your Gravitino configuration file:
+
+| Configuration Property | Description
| Default Value |
Required | Since Version |
+|-------------------------------------------|------------------------------------------------------------------------------|-------------------------|----------|---------------|
+| `gravitino.auxService.names` | Auxiliary services to run.
Include `lance-rest` to enable Lance REST service | iceberg-rest,lance-rest |
Yes | 0.2.0 |
+| `gravitino.lance-rest.classpath` | Classpath for Lance REST
service, relative to Gravitino home directory | lance-rest-server/libs |
Yes | 1.1.0 |
+| `gravitino.lance-rest.httpPort` | Port number for Lance REST
service | 9101 |
Yes | 1.1.0 |
+| `gravitino.lance-rest.host` | Hostname for Lance REST service
| 0.0.0.0 | Yes
| 1.1.0 |
+| `gravitino.lance-rest.namespace-backend` | Namespace metadata backend
(currently only `gravitino` is supported) | gravitino |
Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino-uri` | Gravitino server URI (required
when namespace-backend is `gravitino`) | http://localhost:8090 | Yes
| 1.1.0 |
+| `gravitino.lance-rest.gravitino-metalake` | Gravitino metalake name
(required when namespace-backend is `gravitino`) | (none)
| Yes | 1.1.0 |
+
+**Example Configuration:**
+
+```properties
+gravitino.auxService.names = lance-rest
+gravitino.lance-rest.httpPort = 9101
+gravitino.lance-rest.host = 0.0.0.0
+gravitino.lance-rest.namespace-backend = gravitino
+gravitino.lance-rest.gravitino.uri = http://localhost:8090
+gravitino.lance-rest.gravitino.metalake-name = my_metalake
+```
+
+### Running Standalone
+
+To run Lance REST service independently without Gravitino server:
+
+```shell
+{GRAVITINO_HOME}/bin/gravitino-lance-rest-server.sh start
+```
+
+Configure the service by editing
`{GRAVITINO_HOME}/conf/gravitino-lance-rest-server.conf` or passing
command-line arguments:
+
+| Configuration Property | Description
| Default Value | Required | Since Version |
+|------------------------------------------------|-----------------------------|-----------------------|----------|---------------|
+| `gravitino.lance-rest.namespace-backend` | Namespace metadata backend
| gravitino | Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI
| http://localhost:8090 | Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name
| (none) | Yes | 1.1.0 |
+| `gravitino.lance-rest.httpPort` | Service port number
| 9101 | No | 1.1.0 |
+| `gravitino.lance-rest.host` | Service hostname
| 0.0.0.0 | No | 1.1.0 |
+
+:::tip
+In most cases, you only need to configure
`gravitino.lance-rest.gravitino.metalake-name` and other properties can use
their default values.
+:::
+
+
+### Running with Docker
+
+Launch Lance REST service using Docker:
+
+```shell
+docker run -d --name lance-rest-service -p 9101:9101 \
+ -e LANCE_REST_GRAVITINO_URI=http://gravitino-host:8090 \
+ -e LANCE_REST_GRAVITINO_METALAKE_NAME=your_metalake_name \
+ apache/gravitino-lance-rest:latest
+```
+
+Access the service at `http://localhost:9101`.
+
+**Environment Variables:**
+
+| Environment Variable | Configuration Property
| Required | Default Value | Since Version |
+|--------------------------------------|-------------------------------------------|----------|-------------------------|---------------|
+| `LANCE_REST_NAMESPACE_BACKEND` |
`gravitino.lance-rest.namespace-backend` | No | `gravitino`
| 1.1.0 |
+| `LANCE_REST_GRAVITINO_METALAKE_NAME` |
`gravitino.lance-rest.gravitino-metalake` | Yes | (none)
| 1.1.0 |
+| `LANCE_REST_GRAVITINO_URI` | `gravitino.lance-rest.gravitino-uri`
| No | `http://localhost:8090` | 1.1.0 |
+| `LANCE_REST_HOST` | `gravitino.lance-rest.host`
| No | `0.0.0.0` | 1.1.0 |
+| `LANCE_REST_PORT` | `gravitino.lance-rest.httpPort`
| No | `9101` | 1.1.0 |
+
+:::tip Configuration Tips
+- **Required:** Set `LANCE_REST_GRAVITINO_METALAKE_NAME` to your Gravitino
metalake name
+- **Conditional:** Update `LANCE_REST_GRAVITINO_URI` if Gravitino server is
not on `localhost`
+- **Optional:** Other variables can use default values unless you have
specific requirements
+
+
+## Usage Guidelines
+
+When using Lance REST service with Gravitino backend, keep the following
considerations in mind:
+
+### Prerequisites
+- A running Gravitino server with a created metalake
+
+### Namespace Hierarchy
+Gravitino follows a three-level hierarchy: **catalog → schema → table**. When
creating namespaces or tables:
+
+1. **Parent must exist:** Before creating `lance_catalog/schema`, ensure
`lance_catalog` catalog exists in Gravitino metalake.
+2. **Two-level limit:** You can create namespace `lance_catalog/schema`, but
**not** `lance_catalog/schema/sub_schema`.
+3. **Table placement:** Tables can only be created under
`lance_catalog/schema`, not at catalog level.
+
+**Example Hierarchy:**
+```
+metalake
+└── lance_catalog (catalog - create via REST)
+ └── schema (namespace - create via REST)
+ └── table01 (table - create via REST)
+```
+
+### Delimiter Convention
+
+The Lance REST API uses `$` as the default delimiter to separate namespace
levels in URIs. When making HTTP requests:
+
+- **URL Encoding Required**: `$` must be URL-encoded as `%24`
+- **Example**: `lance_catalog$schema$table01` becomes
`lance_catalog%24schema%24table01` in URLs
+
+**Common Delimiters:**
+```
+Namespace path: lance_catalog.schema.table01
+URI representation: lance_catalog$schema$table01
+URL encoded: lance_catalog%24schema%24table01
+```
+
+:::caution Important Limitations
+- Currently supports only **two levels of namespaces** before tables
+- Tables **cannot** be nested deeper than schema level
+- Parent catalog must be created in Gravitino before using Lance REST API
+- Metadata operations require Gravitino server to be available
+- Namespace deletion is recursive and irreversible
+
+## Examples
+
+The following examples demonstrate how to interact with Lance REST service
using different programming languages and tools.
+
+**Prerequisites:**
+- Gravitino server is running with Lance REST service enabled.
+- A metalake has been created in Gravitino.
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+# Create a catalog-level namespace
+# mode: "create" | "exist_ok" | "overwrite" for create namespace/table; mode:
"create" | "overwrite" for register table
+curl -X POST http://localhost:9101/lance/v1/namespace/lance_catalog/create \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "id": ["lance_catalog"],
+ "mode": "create"
+ }'
+
+# Create a schema namespace
+# Note: %24 is URL-encoded '$' character used as delimiter
+curl -X POST
http://localhost:9101/lance/v1/namespace/lance_catalog%24schema/create \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "id": ["lance_catalog", "schema"],
+ "mode": "create"
+ }'
+
+# Register an existing table
+curl -X POST
http://localhost:9101/lance/v1/table/lance_catalog%24schema%24table01/register \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "id": ["lance_catalog", "schema", "table01"],
+ "location": "/tmp/lance_catalog/schema/table01",
+ "mode": "CREATE"
+ }'
+
+# Create a new empty table
+curl -X POST
http://localhost:9101/lance/v1/table/lance_catalog%24schema%24table02/create-empty
\
+ -H 'Content-Type: application/json' \
+ -d '{
+ "id": ["lance_catalog", "schema", "table02"],
+ "location": "/tmp/lance_catalog/schema/table02",
+ "properties": { "description": "This is table02" }
+ }'
+
+# Create a table with schema, the schema is inferred from the Arrow IPC file
+curl -X POST \
+
"http://localhost:9101/lance/v1/table/lance_catalog%24schema%24table03/create" \
+ -H 'Content-Type: application/vnd.apache.arrow.stream' \
+ -H "x-lance-table-location: "/tmp/lance_catalog/schema/table04" \
Review Comment:
Inconsistent table naming in the comment and the header. The comment states
"table03" but the header value shows "table04":
- Comment: `# Create a table with schema, the schema is inferred from the
Arrow IPC file`
- Endpoint: `/table/lance_catalog%24schema%24table03/create`
- Header: `x-lance-table-location: "/tmp/lance_catalog/schema/table04"`
The location path should match the table name. Change "table04" to "table03"
in the x-lance-table-location header.
```suggestion
-H "x-lance-table-location: "/tmp/lance_catalog/schema/table03" \
```
##########
docs/lakehouse-generic-lance-table.md:
##########
@@ -0,0 +1,335 @@
+
+
+---
+title: "Generic lakehouse catalog with Lance"
+slug: /lakehouse-generic-catalog-with-lance
+keywords:
+- lakehouse
+- lance
+- metadata
+- generic catalog
+- file system
+ license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+## Overview
+
+This document describes how to use Apache Gravitino to manage a generic
lakehouse catalog using Lance as the underlying table format.
+
+
+## Table Management
+
+### Supported Operations
+
+For Lance tables in a Generic Lakehouse Catalog, the following table
summarizes supported operations:
+
+| Operation | Support Status |
+|-----------|----------------|
+| List | ✅ Full |
+| Load | ✅ Full |
+| Alter | No support now |
+| Create | ✅ Full |
+| Register | ✅ Full |
+| Drop | ✅ Full |
+| Truncate | ✅ Full |
+
+:::note Feature Limitations
+- **Partitioning:** Not currently supported
+- **Sort Orders:** Not currently supported
+- **Distributions:** Not currently supported
+ :::
+
+### Data Type Mappings
+
+Lance uses Apache Arrow for table schemas. The following table shows type
mappings between Gravitino and Arrow:
+
+| Gravitino Type | Arrow Type |
+|----------------------------------|-----------------------------------------|
+| `Struct` | `Struct` |
+| `Map` | `Map` |
+| `List` | `Array` |
+| `Boolean` | `Boolean` |
+| `Byte` | `Int8` |
+| `Short` | `Int16` |
+| `Integer` | `Int32` |
+| `Long` | `Int64` |
+| `Float` | `Float` |
+| `Double` | `Double` |
+| `String` | `Utf8` |
+| `Binary` | `Binary` |
+| `Decimal(p, s)` | `Decimal(p, s)` (128-bit) |
+| `Date` | `Date` |
+| `Timestamp`/`Timestamp(6)` | `TimestampType withoutZone` |
+| `Timestamp(0)` | `TimestampType Second withoutZone` |
+| `Timestamp(3)` | `TimestampType Millisecond withoutZone` |
+| `Timestamp(9)` | `TimestampType Nanosecond withoutZone` |
+| `Timestamp_tz`/`Timestamp_tz(6)` | `TimestampType Microsecond withUtc` |
+| `Timestamp_tz(0)` | `TimestampType Second withUtc` |
+| `Timestamp_tz(3)` | `TimestampType Millisecond withUtc` |
+| `Timestamp_tz(9)` | `TimestampType Nanosecond withUtc` |
+| `Time`/`Time(9)` | `Time Nanosecond` |
+| `Null` | `Null` |
+| `Fixed(n)` | `Fixed-Size Binary(n)` |
+| `Interval_year` | `Interval(YearMonth)` |
+| `Interval_day` | `Duration(Microsecond)` |
+| `External(arrow_field_json_str)` | Any Arrow Field |
+
+### External Type Support
+
+For Arrow types not natively mapped in Gravitino, use the
`External(arrow_field_json_str)` type, which accepts a JSON string
representation of an Arrow `Field`.
+
+**Requirements:**
+- JSON must conform to Apache Arrow [Field
specification](https://github.com/apache/arrow-java/blob/ed81e5981a2bee40584b3a411ed755cb4cc5b91f/vector/src/main/java/org/apache/arrow/vector/types/pojo/Field.java#L80C1-L86C68)
+- `name` attribute must match column name exactly
+- `nullable` attribute must match column nullability
+- `children` array:
+ - Empty for primitive types
+ - Contains child field definitions for complex types (Struct, List)
+
+**Examples:**
+
+| Arrow Type | External Type Definition
|
+|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `Large Utf8` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largeutf8\"},\"children\":[]}")`
|
+| `Large Binary` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largebinary\"},\"children\":[]}")`
|
+| `Large List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largelist\"},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",\"bitWidth\":32,\"isSigned\":true},\"children\":[]}]}")`
|
+| `Fixed-Size List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"fixedsizelist\",\"listSize\":10},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",\"bitWidth\":32,\"isSigned\":true},\"children\":[]}]}")`
|
+
+### Table Properties
+
+Required and optional properties for tables in a Generic Lakehouse Catalog:
+
+| Property | Description
| Default | Required | Since
Version |
+|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|--------------|---------------|
+| `format` | Table format: `lance`, `iceberg`, etc. (currently
only `lance` is fully supported)
| (none) | Yes | 1.1.0
|
+| `location` | Storage path for table metadata and data, Lance
currently supports: S3, GCS, OSS, AZ, File, Memory and file-object-store.
| (none) | Conditional* |
1.1.0 |
+| `external` | Whether the data directory is an external location.
If it's `true`, dropping a table will only remove metadata in Gravitino and
will not delete the data directory, and purge table will delete both. For a
non-external table, dropping will drop both.
| false | No |
1.1.0 |
+| `lance.creation-mode` | Create mode: for create table, it can be `CREATE`,
`EXIST_OK` or `OVERWRITE`. and it should be `CREATE` and `OVERWRITE` for
registering tables
| `CREATE` | No |
1.1.0 |
+| `lance.register` | Whether it is a register table operation. This API
will not create data directory acutally and it's the user responsibility to
create and manage the data directory.
| false | No |
1.1.0 |
+| `lance.storage.xxxx` | Any additional storage-specific properties required
by Lance format (e.g., S3 credentials, HDFS configs). Replace `xxxx` with
actual property names. For example, we can use
`lance.storage.aws_access_key_id` to set S3 aws_access_key_id when using a S3
location, for detail, please refer to
https://lancedb.com/docs/storage/integrations/ | (none) | No |
1.1.0 |
+
+
+**Location Requirement:** Must be specified at catalog, schema, or table
level. See [Location
Resolution](./lakehouse-generic-catalog.md#key-property-location).
+
+You may also set additional properties specific to your lakehouse format or
custom requirements.
+
+### Index Support
+
+Index capabilities vary by lakehouse format. The following table shows Lance
format support:
+
+| Index Type | Description
| Lance Support |
+|------------|----------------------------------------------------------------------------------------------|---------------|
+| SCALAR | Optimizes searches on scalar data types (integers, floats,
etc.) | ✅ |
+| VECTOR | Optimizes similarity searches in high-dimensional vector spaces
| ✅ |
+| BTREE | Balanced tree for sorted data with logarithmic
search/insert/delete complexity | ✅ |
+| INVERTED | Full-text search optimization through term-to-location mapping
| ✅ |
+| IVF_FLAT | Vector search with inverted file and flat quantization
| ✅ |
+| IVF_SQ | Vector search with scalar quantization for memory efficiency
| ✅ |
+| IVF_PQ | Vector search with product quantization balancing accuracy and
memory | ✅ |
+
+:::caution Index Creation Limitation
+**Lance tables do not support index creation during table creation.** You must:
+1. Create the table first
+2. Then create indexes on the created table
+
+This is a fundamental limitation of Lance format, not Gravitino.
+:::
+
+For more details on table indexes, see [Table Partitioning, Distribution, Sort
Ordering, and
Indexes](./manage-relational-metadata-using-gravitino.md#table-partitioning-distribution-sort-ordering-and-indexes).
+
+### Table Operations
+
+Table operations follow standard relational catalog patterns. See [Table
Operations](./manage-relational-metadata-using-gravitino.md#table-operations)
for comprehensive documentation.
+
+The following sections provide examples and important details for working with
Lance tables.
+
+#### Creating a Lance Table
+
+<Tabs groupId='language' queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+ -H "Content-Type: application/json" -d '{
+ "name": "lance_table",
+ "comment": "Example Lance table",
+ "columns": [
+ {
+ "name": "id",
+ "type": "integer",
+ "comment": "Primary identifier",
+ "nullable": false,
+ "autoIncrement": true,
+ "defaultValue": {
+ "type": "literal",
+ "dataType": "integer",
+ "value": "-1"
+ }
+ }
+ ],
+ "properties": {
+ "format": "lance",
+ "location": "/tmp/lance_catalog/schema/lance_table1"
Review Comment:
Inconsistent table naming. The table name is "lance_table" (line 158) but
the location uses "lance_table1" (line 176). These should match for
consistency. Change the location to `/tmp/lance_catalog/schema/lance_table` to
match the table name.
```suggestion
"location": "/tmp/lance_catalog/schema/lance_table"
```
##########
docs/lakehouse-generic-lance-table.md:
##########
@@ -0,0 +1,335 @@
+
+
+---
+title: "Generic lakehouse catalog with Lance"
+slug: /lakehouse-generic-catalog-with-lance
+keywords:
+- lakehouse
+- lance
+- metadata
+- generic catalog
+- file system
+ license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+## Overview
+
+This document describes how to use Apache Gravitino to manage a generic
lakehouse catalog using Lance as the underlying table format.
+
+
+## Table Management
+
+### Supported Operations
+
+For Lance tables in a Generic Lakehouse Catalog, the following table
summarizes supported operations:
+
+| Operation | Support Status |
+|-----------|----------------|
+| List | ✅ Full |
+| Load | ✅ Full |
+| Alter | No support now |
+| Create | ✅ Full |
+| Register | ✅ Full |
+| Drop | ✅ Full |
+| Truncate | ✅ Full |
+
+:::note Feature Limitations
+- **Partitioning:** Not currently supported
+- **Sort Orders:** Not currently supported
+- **Distributions:** Not currently supported
+ :::
+
+### Data Type Mappings
+
+Lance uses Apache Arrow for table schemas. The following table shows type
mappings between Gravitino and Arrow:
+
+| Gravitino Type | Arrow Type |
+|----------------------------------|-----------------------------------------|
+| `Struct` | `Struct` |
+| `Map` | `Map` |
+| `List` | `Array` |
+| `Boolean` | `Boolean` |
+| `Byte` | `Int8` |
+| `Short` | `Int16` |
+| `Integer` | `Int32` |
+| `Long` | `Int64` |
+| `Float` | `Float` |
+| `Double` | `Double` |
+| `String` | `Utf8` |
+| `Binary` | `Binary` |
+| `Decimal(p, s)` | `Decimal(p, s)` (128-bit) |
+| `Date` | `Date` |
+| `Timestamp`/`Timestamp(6)` | `TimestampType withoutZone` |
+| `Timestamp(0)` | `TimestampType Second withoutZone` |
+| `Timestamp(3)` | `TimestampType Millisecond withoutZone` |
+| `Timestamp(9)` | `TimestampType Nanosecond withoutZone` |
+| `Timestamp_tz`/`Timestamp_tz(6)` | `TimestampType Microsecond withUtc` |
+| `Timestamp_tz(0)` | `TimestampType Second withUtc` |
+| `Timestamp_tz(3)` | `TimestampType Millisecond withUtc` |
+| `Timestamp_tz(9)` | `TimestampType Nanosecond withUtc` |
+| `Time`/`Time(9)` | `Time Nanosecond` |
+| `Null` | `Null` |
+| `Fixed(n)` | `Fixed-Size Binary(n)` |
+| `Interval_year` | `Interval(YearMonth)` |
+| `Interval_day` | `Duration(Microsecond)` |
+| `External(arrow_field_json_str)` | Any Arrow Field |
+
+### External Type Support
+
+For Arrow types not natively mapped in Gravitino, use the
`External(arrow_field_json_str)` type, which accepts a JSON string
representation of an Arrow `Field`.
+
+**Requirements:**
+- JSON must conform to Apache Arrow [Field
specification](https://github.com/apache/arrow-java/blob/ed81e5981a2bee40584b3a411ed755cb4cc5b91f/vector/src/main/java/org/apache/arrow/vector/types/pojo/Field.java#L80C1-L86C68)
+- `name` attribute must match column name exactly
+- `nullable` attribute must match column nullability
+- `children` array:
+ - Empty for primitive types
+ - Contains child field definitions for complex types (Struct, List)
+
+**Examples:**
+
+| Arrow Type | External Type Definition
|
+|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `Large Utf8` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largeutf8\"},\"children\":[]}")`
|
+| `Large Binary` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largebinary\"},\"children\":[]}")`
|
+| `Large List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largelist\"},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",\"bitWidth\":32,\"isSigned\":true},\"children\":[]}]}")`
|
+| `Fixed-Size List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"fixedsizelist\",\"listSize\":10},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",\"bitWidth\":32,\"isSigned\":true},\"children\":[]}]}")`
|
+
+### Table Properties
+
+Required and optional properties for tables in a Generic Lakehouse Catalog:
+
+| Property | Description
| Default | Required | Since
Version |
+|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|--------------|---------------|
+| `format` | Table format: `lance`, `iceberg`, etc. (currently
only `lance` is fully supported)
| (none) | Yes | 1.1.0
|
+| `location` | Storage path for table metadata and data, Lance
currently supports: S3, GCS, OSS, AZ, File, Memory and file-object-store.
| (none) | Conditional* |
1.1.0 |
+| `external` | Whether the data directory is an external location.
If it's `true`, dropping a table will only remove metadata in Gravitino and
will not delete the data directory, and purge table will delete both. For a
non-external table, dropping will drop both.
| false | No |
1.1.0 |
+| `lance.creation-mode` | Create mode: for create table, it can be `CREATE`,
`EXIST_OK` or `OVERWRITE`. and it should be `CREATE` and `OVERWRITE` for
registering tables
| `CREATE` | No |
1.1.0 |
+| `lance.register` | Whether it is a register table operation. This API
will not create data directory acutally and it's the user responsibility to
create and manage the data directory.
| false | No |
1.1.0 |
Review Comment:
Grammatical error: "it's the user responsibility" should be "it's the user's
responsibility" (possessive form required).
```suggestion
| `lance.register` | Whether it is a register table operation. This API
will not create data directory acutally and it's the user's responsibility to
create and manage the data directory.
| false | No |
1.1.0 |
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,390 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+ - Lance datasets
+ - REST API
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Overview
+
+The Lance REST service provides a RESTful interface for managing Lance
datasets through HTTP endpoints. Introduced in Gravitino version 1.1.0, this
service enables seamless interaction with Lance datasets for data operations
and metadata management.
+
+The service implements the [Lance REST API
specification](https://docs.lancedb.com/api-reference/introduction). For
detailed specification documentation, see the [official Lance REST
documentation](https://lance.org/format/namespace/rest/catalog-spec/).
+
+### What is Lance?
+
+[Lance](https://lance.org/format/) is a modern columnar data format designed
for AI/ML workloads. It provides:
+
+- **High-performance vector search**: Native support for similarity search on
high-dimensional embeddings
+- **Columnar storage**: Optimized for analytical queries and machine learning
pipelines
+- **Fast random access**: Efficient row-level operations unlike traditional
columnar formats
+- **Version control**: Built-in dataset versioning and time-travel capabilities
+- **Incremental updates**: Append and update data without full rewrites
+
+### Architecture
+
+The Lance REST service acts as a bridge between Lance datasets and
applications:
+
+```
+┌─────────────────┐
+│ Applications │
+│ (Python/Java) │
+└────────┬────────┘
+ │ HTTP/REST
+ ▼
+┌─────────────────┐
+│ Lance REST │◄──── Gravitino Metalake
+│ Service │ (Metadata Backend)
+└────────┬────────┘
+ │ File System Operations
+ ▼
+┌─────────────────┐
+│ Lance Datasets │
+│ (S3/GCS/Local) │
+└─────────────────┘
+```
+
+**Key Features:**
+- Full compliance with Lance REST API specification
+- Can run standalone or integrated with Gravitino server
+- Support for namespace and table management
+- Index creation and management capabilities (Index operations are not
supported in version 1.1.0)
+- Metadata stored in Gravitino for unified governance
+
+## Supported Operations
+
+The Lance REST service provides comprehensive support for namespace
management, table management, and index operations. The table below lists all
supported operations:
+
+| Operation | Description
| HTTP Method | Endpoint Pattern | Since Version |
+|-------------------|-------------------------------------------------------------------|-------------|-------------------------------------|---------------|
+| CreateNamespace | Create a new Lance namespace
| POST | `/lance/v1/namespace/{id}/create` | 1.1.0 |
+| ListNamespaces | List all namespaces under a parent namespace
| GET | `/lance/v1/namespace/{parent}/list` | 1.1.0 |
+| DescribeNamespace | Retrieve detailed information about a specific namespace
| POST | `/lance/v1/namespace/{id}/describe` | 1.1.0 |
+| DropNamespace | Delete a namespace
| POST | `/lance/v1/namespace/{id}/drop` | 1.1.0 |
+| NamespaceExists | Check whether a namespace exists
| POST | `/lance/v1/namespace/{id}/exists` | 1.1.0 |
+| ListTables | List all tables in a namespace
| GET | `/lance/v1/table/{namespace}/list` | 1.1.0 |
+| CreateTable | Create a new table in a namespace
| POST | `/lance/v1/table/{id}/create` | 1.1.0 |
+| DropTable | Delete a table including both metadata and data
| POST | `/lance/v1/table/{id}/drop` | 1.1.0 |
+| TableExists | Check whether a table exists
| POST | `/lance/v1/table/{id}/exists` | 1.1.0 |
+| RegisterTable | Register an existing Lance table to a namespace
| POST | `/lance/v1/table/{id}/register` | 1.1.0 |
+| DeregisterTable | Unregister a table from a namespace (metadata only, data
remains) | POST | `/lance/v1/table/{id}/deregister` | 1.1.0 |
+
+More details, please refer to the [Lance REST API
specification](https://lance.org/format/namespace/rest/catalog-spec/)
+
+### Operation Details
+
+Some operations have specific behaviors and modes. Below are important details
to consider:
+
+#### Namespace Operations
+
+**CreateNamespace** supports three modes:
+- `create`: Fails if namespace already exists
+- `exist_ok`: Succeeds even if namespace exists
+- `overwrite`: Replaces existing namespace
+
+**DropNamespace** behavior:
+- Recursively deletes all child namespaces and tables
+- Deletes both metadata and Lance data files
+- Operation is irreversible
+
+#### Table Operations
+
+**RegisterTable vs CreateTable**:
+- **RegisterTable**: Links existing Lance datasets into Gravitino catalog
without data movement
+- **CreateTable**: Creates new Lance table with schema and writes data files
+
+**DropTable vs DeregisterTable**:
+- **DropTable**: Permanently deletes metadata and data files from storage
+- **DeregisterTable**: Removes metadata from Gravitino but preserves Lance
data files
+
+
+## Deployment
+
+### Running with Gravitino Server
+
+To enable the Lance REST service within Gravitino server, configure the
following properties in your Gravitino configuration file:
+
+| Configuration Property | Description
| Default Value |
Required | Since Version |
+|-------------------------------------------|------------------------------------------------------------------------------|-------------------------|----------|---------------|
+| `gravitino.auxService.names` | Auxiliary services to run.
Include `lance-rest` to enable Lance REST service | iceberg-rest,lance-rest |
Yes | 0.2.0 |
+| `gravitino.lance-rest.classpath` | Classpath for Lance REST
service, relative to Gravitino home directory | lance-rest-server/libs |
Yes | 1.1.0 |
+| `gravitino.lance-rest.httpPort` | Port number for Lance REST
service | 9101 |
Yes | 1.1.0 |
+| `gravitino.lance-rest.host` | Hostname for Lance REST service
| 0.0.0.0 | Yes
| 1.1.0 |
+| `gravitino.lance-rest.namespace-backend` | Namespace metadata backend
(currently only `gravitino` is supported) | gravitino |
Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino-uri` | Gravitino server URI (required
when namespace-backend is `gravitino`) | http://localhost:8090 | Yes
| 1.1.0 |
+| `gravitino.lance-rest.gravitino-metalake` | Gravitino metalake name
(required when namespace-backend is `gravitino`) | (none)
| Yes | 1.1.0 |
+
+**Example Configuration:**
+
+```properties
+gravitino.auxService.names = lance-rest
+gravitino.lance-rest.httpPort = 9101
+gravitino.lance-rest.host = 0.0.0.0
+gravitino.lance-rest.namespace-backend = gravitino
+gravitino.lance-rest.gravitino.uri = http://localhost:8090
+gravitino.lance-rest.gravitino.metalake-name = my_metalake
+```
+
+### Running Standalone
+
+To run Lance REST service independently without Gravitino server:
+
+```shell
+{GRAVITINO_HOME}/bin/gravitino-lance-rest-server.sh start
+```
+
+Configure the service by editing
`{GRAVITINO_HOME}/conf/gravitino-lance-rest-server.conf` or passing
command-line arguments:
+
+| Configuration Property | Description
| Default Value | Required | Since Version |
+|------------------------------------------------|-----------------------------|-----------------------|----------|---------------|
+| `gravitino.lance-rest.namespace-backend` | Namespace metadata backend
| gravitino | Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI
| http://localhost:8090 | Yes | 1.1.0 |
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name
| (none) | Yes | 1.1.0 |
+| `gravitino.lance-rest.httpPort` | Service port number
| 9101 | No | 1.1.0 |
+| `gravitino.lance-rest.host` | Service hostname
| 0.0.0.0 | No | 1.1.0 |
+
+:::tip
+In most cases, you only need to configure
`gravitino.lance-rest.gravitino.metalake-name` and other properties can use
their default values.
+:::
+
+
+### Running with Docker
+
+Launch Lance REST service using Docker:
+
+```shell
+docker run -d --name lance-rest-service -p 9101:9101 \
+ -e LANCE_REST_GRAVITINO_URI=http://gravitino-host:8090 \
+ -e LANCE_REST_GRAVITINO_METALAKE_NAME=your_metalake_name \
+ apache/gravitino-lance-rest:latest
+```
+
+Access the service at `http://localhost:9101`.
+
+**Environment Variables:**
+
+| Environment Variable | Configuration Property
| Required | Default Value | Since Version |
+|--------------------------------------|-------------------------------------------|----------|-------------------------|---------------|
+| `LANCE_REST_NAMESPACE_BACKEND` |
`gravitino.lance-rest.namespace-backend` | No | `gravitino`
| 1.1.0 |
+| `LANCE_REST_GRAVITINO_METALAKE_NAME` |
`gravitino.lance-rest.gravitino-metalake` | Yes | (none)
| 1.1.0 |
+| `LANCE_REST_GRAVITINO_URI` | `gravitino.lance-rest.gravitino-uri`
| No | `http://localhost:8090` | 1.1.0 |
+| `LANCE_REST_HOST` | `gravitino.lance-rest.host`
| No | `0.0.0.0` | 1.1.0 |
+| `LANCE_REST_PORT` | `gravitino.lance-rest.httpPort`
| No | `9101` | 1.1.0 |
+
+:::tip Configuration Tips
+- **Required:** Set `LANCE_REST_GRAVITINO_METALAKE_NAME` to your Gravitino
metalake name
+- **Conditional:** Update `LANCE_REST_GRAVITINO_URI` if Gravitino server is
not on `localhost`
+- **Optional:** Other variables can use default values unless you have
specific requirements
+
+
+## Usage Guidelines
+
+When using Lance REST service with Gravitino backend, keep the following
considerations in mind:
+
+### Prerequisites
+- A running Gravitino server with a created metalake
+
+### Namespace Hierarchy
+Gravitino follows a three-level hierarchy: **catalog → schema → table**. When
creating namespaces or tables:
+
+1. **Parent must exist:** Before creating `lance_catalog/schema`, ensure
`lance_catalog` catalog exists in Gravitino metalake.
+2. **Two-level limit:** You can create namespace `lance_catalog/schema`, but
**not** `lance_catalog/schema/sub_schema`.
+3. **Table placement:** Tables can only be created under
`lance_catalog/schema`, not at catalog level.
+
+**Example Hierarchy:**
+```
+metalake
+└── lance_catalog (catalog - create via REST)
+ └── schema (namespace - create via REST)
+ └── table01 (table - create via REST)
+```
+
+### Delimiter Convention
+
+The Lance REST API uses `$` as the default delimiter to separate namespace
levels in URIs. When making HTTP requests:
+
+- **URL Encoding Required**: `$` must be URL-encoded as `%24`
+- **Example**: `lance_catalog$schema$table01` becomes
`lance_catalog%24schema%24table01` in URLs
+
+**Common Delimiters:**
+```
+Namespace path: lance_catalog.schema.table01
+URI representation: lance_catalog$schema$table01
+URL encoded: lance_catalog%24schema%24table01
+```
+
+:::caution Important Limitations
+- Currently supports only **two levels of namespaces** before tables
+- Tables **cannot** be nested deeper than schema level
+- Parent catalog must be created in Gravitino before using Lance REST API
+- Metadata operations require Gravitino server to be available
+- Namespace deletion is recursive and irreversible
Review Comment:
Missing closing tag for the caution admonition block. The `:::caution
Important Limitations` block (line 222) is not properly closed with `:::`
before the next section starts.
```suggestion
- Namespace deletion is recursive and irreversible
:::
```
##########
docs/lakehouse-generic-lance-table.md:
##########
@@ -0,0 +1,335 @@
+
+
+---
+title: "Generic lakehouse catalog with Lance"
+slug: /lakehouse-generic-catalog-with-lance
+keywords:
+- lakehouse
+- lance
+- metadata
+- generic catalog
+- file system
+ license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+## Overview
+
+This document describes how to use Apache Gravitino to manage a generic
lakehouse catalog using Lance as the underlying table format.
+
+
+## Table Management
+
+### Supported Operations
+
+For Lance tables in a Generic Lakehouse Catalog, the following table
summarizes supported operations:
+
+| Operation | Support Status |
+|-----------|----------------|
+| List | ✅ Full |
+| Load | ✅ Full |
+| Alter | No support now |
+| Create | ✅ Full |
+| Register | ✅ Full |
+| Drop | ✅ Full |
+| Truncate | ✅ Full |
+
+:::note Feature Limitations
+- **Partitioning:** Not currently supported
+- **Sort Orders:** Not currently supported
+- **Distributions:** Not currently supported
+ :::
+
+### Data Type Mappings
+
+Lance uses Apache Arrow for table schemas. The following table shows type
mappings between Gravitino and Arrow:
+
+| Gravitino Type | Arrow Type |
+|----------------------------------|-----------------------------------------|
+| `Struct` | `Struct` |
+| `Map` | `Map` |
+| `List` | `Array` |
+| `Boolean` | `Boolean` |
+| `Byte` | `Int8` |
+| `Short` | `Int16` |
+| `Integer` | `Int32` |
+| `Long` | `Int64` |
+| `Float` | `Float` |
+| `Double` | `Double` |
+| `String` | `Utf8` |
+| `Binary` | `Binary` |
+| `Decimal(p, s)` | `Decimal(p, s)` (128-bit) |
+| `Date` | `Date` |
+| `Timestamp`/`Timestamp(6)` | `TimestampType withoutZone` |
+| `Timestamp(0)` | `TimestampType Second withoutZone` |
+| `Timestamp(3)` | `TimestampType Millisecond withoutZone` |
+| `Timestamp(9)` | `TimestampType Nanosecond withoutZone` |
+| `Timestamp_tz`/`Timestamp_tz(6)` | `TimestampType Microsecond withUtc` |
+| `Timestamp_tz(0)` | `TimestampType Second withUtc` |
+| `Timestamp_tz(3)` | `TimestampType Millisecond withUtc` |
+| `Timestamp_tz(9)` | `TimestampType Nanosecond withUtc` |
+| `Time`/`Time(9)` | `Time Nanosecond` |
+| `Null` | `Null` |
+| `Fixed(n)` | `Fixed-Size Binary(n)` |
+| `Interval_year` | `Interval(YearMonth)` |
+| `Interval_day` | `Duration(Microsecond)` |
+| `External(arrow_field_json_str)` | Any Arrow Field |
+
+### External Type Support
+
+For Arrow types not natively mapped in Gravitino, use the
`External(arrow_field_json_str)` type, which accepts a JSON string
representation of an Arrow `Field`.
+
+**Requirements:**
+- JSON must conform to Apache Arrow [Field
specification](https://github.com/apache/arrow-java/blob/ed81e5981a2bee40584b3a411ed755cb4cc5b91f/vector/src/main/java/org/apache/arrow/vector/types/pojo/Field.java#L80C1-L86C68)
+- `name` attribute must match column name exactly
+- `nullable` attribute must match column nullability
+- `children` array:
+ - Empty for primitive types
+ - Contains child field definitions for complex types (Struct, List)
+
+**Examples:**
+
+| Arrow Type | External Type Definition
|
+|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `Large Utf8` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largeutf8\"},\"children\":[]}")`
|
+| `Large Binary` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largebinary\"},\"children\":[]}")`
|
+| `Large List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largelist\"},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",\"bitWidth\":32,\"isSigned\":true},\"children\":[]}]}")`
|
+| `Fixed-Size List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"fixedsizelist\",\"listSize\":10},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",\"bitWidth\":32,\"isSigned\":true},\"children\":[]}]}")`
|
+
+### Table Properties
+
+Required and optional properties for tables in a Generic Lakehouse Catalog:
+
+| Property | Description
| Default | Required | Since
Version |
+|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|--------------|---------------|
+| `format` | Table format: `lance`, `iceberg`, etc. (currently
only `lance` is fully supported)
| (none) | Yes | 1.1.0
|
+| `location` | Storage path for table metadata and data, Lance
currently supports: S3, GCS, OSS, AZ, File, Memory and file-object-store.
| (none) | Conditional* |
1.1.0 |
+| `external` | Whether the data directory is an external location.
If it's `true`, dropping a table will only remove metadata in Gravitino and
will not delete the data directory, and purge table will delete both. For a
non-external table, dropping will drop both.
| false | No |
1.1.0 |
+| `lance.creation-mode` | Create mode: for create table, it can be `CREATE`,
`EXIST_OK` or `OVERWRITE`. and it should be `CREATE` and `OVERWRITE` for
registering tables
| `CREATE` | No |
1.1.0 |
+| `lance.register` | Whether it is a register table operation. This API
will not create data directory acutally and it's the user responsibility to
create and manage the data directory.
| false | No |
1.1.0 |
+| `lance.storage.xxxx` | Any additional storage-specific properties required
by Lance format (e.g., S3 credentials, HDFS configs). Replace `xxxx` with
actual property names. For example, we can use
`lance.storage.aws_access_key_id` to set S3 aws_access_key_id when using a S3
location, for detail, please refer to
https://lancedb.com/docs/storage/integrations/ | (none) | No |
1.1.0 |
+
+
+**Location Requirement:** Must be specified at catalog, schema, or table
level. See [Location
Resolution](./lakehouse-generic-catalog.md#key-property-location).
+
+You may also set additional properties specific to your lakehouse format or
custom requirements.
+
+### Index Support
+
+Index capabilities vary by lakehouse format. The following table shows Lance
format support:
+
+| Index Type | Description
| Lance Support |
+|------------|----------------------------------------------------------------------------------------------|---------------|
+| SCALAR | Optimizes searches on scalar data types (integers, floats,
etc.) | ✅ |
+| VECTOR | Optimizes similarity searches in high-dimensional vector spaces
| ✅ |
+| BTREE | Balanced tree for sorted data with logarithmic
search/insert/delete complexity | ✅ |
+| INVERTED | Full-text search optimization through term-to-location mapping
| ✅ |
+| IVF_FLAT | Vector search with inverted file and flat quantization
| ✅ |
+| IVF_SQ | Vector search with scalar quantization for memory efficiency
| ✅ |
+| IVF_PQ | Vector search with product quantization balancing accuracy and
memory | ✅ |
+
+:::caution Index Creation Limitation
+**Lance tables do not support index creation during table creation.** You must:
+1. Create the table first
+2. Then create indexes on the created table
+
+This is a fundamental limitation of Lance format, not Gravitino.
+:::
+
+For more details on table indexes, see [Table Partitioning, Distribution, Sort
Ordering, and
Indexes](./manage-relational-metadata-using-gravitino.md#table-partitioning-distribution-sort-ordering-and-indexes).
+
+### Table Operations
+
+Table operations follow standard relational catalog patterns. See [Table
Operations](./manage-relational-metadata-using-gravitino.md#table-operations)
for comprehensive documentation.
+
+The following sections provide examples and important details for working with
Lance tables.
+
+#### Creating a Lance Table
+
+<Tabs groupId='language' queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+ -H "Content-Type: application/json" -d '{
+ "name": "lance_table",
+ "comment": "Example Lance table",
+ "columns": [
+ {
+ "name": "id",
+ "type": "integer",
+ "comment": "Primary identifier",
+ "nullable": false,
+ "autoIncrement": true,
+ "defaultValue": {
+ "type": "literal",
+ "dataType": "integer",
+ "value": "-1"
+ }
Review Comment:
Documentation inconsistency: The example shows using `autoIncrement: true`
and `defaultValue` (lines 166-171), but the catalog support tables indicate
that `lakehouse-generic` does not support default values (line 996) or
auto-increment (line 1014). Either remove these unsupported features from the
example or update the support tables if Lance does support them.
```suggestion
"nullable": false
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]