bobbai00 commented on code in PR #4272:
URL: https://github.com/apache/texera/pull/4272#discussion_r3012082316
##########
common/workflow-core/src/main/scala/org/apache/texera/amber/core/storage/result/iceberg/IcebergTableWriter.scala:
##########
@@ -107,10 +106,12 @@ private[storage] class IcebergTableWriter[T](
private def flushBuffer(): Unit = {
if (buffer.nonEmpty) {
// Create a unique file path using the writer's identifier and the
filename index
- val filepath =
Paths.get(table.location()).resolve(s"${writerIdentifier}_${filenameIdx}")
+ val location = table.location()
+ val basePath = if (location.endsWith("/")) location else s"$location/"
Review Comment:
I think Paths.get().resolve() can automatically handle the case where the
path has "/", can you double check? I want to avoid this logic of string
character check.
Also in systems like windows, the delimiter is \, not / so this current way
is not scalable
##########
amber/src/main/python/core/storage/iceberg/iceberg_utils.py:
##########
@@ -153,6 +153,44 @@ def create_postgres_catalog(
)
+def create_rest_catalog(
+ catalog_name: str,
+ warehouse_name: str,
+ rest_uri: str,
+ s3_endpoint: str,
+ s3_region: str,
+ s3_username: str,
+ s3_password: str,
+) -> Catalog:
+ """
+ Creates a REST catalog instance by connecting to a REST endpoint.
+ - Configures the catalog to interact with a REST endpoint.
+ - The warehouse_name parameter specifies the warehouse identifier (name
for Lakekeeper).
+ - Configures S3FileIO for MinIO/S3 storage backend.
+ :param catalog_name: the name of the catalog.
+ :param warehouse_name: the warehouse identifier (name for Lakekeeper).
Review Comment:
Remove the "Lakekeeper" from the comment
##########
amber/src/main/python/core/storage/iceberg/iceberg_utils.py:
##########
@@ -153,6 +153,44 @@ def create_postgres_catalog(
)
+def create_rest_catalog(
+ catalog_name: str,
+ warehouse_name: str,
+ rest_uri: str,
+ s3_endpoint: str,
+ s3_region: str,
+ s3_username: str,
+ s3_password: str,
+) -> Catalog:
+ """
+ Creates a REST catalog instance by connecting to a REST endpoint.
+ - Configures the catalog to interact with a REST endpoint.
+ - The warehouse_name parameter specifies the warehouse identifier (name
for Lakekeeper).
Review Comment:
Remove the "Lakekeeper" from the comment
##########
sql/texera_lakekeeper.sql:
##########
@@ -0,0 +1,21 @@
+-- Licensed to the Apache Software Foundation (ASF) under one
Review Comment:
This file can go to the second PR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]