Re: [I] Support Vended Credentials for Azure Data Lake Store [iceberg-python]

2024-09-09 Thread via GitHub
c-thiel commented on issue #1146: URL: https://github.com/apache/iceberg-python/issues/1146#issuecomment-2339797459 @sungwy in my comment as well as in my comment I am using "adls.sas-token" which is exactly what Java and Spark expect: https://github.com/apache/iceberg/blob/4873b4b7534d

Re: [PR] Core, Kafka, Spark: Use AssertJ instead of JUnit assertions [iceberg]

2024-09-09 Thread via GitHub
findepi commented on PR #11102: URL: https://github.com/apache/iceberg/pull/11102#issuecomment-2339774571 thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] feat: support lower_bound&&upper_bound for parquet writer [iceberg-rust]

2024-09-09 Thread via GitHub
xxchan commented on code in PR #383: URL: https://github.com/apache/iceberg-rust/pull/383#discussion_r1751316492 ## crates/iceberg/src/arrow/schema.rs: ## @@ -35,6 +35,9 @@ use rust_decimal::prelude::ToPrimitive; use std::collections::HashMap; use std::sync::Arc; +/// When i

Re: [PR] Core, Kafka, Spark: Use AssertJ instead of JUnit assertions [iceberg]

2024-09-09 Thread via GitHub
nastra merged PR #11102: URL: https://github.com/apache/iceberg/pull/11102 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] feat: Reassign field ids for schema [iceberg-rust]

2024-09-09 Thread via GitHub
Xuanwo commented on PR #615: URL: https://github.com/apache/iceberg-rust/pull/615#issuecomment-2339690333 cc @liurenjie1024 would you like to take a look too? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] [API + Avro] Add default value APIs and Avro implementation [iceberg]

2024-09-09 Thread via GitHub
manuzhang commented on code in PR #9502: URL: https://github.com/apache/iceberg/pull/9502#discussion_r1751206372 ## core/src/test/java/org/apache/iceberg/avro/TestReadDefaultValues.java: ## @@ -0,0 +1,237 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1751149711 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs"

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1751149711 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs"

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1751149711 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs"

Re: [PR] Core: Parallelize manifest writing for many new files [iceberg]

2024-09-09 Thread via GitHub
karuppayya commented on code in PR #11086: URL: https://github.com/apache/iceberg/pull/11086#discussion_r1751128701 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -554,6 +562,84 @@ protected boolean cleanupAfterCommit() { return true; } + protec

Re: [PR] Core: Parallelize manifest writing for many new files [iceberg]

2024-09-09 Thread via GitHub
karuppayya commented on code in PR #11086: URL: https://github.com/apache/iceberg/pull/11086#discussion_r1751070837 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -554,6 +562,84 @@ protected boolean cleanupAfterCommit() { return true; } + protec

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1751106398 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs" +Pl

Re: [I] Adding RESTCatalog based Spark Smoke Test [iceberg]

2024-09-09 Thread via GitHub
haizhou-zhao commented on issue #11079: URL: https://github.com/apache/iceberg/issues/11079#issuecomment-2339395130 Found several issues while attempting to enable integration test based on REST Catalog for Spark. Listing them below as I troubleshoot them one by one. -- This is an automat

[I] RESTSessionCatalog loadTable incorrect set last access time as metadata last update time [iceberg]

2024-09-09 Thread via GitHub
haizhou-zhao opened a new issue, #11103: URL: https://github.com/apache/iceberg/issues/11103 ### Apache Iceberg version 1.6.1 (latest release) ### Query engine Spark ### Please describe the bug 🐞 ## Background When using Spark (or likewise execution engin

Re: [PR] Core: Parallelize manifest writing for many new files [iceberg]

2024-09-09 Thread via GitHub
dramaticlly commented on code in PR #11086: URL: https://github.com/apache/iceberg/pull/11086#discussion_r1751080154 ## core/src/test/java/org/apache/iceberg/TestSnapshotProducer.java: ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… [iceberg]

2024-09-09 Thread via GitHub
github-actions[bot] commented on PR #8202: URL: https://github.com/apache/iceberg/pull/8202#issuecomment-2339369354 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Support partial insert in merge into command [iceberg]

2024-09-09 Thread via GitHub
github-actions[bot] commented on issue #8199: URL: https://github.com/apache/iceberg/issues/8199#issuecomment-2339369339 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Request to add KLL Datasketch and hive ColumnStatisticsObj and as standard blob types to puffin file. [iceberg]

2024-09-09 Thread via GitHub
github-actions[bot] commented on issue #8198: URL: https://github.com/apache/iceberg/issues/8198#issuecomment-2339369316 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Core: check location for conflict before creating table [iceberg]

2024-09-09 Thread via GitHub
github-actions[bot] commented on PR #8194: URL: https://github.com/apache/iceberg/pull/8194#issuecomment-2339369273 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Iceberg Java Api - S3 Session Token - 403 Forbidden exception [iceberg]

2024-09-09 Thread via GitHub
github-actions[bot] commented on issue #8190: URL: https://github.com/apache/iceberg/issues/8190#issuecomment-2339369252 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] API: Convert Evaluator in expression to use a Comparator [iceberg]

2024-09-09 Thread via GitHub
github-actions[bot] closed pull request #7883: API: Convert Evaluator in expression to use a Comparator URL: https://github.com/apache/iceberg/pull/7883 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] API: Convert Evaluator in expression to use a Comparator [iceberg]

2024-09-09 Thread via GitHub
github-actions[bot] commented on PR #7883: URL: https://github.com/apache/iceberg/pull/7883#issuecomment-2339368920 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Aliyun: Migrate tests to junit5 for aliyun client factory [iceberg]

2024-09-09 Thread via GitHub
github-actions[bot] commented on PR #7853: URL: https://github.com/apache/iceberg/pull/7853#issuecomment-2339368891 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Aliyun: Migrate tests to junit5 for aliyun client factory [iceberg]

2024-09-09 Thread via GitHub
github-actions[bot] closed pull request #7853: Aliyun: Migrate tests to junit5 for aliyun client factory URL: https://github.com/apache/iceberg/pull/7853 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Cache Manifest files [iceberg-python]

2024-09-09 Thread via GitHub
corleyma commented on code in PR #787: URL: https://github.com/apache/iceberg-python/pull/787#discussion_r1751060909 ## pyiceberg/table/snapshots.py: ## @@ -228,6 +229,13 @@ def __eq__(self, other: Any) -> bool: ) +@lru_cache Review Comment: is 128 big enough?

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750965732 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs"

Re: [PR] Bump `duckdb` to version `1.1.0` [iceberg-python]

2024-09-09 Thread via GitHub
kevinjqliu commented on PR #1149: URL: https://github.com/apache/iceberg-python/pull/1149#issuecomment-2339300626 Updated the PR to only change duckdb from v1.0.0 to v1.1.0 in the poetry.lock file -- This is an automated message from the Apache Git Service. To respond to the message, plea

[PR] Bump pydantic from 2.9.0 to 2.9.1 [iceberg-python]

2024-09-09 Thread via GitHub
dependabot[bot] opened a new pull request, #1154: URL: https://github.com/apache/iceberg-python/pull/1154 Bumps [pydantic](https://github.com/pydantic/pydantic) from 2.9.0 to 2.9.1. Release notes Sourced from https://github.com/pydantic/pydantic/releases";>pydantic's releases.

[PR] Bump duckdb from 1.0.0 to 1.1.0 [iceberg-python]

2024-09-09 Thread via GitHub
dependabot[bot] opened a new pull request, #1152: URL: https://github.com/apache/iceberg-python/pull/1152 Bumps [duckdb](https://github.com/duckdb/duckdb) from 1.0.0 to 1.1.0. Release notes Sourced from https://github.com/duckdb/duckdb/releases";>duckdb's releases. DuckDB 1.1

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750965732 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs"

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750965732 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs"

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750965732 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs"

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750965732 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs"

Re: [I] Bump arrow to 53 [iceberg-rust]

2024-09-09 Thread via GitHub
sdd commented on issue #622: URL: https://github.com/apache/iceberg-rust/issues/622#issuecomment-2339158328 This is blocked on DataFusion. The most recently released version of DataFusion, v41.0.0, depends on arrow-* 52. Once https://github.com/apache/datafusion/pull/12032 is in a rel

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
flyrain commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750948304 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs" +Pl

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750946862 ## open-api/rest-catalog-open-api.yaml: ## @@ -2774,6 +3062,140 @@ components: additionalProperties: type: string +ScanTasks: + type:

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750944397 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,264 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/na

Re: [PR] Core: Parallelize manifest writing for many new files [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #11086: URL: https://github.com/apache/iceberg/pull/11086#discussion_r1750840473 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -554,6 +562,84 @@ protected boolean cleanupAfterCommit() { return true; } +

Re: [PR] Bump `duckdb` to version `1.1.0` [iceberg-python]

2024-09-09 Thread via GitHub
kevinjqliu commented on code in PR #1149: URL: https://github.com/apache/iceberg-python/pull/1149#discussion_r1750910730 ## pyproject.toml: ## @@ -64,7 +64,7 @@ zstandard = ">=0.13.0,<1.0.0" tenacity = ">=8.2.3,<10.0.0" pyarrow = { version = ">=14.0.0,<18.0.0", optional = true

Re: [PR] Bump `duckdb` to version `1.1.0` [iceberg-python]

2024-09-09 Thread via GitHub
sungwy commented on code in PR #1149: URL: https://github.com/apache/iceberg-python/pull/1149#discussion_r1750905022 ## pyproject.toml: ## @@ -64,7 +64,7 @@ zstandard = ">=0.13.0,<1.0.0" tenacity = ">=8.2.3,<10.0.0" pyarrow = { version = ">=14.0.0,<18.0.0", optional = true }

Re: [I] EPIC: Rust Based Compaction [iceberg-rust]

2024-09-09 Thread via GitHub
kevinjqliu commented on issue #624: URL: https://github.com/apache/iceberg-rust/issues/624#issuecomment-2339045843 Thanks for starting this! Linking the relevant issue from the pyiceberg side, [iceberg-python/#1092](https://github.com/apache/iceberg-python/issues/1092) Would be great to

Re: [I] Error: `table_type` missing from table parameters when loading table from Hive metastore [iceberg-python]

2024-09-09 Thread via GitHub
kevinjqliu commented on issue #1150: URL: https://github.com/apache/iceberg-python/issues/1150#issuecomment-2338993156 > Are only tables created by pyiceberg supported here? Anyone can create an iceberg table using HMS, which can be read by PyIceberg. In HMS, the assumption is that i

Re: [I] Modify SQLCatalog initialization so that classes are not always created and update how these classes are created to be more open to tother DB's [iceberg-python]

2024-09-09 Thread via GitHub
sungwy commented on issue #1148: URL: https://github.com/apache/iceberg-python/issues/1148#issuecomment-2338975789 I work in financial services as well, and actually have the same separation between DDL and DML. But I agree with your point in that it would be best to not generalize the pra

Re: [I] Error: `table_type` missing from table parameters when loading table from Hive metastore [iceberg-python]

2024-09-09 Thread via GitHub
edgarrmondragon commented on issue #1150: URL: https://github.com/apache/iceberg-python/issues/1150#issuecomment-2338970552 > Who created the table in this case? When PyIceberg creates the table, it injects the `table_type` property I suppose it was created by a third-party and not b

Re: [I] Modify SQLCatalog initialization so that classes are not always created and update how these classes are created to be more open to tother DB's [iceberg-python]

2024-09-09 Thread via GitHub
isc-patrick commented on issue #1148: URL: https://github.com/apache/iceberg-python/issues/1148#issuecomment-2338957758 I had not yet seen the change introducing load_catalog instead of using the SQLCatalog constructor, so just updated to v0.70 - I was away for couple weeks. My point

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750815966 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs"

[PR] Impl rest catalog + table updates & requirements [iceberg-go]

2024-09-09 Thread via GitHub
jwtryg opened a new pull request, #146: URL: https://github.com/apache/iceberg-go/pull/146 Hi @zeroshade I think it's really cool that you are working on a golang-iceberg implementation, and I would like to contribute if I can. I have tried to finish the rest catalog implementatio

Re: [I] Modify SQLCatalog initialization so that classes are not always created and update how these classes are created to be more open to tother DB's [iceberg-python]

2024-09-09 Thread via GitHub
sungwy commented on issue #1148: URL: https://github.com/apache/iceberg-python/issues/1148#issuecomment-2338853251 Given that the catalog is instantiated through `load_catalog` entrypoint function, I think we'd actually need to introduce a catalog property as well, if we decide to introduc

Re: [PR] AWS: Add configuration and set better defaults for S3 retry behaviour [iceberg]

2024-09-09 Thread via GitHub
ookumuso commented on PR #11052: URL: https://github.com/apache/iceberg/pull/11052#issuecomment-2338851624 Removed all InputStream related changes in favor #10433 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] Modify SQLCatalog initialization so that classes are not always created and update how these classes are created to be more open to tother DB's [iceberg-python]

2024-09-09 Thread via GitHub
isc-patrick commented on issue #1148: URL: https://github.com/apache/iceberg-python/issues/1148#issuecomment-2338829111 My thought was to pass an additional argument into init that allowed you to turn off creation of tables. My concern is that I don't think you will want this auto-create t

Re: [PR] Python: Add support for Python 3.12 [iceberg-python]

2024-09-09 Thread via GitHub
kevinjqliu closed pull request #35: Python: Add support for Python 3.12 URL: https://github.com/apache/iceberg-python/pull/35 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] support python 3.12 [iceberg-python]

2024-09-09 Thread via GitHub
kevinjqliu commented on PR #254: URL: https://github.com/apache/iceberg-python/pull/254#issuecomment-2338806960 hey @MehulBatra, can i close this in favor of #1068? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[PR] Bump `duckdb` to version `1.1.0` [iceberg-python]

2024-09-09 Thread via GitHub
kevinjqliu opened a new pull request, #1149: URL: https://github.com/apache/iceberg-python/pull/1149 Duckdb version 1.1.0 added support for automatic retries when installing extensions (https://github.com/duckdb/duckdb/pull/13122). This will help resolve the intermittent CI issue observed

Re: [PR] fix: Invert `case_sensitive` logic in StructType [iceberg-python]

2024-09-09 Thread via GitHub
AnthonyLam commented on PR #1147: URL: https://github.com/apache/iceberg-python/pull/1147#issuecomment-2338752007 > Hello @AnthonyLam, thank you for your contribution. Could you add a test to ensure the expected behavior? For sure! I've added a test in `test_types.py`. Let me know if

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
dramaticlly commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750692281 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,264 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix

Re: [I] Iceberg Glue Concurrent Update can result in missing metadata_location [iceberg]

2024-09-09 Thread via GitHub
singhpk234 commented on issue #9411: URL: https://github.com/apache/iceberg/issues/9411#issuecomment-2338733735 @shaeqahmed are you seeing this log line too ? ``` LOG.warn( "Received unexpected failure when committing to {}, validating if commit ended up succeedi

Re: [I] Modify SQLCatalog initialization so that classes are not always created and update how these classes are created to be more open to tother DB's [iceberg-python]

2024-09-09 Thread via GitHub
sungwy commented on issue #1148: URL: https://github.com/apache/iceberg-python/issues/1148#issuecomment-2338726996 Hi @isc-patrick - thank you for raising this. These are great points. I'm wondering if we can keep the current approach of running `_ensure_table_exists` within the inst

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
flyrain commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750675084 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,264 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/na

Re: [PR] DRAFT - Issue 10275 - Reward support for nulls [iceberg]

2024-09-09 Thread via GitHub
slessard commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1750566040 ## arrow/src/test/java/org/apache/iceberg/arrow/vectorized/ArrowReaderTest.java: ## @@ -262,6 +264,111 @@ public void testReadColumnFilter2() throws Exception {

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
flyrain commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750651482 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs" +Pl

Re: [I] Data files which are still useful are mistakenly cleaned up when trying to expire a specified snapshot [iceberg]

2024-09-09 Thread via GitHub
hantangwangd closed issue #10982: Data files which are still useful are mistakenly cleaned up when trying to expire a specified snapshot URL: https://github.com/apache/iceberg/issues/10982 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750583821 ## open-api/rest-catalog-open-api.yaml: ## @@ -541,6 +541,264 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{p

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750572264 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs"

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-09 Thread via GitHub
danielcweeks commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1750551470 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs" +

Re: [PR] Core: Fix the behavior of IncrementalFileCleanup when expire a snapshot [iceberg]

2024-09-09 Thread via GitHub
amogh-jahagirdar merged PR #10983: URL: https://github.com/apache/iceberg/pull/10983 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

[I] Modify SQLCatalog initialization so that classes are not always created and update how these classes are created to be more open to tother DB's [iceberg-python]

2024-09-09 Thread via GitHub
isc-patrick opened a new issue, #1148: URL: https://github.com/apache/iceberg-python/issues/1148 ### Feature Request / Improvement I think there are 2 issues with the SQLCatalog constructor: https://github.com/apache/iceberg-python/blob/d587e6724685744918ecf192724437182ad01abf/

Re: [I] Kafka Connect: Record projection Index out of bounds error [iceberg]

2024-09-09 Thread via GitHub
bryanck commented on issue #11099: URL: https://github.com/apache/iceberg/issues/11099#issuecomment-2338355847 Thanks for reporting this, does this require a fix or was it something else? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Kafka Connect: add option to force columns to lowercase [iceberg]

2024-09-09 Thread via GitHub
bryanck commented on code in PR #11100: URL: https://github.com/apache/iceberg/pull/11100#discussion_r1750375359 ## kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/IcebergSinkConfig.java: ## @@ -79,6 +79,8 @@ public class IcebergSinkConfig extends AbstractCo

Re: [PR] Spark 3.3, 3.4: Fix incorrect catalog loaded in TestCreateActions [iceberg]

2024-09-09 Thread via GitHub
nastra merged PR #11049: URL: https://github.com/apache/iceberg/pull/11049 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Kafka Connect: add option to force columns to lowercase [iceberg]

2024-09-09 Thread via GitHub
bryanck commented on PR #11100: URL: https://github.com/apache/iceberg/pull/11100#issuecomment-2338262871 Note, this addresses https://github.com/apache/iceberg/issues/11091 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Spark configuration for amazon access key and secret key with glue catalog for apache Iceberg is not honoring [iceberg]

2024-09-09 Thread via GitHub
andythsu commented on issue #10078: URL: https://github.com/apache/iceberg/issues/10078#issuecomment-2338251099 @clamar14 @nastra I ended up using different jars and putting everything together instead of using `iceberg-aws-bundle` ``` def get_builder() -> SparkSession.Builder:

Re: [I] Spark Streaming Job with multiple queries MERGE INTO the same target table (Runtime file filtering is not possible) [iceberg]

2024-09-09 Thread via GitHub
RussellSpitzer commented on issue #11094: URL: https://github.com/apache/iceberg/issues/11094#issuecomment-2338247842 > @eric-maynard I have other examples where I do what you suggest. If I move this inside `start_streaming_query` the builder will simply return the existing spark session `g

Re: [PR] AWS: Add configuration and set better defaults for S3 retry behaviour [iceberg]

2024-09-09 Thread via GitHub
nastra commented on code in PR #11052: URL: https://github.com/apache/iceberg/pull/11052#discussion_r1750335281 ## aws/src/test/java/org/apache/iceberg/aws/s3/TestS3FileIOProperties.java: ## @@ -491,4 +493,17 @@ public void testApplyUserAgentConfigurations() { Mockito.verif

Re: [PR] Flink: Avoid metaspace memory leak by not registering ShutdownHook for ExecutorService in Flink [iceberg]

2024-09-09 Thread via GitHub
pvary commented on PR #11073: URL: https://github.com/apache/iceberg/pull/11073#issuecomment-2338174338 @fengjiajie: Does this mean, that you have submitting 1091 jobs in the span of 2 minutes? I see that the default timeout for the `MoreExecutors.getExitingExecutorService` is 120s ``

Re: [PR] Core: fix NPE with HadoopFileIO because FileIOParser doesn't serialize Hadoop configuration [iceberg]

2024-09-09 Thread via GitHub
pvary commented on code in PR #10926: URL: https://github.com/apache/iceberg/pull/10926#discussion_r1750267503 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopFileIO.java: ## @@ -63,7 +63,11 @@ public class HadoopFileIO implements HadoopConfigurable, DelegateFileIO {

Re: [PR] Core: Add support for `view-default` property in catalog [iceberg]

2024-09-09 Thread via GitHub
nastra commented on code in PR #11064: URL: https://github.com/apache/iceberg/pull/11064#discussion_r1750255921 ## core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java: ## @@ -107,6 +108,7 @@ public void basicCreateView() { assertThat(view.currentVersion().opera

Re: [PR] Core: Add support for `view-default` property in catalog [iceberg]

2024-09-09 Thread via GitHub
nastra commented on code in PR #11064: URL: https://github.com/apache/iceberg/pull/11064#discussion_r1750254414 ## open-api/src/testFixtures/java/org/apache/iceberg/rest/RCKUtils.java: ## @@ -85,7 +85,8 @@ static RESTCatalog initCatalogClient() { catalogProperties.putIfAbse

Re: [PR] scan: fix error when reading an empty table [iceberg-rust]

2024-09-09 Thread via GitHub
sdd commented on PR #608: URL: https://github.com/apache/iceberg-rust/pull/608#issuecomment-2338064061 Just to clarify, not having any snapshots is not necessarily the same as not having any data. If there is no current snapshot then there can't be any data, but someone could delete all dat

Re: [PR] Remove Hive 2 [iceberg]

2024-09-09 Thread via GitHub
nastra commented on code in PR #10996: URL: https://github.com/apache/iceberg/pull/10996#discussion_r1750200212 ## mr/src/main/java/org/apache/iceberg/mr/hive/serde/objectinspector/IcebergObjectInspector.java: ## @@ -27,33 +27,23 @@ import org.apache.hadoop.hive.serde2.typeinfo

Re: [PR] Remove Hive 2 [iceberg]

2024-09-09 Thread via GitHub
nastra commented on code in PR #10996: URL: https://github.com/apache/iceberg/pull/10996#discussion_r1750198331 ## mr/src/main/java/org/apache/iceberg/mr/hive/serde/objectinspector/IcebergObjectInspector.java: ## @@ -27,33 +27,23 @@ import org.apache.hadoop.hive.serde2.typeinfo

Re: [PR] TableMetadataBuilder [iceberg-rust]

2024-09-09 Thread via GitHub
Xuanwo commented on PR #587: URL: https://github.com/apache/iceberg-rust/pull/587#issuecomment-2338031867 I have reviewed most PRs that I am confident can be merged. The only one left is https://github.com/apache/iceberg-rust/pull/615, for which I need more input. -- This is an automated

Re: [PR] Feat: Normalize TableMetadata [iceberg-rust]

2024-09-09 Thread via GitHub
Xuanwo merged PR #611: URL: https://github.com/apache/iceberg-rust/pull/611 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [I] Library public api isolation and import decoupling [iceberg-python]

2024-09-09 Thread via GitHub
ndrluis commented on issue #499: URL: https://github.com/apache/iceberg-python/issues/499#issuecomment-2338019734 I'll close this issue because we added the mypy linter, which solves our problem with coupling, and we have #1099 to solve the public API definition. -- This is an automated m

Re: [I] Library public api isolation and import decoupling [iceberg-python]

2024-09-09 Thread via GitHub
ndrluis closed issue #499: Library public api isolation and import decoupling URL: https://github.com/apache/iceberg-python/issues/499 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Spark 3.3, 3.4: Fix incorrect catalog loaded in TestCreateActions [iceberg]

2024-09-09 Thread via GitHub
nastra commented on code in PR #11049: URL: https://github.com/apache/iceberg/pull/11049#discussion_r1750181394 ## spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestCreateActions.java: ## @@ -728,6 +733,8 @@ public void testStructOfThreeLevelLists() throws Exce

Re: [I] Case sensitivity is not respected when using IcebergGenerics.ScanBuilder [iceberg]

2024-09-09 Thread via GitHub
mderoy closed issue #8178: Case sensitivity is not respected when using IcebergGenerics.ScanBuilder URL: https://github.com/apache/iceberg/issues/8178 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Feat: Normalize TableMetadata [iceberg-rust]

2024-09-09 Thread via GitHub
c-thiel commented on PR #611: URL: https://github.com/apache/iceberg-rust/pull/611#issuecomment-2337827094 > Thank you very much for working on this. I have a few style suggestions, but please feel free to disregard them if they don't suit you. Thanks @Xuanwo, they all make sense! Add

Re: [PR] TableMetadataBuilder [iceberg-rust]

2024-09-09 Thread via GitHub
c-thiel commented on PR #587: URL: https://github.com/apache/iceberg-rust/pull/587#issuecomment-2337769487 @liurenjie1024 thanks for the Feedback! > > My first point of the opening statement: Do we re-write our SortOrder and add the schema to PartitionSpec so that we can match on name

Re: [PR] DRAFT - Issue 10275 - Reward support for nulls [iceberg]

2024-09-09 Thread via GitHub
nastra commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1749998444 ## arrow/src/test/java/org/apache/iceberg/arrow/vectorized/GenericArrowVectorAccessorFactoryTest.java: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] DRAFT - Issue 10275 - Reward support for nulls [iceberg]

2024-09-09 Thread via GitHub
nastra commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1749997916 ## arrow/src/test/java/org/apache/iceberg/arrow/vectorized/GenericArrowVectorAccessorFactoryTest.java: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] DRAFT - Issue 10275 - Reward support for nulls [iceberg]

2024-09-09 Thread via GitHub
nastra commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1749992612 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorHolder.java: ## @@ -140,14 +141,16 @@ public static class ConstantVectorHolder extends VectorHolder {

Re: [PR] feat: Reassign field ids for schema [iceberg-rust]

2024-09-09 Thread via GitHub
c-thiel commented on code in PR #615: URL: https://github.com/apache/iceberg-rust/pull/615#discussion_r1749940779 ## crates/iceberg/src/spec/schema.rs: ## @@ -86,6 +87,16 @@ impl SchemaBuilder { self } +/// Reassign all field-ids (nested) on build. +/// I

[PR] Core: Add Catalog Transactions API [iceberg]

2024-09-09 Thread via GitHub
nastra opened a new pull request, #6948: URL: https://github.com/apache/iceberg/pull/6948 I have written up a design doc, which is available [here](https://docs.google.com/document/d/1UxXifU8iqP_byaW4E2RuKZx1nobxmAvc5urVcWas1B8/edit?usp=sharing) I think eventually we'd want to split t