Re: [PR] Tests: Unify the test catalog setting [iceberg-python]

2024-04-15 Thread via GitHub
frankliee commented on code in PR #609: URL: https://github.com/apache/iceberg-python/pull/609#discussion_r1566818851 ## tests/conftest.py: ## @@ -2144,3 +2144,31 @@ def arrow_table_with_only_nulls(pa_schema: "pa.Schema") -> "pa.Table": import pyarrow as pa return p

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
nastra commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566814614 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestDelete.java: ## @@ -501,6 +503,38 @@ public void testDeleteNonExistingRecords() {

Re: [PR] chore(deps): Update volo-thrift requirement from 0.9.2 to 0.10.0 [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 commented on PR #326: URL: https://github.com/apache/iceberg-rust/pull/326#issuecomment-2058338661 I think we may need to upgrade this manually since it contains api change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] chore(deps): Update volo-thrift requirement from 0.9.2 to 0.10.0 [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 closed pull request #326: chore(deps): Update volo-thrift requirement from 0.9.2 to 0.10.0 URL: https://github.com/apache/iceberg-rust/pull/326 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-15 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1566583155 ## core/src/test/java/org/apache/iceberg/rest/responses/TestListNamespacesResponse.java: ## @@ -34,7 +34,7 @@ public class TestListNamespacesResponse extends RequestRe

Re: [PR] Add Partitions Metadata Table [iceberg-python]

2024-04-15 Thread via GitHub
HonahX commented on PR #603: URL: https://github.com/apache/iceberg-python/pull/603#issuecomment-2058317984 Merged! Thanks @syun64 for working on this and thanks @Fokko for reviewing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Add Partitions Metadata Table [iceberg-python]

2024-04-15 Thread via GitHub
HonahX merged PR #603: URL: https://github.com/apache/iceberg-python/pull/603 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] chore(deps): Bump apache/skywalking-eyes from 0.5.0 to 0.6.0 [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 commented on PR #328: URL: https://github.com/apache/iceberg-rust/pull/328#issuecomment-2058314718 Thanks @dependabot for working on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] chore(deps): Update volo-thrift requirement from 0.9.2 to 0.10.0 [iceberg-rust]

2024-04-15 Thread via GitHub
dependabot[bot] commented on PR #326: URL: https://github.com/apache/iceberg-rust/pull/326#issuecomment-2058313835 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version,

Re: [PR] chore(deps): Bump apache/skywalking-eyes from 0.5.0 to 0.6.0 [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 merged PR #328: URL: https://github.com/apache/iceberg-rust/pull/328 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Tests: Unify the test catalog setting [iceberg-python]

2024-04-15 Thread via GitHub
HonahX commented on code in PR #609: URL: https://github.com/apache/iceberg-python/pull/609#discussion_r1566772517 ## tests/conftest.py: ## @@ -2144,3 +2144,31 @@ def arrow_table_with_only_nulls(pa_schema: "pa.Schema") -> "pa.Table": import pyarrow as pa return pa.T

Re: [PR] chore(deps): Update volo-thrift requirement from 0.9.2 to 0.10.0 [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 closed pull request #326: chore(deps): Update volo-thrift requirement from 0.9.2 to 0.10.0 URL: https://github.com/apache/iceberg-rust/pull/326 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] chore(deps): Bump peaceiris/actions-gh-pages from 3.9.3 to 4.0.0 [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 commented on PR #333: URL: https://github.com/apache/iceberg-rust/pull/333#issuecomment-2058312965 Thanks @dependabot for working on this, and @Fokko @Xuanwo for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] chore(deps): Bump peaceiris/actions-gh-pages from 3.9.3 to 4.0.0 [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 merged PR #333: URL: https://github.com/apache/iceberg-rust/pull/333 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] chore(deps): Bump peaceiris/actions-mdbook from 1 to 2 [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 commented on PR #332: URL: https://github.com/apache/iceberg-rust/pull/332#issuecomment-2058311565 Thanks for @Xuanwo @Fokko for the review, and @dependabot for the work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] chore(deps): Bump peaceiris/actions-mdbook from 1 to 2 [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 merged PR #332: URL: https://github.com/apache/iceberg-rust/pull/332 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] chore(deps): Update pilota requirement from 0.10.0 to 0.11.0 [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 commented on PR #327: URL: https://github.com/apache/iceberg-rust/pull/327#issuecomment-2058310299 Thanks @Fokko @Xuanwo for the review, and @dependabot for working on this -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] chore(deps): Update pilota requirement from 0.10.0 to 0.11.0 [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 merged PR #327: URL: https://github.com/apache/iceberg-rust/pull/327 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1566753916 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/VectorizedSparkParquetReaders.java: ## @@ -51,22 +53,43 @@ public class VectorizedSparkParq

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1566753486 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnarBatchReader.java: ## @@ -0,0 +1,303 @@ +/* + * Licensed to the Ap

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1566752772 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnReader.java: ## @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache So

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1566752634 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnReader.java: ## @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache So

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1566753162 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnarBatchReader.java: ## @@ -0,0 +1,303 @@ +/* + * Licensed to the Ap

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-04-15 Thread via GitHub
pvary commented on code in PR #9464: URL: https://github.com/apache/iceberg/pull/9464#discussion_r1566738743 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/source/split/SerializerHelper.java: ## @@ -0,0 +1,186 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] Core: Fix JDBC Catalog table commit when migrating from schema V0 to V1 [iceberg]

2024-04-15 Thread via GitHub
nastra merged PR #10111: URL: https://github.com/apache/iceberg/pull/10111 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] [JDBC Catalog] Table commit fails if iceberg_type field is NULL [iceberg]

2024-04-15 Thread via GitHub
nastra closed issue #10046: [JDBC Catalog] Table commit fails if iceberg_type field is NULL URL: https://github.com/apache/iceberg/issues/10046 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Flink: Adds support for Flink 1.19 version [iceberg]

2024-04-15 Thread via GitHub
manuzhang commented on PR #10112: URL: https://github.com/apache/iceberg/pull/10112#issuecomment-2058257988 Why do we remove Flink 1.1.6 in this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-15 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1566713074 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -278,14 +286,26 @@ public void setConf(Object newConf) { @Override public List listTa

Re: [I] Create iceberg table from existsing parquet files with slightly different schemas (schemas merge is possible). [iceberg-python]

2024-04-15 Thread via GitHub
kevinjqliu commented on issue #601: URL: https://github.com/apache/iceberg-python/issues/601#issuecomment-2058220348 Looks like your schema is nested, which makes things more complicated. It's pretty difficult to deal with merging nested schemas. I'm not sure if there's an out-of-the-box so

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566695337 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -43,6 +43,10 @@ protected BaseRowDelta self() { @Override protected String operation() {

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1566673808 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnReader.java: ## @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
manuzhang commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1566680285 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -167,6 +167,10 @@ private TableProperties() {} "write.parquet.bloom-filter-max-bytes";

Re: [I] Update Roadmap / Close old Issues [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 commented on issue #330: URL: https://github.com/apache/iceberg-rust/issues/330#issuecomment-2058158721 @marvinlanhenke Sorry for late reply. Yeah I think we should update the roadmap to reflect latest update, welcome to submit pr, thanks! -- This is an automated message fro

Re: [I] flink:FlinkSink support dynamically changed schema [iceberg]

2024-04-15 Thread via GitHub
Ruees commented on issue #4190: URL: https://github.com/apache/iceberg/issues/4190#issuecomment-2058150468 > @leichangqing You can refer to the last two commits of my branch https://github.com/lintingbin2009/iceberg/tree/flink-sink-dynamically-change. We have put this part of the code in ou

Re: [PR] [0.6.x] Backport #607 and #434 [iceberg-python]

2024-04-15 Thread via GitHub
HonahX merged PR #608: URL: https://github.com/apache/iceberg-python/pull/608 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-15 Thread via GitHub
liurenjie1024 commented on code in PR #320: URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1566618135 ## crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs: ## @@ -0,0 +1,317 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[PR] [0.6.x] Backport #607 [iceberg-python]

2024-04-15 Thread via GitHub
HonahX opened a new pull request, #608: URL: https://github.com/apache/iceberg-python/pull/608 Backport #607 No merge conflict, just open a PR for CI to run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1566594959 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -167,6 +167,10 @@ private TableProperties() {} "write.parquet.bloom-filter-max-bytes";

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-15 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1566583155 ## core/src/test/java/org/apache/iceberg/rest/responses/TestListNamespacesResponse.java: ## @@ -34,7 +34,7 @@ public class TestListNamespacesResponse extends RequestRe

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
manuzhang commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1566577675 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -167,6 +167,10 @@ private TableProperties() {} "write.parquet.bloom-filter-max-bytes";

Re: [PR] Spark 3.5: Parallelize reading files in snapshot and migrate procedures [iceberg]

2024-04-15 Thread via GitHub
manuzhang commented on code in PR #10037: URL: https://github.com/apache/iceberg/pull/10037#discussion_r1566576095 ## data/src/main/java/org/apache/iceberg/data/MigrationService.java: ## @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [I] Cannot Drop Table Created with HiveIcebergStorageHandler Enabled but Metadata.json is Missing [iceberg]

2024-04-15 Thread via GitHub
github-actions[bot] commented on issue #2554: URL: https://github.com/apache/iceberg/issues/2554#issuecomment-2058016053 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Spark filters do not work on int96 timestamp columns [iceberg]

2024-04-15 Thread via GitHub
github-actions[bot] commented on issue #2553: URL: https://github.com/apache/iceberg/issues/2553#issuecomment-2058016033 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Improve backward compatibility tests for spec changes introduced in all table versions [iceberg]

2024-04-15 Thread via GitHub
github-actions[bot] commented on issue #2542: URL: https://github.com/apache/iceberg/issues/2542#issuecomment-2058015981 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Flink : add filters to project for flink IcebergTableSource [iceberg]

2024-04-15 Thread via GitHub
github-actions[bot] commented on issue #2537: URL: https://github.com/apache/iceberg/issues/2537#issuecomment-2058015957 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Reduce errorprone warnings for Iceberg build [iceberg]

2024-04-15 Thread via GitHub
github-actions[bot] commented on issue #2545: URL: https://github.com/apache/iceberg/issues/2545#issuecomment-2058016005 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Flink : Data skew when we use RewriteDataFilesAction of flink to do rewrite [iceberg]

2024-04-15 Thread via GitHub
github-actions[bot] commented on issue #2536: URL: https://github.com/apache/iceberg/issues/2536#issuecomment-2058015929 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Flink: add vectorized read for flink [iceberg]

2024-04-15 Thread via GitHub
github-actions[bot] commented on issue #2534: URL: https://github.com/apache/iceberg/issues/2534#issuecomment-2058015906 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Spark Dynamic Partition Pruning [iceberg]

2024-04-15 Thread via GitHub
github-actions[bot] commented on issue #2527: URL: https://github.com/apache/iceberg/issues/2527#issuecomment-2058015881 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Hive: turn off the stats gathering when iceberg.hive.keep.stats is false [iceberg]

2024-04-15 Thread via GitHub
stargrey102 commented on PR #10148: URL: https://github.com/apache/iceberg/pull/10148#issuecomment-2057981934 Some hive tested failed due to a new table property is added: https://github.com/apache/iceberg/pull/10148/commits/27d3c8a8f72e08cd65d4a226e7daf362527b5f7c Would you mind helping

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-15 Thread via GitHub
danielcweeks commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1566524213 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -278,14 +286,26 @@ public void setConf(Object newConf) { @Override public List l

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-04-15 Thread via GitHub
stevenzwu commented on code in PR #9464: URL: https://github.com/apache/iceberg/pull/9464#discussion_r1566450101 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/source/split/IcebergSourceSplit.java: ## @@ -157,21 +165,56 @@ byte[] serializeV2() throws IOException {

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
amogh-jahagirdar commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566455670 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -43,6 +43,10 @@ protected BaseRowDelta self() { @Override protected String operatio

Re: [PR] Rename hive to hadoop [iceberg-go]

2024-04-15 Thread via GitHub
thorfour closed pull request #71: Rename hive to hadoop URL: https://github.com/apache/iceberg-go/pull/71 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail

Re: [PR] Spark 3.5: Check table existence to determine which catalog for drop table [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10128: URL: https://github.com/apache/iceberg/pull/10128#discussion_r1566449690 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java: ## @@ -275,18 +275,20 @@ public Table alterTable(Identifier ident, TableChang

Re: [PR] Core: Add ManifestWrite benchmark [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on PR #8637: URL: https://github.com/apache/iceberg/pull/8637#issuecomment-2057818943 There was a style violation: ``` Error: eckstyle] [ERROR] /home/runner/work/iceberg/iceberg/core/src/jmh/java/org/apache/iceberg/ManifestWriteBenchmark.java:89:9: Variable '

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1566422127 ## parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java: ## @@ -476,7 +493,10 @@ static Context dataContext(Map config) { int bloomFilterMaxBytes

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1566441905 ## docs/docs/configuration.md: ## @@ -49,8 +49,9 @@ Iceberg tables support table properties to configure table behavior, like the de | write.parquet.dict-size-byt

Re: [PR] Spark 3.5: Add max allowed failed commits to RewriteDataFiles when partial progress is enabled [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on PR #9611: URL: https://github.com/apache/iceberg/pull/9611#issuecomment-2057810296 Thanks, @manuzhang! Thanks for reviewing, @nastra @RussellSpitzer! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1566440289 ## docs/docs/configuration.md: ## @@ -49,8 +49,9 @@ Iceberg tables support table properties to configure table behavior, like the de | write.parquet.dict-size-byt

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-15 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1558712520 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -278,14 +286,26 @@ public void setConf(Object newConf) { @Override public List listTa

Re: [PR] Spark 3.5: Add max allowed failed commits to RewriteDataFiles when partial progress is enabled [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi merged PR #9611: URL: https://github.com/apache/iceberg/pull/9611 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1566422127 ## parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java: ## @@ -476,7 +493,10 @@ static Context dataContext(Map config) { int bloomFilterMaxBytes

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566275123 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestDelete.java: ## @@ -501,6 +503,38 @@ public void testDeleteNonExistingRecords

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1566415714 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -167,6 +167,10 @@ private TableProperties() {} "write.parquet.bloom-filter-max-bytes";

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1566414171 ## parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java: ## @@ -476,7 +493,10 @@ static Context dataContext(Map config) { int bloomFilterMaxByte

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1566403848 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -167,6 +167,10 @@ private TableProperties() {} "write.parquet.bloom-filter-max-bytes";

Re: [PR] [Bug Fix] HiveCatalog's _commit_table need to refresh and update the metadata in a transaction [iceberg-python]

2024-04-15 Thread via GitHub
Fokko commented on PR #607: URL: https://github.com/apache/iceberg-python/pull/607#issuecomment-2057767366 @HonahX Thanks for fixing this. I think we should backport this to 0.6.1 as well 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [I] Concern about possible consistency issue in HiveCatalog's _commit_table [iceberg-python]

2024-04-15 Thread via GitHub
Fokko closed issue #588: Concern about possible consistency issue in HiveCatalog's _commit_table URL: https://github.com/apache/iceberg-python/issues/588 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [Bug Fix] HiveCatalog's _commit_table need to refresh and update the metadata in a transaction [iceberg-python]

2024-04-15 Thread via GitHub
Fokko merged PR #607: URL: https://github.com/apache/iceberg-python/pull/607 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1566403848 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -167,6 +167,10 @@ private TableProperties() {} "write.parquet.bloom-filter-max-bytes";

Re: [PR] Core: Allow manifest file cache to be configurable [iceberg]

2024-04-15 Thread via GitHub
tdcmeehan commented on code in PR #10118: URL: https://github.com/apache/iceberg/pull/10118#discussion_r1566401938 ## core/src/main/java/org/apache/iceberg/io/ContentCache.java: ## @@ -18,274 +18,35 @@ */ package org.apache.iceberg.io; -import com.github.benmanes.caffeine.c

Re: [PR] Core: Allow manifest file cache to be configurable [iceberg]

2024-04-15 Thread via GitHub
tdcmeehan commented on code in PR #10118: URL: https://github.com/apache/iceberg/pull/10118#discussion_r1566400397 ## core/src/main/java/org/apache/iceberg/io/ContentCacheManager.java: ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10149: URL: https://github.com/apache/iceberg/pull/10149#discussion_r1566395480 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -167,6 +167,10 @@ private TableProperties() {} "write.parquet.bloom-filter-max-bytes";

Re: [PR] Core: Fix JDBC Catalog table commit when migrating from schema V0 to V1 [iceberg]

2024-04-15 Thread via GitHub
amogh-jahagirdar commented on PR #10111: URL: https://github.com/apache/iceberg/pull/10111#issuecomment-2057754827 Looks like Flink CI timed out when running actions/cache before even running the actual tests. I'm going to retrigger and merge. -- This is an automated message from the Apac

Re: [PR] Add bloom filter fpp config [iceberg]

2024-04-15 Thread via GitHub
huaxingao commented on PR #10149: URL: https://github.com/apache/iceberg/pull/10149#issuecomment-2057752708 To test the change, I stepped into the [code](https://github.com/apache/parquet-mr/blob/parquet-1.13.x/parquet-column/src/main/java/org/apache/parquet/column/impl/ColumnWriterBase.java

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566392268 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -43,6 +43,10 @@ protected BaseRowDelta self() { @Override protected String operation() {

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566392971 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestDelete.java: ## @@ -501,6 +503,32 @@ public void testDeleteNonExistingRecords

Re: [I] Unable to load an iceberg table from aws glue catalog [iceberg-python]

2024-04-15 Thread via GitHub
geruh commented on issue #515: URL: https://github.com/apache/iceberg-python/issues/515#issuecomment-2057718236 Interesting can you run `aws sts get-caller-identity` in the terminal to ensure the right identity is being used? you can also, explicitly set the S3FileIO by passing in the

Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-15 Thread via GitHub
sdd commented on code in PR #320: URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1566362815 ## crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs: ## @@ -0,0 +1,317 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribut

Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-15 Thread via GitHub
sdd commented on code in PR #320: URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1566349390 ## crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs: ## @@ -0,0 +1,317 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribut

Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-15 Thread via GitHub
sdd commented on code in PR #320: URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1566333861 ## crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs: ## @@ -0,0 +1,317 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribut

Re: [PR] Spec: Clarify time travel implementation in Iceberg [iceberg]

2024-04-15 Thread via GitHub
emkornfield commented on PR #8982: URL: https://github.com/apache/iceberg/pull/8982#issuecomment-2057613376 @Fokko @aokolnychyi just wanted to ping again to see if you have bandwidth to take another look? -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] Spec: Clarify which columns can be used for equality delete files. [iceberg]

2024-04-15 Thread via GitHub
emkornfield commented on PR #8981: URL: https://github.com/apache/iceberg/pull/8981#issuecomment-2057612264 @Fokko just wanted to see if you have bandwidth to take another look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
amogh-jahagirdar commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566301394 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -43,6 +43,10 @@ protected BaseRowDelta self() { @Override protected String operatio

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
amogh-jahagirdar commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566301394 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -43,6 +43,10 @@ protected BaseRowDelta self() { @Override protected String operatio

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
amogh-jahagirdar commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566285705 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -43,6 +43,10 @@ protected BaseRowDelta self() { @Override protected String operatio

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
amogh-jahagirdar commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566285705 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -43,6 +43,10 @@ protected BaseRowDelta self() { @Override protected String operatio

Re: [PR] Spark 3.5: Parallelize reading files in snapshot and migrate procedures [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10037: URL: https://github.com/apache/iceberg/pull/10037#discussion_r1566291691 ## data/src/main/java/org/apache/iceberg/data/MigrationService.java: ## @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Spark 3.5: Parallelize reading files in snapshot and migrate procedures [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10037: URL: https://github.com/apache/iceberg/pull/10037#discussion_r1566289821 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java: ## @@ -237,4 +237,11 @@ protected ExecutorService executorService(int t

Re: [PR] Spark 3.5: Parallelize reading files in snapshot and migrate procedures [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10037: URL: https://github.com/apache/iceberg/pull/10037#discussion_r1566289821 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java: ## @@ -237,4 +237,11 @@ protected ExecutorService executorService(int t

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
amogh-jahagirdar commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566285705 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -43,6 +43,10 @@ protected BaseRowDelta self() { @Override protected String operatio

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
amogh-jahagirdar commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566285705 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -43,6 +43,10 @@ protected BaseRowDelta self() { @Override protected String operatio

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566284577 ## spark/v3.4/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestDelete.java: ## @@ -502,6 +505,38 @@ public void testDeleteNonExistingRecords

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566282540 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -43,6 +43,10 @@ protected BaseRowDelta self() { @Override protected String operation() {

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566275123 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestDelete.java: ## @@ -501,6 +503,38 @@ public void testDeleteNonExistingRecords

Re: [PR] Hive: turn off the stats gathering when iceberg.hive.keep.stats is false [iceberg]

2024-04-15 Thread via GitHub
pvary commented on PR #10148: URL: https://github.com/apache/iceberg/pull/10148#issuecomment-2057554051 @deniskuzZ: what would be the effect of this change to the Hive integration? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] flink autoscaler: how set write-parallelism ? [iceberg]

2024-04-15 Thread via GitHub
pvary commented on issue #10147: URL: https://github.com/apache/iceberg/issues/10147#issuecomment-2057528514 @sannaroby: can you share the Sink code? What distribution mode do you use? Maybe we need a rebalance step before the writer? -- This is an automated message from the Apache Git

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566226747 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -43,6 +43,10 @@ protected BaseRowDelta self() { @Override protected String operation() {

Re: [PR] Core, Spark: Use 'delete' if RowDelta only has delete files [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #10123: URL: https://github.com/apache/iceberg/pull/10123#discussion_r1566226747 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -43,6 +43,10 @@ protected BaseRowDelta self() { @Override protected String operation() {

  1   2   >