Re: [I] DeltaLakeToIcebergMigration Performance [iceberg]

2023-12-21 Thread via GitHub
nk1506 commented on issue #9246: URL: https://github.com/apache/iceberg/issues/9246#issuecomment-1867357185 I think performance should not be proportional to no of parquet files. Do you have more infos in terms of metadata files? like how many chechpoint parquet files , etc? -- This is

Re: [I] doc: Update README with roadmap and feature status. [iceberg-rust]

2023-12-21 Thread via GitHub
liurenjie1024 commented on issue #133: URL: https://github.com/apache/iceberg-rust/issues/133#issuecomment-1867355311 I'll work on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2023-12-21 Thread via GitHub
pvary commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1434808593 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2023-12-21 Thread via GitHub
pvary commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1434802745 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Spark 3.5: Migrate tests to JUnit5 in data directory [iceberg]

2023-12-21 Thread via GitHub
nastra merged PR #9341: URL: https://github.com/apache/iceberg/pull/9341 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2023-12-21 Thread via GitHub
pvary commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1434798523 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Spark 3:5 Migrate tests to JUnit5 in source directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on PR #9342: URL: https://github.com/apache/iceberg/pull/9342#issuecomment-1867295814 Since this PR depends on #9341, I added a patch of files changed in #9341. Commit [4f88900](https://github.com/apache/iceberg/pull/9342/commits/4f8890089b77a17987811e14e8659a43fac465

Re: [PR] Deliver key metadata to parquet encryption [iceberg]

2023-12-21 Thread via GitHub
ggershinsky commented on code in PR #9359: URL: https://github.com/apache/iceberg/pull/9359#discussion_r1434752900 ## core/src/main/java/org/apache/iceberg/avro/Avro.java: ## @@ -91,6 +92,13 @@ public static WriteBuilder write(OutputFile file) { return new WriteBuilder(file

Re: [PR] Fix spark AddFilesProcedure log tip [iceberg]

2023-12-21 Thread via GitHub
nk1506 commented on code in PR #9357: URL: https://github.com/apache/iceberg/pull/9357#discussion_r1434732305 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/AddFilesProcedure.java: ## @@ -196,7 +196,7 @@ private void importFileTable( importPartition

Re: [PR] feat: Expression system. [iceberg-rust]

2023-12-21 Thread via GitHub
liurenjie1024 commented on code in PR #132: URL: https://github.com/apache/iceberg-rust/pull/132#discussion_r1434689405 ## crates/iceberg/src/expr/bound.rs: ## @@ -0,0 +1,41 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] feat: Expression system. [iceberg-rust]

2023-12-21 Thread via GitHub
liurenjie1024 commented on code in PR #132: URL: https://github.com/apache/iceberg-rust/pull/132#discussion_r1434689231 ## crates/iceberg/src/expr/mod.rs: ## @@ -0,0 +1,49 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreement

Re: [PR] feat: Expression system. [iceberg-rust]

2023-12-21 Thread via GitHub
liurenjie1024 commented on code in PR #132: URL: https://github.com/apache/iceberg-rust/pull/132#discussion_r1434685440 ## crates/iceberg/src/expr/bound.rs: ## @@ -0,0 +1,41 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] feat: Expression system. [iceberg-rust]

2023-12-21 Thread via GitHub
liurenjie1024 commented on code in PR #132: URL: https://github.com/apache/iceberg-rust/pull/132#discussion_r1434685237 ## crates/iceberg/src/expr/mod.rs: ## @@ -0,0 +1,49 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreement

Re: [PR] feat: Expression system. [iceberg-rust]

2023-12-21 Thread via GitHub
liurenjie1024 commented on code in PR #132: URL: https://github.com/apache/iceberg-rust/pull/132#discussion_r1434681128 ## crates/iceberg/src/expr/bound.rs: ## @@ -0,0 +1,41 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] feat: Add website layout [iceberg-rust]

2023-12-21 Thread via GitHub
liurenjie1024 commented on PR #130: URL: https://github.com/apache/iceberg-rust/pull/130#issuecomment-1867168892 > I'm open to switching to MkDocs or any other framework that Iceberg uses. I agree that using the same tools makes it easier to maintain the docs, especially since Iceberg has s

[I] doc: Update README with roadmap and feature status. [iceberg-rust]

2023-12-21 Thread via GitHub
liurenjie1024 opened a new issue, #133: URL: https://github.com/apache/iceberg-rust/issues/133 Inspired by [iceberg-go](https://github.com/apache/iceberg-go), I feel that we may need a roadmap/status table in `README.md` to make the status more clear. -- This is an automated message from

[I] When using the Flink upsert mode, the speed of reading data from the iceberg table is very slow. [iceberg]

2023-12-21 Thread via GitHub
13535048320 opened a new issue, #9363: URL: https://github.com/apache/iceberg/issues/9363 ### Query engine Write: Flink Read: Trino ### Question When using the Flink upsert mode, the speed of reading data from the iceberg table is very slow, it takes 1 minute to query

Re: [PR] Apply Name mapping [iceberg-python]

2023-12-21 Thread via GitHub
HonahX commented on code in PR #219: URL: https://github.com/apache/iceberg-python/pull/219#discussion_r1434661513 ## pyiceberg/io/pyarrow.py: ## @@ -698,77 +708,147 @@ def before_field(self, field: pa.Field) -> None: def after_field(self, field: pa.Field) -> None:

Re: [PR] Glue catalog commit table [iceberg-python]

2023-12-21 Thread via GitHub
HonahX commented on code in PR #140: URL: https://github.com/apache/iceberg-python/pull/140#discussion_r1434657980 ## pyiceberg/catalog/__init__.py: ## @@ -74,6 +75,8 @@ LOCATION = "location" EXTERNAL_TABLE = "EXTERNAL_TABLE" +TABLE_METADATA_FILE_NAME_REGEX = re.compile(r"""

Re: [PR] feat: Add website layout [iceberg-rust]

2023-12-21 Thread via GitHub
Xuanwo commented on PR #130: URL: https://github.com/apache/iceberg-rust/pull/130#issuecomment-1867122015 Hi, @bitsondatadev, thanks a lot for your effort of maintaining docs. I can imagine that coordinating such a huge amount of work is challenging. I'm open to switching to MkDocs or

Re: [I] iceberg HiveCatalog insert exception of GSS initiate failed [iceberg]

2023-12-21 Thread via GitHub
ma311199 commented on issue #3127: URL: https://github.com/apache/iceberg/issues/3127#issuecomment-1867117385 Using CDH hive2.1.1+iceberg0.13.1+kerberos.May I ask for a solution to this problem? > 023-12-11 10:38:12,167 ERROR [CommitterEvent Processor #1] org.apache.thrift.transport.TSas

Re: [I] How do I know which partition has delete files and the count? [iceberg]

2023-12-21 Thread via GitHub
github-actions[bot] closed issue #6995: How do I know which partition has delete files and the count? URL: https://github.com/apache/iceberg/issues/6995 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] How do I know which partition has delete files and the count? [iceberg]

2023-12-21 Thread via GitHub
github-actions[bot] commented on issue #6995: URL: https://github.com/apache/iceberg/issues/6995#issuecomment-1867073277 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Spark SystemFunctions are not pushed down during JOIN [iceberg]

2023-12-21 Thread via GitHub
wypoon commented on PR #9233: URL: https://github.com/apache/iceberg/pull/9233#issuecomment-1867061154 > I ran `TestSPJWithBucketing` on Spark 3.4 and I do see that `testMergeSPJwithCondition` passes and `testMergeSPJwithoutCondition` does not (on Spark 3.5 both fail). I believe that the re

Re: [PR] Spark SystemFunctions are not pushed down during JOIN [iceberg]

2023-12-21 Thread via GitHub
wypoon commented on PR #9233: URL: https://github.com/apache/iceberg/pull/9233#issuecomment-1867048740 I ran `TestSPJWithBucketing` on Spark 3.4 and I do see that `testMergeSPJwithCondition` passes and `testMergeSPJwithoutCondition` does not (on Spark 3.5 both fail). I believe that the reas

Re: [PR] Flink: Watermark read options [iceberg]

2023-12-21 Thread via GitHub
stevenzwu commented on code in PR #9346: URL: https://github.com/apache/iceberg/pull/9346#discussion_r1434494178 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/source/IcebergSource.java: ## @@ -489,25 +483,27 @@ public IcebergSource build() { } conte

Re: [I] Deleting a column from an iceberg table breaks schema in AWS Glue catalog [iceberg]

2023-12-21 Thread via GitHub
00Fede commented on issue #6340: URL: https://github.com/apache/iceberg/issues/6340#issuecomment-1867021274 This is still happening. We have to delete the dropped column directly from AWS Glue Data Catalog to be able to run queries in Athena or Quicksight. -- This is an automated message

[PR] Build: Bump fastavro from 1.9.1 to 1.9.2 [iceberg-python]

2023-12-21 Thread via GitHub
dependabot[bot] opened a new pull request, #236: URL: https://github.com/apache/iceberg-python/pull/236 Bumps [fastavro](https://github.com/fastavro/fastavro) from 1.9.1 to 1.9.2. Changelog Sourced from https://github.com/fastavro/fastavro/blob/master/ChangeLog";>fastavro's changel

Re: [PR] Spark: Add support for reading Iceberg views [iceberg]

2023-12-21 Thread via GitHub
amogh-jahagirdar commented on code in PR #9340: URL: https://github.com/apache/iceberg/pull/9340#discussion_r1434470861 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkView.java: ## @@ -0,0 +1,148 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Build: Bump ray from 2.7.1 to 2.8.1 [iceberg-python]

2023-12-21 Thread via GitHub
dependabot[bot] commented on PR #174: URL: https://github.com/apache/iceberg-python/pull/174#issuecomment-1867007495 Superseded by #235. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Build: Bump ray from 2.7.1 to 2.8.1 [iceberg-python]

2023-12-21 Thread via GitHub
dependabot[bot] closed pull request #174: Build: Bump ray from 2.7.1 to 2.8.1 URL: https://github.com/apache/iceberg-python/pull/174 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[PR] Build: Bump ray from 2.7.1 to 2.9.0 [iceberg-python]

2023-12-21 Thread via GitHub
dependabot[bot] opened a new pull request, #235: URL: https://github.com/apache/iceberg-python/pull/235 Bumps [ray](https://github.com/ray-project/ray) from 2.7.1 to 2.9.0. Release notes Sourced from https://github.com/ray-project/ray/releases";>ray's releases. Ray-2.9.0 R

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2023-12-21 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1434342789 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2023-12-21 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1434342789 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Core: Fix missing delete files from transaction [iceberg]

2023-12-21 Thread via GitHub
Fokko commented on code in PR #9354: URL: https://github.com/apache/iceberg/pull/9354#discussion_r1434520886 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -892,15 +892,18 @@ private void cleanUncommittedAppends(Set committed) { } } -

Re: [PR] Docs: Nit-fix the parameter of set_current_snapshot procedure in the example [iceberg]

2023-12-21 Thread via GitHub
amogh-jahagirdar merged PR #9360: URL: https://github.com/apache/iceberg/pull/9360 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [I] Schema IDs Re-Order? [iceberg-python]

2023-12-21 Thread via GitHub
sebpretzer commented on issue #229: URL: https://github.com/apache/iceberg-python/issues/229#issuecomment-1866915656 Ah, that makes sense. I assume the re-assigning is happening [here](https://github.com/apache/iceberg-python/blob/03caee20570cd7eabd2a5a9ee7341a154b818ca1/pyiceberg/table/meta

Re: [PR] Core: Add param to limit manifest parallel reader queue size [iceberg]

2023-12-21 Thread via GitHub
findepi commented on PR #7844: URL: https://github.com/apache/iceberg/pull/7844#issuecomment-1866905125 > The problem is that planning uses a shared threadpool. Using a blocking queue would cause tasks to stall, which would then tie up the threads in the shared pool and cause all planni

Re: [PR] Spark: Add support for reading Iceberg views [iceberg]

2023-12-21 Thread via GitHub
amogh-jahagirdar commented on code in PR #9340: URL: https://github.com/apache/iceberg/pull/9340#discussion_r1434452595 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkView.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Deliver key metadata to parquet encryption [iceberg]

2023-12-21 Thread via GitHub
rdblue commented on code in PR #9359: URL: https://github.com/apache/iceberg/pull/9359#discussion_r1434450493 ## core/src/main/java/org/apache/iceberg/avro/Avro.java: ## @@ -91,6 +92,13 @@ public static WriteBuilder write(OutputFile file) { return new WriteBuilder(file);

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2023-12-21 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1434415826 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/sink/shuffle/TestMapRangePartitioner.java: ## @@ -0,0 +1,511 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] Deliver key metadata to parquet encryption [iceberg]

2023-12-21 Thread via GitHub
rdblue commented on code in PR #9359: URL: https://github.com/apache/iceberg/pull/9359#discussion_r1434433368 ## core/src/main/java/org/apache/iceberg/avro/Avro.java: ## @@ -91,6 +92,13 @@ public static WriteBuilder write(OutputFile file) { return new WriteBuilder(file);

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2023-12-21 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1434415826 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/sink/shuffle/TestMapRangePartitioner.java: ## @@ -0,0 +1,511 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2023-12-21 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1434415826 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/sink/shuffle/TestMapRangePartitioner.java: ## @@ -0,0 +1,511 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2023-12-21 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1434407274 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2023-12-21 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1434407787 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/sink/shuffle/TestMapRangePartitioner.java: ## @@ -0,0 +1,511 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] Spark SystemFunctions are not pushed down during JOIN [iceberg]

2023-12-21 Thread via GitHub
wypoon commented on PR #9233: URL: https://github.com/apache/iceberg/pull/9233#issuecomment-1866790718 @tmnd1991 are you saying that TestSPJWithBucketing is supposed to fail? I thought that the idea is to write a test that fails without the change in this PR but **passes** with it. --

Re: [PR] Deliver key metadata to parquet encryption [iceberg]

2023-12-21 Thread via GitHub
ggershinsky commented on PR #9359: URL: https://github.com/apache/iceberg/pull/9359#issuecomment-1866785085 The last round of comments in #6762 is addressed here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] feat: Add website layout [iceberg-rust]

2023-12-21 Thread via GitHub
bitsondatadev commented on PR #130: URL: https://github.com/apache/iceberg-rust/pull/130#issuecomment-1866709687 Okay, so to avoid any delays or concern here, I don't want you all to stop with writing documentation because you're not sure how this will fall inline with the other docs.

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2023-12-21 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1434342789 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2023-12-21 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1434338290 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434336535 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWrites.java: ## @@ -63,29 +70,21 @@ import org.apache.spark.sql.SaveMode; import org.apa

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434323504 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWrites.java: ## @@ -63,29 +70,21 @@ import org.apache.spark.sql.SaveMode; import org.apa

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434323504 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWrites.java: ## @@ -63,29 +70,21 @@ import org.apache.spark.sql.SaveMode; import org.apa

[PR] Rest Catalog Support for a Separate OAuth Server URI [iceberg-python]

2023-12-21 Thread via GitHub
syun64 opened a new pull request, #233: URL: https://github.com/apache/iceberg-python/pull/233 Closes: https://github.com/apache/iceberg-python/issues/230 This PR introduces support for **Separation of Roles** in OAuth authorization when using the REST Catalog, by allowing the user to

Re: [PR] feat: Add website layout [iceberg-rust]

2023-12-21 Thread via GitHub
bitsondatadev commented on PR #130: URL: https://github.com/apache/iceberg-rust/pull/130#issuecomment-1866632260 Update After having a meeting with Fokko and him showing me the [Arrow site documentation](https://arrow.apache.org/docs/) I am not totally opposed to the idea despite my

Re: [PR] Core: Fix missing delete files from transaction [iceberg]

2023-12-21 Thread via GitHub
rdblue commented on PR #9354: URL: https://github.com/apache/iceberg/pull/9354#issuecomment-1866627868 Thanks for getting this in @nastra and @Fokko! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Flink: Watermark read options [iceberg]

2023-12-21 Thread via GitHub
rodmeneses commented on code in PR #9346: URL: https://github.com/apache/iceberg/pull/9346#discussion_r1434279736 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/FlinkConfigOptions.java: ## @@ -94,7 +94,7 @@ private FlinkConfigOptions() {} public static final Conf

Re: [PR] WIP: Action [iceberg]

2023-12-21 Thread via GitHub
ajantha-bhat closed pull request #9361: WIP: Action URL: https://github.com/apache/iceberg/pull/9361 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: iss

Re: [PR] Add Description on Using a Separate Authorization Server [iceberg]

2023-12-21 Thread via GitHub
danielcweeks merged PR #8998: URL: https://github.com/apache/iceberg/pull/8998 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434257298 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWrites.java: ## @@ -186,21 +185,24 @@ private void writeAndValidateWithLocations(Tab

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434250422 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWrites.java: ## @@ -63,29 +70,21 @@ import org.apache.spark.sql.SaveMode; import o

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434250422 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWrites.java: ## @@ -63,29 +70,21 @@ import org.apache.spark.sql.SaveMode; import o

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434243638 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWrites.java: ## @@ -186,21 +185,24 @@ private void writeAndValidateWithLocations(Table tab

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434242259 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWrites.java: ## @@ -63,29 +70,21 @@ import org.apache.spark.sql.SaveMode; import org.apa

[PR] Docs: Nit-fix the parameter of set_current_snapshot procedure in the example [iceberg]

2023-12-21 Thread via GitHub
tomtongue opened a new pull request, #9360: URL: https://github.com/apache/iceberg/pull/9360 The document example says the parameter of `set_current_snapshot` like `tag`. But the correct parameter is `ref`. From the source, the parameter shows `ref`. https://github.com/apache/

Re: [PR] feat: Add website layout [iceberg-rust]

2023-12-21 Thread via GitHub
bitsondatadev commented on PR #130: URL: https://github.com/apache/iceberg-rust/pull/130#issuecomment-1866527007 > mdbook is widely used in rust community to build a docs site While I can respect this, it is larger than just a Rust project, and we've had [ongoing](https://lists.apache

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434221658 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWrites.java: ## @@ -343,7 +343,7 @@ public void testNullableWithSparkSqlOption() thr

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434218017 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWrites.java: ## @@ -343,7 +343,7 @@ public void testNullableWithSparkSqlOption() throws I

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434205224 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestAvroScan.java: ## @@ -68,7 +68,7 @@ public static void stopSpark() { @Override protected vo

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434201795 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestAvroScan.java: ## @@ -68,7 +68,7 @@ public static void stopSpark() { @Override protec

Re: [PR] Spark: Add support for reading Iceberg views [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9340: URL: https://github.com/apache/iceberg/pull/9340#discussion_r1434201832 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkCatalog.java: ## @@ -31,8 +31,10 @@ import org.apache.spark.sql.connector.catalog.SupportsNam

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434197493 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestParquetScan.java: ## @@ -143,7 +143,7 @@ public void testEmptyTableProjection() throws IOException

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434196954 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWrites.java: ## @@ -272,20 +274,22 @@ private Dataset createDataset(Iterable records, Sch

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434196320 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestDataFrameWrites.java: ## @@ -144,10 +143,10 @@ protected void writeAndValidate(Schema schema) throw

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434195782 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestAvroScan.java: ## @@ -68,7 +68,7 @@ public static void stopSpark() { @Override protected vo

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434187605 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkOrcReadMetadataColumns.java: ## @@ -95,24 +99,20 @@ public class TestSparkOrcReadMetadataC

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434181849 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkOrcReadMetadataColumns.java: ## @@ -95,24 +99,20 @@ public class TestSparkOrcReadMetadataColumns

Re: [PR] Apply Name mapping [iceberg-python]

2023-12-21 Thread via GitHub
syun64 commented on PR #219: URL: https://github.com/apache/iceberg-python/pull/219#issuecomment-1866426126 > Thanks @syun64 for the great work! Thank you for the detailed review @HonahX ! I've taken most of your suggestions, and left a response to the one regarding field_type - pleas

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434159383 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkOrcReadMetadataColumns.java: ## @@ -95,24 +99,20 @@ public class TestSparkOrcReadMetadataC

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434164015 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkOrcReadMetadataColumns.java: ## @@ -95,24 +99,20 @@ public class TestSparkOrcReadMetadataC

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434159383 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkOrcReadMetadataColumns.java: ## @@ -95,24 +99,20 @@ public class TestSparkOrcReadMetadataC

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434150438 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkOrcReadMetadataColumns.java: ## @@ -95,24 +99,20 @@ public class TestSparkOrcReadMetadataColumns

Re: [PR] Spark Migration to JUnit5 AssertJ - spark/data directory [iceberg]

2023-12-21 Thread via GitHub
nastra commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434150438 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkOrcReadMetadataColumns.java: ## @@ -95,24 +99,20 @@ public class TestSparkOrcReadMetadataColumns

Re: [I] Deleting a column from an iceberg table breaks schema in AWS Glue catalog [iceberg]

2023-12-21 Thread via GitHub
qoqajr commented on issue #6340: URL: https://github.com/apache/iceberg/issues/6340#issuecomment-1866356484 Hello, any updates on this? This bug is still present! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Deliver key metadata to parquet encryption [iceberg]

2023-12-21 Thread via GitHub
ggershinsky commented on PR #6762: URL: https://github.com/apache/iceberg/pull/6762#issuecomment-1866291510 Moved to main branch base via #9359 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Deliver key metadata to parquet encryption [iceberg]

2023-12-21 Thread via GitHub
ggershinsky closed pull request #6762: Deliver key metadata to parquet encryption URL: https://github.com/apache/iceberg/pull/6762 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[PR] Deliver key metadata to parquet encryption [iceberg]

2023-12-21 Thread via GitHub
ggershinsky opened a new pull request, #9359: URL: https://github.com/apache/iceberg/pull/9359 Moving #6762 to main branch base -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Spark Migration to JUnit5 AssertJ: non-parameterized, spark/source directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on PR #9342: URL: https://github.com/apache/iceberg/pull/9342#issuecomment-1866286222 After testing, I realized that this PR depends a lot on the changes I introduced in #9341. So we need to prioritise merging #9341 first. -- This is an automated message from the Ap

Re: [PR] Spark SystemFunctions are not pushed down during JOIN [iceberg]

2023-12-21 Thread via GitHub
tmnd1991 commented on code in PR #9233: URL: https://github.com/apache/iceberg/pull/9233#discussion_r1434085055 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestSPJWithBucketing.java: ## @@ -0,0 +1,219 @@ +/* + * Licensed to the Apache Softwar

Re: [PR] feat: Expression system. [iceberg-rust]

2023-12-21 Thread via GitHub
Fokko commented on code in PR #132: URL: https://github.com/apache/iceberg-rust/pull/132#discussion_r1434045494 ## crates/iceberg/src/expr/bound.rs: ## @@ -0,0 +1,41 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. Se

Re: [PR] Apply Name mapping [iceberg-python]

2023-12-21 Thread via GitHub
syun64 commented on code in PR #219: URL: https://github.com/apache/iceberg-python/pull/219#discussion_r1434067495 ## pyiceberg/io/pyarrow.py: ## @@ -698,77 +708,147 @@ def before_field(self, field: pa.Field) -> None: def after_field(self, field: pa.Field) -> None:

Re: [PR] Spark Migration to JUnit5 AssertJ - non-parameterized, spark/data directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on code in PR #9341: URL: https://github.com/apache/iceberg/pull/9341#discussion_r1434062826 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkOrcReadMetadataColumns.java: ## @@ -100,7 +99,7 @@ public static Object[] parameters() {

Re: [PR] Spark Migration to JUnit5 AssertJ - non-parameterized, spark/data directory [iceberg]

2023-12-21 Thread via GitHub
chinmay-bhat commented on PR #9341: URL: https://github.com/apache/iceberg/pull/9341#issuecomment-1866220019 Initially, I assumed I can seperate out parameterize and non-parameterize files. But the CI errors included parameterized files that inherit from `spark/data/AvroDataTest`. So

Re: [PR] feat: Add website layout [iceberg-rust]

2023-12-21 Thread via GitHub
Fokko commented on PR #130: URL: https://github.com/apache/iceberg-rust/pull/130#issuecomment-1866188024 @bitsondatadev WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-12-21 Thread via GitHub
szehon-ho commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1433431468 ## api/src/main/java/org/apache/iceberg/types/TypeUtil.java: ## @@ -452,6 +454,68 @@ private static void checkSchemaCompatibility( } } + /** + * Estimate

Re: [PR] WIP: first pass at `UnboundTransform` [iceberg-python]

2023-12-21 Thread via GitHub
Fokko commented on PR #209: URL: https://github.com/apache/iceberg-python/pull/209#issuecomment-1866170081 Hey @jayceslesar thanks for working on this, and I think you're already halfway there. The most important part is to make sure that we convert `as date` to a `DayTransform`. -- This

Re: [PR] API: Fix equals and hashCode in CharSequenceSet [iceberg]

2023-12-21 Thread via GitHub
ajantha-bhat commented on PR #9245: URL: https://github.com/apache/iceberg/pull/9245#issuecomment-1866169340 I think this PR has produced error prone warnings. https://github.com/apache/iceberg/assets/5889404/93566564-18a2-4ad0-8f17-f3ffaab1a537";> -- This is an automated messa

Re: [PR] WIP: first pass at `UnboundTransform` [iceberg-python]

2023-12-21 Thread via GitHub
Fokko commented on code in PR #209: URL: https://github.com/apache/iceberg-python/pull/209#discussion_r1434014214 ## tests/expressions/test_parser.py: ## @@ -199,3 +200,8 @@ def test_with_function() -> None: parser.parse("foo = 1 and lower(bar) = '2'") assert "Ex

Re: [PR] WIP: first pass at `UnboundTransform` [iceberg-python]

2023-12-21 Thread via GitHub
Fokko commented on code in PR #209: URL: https://github.com/apache/iceberg-python/pull/209#discussion_r1434012843 ## pyiceberg/transforms.py: ## @@ -821,3 +824,34 @@ class BoundTransform(BoundTerm[L]): def __init__(self, term: BoundTerm[L], transform: Transform[L, Any]):

  1   2   >