Re: [PR] iceberg-parquet: Switch tests to JUnit5 + AssertJ-style assertions [iceberg]

2023-12-01 Thread via GitHub
lisirrx commented on code in PR #9161: URL: https://github.com/apache/iceberg/pull/9161#discussion_r1412733984 ## parquet/src/test/java/org/apache/iceberg/parquet/TestDictionaryRowGroupFilter.java: ## @@ -79,16 +53,39 @@ import

Re: [PR] Core: REST HttpClient connections config [iceberg]

2023-12-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #9195: URL: https://github.com/apache/iceberg/pull/9195#discussion_r1412727587 ## core/src/main/java/org/apache/iceberg/rest/HTTPClient.java: ## @@ -72,6 +74,10 @@ public class HTTPClient implements RESTClient { static final String

Re: [PR] Core: REST HttpClient connections config [iceberg]

2023-12-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #9195: URL: https://github.com/apache/iceberg/pull/9195#discussion_r1412727587 ## core/src/main/java/org/apache/iceberg/rest/HTTPClient.java: ## @@ -72,6 +74,10 @@ public class HTTPClient implements RESTClient { static final String

Re: [PR] Spark: Bump Spark minor versions for 3.3 and 3.4 [iceberg]

2023-12-01 Thread via GitHub
ajantha-bhat commented on PR #9187: URL: https://github.com/apache/iceberg/pull/9187#issuecomment-1836986136 > org.apache.spark.sql.AnalysisException: Cannot write incompatible data to table '`spark_catalog`.`default`.`source_table`': - Cannot safely cast 'id': string to int.```

Re: [PR] relativePath [wip] [iceberg]

2023-12-01 Thread via GitHub
ajantha-bhat commented on PR #8260: URL: https://github.com/apache/iceberg/pull/8260#issuecomment-1836983928 Are we still working on it? Looks like there are some interest from the community also. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Fix StructCopy [iceberg]

2023-12-01 Thread via GitHub
aokolnychyi commented on PR #6894: URL: https://github.com/apache/iceberg/pull/6894#issuecomment-1836967378 @RussellSpitzer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] API, Spark: Add fastForwardOrCreate API and integrate that with Spark fast forward procedure [iceberg]

2023-12-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #9196: URL: https://github.com/apache/iceberg/pull/9196#discussion_r1412683806 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -73,19 +71,16 @@ public StructType outputType() {

Re: [PR] API, Spark: Add fastForwardOrCreate API and integrate that with Spark fast forward procedure [iceberg]

2023-12-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #9196: URL: https://github.com/apache/iceberg/pull/9196#discussion_r1412683256 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -73,19 +71,16 @@ public StructType outputType() {

Re: [PR] API, Spark: Add fastForwardOrCreate API and integrate that with Spark fast forward procedure [iceberg]

2023-12-01 Thread via GitHub
amogh-jahagirdar commented on code in PR #9196: URL: https://github.com/apache/iceberg/pull/9196#discussion_r1412683256 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -73,19 +71,16 @@ public StructType outputType() {

[PR] API, Spark: Add fastForwardOrCreate API and integrate that with Spark fast forward procedure [iceberg]

2023-12-01 Thread via GitHub
amogh-jahagirdar opened a new pull request, #9196: URL: https://github.com/apache/iceberg/pull/9196 Fixes #8849 This change adds a `fastForwardOrCreate` API which will perform a fast forward of the `from` branch if it exists; otherwise `from` will be created and pointing towards

Re: [I] Question about iceberg partition table [iceberg]

2023-12-01 Thread via GitHub
github-actions[bot] commented on issue #7406: URL: https://github.com/apache/iceberg/issues/7406#issuecomment-1836946114 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-12-01 Thread via GitHub
RussellSpitzer commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1412674377 ## core/src/main/java/org/apache/iceberg/SystemConfigs.java: ## @@ -43,14 +43,14 @@ private SystemConfigs() {} Integer::parseUnsignedInt); /** -

Re: [PR] Core: Avro writers use BlockingBinaryEncoder to enable array/map size calculations. [iceberg]

2023-12-01 Thread via GitHub
aokolnychyi commented on PR #8625: URL: https://github.com/apache/iceberg/pull/8625#issuecomment-1836930161 @rustyconover @Fokko, I was wondering whether there were any updates. It would be great to have this in. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-12-01 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1412667139 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -500,4 +501,21 @@ public static Snapshot latestSnapshot(TableMetadata metadata, String

Re: [I] Create table should take in sort order/ distribution mode [iceberg]

2023-12-01 Thread via GitHub
maytasm commented on issue #8179: URL: https://github.com/apache/iceberg/issues/8179#issuecomment-1836918587 Any update on this feature request / development? Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Should we support sort order during table creation SQL ? [iceberg]

2023-12-01 Thread via GitHub
maytasm commented on issue #3547: URL: https://github.com/apache/iceberg/issues/3547#issuecomment-1836917796 A little late to the conversation but another case to think about is CREATE TABLE ... LIKE ... i.e. `create table foo like bar` where the table bar has both

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-12-01 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1412657382 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -500,4 +501,21 @@ public static Snapshot latestSnapshot(TableMetadata metadata, String

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-12-01 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1412657113 ## core/src/main/java/org/apache/iceberg/SystemConfigs.java: ## @@ -43,14 +43,14 @@ private SystemConfigs() {} Integer::parseUnsignedInt); /** -

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-12-01 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1412656524 ## api/src/main/java/org/apache/iceberg/types/TypeUtil.java: ## @@ -452,6 +454,59 @@ private static void checkSchemaCompatibility( } } + public static

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-12-01 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1412655796 ## core/src/main/java/org/apache/iceberg/deletes/BitmapPositionDeleteIndex.java: ## @@ -27,6 +27,15 @@ class BitmapPositionDeleteIndex implements

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-12-01 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1412655501 ## core/src/main/java/org/apache/iceberg/deletes/EmptyPositionDeleteIndex.java: ## @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[PR] Core: REST HttpClient connections config [iceberg]

2023-12-01 Thread via GitHub
danielcweeks opened a new pull request, #9195: URL: https://github.com/apache/iceberg/pull/9195 The default config for http connections is only 25 connections and 5 per host. Since the REST client is typically talking to a single host, the number of connections allowed is relatively small

[I] BUG: Bug: partition name stored in partition data in data file contains special character [iceberg-python]

2023-12-01 Thread via GitHub
puchengy opened a new issue, #175: URL: https://github.com/apache/iceberg-python/issues/175 ### Apache Iceberg version 0.5.0 (latest release) ### Please describe the bug  an example to demonstrate the issue

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-12-01 Thread via GitHub
RussellSpitzer commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1412642976 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -500,4 +501,21 @@ public static Snapshot latestSnapshot(TableMetadata metadata, String

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-12-01 Thread via GitHub
RussellSpitzer commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1412642976 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -500,4 +501,21 @@ public static Snapshot latestSnapshot(TableMetadata metadata, String

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-12-01 Thread via GitHub
RussellSpitzer commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1412637270 ## core/src/main/java/org/apache/iceberg/deletes/EmptyPositionDeleteIndex.java: ## @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Spark: Bump Spark minor versions for 3.3 and 3.4 [iceberg]

2023-12-01 Thread via GitHub
RussellSpitzer commented on PR #9187: URL: https://github.com/apache/iceberg/pull/9187#issuecomment-1836871316 ```org.apache.iceberg.spark.extensions.TestAddFilesProcedure > invalidDataImport[catalogName = spark_catalog, implementation = org.apache.iceberg.spark.SparkSessionCatalog, config

Re: [PR] Build: Bump ray from 2.7.1 to 2.8.0 [iceberg-python]

2023-12-01 Thread via GitHub
dependabot[bot] closed pull request #129: Build: Bump ray from 2.7.1 to 2.8.0 URL: https://github.com/apache/iceberg-python/pull/129 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Build: Bump ray from 2.7.1 to 2.8.0 [iceberg-python]

2023-12-01 Thread via GitHub
dependabot[bot] commented on PR #129: URL: https://github.com/apache/iceberg-python/pull/129#issuecomment-1836850722 Superseded by #174. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] Build: Bump ray from 2.7.1 to 2.8.1 [iceberg-python]

2023-12-01 Thread via GitHub
dependabot[bot] opened a new pull request, #174: URL: https://github.com/apache/iceberg-python/pull/174 Bumps [ray](https://github.com/ray-project/ray) from 2.7.1 to 2.8.1. Release notes Sourced from https://github.com/ray-project/ray/releases;>ray's releases. Ray-2.8.0

[PR] Core: Add PartitionMap [iceberg]

2023-12-01 Thread via GitHub
aokolnychyi opened a new pull request, #9194: URL: https://github.com/apache/iceberg/pull/9194 This PR adds `PartitionMap`, a map that uses a pair of spec ID and partition tuple as keys. It is similar to `PartitionSet`. The class will simplify places like `DeleteFileIndex` that uses

Re: [I] Parquet file overwritten by spark streaming job in subsequent execution with same spark streaming checkpoint location [iceberg]

2023-12-01 Thread via GitHub
amitmittal5 commented on issue #9172: URL: https://github.com/apache/iceberg/issues/9172#issuecomment-1836474036 I corrected the typo in original post to correct the data source as Kafka. It now states "reads the data from Kafka and write to ADLS gen 2 in iceberg table" -- This is an

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-01 Thread via GitHub
ajantha-bhat commented on PR #8909: URL: https://github.com/apache/iceberg/pull/8909#issuecomment-1836303934 Addressed new comments other than (https://github.com/apache/iceberg/pull/8909#discussion_r1410813198). Feel free to push the changes on top of this PR. -- This is an automated

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-01 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1412244983 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieViewOperations.java: ## @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-01 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1412241028 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieTableOperations.java: ## @@ -132,74 +130,42 @@ protected void doRefresh() { @Override protected

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-01 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1412239066 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieViewOperations.java: ## @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-01 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1412237675 ## nessie/src/test/java/org/apache/iceberg/nessie/BaseTestIceberg.java: ## @@ -180,6 +188,33 @@ protected Table createTable(TableIdentifier tableIdentifier,

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-01 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1412236654 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java: ## @@ -347,4 +327,73 @@ private TableIdentifier identifierWithoutTableReference(

[I] Flink Rewrite Files Action OOM [iceberg]

2023-12-01 Thread via GitHub
bhupixb opened a new issue, #9193: URL: https://github.com/apache/iceberg/issues/9193 ### Apache Iceberg version 1.4.1 ### Query engine Flink ### Please describe the bug  # Background: We are using the flink iceberg sinks to write data to an iceberg

Re: [PR] Support usage of Separate OIDC Authorization Server URI [iceberg]

2023-12-01 Thread via GitHub
syun64 commented on PR #8976: URL: https://github.com/apache/iceberg/pull/8976#issuecomment-1835901741 Leaving a comment to keep the PR active - @nastra @danielcweeks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[I] IN clause on system function is not pushed down [iceberg]

2023-12-01 Thread via GitHub
tmnd1991 opened a new issue, #9191: URL: https://github.com/apache/iceberg/issues/9191 ### Apache Iceberg version main (development) ### Query engine Spark ### Please describe the bug  The following query: ``` SELECT * FROM %s WHERE