Re: [PR] Build: Bump mkdocstrings-python from 1.8.0 to 1.10.0 [iceberg-python]

2024-05-01 Thread via GitHub
Fokko merged PR #690: URL: https://github.com/apache/iceberg-python/pull/690 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Type error in naming a protected attribute in class PyArrowFile(OutputFile, InputFile), breaking readability from HDFS using PyArrow [iceberg-python]

2024-05-01 Thread via GitHub
HonahX closed issue #654: Type error in naming a protected attribute in class PyArrowFile(OutputFile, InputFile), breaking readability from HDFS using PyArrow URL: https://github.com/apache/iceberg-python/issues/654 -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Changed class variable names in pyarrow.py [iceberg-python]

2024-05-01 Thread via GitHub
HonahX merged PR #686: URL: https://github.com/apache/iceberg-python/pull/686 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Implement table_exists method with try-catch for NoSuchTableError [iceberg-python]

2024-05-01 Thread via GitHub
HonahX commented on code in PR #678: URL: https://github.com/apache/iceberg-python/pull/678#discussion_r1587074970 ## tests/catalog/integration_test_dynamodb.py: ## @@ -262,3 +262,8 @@ def test_update_namespace_properties(test_catalog: Catalog, database_name: str)

Re: [PR] Build: Bump getdaft from 0.2.21 to 0.2.23 [iceberg-python]

2024-05-01 Thread via GitHub
Fokko merged PR #689: URL: https://github.com/apache/iceberg-python/pull/689 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Is the "Emitting watermarks" new feature can't be used in flink sql? [iceberg]

2024-05-01 Thread via GitHub
pvary commented on issue #10219: URL: https://github.com/apache/iceberg/issues/10219#issuecomment-2089602930 Maybe we could just implement the interface with the `IcebergTableSource`. We either prevent setting a watermark strategy which is not `noWatermarks` (cleaner approach, but might

Re: [PR] MR: iceberg storage handler should set common projection pruning config [iceberg]

2024-05-01 Thread via GitHub
pvary commented on code in PR #10188: URL: https://github.com/apache/iceberg/pull/10188#discussion_r1587062170 ## mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java: ## @@ -111,8 +111,15 @@ public void configureTableJobProperties(TableDesc tableDesc,

Re: [I] Add runtime module to enable concurrent load of manifest files. [iceberg-rust]

2024-05-01 Thread via GitHub
marvinlanhenke commented on issue #124: URL: https://github.com/apache/iceberg-rust/issues/124#issuecomment-2089542654 in order to verify my understanding and possibly kick of a design discussion, we could follow the approach of `sqlx`: - have a `runtime.rs` - to define a

Re: [PR] refactor: cache partition_schema in `fn plan_files()` [iceberg-rust]

2024-05-01 Thread via GitHub
liurenjie1024 merged PR #362: URL: https://github.com/apache/iceberg-rust/pull/362 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] enhancement: caching of partition_schemas in `fn plan_files()` [iceberg-rust]

2024-05-01 Thread via GitHub
liurenjie1024 closed issue #361: enhancement: caching of partition_schemas in `fn plan_files()` URL: https://github.com/apache/iceberg-rust/issues/361 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Basic Integration with Datafusion [iceberg-rust]

2024-05-01 Thread via GitHub
liurenjie1024 commented on PR #324: URL: https://github.com/apache/iceberg-rust/pull/324#issuecomment-2089469920 Thanks @marvinlanhenke for this pr, and @Fokko @Xuanwo @viirya @simonvandel @tshauck for review! -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Basic Integration with Datafusion [iceberg-rust]

2024-05-01 Thread via GitHub
liurenjie1024 merged PR #324: URL: https://github.com/apache/iceberg-rust/pull/324 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [WIP] POC of runtime module [iceberg-rust]

2024-05-01 Thread via GitHub
liurenjie1024 commented on PR #233: URL: https://github.com/apache/iceberg-rust/pull/233#issuecomment-2089454899 cc @odysa Is this pr ready for review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Refactor: Extract `partition_filters` from `ManifestEvaluator` [iceberg-rust]

2024-05-01 Thread via GitHub
liurenjie1024 commented on code in PR #360: URL: https://github.com/apache/iceberg-rust/pull/360#discussion_r1587013567 ## crates/iceberg/src/scan.rs: ## @@ -169,55 +177,66 @@ pub struct TableScan { filter: Option>, } -/// A stream of [`FileScanTask`]. -pub type

Re: [PR] Refactor: Extract `partition_filters` from `ManifestEvaluator` [iceberg-rust]

2024-05-01 Thread via GitHub
liurenjie1024 commented on code in PR #360: URL: https://github.com/apache/iceberg-rust/pull/360#discussion_r1587013047 ## crates/iceberg/src/scan.rs: ## @@ -99,7 +107,7 @@ impl<'a> TableScanBuilder<'a> { } /// Build the table scan. -pub fn build(self) ->

Re: [PR] Basic Integration with Datafusion [iceberg-rust]

2024-05-01 Thread via GitHub
liurenjie1024 commented on PR #324: URL: https://github.com/apache/iceberg-rust/pull/324#issuecomment-2089449018 > I haven't checked now, but from memory I believe py-iceberg uses pyspark to setup tables with actual data for proper integration testing? I think so. I think datafusion

Re: [I] Geospatial Support [iceberg]

2024-05-01 Thread via GitHub
jiayuasu commented on issue #10260: URL: https://github.com/apache/iceberg/issues/10260#issuecomment-2089326700 Looking forward to the feedback from Iceberg community! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Geospatial Support [iceberg]

2024-05-01 Thread via GitHub
szehon-ho commented on issue #10260: URL: https://github.com/apache/iceberg/issues/10260#issuecomment-2089324826 Note: thanks to @jiayuasu and @Kontinuation from Wherobots for invaluable domain specific advice and POC support from Havasu Iceberg-fork and Geolake, and also @badbye and other

[I] Geospatial Support [iceberg]

2024-05-01 Thread via GitHub
szehon-ho opened a new issue, #10260: URL: https://github.com/apache/iceberg/issues/10260 ### Proposed Change (This is an abridged version of the proposal document) Big data open source projects have been leveraged for storage and analysis of geospatial data for a long time,

Re: [PR] Parquet: page skipping using filtered row groups for non-vectorized read [iceberg]

2024-05-01 Thread via GitHub
wypoon commented on PR #10228: URL: https://github.com/apache/iceberg/pull/10228#issuecomment-2089311010 @zhongyujiang I would be happy to make you a co-author, but it was not easy to pull in commits from your PR directly. If you like, you can open a PR against my branch (even a dummy

Re: [PR] Build: Bump ray from 2.9.2 to 2.12.0 [iceberg-python]

2024-05-01 Thread via GitHub
dependabot[bot] closed pull request #672: Build: Bump ray from 2.9.2 to 2.12.0 URL: https://github.com/apache/iceberg-python/pull/672 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Build: Bump ray from 2.9.2 to 2.12.0 [iceberg-python]

2024-05-01 Thread via GitHub
dependabot[bot] commented on PR #672: URL: https://github.com/apache/iceberg-python/pull/672#issuecomment-2089242154 Superseded by #691. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] Build: Bump ray from 2.9.2 to 2.20.0 [iceberg-python]

2024-05-01 Thread via GitHub
dependabot[bot] opened a new pull request, #691: URL: https://github.com/apache/iceberg-python/pull/691 Bumps [ray](https://github.com/ray-project/ray) from 2.9.2 to 2.20.0. Release notes Sourced from https://github.com/ray-project/ray/releases;>ray's releases. Ray-2.20.0

[PR] Build: Bump mkdocstrings-python from 1.8.0 to 1.10.0 [iceberg-python]

2024-05-01 Thread via GitHub
dependabot[bot] opened a new pull request, #690: URL: https://github.com/apache/iceberg-python/pull/690 Bumps [mkdocstrings-python](https://github.com/mkdocstrings/python) from 1.8.0 to 1.10.0. Release notes Sourced from

[PR] Build: Bump getdaft from 0.2.21 to 0.2.23 [iceberg-python]

2024-05-01 Thread via GitHub
dependabot[bot] opened a new pull request, #689: URL: https://github.com/apache/iceberg-python/pull/689 Bumps [getdaft](https://github.com/Eventual-Inc/Daft) from 0.2.21 to 0.2.23. Release notes Sourced from https://github.com/Eventual-Inc/Daft/releases;>getdaft's releases.

[PR] Build: Bump coverage from 7.4.4 to 7.5.0 [iceberg-python]

2024-05-01 Thread via GitHub
dependabot[bot] opened a new pull request, #688: URL: https://github.com/apache/iceberg-python/pull/688 Bumps [coverage](https://github.com/nedbat/coveragepy) from 7.4.4 to 7.5.0. Changelog Sourced from https://github.com/nedbat/coveragepy/blob/master/CHANGES.rst;>coverage's

Re: [I] REST Catalog to support custom-catalog name like HMS/Glue [iceberg]

2024-05-01 Thread via GitHub
flyrain commented on issue #10205: URL: https://github.com/apache/iceberg/issues/10205#issuecomment-2089231268 The multipart namespace in the REST spec can support a use case like `catalog.db.table`. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Accounting writer [iceberg-go]

2024-05-01 Thread via GitHub
thorfour closed pull request #72: Accounting writer URL: https://github.com/apache/iceberg-go/pull/72 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Flink: Not Writing [iceberg]

2024-05-01 Thread via GitHub
parrik commented on issue #8916: URL: https://github.com/apache/iceberg/issues/8916#issuecomment-2089011646 @a8356555 have a similar issue - how did you unblock yourself? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Flink: Backport #10200 to v1.19 and v1.17 [iceberg]

2024-05-01 Thread via GitHub
pvary commented on PR #10259: URL: https://github.com/apache/iceberg/pull/10259#issuecomment-2088912008 Thanks for the review @stevenzwu! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Flink: Backport #10200 to v1.19 and v1.17 [iceberg]

2024-05-01 Thread via GitHub
pvary merged PR #10259: URL: https://github.com/apache/iceberg/pull/10259 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Support for snowflake catalog in apache iceberg [iceberg-python]

2024-05-01 Thread via GitHub
prabodh1194 commented on issue #685: URL: https://github.com/apache/iceberg-python/issues/685#issuecomment-207256 raised PR here -- https://github.com/apache/iceberg-python/pull/687 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[PR] define snowflake catalog [iceberg-python]

2024-05-01 Thread via GitHub
prabodh1194 opened a new pull request, #687: URL: https://github.com/apache/iceberg-python/pull/687 - defined reading Iceberg tables using the snowflake catalog. - snowflake catalog is pretty much read only, so adding primarily read only ops. - refer snowflake iceberg sdk read guide:

Re: [PR] Implement table_exists method with try-catch for NoSuchTableError [iceberg-python]

2024-05-01 Thread via GitHub
MehulBatra commented on code in PR #678: URL: https://github.com/apache/iceberg-python/pull/678#discussion_r1586577083 ## pyiceberg/catalog/__init__.py: ## @@ -394,6 +394,11 @@ def table_exists(self, identifier: Union[str, Identifier]) -> bool: Returns:

Re: [I] Spark: Dropping partition column from old partition table corrupts entire table [iceberg]

2024-05-01 Thread via GitHub
EXPEbdodla commented on issue #10234: URL: https://github.com/apache/iceberg/issues/10234#issuecomment-2088839092 Once partition column is dropped/ replaced from the partition spec, it should be similar to DROP COLUMN. Is that a right assumption? -- This is an automated message from the

Re: [PR] Implement table_exists method with try-catch for NoSuchTableError [iceberg-python]

2024-05-01 Thread via GitHub
kevinjqliu commented on code in PR #678: URL: https://github.com/apache/iceberg-python/pull/678#discussion_r1586520844 ## pyiceberg/catalog/__init__.py: ## @@ -394,6 +394,11 @@ def table_exists(self, identifier: Union[str, Identifier]) -> bool: Returns:

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-05-01 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1586515021 ## open-api/rest-catalog-open-api.yaml: ## @@ -537,6 +537,110 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' +

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-05-01 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1586505918 ## open-api/rest-catalog-open-api.yaml: ## @@ -2106,6 +2210,32 @@ components: items: $ref: '#/components/schemas/PartitionStatisticsFile' +

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-05-01 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1586505918 ## open-api/rest-catalog-open-api.yaml: ## @@ -2106,6 +2210,32 @@ components: items: $ref: '#/components/schemas/PartitionStatisticsFile' +

Re: [PR] Core: add a new task-type field to task JSON serialization. add data task JSON serialization imp. [iceberg]

2024-05-01 Thread via GitHub
stevenzwu commented on code in PR #9728: URL: https://github.com/apache/iceberg/pull/9728#discussion_r1586466491 ## core/src/test/java/org/apache/iceberg/TestDataTaskParser.java: ## @@ -0,0 +1,249 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Hive: Remove deprecated method [iceberg]

2024-05-01 Thread via GitHub
Fokko merged PR #10257: URL: https://github.com/apache/iceberg/pull/10257 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Hive: Remove deprecated method [iceberg]

2024-05-01 Thread via GitHub
Fokko commented on PR #10257: URL: https://github.com/apache/iceberg/pull/10257#issuecomment-2088609690 Thanks for the quick review @pvary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Spark: Dropping partition column from old partition table corrupts entire table [iceberg]

2024-05-01 Thread via GitHub
manuzhang commented on issue #10234: URL: https://github.com/apache/iceberg/issues/10234#issuecomment-2088597313 I don't think drop column should be allowed when you still have data with the column as partition. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Spark 3.5: Only traverse ancestors of current snapshot when building changelog scan [iceberg]

2024-05-01 Thread via GitHub
manuzhang commented on PR #10252: URL: https://github.com/apache/iceberg/pull/10252#issuecomment-2088587509 @flyrain @aokolnychyi please help review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Iceberg / Spark writing to s3 warehouse : Unable to load region from any of the providers in the chain software [iceberg]

2024-05-01 Thread via GitHub
UFMurphy commented on issue #7570: URL: https://github.com/apache/iceberg/issues/7570#issuecomment-2088572698 Hi dramatically, In your {SPARK_HOME}/conf directory, you should see a file called spark-defaults.conf. At the bottom of that file, just add your environment variables

Re: [I] byte and short types in spark no longer auto coerce to int32 [iceberg]

2024-05-01 Thread via GitHub
jkolash commented on issue #10225: URL: https://github.com/apache/iceberg/issues/10225#issuecomment-2088423502 Just wanted to make sure you were aware reproducing is pretty simple ``` Author: jkolash Date: Thu Apr 25 19:23:22 2024 -0400 Failing test for issue

Re: [I] Type error in naming a protected attribute in class PyArrowFile(OutputFile, InputFile), breaking readability from HDFS using PyArrow [iceberg-python]

2024-05-01 Thread via GitHub
SebastianoMeneghin commented on issue #654: URL: https://github.com/apache/iceberg-python/issues/654#issuecomment-2088215801 Here you can find the PR! https://github.com/apache/iceberg-python/pull/686 -- This is an automated message from the Apache Git Service. To respond to the

[PR] Changed class variable names in pyarrow.py [iceberg-python]

2024-05-01 Thread via GitHub
SebastianoMeneghin opened a new pull request, #686: URL: https://github.com/apache/iceberg-python/pull/686 Renamed class variable in PyArrowFile from ._fs to _filesystem, in order to assign with the __init()__ method the Object FileSystem to the class, and not to the single instance of the

Re: [I] appending to a table with Decimal > 32767 results in `int too big to convert` [iceberg-python]

2024-05-01 Thread via GitHub
bigluck commented on issue #669: URL: https://github.com/apache/iceberg-python/issues/669#issuecomment-2088085023 Thanks @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Add runtime module to enable concurrent load of manifest files. [iceberg-rust]

2024-05-01 Thread via GitHub
liurenjie1024 commented on issue #124: URL: https://github.com/apache/iceberg-rust/issues/124#issuecomment-2088084228 It's already tracked here: https://github.com/apache/iceberg-rust/issues/123 -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Build: Bump pydantic from 2.7.0 to 2.7.1 [iceberg-python]

2024-05-01 Thread via GitHub
Fokko merged PR #680: URL: https://github.com/apache/iceberg-python/pull/680 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Build: Bump pyarrow from 15.0.2 to 16.0.0 [iceberg-python]

2024-05-01 Thread via GitHub
Fokko merged PR #681: URL: https://github.com/apache/iceberg-python/pull/681 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] appending to a table with Decimal > 32767 results in `int too big to convert` [iceberg-python]

2024-05-01 Thread via GitHub
Fokko commented on issue #669: URL: https://github.com/apache/iceberg-python/issues/669#issuecomment-2088044658 A fix has been merged, thanks for reporting this @bigluck and @vtk9 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Build: Bump Poetry to 1.8.2 [iceberg-python]

2024-05-01 Thread via GitHub
HonahX merged PR #676: URL: https://github.com/apache/iceberg-python/pull/676 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Build: Bump moto from 5.0.5 to 5.0.6 [iceberg-python]

2024-05-01 Thread via GitHub
HonahX merged PR #679: URL: https://github.com/apache/iceberg-python/pull/679 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Support CreateTableTransaction for Hive and SQL Catalog [iceberg-python]

2024-05-01 Thread via GitHub
HonahX closed pull request #611: Support CreateTableTransaction for Hive and SQL Catalog URL: https://github.com/apache/iceberg-python/pull/611 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Support CreateTableTransaction for Hive and SQL Catalog [iceberg-python]

2024-05-01 Thread via GitHub
HonahX commented on PR #611: URL: https://github.com/apache/iceberg-python/pull/611#issuecomment-2088039618 Separate this into #683 and #684 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] Support CreateTableTransaction for SqlCatalog [iceberg-python]

2024-05-01 Thread via GitHub
HonahX opened a new pull request, #684: URL: https://github.com/apache/iceberg-python/pull/684 Follow-up of https://github.com/apache/iceberg-python/pull/498 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[PR] Support CreateTableTransaction for HiveCatalog [iceberg-python]

2024-05-01 Thread via GitHub
HonahX opened a new pull request, #683: URL: https://github.com/apache/iceberg-python/pull/683 Follow-up of #498 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] appending to a table with Decimal > 32767 results in `int too big to convert` [iceberg-python]

2024-05-01 Thread via GitHub
HonahX closed issue #669: appending to a table with Decimal > 32767 results in `int too big to convert` URL: https://github.com/apache/iceberg-python/issues/669 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Bug: Take signed bit into account [iceberg-python]

2024-05-01 Thread via GitHub
HonahX merged PR #677: URL: https://github.com/apache/iceberg-python/pull/677 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: