Re: [PR] fix: complete miss attribute for map && list in avro schema [iceberg-rust]

2024-07-31 Thread via GitHub
Fokko commented on PR #411: URL: https://github.com/apache/iceberg-rust/pull/411#issuecomment-2262171294 @ZENOTME can you rebase the PR? :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Encryption integration and test [iceberg]

2024-07-31 Thread via GitHub
ggershinsky commented on code in PR #5544: URL: https://github.com/apache/iceberg/pull/5544#discussion_r1699502474 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -106,18 +112,40 @@ public static String translateToIcebergProp(String hmsPr

Re: [PR] Encryption integration and test [iceberg]

2024-07-31 Thread via GitHub
ggershinsky commented on code in PR #5544: URL: https://github.com/apache/iceberg/pull/5544#discussion_r1699473244 ## api/src/main/java/org/apache/iceberg/encryption/EncryptingFileIO.java: ## @@ -109,14 +111,19 @@ public InputFile newInputFile(ManifestFile manifest) { }

Re: [PR] Spec: add variant type [iceberg]

2024-07-31 Thread via GitHub
flyrain commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1699257350 ## format/variant-shredding-spec.md: ## @@ -0,0 +1,264 @@ +--- +title: "View Spec" +--- + + +# Shredding Overview + +The Spark Variant type is designed to store and p

Re: [PR] Spec: Clarify identity partition edge cases. [iceberg]

2024-07-31 Thread via GitHub
emkornfield commented on PR #10835: URL: https://github.com/apache/iceberg/pull/10835#issuecomment-2262121685 CC @rdblue @RussellSpitzer an alternative (or perhaps cleanup) would be to move column projection to be a subject of scan-planning. I think this would flow more nicely since all of

[PR] Spec: Clarify identity partition edge cases. [iceberg]

2024-07-31 Thread via GitHub
emkornfield opened a new pull request, #10835: URL: https://github.com/apache/iceberg/pull/10835 Discussion on mailing list: https://lists.apache.org/thread/hss83r1605r8932b94xv9y2wfb9o0yns. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Encryption integration and test [iceberg]

2024-07-31 Thread via GitHub
ggershinsky commented on code in PR #5544: URL: https://github.com/apache/iceberg/pull/5544#discussion_r1699473244 ## api/src/main/java/org/apache/iceberg/encryption/EncryptingFileIO.java: ## @@ -109,14 +111,19 @@ public InputFile newInputFile(ManifestFile manifest) { }

Re: [I] Investigation about tracing, logging, and metrics support. [iceberg-rust]

2024-07-31 Thread via GitHub
Xuanwo commented on issue #482: URL: https://github.com/apache/iceberg-rust/issues/482#issuecomment-2262103009 > It sounds like `tracing` is the preferred option for tracing and logging. I'm happy to raise a PR to add this if we are all in agreement. I'm fine with this. Also cc @liure

Re: [I] Geospatial Support [iceberg]

2024-07-31 Thread via GitHub
jiayuasu commented on issue #10260: URL: https://github.com/apache/iceberg/issues/10260#issuecomment-2262098914 Thank you all for the great discussion. We will focus on the Parquet geometry proposal for now, then come back to the Iceberg one. As I already commented in the Parquet Geom

Re: [PR] Flink: infer source parallelism for FLIP-27 source in batch execution mode [iceberg]

2024-07-31 Thread via GitHub
stevenzwu commented on PR #10832: URL: https://github.com/apache/iceberg/pull/10832#issuecomment-2262084012 TestIcebergSpeculativeExecutionSupport hangs after this change. need to investigate. -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Ensure that RestCatalog passes user config to FileIO [iceberg-rust]

2024-07-31 Thread via GitHub
liurenjie1024 commented on code in PR #476: URL: https://github.com/apache/iceberg-rust/pull/476#discussion_r1699321914 ## crates/catalog/rest/src/catalog.rs: ## @@ -504,8 +504,15 @@ impl Catalog for RestCatalog { .query::(request) .await?; +l

Re: [PR] Core: Refactor ZOrderByteUtils [iceberg]

2024-07-31 Thread via GitHub
ajantha-bhat commented on PR #10624: URL: https://github.com/apache/iceberg/pull/10624#issuecomment-2262073216 Failed due to flaky test: https://github.com/apache/iceberg/issues/10599 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Core: Refactor ZOrderByteUtils [iceberg]

2024-07-31 Thread via GitHub
ajantha-bhat closed pull request #10624: Core: Refactor ZOrderByteUtils URL: https://github.com/apache/iceberg/pull/10624 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Docs: Clarify wording on releases [iceberg]

2024-07-31 Thread via GitHub
emkornfield commented on code in PR #10806: URL: https://github.com/apache/iceberg/pull/10806#discussion_r1699442754 ## site/docs/how-to-release.md: ## @@ -76,17 +70,33 @@ For more information, see the Gradle [signing documentation](https://docs.gradle The release should be ex

[PR] Docs: Add Databend docs url to sidebar [iceberg]

2024-07-31 Thread via GitHub
PsiACE opened a new pull request, #10834: URL: https://github.com/apache/iceberg/pull/10834 Add links to the official Databend Iceberg tables support in the sidebar. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Spec: Deprecate the file system table scheme. [iceberg]

2024-07-31 Thread via GitHub
manuzhang commented on PR #10833: URL: https://github.com/apache/iceberg/pull/10833#issuecomment-2261960418 We can hide whitespace in review https://github.com/apache/iceberg/pull/10833/files?diff=split&w=1. -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] Ensure that RestCatalog passes user config to FileIO [iceberg-rust]

2024-07-31 Thread via GitHub
liurenjie1024 commented on code in PR #476: URL: https://github.com/apache/iceberg-rust/pull/476#discussion_r1699321914 ## crates/catalog/rest/src/catalog.rs: ## @@ -504,8 +504,15 @@ impl Catalog for RestCatalog { .query::(request) .await?; +l

Re: [I] How to create an iceberg table under a custom catalog instead of creating it under hiveļ¼Œusing HiveCatalog [iceberg]

2024-07-31 Thread via GitHub
zhengsg commented on issue #10786: URL: https://github.com/apache/iceberg/issues/10786#issuecomment-2261796570 > https://iceberg.apache.org/spark-quickstart/ has an example that uses Spark + REST catalog Rest catalog needs a server for the rest request. If there is no rest server for

Re: [I] A logo for iceberg rust! [iceberg-rust]

2024-07-31 Thread via GitHub
liurenjie1024 closed issue #216: A logo for iceberg rust! URL: https://github.com/apache/iceberg-rust/issues/216 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] Improve iceberg site documentation. [iceberg]

2024-07-31 Thread via GitHub
RocMarshal closed issue #3534: Improve iceberg site documentation. URL: https://github.com/apache/iceberg/issues/3534 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [PR] Spec: add variant type [iceberg]

2024-07-31 Thread via GitHub
flyrain commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1699257350 ## format/variant-shredding-spec.md: ## @@ -0,0 +1,264 @@ +--- +title: "View Spec" +--- + + +# Shredding Overview + +The Spark Variant type is designed to store and p

Re: [PR] Spec: add variant type [iceberg]

2024-07-31 Thread via GitHub
flyrain commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1699255366 ## format/spec.md: ## @@ -164,6 +164,8 @@ A **`list`** is a collection of values with some element type. The element field A **`map`** is a collection of key-valu

Re: [I] Configure timestamp downcast programmatically [iceberg-python]

2024-07-31 Thread via GitHub
sungwy commented on issue #960: URL: https://github.com/apache/iceberg-python/issues/960#issuecomment-2261726107 Hi @devinrsmith thank you again for raising this suggestion. I'm working on a similar property driven feature in https://github.com/apache/iceberg-python/pull/986 where I'm propo

Re: [PR] Treat warning as error in CI/Dev [iceberg-python]

2024-07-31 Thread via GitHub
sungwy merged PR #973: URL: https://github.com/apache/iceberg-python/pull/973 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Flink: infer source parallelism for FLIP-27 source in batch execution mode [iceberg]

2024-07-31 Thread via GitHub
stevenzwu commented on code in PR #10832: URL: https://github.com/apache/iceberg/pull/10832#discussion_r1699246487 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/source/IcebergSource.java: ## @@ -545,5 +579,66 @@ public IcebergSource build() { table,

Re: [PR] Use 'strtobool' instead of comparing with a string. [iceberg-python]

2024-07-31 Thread via GitHub
sungwy commented on code in PR #988: URL: https://github.com/apache/iceberg-python/pull/988#discussion_r1699244665 ## pyiceberg/expressions/literals.py: ## @@ -588,7 +589,7 @@ def _(self, type_var: DecimalType) -> Literal[Decimal]: def _(self, type_var: BooleanType) -> Lite

Re: [PR] Flink: refactor sink tests to reduce the number of combinations with parameterized tests [iceberg]

2024-07-31 Thread via GitHub
stevenzwu commented on code in PR #10777: URL: https://github.com/apache/iceberg/pull/10777#discussion_r1698727737 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/SqlBase.java: ## @@ -0,0 +1,110 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-31 Thread via GitHub
sl255051 commented on code in PR #10678: URL: https://github.com/apache/iceberg/pull/10678#discussion_r1699234156 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -427,13 +429,21 @@ private void checkForRedundantPartitions(PartitionField field) { dedupFi

Re: [PR] Spec: Deprecate the file system table scheme. [iceberg]

2024-07-31 Thread via GitHub
dimas-b commented on code in PR #10833: URL: https://github.com/apache/iceberg/pull/10833#discussion_r1699236415 ## format/spec.md: ## @@ -1393,4 +1395,4 @@ This section covers topics not required by the specification but recommendations Iceberg supports two types of histories

Re: [PR] Flink: infer source parallelism for FLIP-27 source in batch execution mode [iceberg]

2024-07-31 Thread via GitHub
stevenzwu commented on code in PR #10832: URL: https://github.com/apache/iceberg/pull/10832#discussion_r1699222716 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/source/IcebergSource.java: ## @@ -481,6 +530,8 @@ public IcebergSource build() { } } +

Re: [I] Does iceberg plan to implement SQL to optimize the layout? [iceberg]

2024-07-31 Thread via GitHub
github-actions[bot] commented on issue #3537: URL: https://github.com/apache/iceberg/issues/3537#issuecomment-2261696185 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Iceberg does not support drop namespace cascade [iceberg]

2024-07-31 Thread via GitHub
github-actions[bot] commented on issue #3541: URL: https://github.com/apache/iceberg/issues/3541#issuecomment-2261696213 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Improve iceberg site documentation. [iceberg]

2024-07-31 Thread via GitHub
github-actions[bot] commented on issue #3534: URL: https://github.com/apache/iceberg/issues/3534#issuecomment-2261696158 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Spec: Deprecate the file system table scheme. [iceberg]

2024-07-31 Thread via GitHub
rdblue commented on code in PR #10833: URL: https://github.com/apache/iceberg/pull/10833#discussion_r1699235451 ## format/spec.md: ## @@ -1393,4 +1395,4 @@ This section covers topics not required by the specification but recommendations Iceberg supports two types of histories

Re: [PR] Spark: vectorized/non-vectorized read compound constant type exception (#3139) [iceberg]

2024-07-31 Thread via GitHub
github-actions[bot] commented on PR #3186: URL: https://github.com/apache/iceberg/pull/3186#issuecomment-2261695684 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Spark: vectorized/non-vectorized read compound constant type exception (#3139) [iceberg]

2024-07-31 Thread via GitHub
github-actions[bot] closed pull request #3186: Spark: vectorized/non-vectorized read compound constant type exception (#3139) URL: https://github.com/apache/iceberg/pull/3186 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] Data: Add equality delete file cache [iceberg]

2024-07-31 Thread via GitHub
github-actions[bot] closed pull request #3174: Data: Add equality delete file cache URL: https://github.com/apache/iceberg/pull/3174 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Data: Add equality delete file cache [iceberg]

2024-07-31 Thread via GitHub
github-actions[bot] commented on PR #3174: URL: https://github.com/apache/iceberg/pull/3174#issuecomment-2261695654 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-31 Thread via GitHub
sl255051 commented on code in PR #10678: URL: https://github.com/apache/iceberg/pull/10678#discussion_r1699234156 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -427,13 +429,21 @@ private void checkForRedundantPartitions(PartitionField field) { dedupFi

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-31 Thread via GitHub
sl255051 commented on code in PR #10678: URL: https://github.com/apache/iceberg/pull/10678#discussion_r1699234156 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -427,13 +429,21 @@ private void checkForRedundantPartitions(PartitionField field) { dedupFi

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-31 Thread via GitHub
sl255051 commented on code in PR #10678: URL: https://github.com/apache/iceberg/pull/10678#discussion_r1699234156 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -427,13 +429,21 @@ private void checkForRedundantPartitions(PartitionField field) { dedupFi

[PR] Spec: Deprecate the file system table scheme. [iceberg]

2024-07-31 Thread via GitHub
rdblue opened a new pull request, #10833: URL: https://github.com/apache/iceberg/pull/10833 In the [discuss thread on the dev list](https://lists.apache.org/thread/oohcjfp1vpo005h2r0f6gfpsp6op0qps), there was agreement to deprecate the "File System Tables" scheme for committing metadata. T

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-31 Thread via GitHub
rdblue commented on code in PR #10678: URL: https://github.com/apache/iceberg/pull/10678#discussion_r1699225863 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -427,13 +429,21 @@ private void checkForRedundantPartitions(PartitionField field) { dedupFiel

[PR] Flink: infer source parallelism for FLIP-27 source in batch execution mode [iceberg]

2024-07-31 Thread via GitHub
stevenzwu opened a new pull request, #10832: URL: https://github.com/apache/iceberg/pull/10832 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-31 Thread via GitHub
sl255051 commented on code in PR #10678: URL: https://github.com/apache/iceberg/pull/10678#discussion_r1699209144 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -427,13 +429,21 @@ private void checkForRedundantPartitions(PartitionField field) { dedupFi

Re: [PR] Spec: add variant type [iceberg]

2024-07-31 Thread via GitHub
aihuaxu commented on PR #10831: URL: https://github.com/apache/iceberg/pull/10831#issuecomment-2261661113 cc @rdblue, @RussellSpitzer and @flyrain -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-31 Thread via GitHub
sl255051 commented on code in PR #10678: URL: https://github.com/apache/iceberg/pull/10678#discussion_r1699210022 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -427,13 +429,21 @@ private void checkForRedundantPartitions(PartitionField field) { dedupFi

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-31 Thread via GitHub
sl255051 commented on code in PR #10678: URL: https://github.com/apache/iceberg/pull/10678#discussion_r1699209144 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -427,13 +429,21 @@ private void checkForRedundantPartitions(PartitionField field) { dedupFi

Re: [PR] SQL Catalog [iceberg-rust]

2024-07-31 Thread via GitHub
sdd commented on code in PR #503: URL: https://github.com/apache/iceberg-rust/pull/503#discussion_r1699208009 ## crates/catalog/sql/src/catalog.rs: ## @@ -0,0 +1,842 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. Se

Re: [PR] SQL Catalog [iceberg-rust]

2024-07-31 Thread via GitHub
sdd commented on code in PR #503: URL: https://github.com/apache/iceberg-rust/pull/503#discussion_r1699208009 ## crates/catalog/sql/src/catalog.rs: ## @@ -0,0 +1,842 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. Se

[PR] Spec: add variant type [iceberg]

2024-07-31 Thread via GitHub
aihuaxu opened a new pull request, #10831: URL: https://github.com/apache/iceberg/pull/10831 Spec: add variant type Proposal: https://docs.google.com/document/d/1QjhpG_SVNPZh3anFcpicMQx90ebwjL7rmzFYfUP89Iw/edit This is to layout the spec for variant type. As we discussed, we a

Re: [PR] feat: improve compatibility of S3 test minio connection [iceberg-rust]

2024-07-31 Thread via GitHub
sdd closed pull request #470: feat: improve compatibility of S3 test minio connection URL: https://github.com/apache/iceberg-rust/pull/470 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] feat: improve compatibility of S3 test minio connection [iceberg-rust]

2024-07-31 Thread via GitHub
sdd commented on PR #470: URL: https://github.com/apache/iceberg-rust/pull/470#issuecomment-2261638592 Closing this, no longer relevant -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] API: Update Conversions.fromPartitionString API to handle timestamps [iceberg]

2024-07-31 Thread via GitHub
rdblue commented on PR #10820: URL: https://github.com/apache/iceberg/pull/10820#issuecomment-2261590336 I think there's a [better solution](https://github.com/apache/iceberg/pull/10724#issuecomment-2261582583) that I wrote up on the other issue. I don't think it is a good idea to add capa

Re: [PR] Core: Add DataFiles builder API to enable users to specify their own custom conversion logic for string partition values [iceberg]

2024-07-31 Thread via GitHub
rdblue commented on PR #10724: URL: https://github.com/apache/iceberg/pull/10724#issuecomment-2261582583 I just talked offline with @amogh-jahagirdar about a better way to do this. Systems that store information in strings should have ways to parse those strings to recover the data, and I t

Re: [PR] Spark Action to Analyze table [iceberg]

2024-07-31 Thread via GitHub
karuppayya commented on code in PR #10288: URL: https://github.com/apache/iceberg/pull/10288#discussion_r1699176783 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/ComputeTableStatsSparkAction.java: ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software

[PR] Bump deptry from 0.17.0 to 0.18.0 [iceberg-python]

2024-07-31 Thread via GitHub
dependabot[bot] opened a new pull request, #990: URL: https://github.com/apache/iceberg-python/pull/990 Bumps [deptry](https://github.com/fpgmaas/deptry) from 0.17.0 to 0.18.0. Release notes Sourced from https://github.com/fpgmaas/deptry/releases";>deptry's releases. 0.18.0

[PR] Bump getdaft from 0.2.31 to 0.2.32 [iceberg-python]

2024-07-31 Thread via GitHub
dependabot[bot] opened a new pull request, #989: URL: https://github.com/apache/iceberg-python/pull/989 Bumps [getdaft](https://github.com/Eventual-Inc/Daft) from 0.2.31 to 0.2.32. Release notes Sourced from https://github.com/Eventual-Inc/Daft/releases";>getdaft's releases.

Re: [I] Investigation about tracing, logging, and metrics support. [iceberg-rust]

2024-07-31 Thread via GitHub
sdd commented on issue #482: URL: https://github.com/apache/iceberg-rust/issues/482#issuecomment-2261509816 It sounds like `tracing` is the preferred option for tracing and logging. I'm happy to raise a PR to add this if we are all in agreement. Regarding telemetry, does anyone have o

Re: [PR] API: Define RepairManifests action interface [iceberg]

2024-07-31 Thread via GitHub
amogh-jahagirdar commented on code in PR #10784: URL: https://github.com/apache/iceberg/pull/10784#discussion_r1699116602 ## api/src/main/java/org/apache/iceberg/actions/RepairManifests.java: ## @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Concurrent data file fetching and parallel RecordBatch processing [iceberg-rust]

2024-07-31 Thread via GitHub
sdd commented on PR #515: URL: https://github.com/apache/iceberg-rust/pull/515#issuecomment-2261484551 If I run this directly against locally hosted Minio, cutting out the HAProxy container in the stack (that is being used to introduce latency and bandwidth constraints to simulate real-worl

Re: [PR] DOC: Strawman proposal for PR merging [iceberg]

2024-07-31 Thread via GitHub
emkornfield commented on code in PR #10780: URL: https://github.com/apache/iceberg/pull/10780#discussion_r1699089811 ## site/docs/contribute.md: ## @@ -45,6 +45,16 @@ The Iceberg community prefers to receive contributions as [Github pull requests] * If a PR is related to an is

Re: [I] A move after a rename fails [iceberg]

2024-07-31 Thread via GitHub
RussellSpitzer commented on issue #10830: URL: https://github.com/apache/iceberg/issues/10830#issuecomment-2261433397 I doubt we intend it to work that way, but I'm not sure it's worth fixing? -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [I] A move after a rename fails [iceberg]

2024-07-31 Thread via GitHub
Fokko commented on issue #10830: URL: https://github.com/apache/iceberg/issues/10830#issuecomment-2261424723 Yes, this works: ```java @Test public void testMoveAfterRename() { Schema schema = new Schema( required(1, "b", Types.IntegerType.get(

Re: [I] Scan Iceberg table sorted on partition key without sort order [iceberg-python]

2024-07-31 Thread via GitHub
kevinjqliu commented on issue #966: URL: https://github.com/apache/iceberg-python/issues/966#issuecomment-2261421425 @BTheunissen happy to help! When you get a working solution, would you mind posting back here for future reference? Even pseudo-code would be helpful! -- This is a

Re: [I] A move after a rename fails [iceberg]

2024-07-31 Thread via GitHub
RussellSpitzer commented on issue #10830: URL: https://github.com/apache/iceberg/issues/10830#issuecomment-2261411157 Can you move then rename? Just wondering how this is actually set up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] Scan Iceberg table sorted on partition key without sort order [iceberg-python]

2024-07-31 Thread via GitHub
BTheunissen closed issue #966: Scan Iceberg table sorted on partition key without sort order URL: https://github.com/apache/iceberg-python/issues/966 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Add overwrite support for `add_files` [iceberg-python]

2024-07-31 Thread via GitHub
enkidulan closed issue #809: Add overwrite support for `add_files` URL: https://github.com/apache/iceberg-python/issues/809 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] Add overwrite support for `add_files` [iceberg-python]

2024-07-31 Thread via GitHub
enkidulan commented on issue #809: URL: https://github.com/apache/iceberg-python/issues/809#issuecomment-2261399987 Closing this issue as the recent changes provide overwrite logic for `add_files` method, more on the the topic is here https://github.com/apache/iceberg-python/pull/810#issuec

Re: [PR] Adding `add_files_overwrite` method [iceberg-python]

2024-07-31 Thread via GitHub
enkidulan closed pull request #810: Adding `add_files_overwrite` method URL: https://github.com/apache/iceberg-python/pull/810 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Adding `add_files_overwrite` method [iceberg-python]

2024-07-31 Thread via GitHub
enkidulan commented on PR #810: URL: https://github.com/apache/iceberg-python/pull/810#issuecomment-2261396569 Thanks for getting back to me @sungwy . I agree with your point of view and respect the decision. I started working on this feature before #569 was introduced, and back then `add_f

Re: [I] Scan Iceberg table sorted on partition key without sort order [iceberg-python]

2024-07-31 Thread via GitHub
BTheunissen commented on issue #966: URL: https://github.com/apache/iceberg-python/issues/966#issuecomment-2261395609 >Sort based on partition field can be done without reading all the data into memory since we can just work off the table metadata. Awesome this is just what I was look

[PR] Concurrent data file fetching and parallel RecordBatch processing [iceberg-rust]

2024-07-31 Thread via GitHub
sdd opened a new pull request, #515: URL: https://github.com/apache/iceberg-rust/pull/515 NB: (This PR builds on top of https://github.com/apache/iceberg-rust/pull/373 and https://github.com/apache/iceberg-rust/pull/512 and includes their commits, so should be rebased on main once they are

Re: [PR] DOCS: Add more post release notes [iceberg-python]

2024-07-31 Thread via GitHub
sungwy commented on PR #983: URL: https://github.com/apache/iceberg-python/pull/983#issuecomment-2261352451 > wdyt about running the Github action to update the pyiceberg website so that the docs changes are reflected? Otherwise, this will be updated on the next release Already did!

Re: [I] Scan Iceberg table sorted on partition key without sort order [iceberg-python]

2024-07-31 Thread via GitHub
kevinjqliu commented on issue #966: URL: https://github.com/apache/iceberg-python/issues/966#issuecomment-2261342573 Taking a stab at this, > support reading a PyArrow Batch Reader Looking at the code for [`to_arrow_batch_reader` ](https://github.com/apache/iceberg-python/blob/

Re: [PR] Core: Adds Basic Classes for Iceberg Table Version 3 [iceberg]

2024-07-31 Thread via GitHub
RussellSpitzer commented on code in PR #10760: URL: https://github.com/apache/iceberg/pull/10760#discussion_r1699026456 ## core/src/main/java/org/apache/iceberg/ManifestLists.java: ## @@ -66,6 +66,9 @@ static ManifestListWriter write( case 2: return new ManifestL

Re: [PR] Kafka Connect: Include third party licenses and notices in distribution [iceberg]

2024-07-31 Thread via GitHub
bryanck commented on PR #10829: URL: https://github.com/apache/iceberg/pull/10829#issuecomment-2261315905 @rdblue @danielcweeks in case you want to take a look, Let me know if you want a different format, I can easily regenerate the files -- This is an automated message from the Apache Gi

[PR] Kafka Connect: Include third party licenses and notices in distribution [iceberg]

2024-07-31 Thread via GitHub
bryanck opened a new pull request, #10829: URL: https://github.com/apache/iceberg/pull/10829 This PR includes third party licenses and notices with the Kafka Connect sink distribution archives. The third party licenses and notices were generated using the Gradle License Report [plugin](htt

Re: [PR] Use 'strtobool' instead of comparing with a string. [iceberg-python]

2024-07-31 Thread via GitHub
sungwy commented on PR #988: URL: https://github.com/apache/iceberg-python/pull/988#issuecomment-2261291991 Yes that sounds like a good idea - the name does sound a bit redundant leaving it as it is: `from pyiceberg.utils.properties import PropertyUtil` - it's quite the tongue twiste

Re: [I] [documentation] library version upgrade fails `test_version_format` [iceberg-python]

2024-07-31 Thread via GitHub
kevinjqliu commented on issue #949: URL: https://github.com/apache/iceberg-python/issues/949#issuecomment-2261284445 sure @laksh-krishna-sharma , assigned to you. Please let me know if you have any questions -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] Use 'strtobool' instead of comparing with a string. [iceberg-python]

2024-07-31 Thread via GitHub
ndrluis commented on PR #988: URL: https://github.com/apache/iceberg-python/pull/988#issuecomment-2261270963 Yes, I think that makes sense. What do you think about extracting the methods from the `PropertyUtil` class into functions in the `pyiceberg/utils/properties.py` module? -- This i

Re: [PR] Adding `add_files_overwrite` method [iceberg-python]

2024-07-31 Thread via GitHub
sungwy commented on PR #810: URL: https://github.com/apache/iceberg-python/pull/810#issuecomment-2261266264 Hi @enkidulan - thank you very much for putting in the time to write up this PR. I'm very appreciative of the work and the level of interest you have in the new API `add_files`

Re: [PR] Use 'strtobool' instead of comparing with a string. [iceberg-python]

2024-07-31 Thread via GitHub
sungwy commented on PR #988: URL: https://github.com/apache/iceberg-python/pull/988#issuecomment-2261252104 Thank you for cleaning this up @ndrluis I think it would also make sense to move `PropertyUtil` out of `pyiceberg/table/__init__.py` to a new file, like `pyiceberg/utils/prope

Re: [PR] DOCS: Add more post release notes [iceberg-python]

2024-07-31 Thread via GitHub
kevinjqliu commented on PR #983: URL: https://github.com/apache/iceberg-python/pull/983#issuecomment-2261249606 wdyt about running the Github action to update the pyiceberg website so that the docs changes are reflected? Otherwise, this will be updated on the next release -- This is an a

Re: [PR] Exclude Python 3.9.7 due to import error in catalog module [iceberg-python]

2024-07-31 Thread via GitHub
ndrluis commented on PR #526: URL: https://github.com/apache/iceberg-python/pull/526#issuecomment-2261231628 @sungwy Done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] DOCS: Add more post release notes [iceberg-python]

2024-07-31 Thread via GitHub
sungwy merged PR #983: URL: https://github.com/apache/iceberg-python/pull/983 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] DOCS: Add more post release notes [iceberg-python]

2024-07-31 Thread via GitHub
sungwy commented on PR #983: URL: https://github.com/apache/iceberg-python/pull/983#issuecomment-2261208582 > I added a few notes to the "Github Release Notes" part as I am going through the steps on my own forked repo Thanks for these nits, and for taking the time to walk through the

Re: [PR] Exclude Python 3.9.7 due to import error in catalog module [iceberg-python]

2024-07-31 Thread via GitHub
sungwy commented on PR #526: URL: https://github.com/apache/iceberg-python/pull/526#issuecomment-2261194102 This sounds like a great idea @ndrluis - thank you for identifying this issue and putting in the fix. Could we ask for you to rebase your PR against the current main, and run `poetry

Re: [PR] Pyarrow IO property for configuring large v small types on read [iceberg-python]

2024-07-31 Thread via GitHub
sungwy commented on PR #986: URL: https://github.com/apache/iceberg-python/pull/986#issuecomment-2261182063 Once approved/merged, I'd like to bring this up on the discussion thread to add this item to 0.7.1 patch release as well. It's a small feature, and it would help with alleviate the me

Re: [PR] Flink: Maintenance - TriggerManager [iceberg]

2024-07-31 Thread via GitHub
stevenzwu commented on code in PR #10484: URL: https://github.com/apache/iceberg/pull/10484#discussion_r1698852703 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TagBasedLockFactory.java: ## @@ -0,0 +1,152 @@ +/* + * Licensed to the Apache Softw

[I] The ColumnarToRow Spark optimization is not applied when using nested fields from an Iceberg table [iceberg]

2024-07-31 Thread via GitHub
cccs-jc opened a new issue, #10828: URL: https://github.com/apache/iceberg/issues/10828 ### Feature Request / Improvement Prior to the introduction of the ColumnarToRow in Spark 3.0.0, columnar data was converted into Spark's internal rows using generated code that copies data fr

Re: [I] Equality delete lost after compact data files [iceberg]

2024-07-31 Thread via GitHub
szehon-ho commented on issue #10312: URL: https://github.com/apache/iceberg/issues/10312#issuecomment-2261085249 yea I think @RussellSpitzer is right, we should rely on validation error to prevent this scenario here, ie T1 should not be able to commit successfully. i need to understa

Re: [PR] Support for Flink's SpeculativeExecution in batch execution mode - Backport of PR #10548 [iceberg]

2024-07-31 Thread via GitHub
venkata91 commented on PR #10776: URL: https://github.com/apache/iceberg/pull/10776#issuecomment-2261028480 > @pvary Btw, I merged the `main` branch changes to my local branch in order to trigger the tests again. Ideally, if it is transient issue, hopefully this should solve it. @pva

Re: [PR] DOCS: Add more post release notes [iceberg-python]

2024-07-31 Thread via GitHub
HonahX commented on code in PR #983: URL: https://github.com/apache/iceberg-python/pull/983#discussion_r1698873629 ## mkdocs/docs/how-to-release.md: ## @@ -208,6 +208,15 @@ svn add /tmp/iceberg-dist-release/ svn ci -m "PyIceberg " /tmp/iceberg-dist-release/ Review Comment:

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-07-31 Thread via GitHub
ajantha-bhat commented on PR #10176: URL: https://github.com/apache/iceberg/pull/10176#issuecomment-2261017015 Thanks @RussellSpitzer, @aokolnychyi and @lirui-apache for the review feedbacks. I have also added strong test validations today. I have addressed or replied to each comm

Re: [PR] SQL Catalog [iceberg-rust]

2024-07-31 Thread via GitHub
callum-ryan commented on code in PR #503: URL: https://github.com/apache/iceberg-rust/pull/503#discussion_r1698860073 ## crates/catalog/sql/src/catalog.rs: ## @@ -0,0 +1,836 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] SQL Catalog [iceberg-rust]

2024-07-31 Thread via GitHub
callum-ryan commented on code in PR #503: URL: https://github.com/apache/iceberg-rust/pull/503#discussion_r1698859503 ## crates/catalog/sql/src/catalog.rs: ## @@ -0,0 +1,836 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] DOCS: Add more post release notes [iceberg-python]

2024-07-31 Thread via GitHub
kevinjqliu commented on code in PR #983: URL: https://github.com/apache/iceberg-python/pull/983#discussion_r1698846962 ## mkdocs/docs/how-to-release.md: ## @@ -243,3 +252,17 @@ Make sure to create a PR to update the [GitHub issues template](https://github.c ## Update the integ

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-07-31 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1698841407 ## api/src/main/java/org/apache/iceberg/expressions/Literals.java: ## @@ -299,6 +300,8 @@ public Literal to(Type type) { return (Literal) new TimeLiteral(valu

Re: [I] Scan does not work as expected [iceberg-rust]

2024-07-31 Thread via GitHub
sdd commented on issue #495: URL: https://github.com/apache/iceberg-rust/issues/495#issuecomment-2260978153 > Correctly writing data into iceberg is not supported yet, so we need external systems such as spark to ingest data. Putting pre generated parquet files maybe an approach, but that r

  1   2   >