Re: [PR] Allow writing `pa.Table` that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on code in PR #921: URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1676751929 ## pyiceberg/io/pyarrow.py: ## @@ -2079,36 +2083,63 @@ def _check_schema_compatible(table_schema: Schema, other_schema: pa.Schema, down Raises:

Re: [PR] Allow writing `pa.Table` that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on code in PR #921: URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1676751929 ## pyiceberg/io/pyarrow.py: ## @@ -2079,36 +2083,63 @@ def _check_schema_compatible(table_schema: Schema, other_schema: pa.Schema, down Raises:

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1676751334 ## build.gradle: ## @@ -358,6 +358,7 @@ project(':iceberg-core') { implementation libs.jackson.databind implementation libs.caffeine implementation

[I] The unit test for class TestFlinkIcebergSink cannot be executed [iceberg]

2024-07-12 Thread via GitHub
dzzxjl opened a new issue, #10694: URL: https://github.com/apache/iceberg/issues/10694 ### Query engine Flink ### Question The unit test for class TestFlinkIcebergSink cannot be executed. In IDEA, the default unit test command is: `:iceberg-flink:iceberg-flink-1.17:test

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on PR #10688: URL: https://github.com/apache/iceberg/pull/10688#issuecomment-2226738178 @danielcweeks @rdblue I know what you mean, Sir. 1.About abandoning the catalog implementation that relies on lockManager. I think this is too radical. I agree that it would

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
BsoBird commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1676690715 ## build.gradle: ## @@ -358,6 +358,7 @@ project(':iceberg-core') { implementation libs.jackson.databind implementation libs.caffeine implementation

Re: [PR] Standardize AWS credential names [iceberg-python]

2024-07-12 Thread via GitHub
jayceslesar commented on PR #922: URL: https://github.com/apache/iceberg-python/pull/922#issuecomment-2226714861 went to generate the mkdocs and spawned https://github.com/apache/iceberg-python/issues/923 but I think the approach looks good. The only way to make less repeatable would be to

[I] Move mkdocs action/workflow into `docs` group [iceberg-python]

2024-07-12 Thread via GitHub
jayceslesar opened a new issue, #923: URL: https://github.com/apache/iceberg-python/issues/923 ### Feature Request / Improvement I was trying to set up mkdocs on this repo and found the existing setup a little unintuitive -- in most cases I use mkdocs as a `docs` extra in whatever

Re: [PR] Standardize AWS credential names [iceberg-python]

2024-07-12 Thread via GitHub
HonahX commented on code in PR #922: URL: https://github.com/apache/iceberg-python/pull/922#discussion_r1676596155 ## pyiceberg/io/__init__.py: ## @@ -46,6 +48,10 @@ logger = logging.getLogger(__name__) +AWS_REGION = "client.region" Review Comment: I chose `client.`

Re: [I] [Bug] Load the proper AWS credential for glue/dynamodb catalog [iceberg-python]

2024-07-12 Thread via GitHub
HonahX commented on issue #892: URL: https://github.com/apache/iceberg-python/issues/892#issuecomment-2226614416 Hi @kevinjqliu @jayceslesar. Thanks for the issue and valuable discussion. I would like to give a try with my proposal in

Re: [I] How to move Iceberg table from one location to another [iceberg]

2024-07-12 Thread via GitHub
anuragmantri commented on issue #3142: URL: https://github.com/apache/iceberg/issues/3142#issuecomment-2226580307 There is now a PR in review which rewrites metadata with new location and also does some other checks. Please take a look at the PR for some ideas

[PR] Standardize AWS credential names [iceberg-python]

2024-07-12 Thread via GitHub
HonahX opened a new pull request, #922: URL: https://github.com/apache/iceberg-python/pull/922 There has been many discussions and concerns over the current behavior of loading AWS credential for glue/dynamo catalog: #892, #515, #570 This PR tries to standardize the property names of

Re: [PR] Encryption integration and test [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #5544: URL: https://github.com/apache/iceberg/pull/5544#discussion_r1676567998 ## core/src/main/java/org/apache/iceberg/TableMetadataParser.java: ## @@ -274,10 +291,12 @@ public static TableMetadata read(FileIO io, String path) { } public

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676567628 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -155,7 +213,11 @@ static Snapshot fromJson(JsonNode node) { operation,

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676566942 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -269,7 +323,11 @@ public Snapshot apply() { operation(), summary(base),

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676566569 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -257,10 +273,48 @@ public Snapshot apply() { .run(index -> manifestFiles[index] =

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676566153 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -257,10 +273,48 @@ public Snapshot apply() { .run(index -> manifestFiles[index] =

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676566569 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -257,10 +273,48 @@ public Snapshot apply() { .run(index -> manifestFiles[index] =

Re: [PR] Allow writing `pa.Table` that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on code in PR #921: URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1676566490 ## pyiceberg/io/pyarrow.py: ## @@ -2079,36 +2083,63 @@ def _check_schema_compatible(table_schema: Schema, other_schema: pa.Schema, down Raises:

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676563921 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -257,10 +273,48 @@ public Snapshot apply() { .run(index -> manifestFiles[index] =

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676563382 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -237,10 +244,19 @@ public Snapshot apply() { OutputFile manifestList = manifestListPath();

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676561882 ## core/src/main/java/org/apache/iceberg/encryption/EncryptionUtil.java: ## @@ -71,30 +70,35 @@ public static KeyManagementClient createKmsClient(Map catalogPro }

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676561247 ## core/src/main/java/org/apache/iceberg/BaseSnapshot.java: ## @@ -143,7 +201,24 @@ private void cacheManifests(FileIO fileIO) { if (allManifests == null) {

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676560690 ## api/src/main/java/org/apache/iceberg/ManifestListFile.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Spark 3.3/3.4: support read of partition metadata column when table is over 1k [iceberg]

2024-07-12 Thread via GitHub
szehon-ho commented on PR #10641: URL: https://github.com/apache/iceberg/pull/10641#issuecomment-2226523211 Merged, thanks @dramaticlly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Spark 3.3/3.4: support read of partition metadata column when table is over 1k [iceberg]

2024-07-12 Thread via GitHub
szehon-ho merged PR #10641: URL: https://github.com/apache/iceberg/pull/10641 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676555890 ## core/src/main/java/org/apache/iceberg/BaseSnapshot.java: ## @@ -62,14 +70,56 @@ class BaseSnapshot implements Snapshot { Map summary, Integer schemaId,

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676552264 ## core/src/main/java/org/apache/iceberg/encryption/StandardEncryptionManager.java: ## @@ -92,13 +94,45 @@ public ByteBuffer wrapKey(ByteBuffer secretKey) { public

Re: [PR] Allow writing `pa.Table` that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
HonahX commented on code in PR #921: URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1676549791 ## pyiceberg/io/pyarrow.py: ## @@ -2079,36 +2083,63 @@ def _check_schema_compatible(table_schema: Schema, other_schema: pa.Schema, down Raises:

Re: [PR] Allow writing `pa.Table` that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
HonahX commented on code in PR #921: URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1676549791 ## pyiceberg/io/pyarrow.py: ## @@ -2079,36 +2083,63 @@ def _check_schema_compatible(table_schema: Schema, other_schema: pa.Schema, down Raises:

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676550818 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -172,6 +234,7 @@ static Snapshot fromJson(JsonNode node) { } } + // Tests only Review

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676546177 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -147,6 +169,42 @@ static Snapshot fromJson(JsonNode node) { if (node.has(MANIFEST_LIST)) {

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676544528 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -147,6 +169,42 @@ static Snapshot fromJson(JsonNode node) { if (node.has(MANIFEST_LIST)) {

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676543810 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -147,6 +169,42 @@ static Snapshot fromJson(JsonNode node) { if (node.has(MANIFEST_LIST)) {

Re: [PR] Spark 3.3/3.4: support read of partition metadata column when table is over 1k [iceberg]

2024-07-12 Thread via GitHub
dramaticlly commented on PR #10641: URL: https://github.com/apache/iceberg/pull/10641#issuecomment-2226487110 @szehon-ho -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [Docs] Add examples for DataFrame branch writes [iceberg]

2024-07-12 Thread via GitHub
anuragmantri commented on code in PR #10644: URL: https://github.com/apache/iceberg/pull/10644#discussion_r1676517338 ## docs/docs/spark-writes.md: ## @@ -228,6 +232,24 @@ SET spark.wap.branch = audit-branch INSERT INTO prod.db.table VALUES (3, 'c'); ``` +### Via DataFrames

Re: [PR] Encryption integration and test [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #5544: URL: https://github.com/apache/iceberg/pull/5544#discussion_r1676515597 ## core/src/main/java/org/apache/iceberg/TableMetadataParser.java: ## @@ -123,6 +127,7 @@ public static void internalWrite( TableMetadata metadata, OutputFile

Re: [PR] Encryption integration and test [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #5544: URL: https://github.com/apache/iceberg/pull/5544#discussion_r1676514642 ## api/src/main/java/org/apache/iceberg/encryption/EncryptingFileIO.java: ## @@ -109,14 +111,19 @@ public InputFile newInputFile(ManifestFile manifest) { } }

Re: [PR] Encryption integration and test [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #5544: URL: https://github.com/apache/iceberg/pull/5544#discussion_r1676512526 ## .palantir/revapi.yml: ## @@ -1018,6 +1018,17 @@ acceptedBreaks: old: "method void

Re: [PR] Encryption integration and test [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #5544: URL: https://github.com/apache/iceberg/pull/5544#discussion_r1676510489 ## .palantir/revapi.yml: ## @@ -1018,6 +1018,17 @@ acceptedBreaks: old: "method void

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676510282 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -106,6 +124,10 @@ public static String toJson(Snapshot snapshot, boolean pretty) { } static

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676501189 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -172,6 +234,7 @@ static Snapshot fromJson(JsonNode node) { } } + // Tests only Review

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676495226 ## core/src/main/java/org/apache/iceberg/BaseSnapshot.java: ## @@ -143,7 +201,24 @@ private void cacheManifests(FileIO fileIO) { if (allManifests == null) {

Re: [PR] [Docs] Add examples for DataFrame branch writes [iceberg]

2024-07-12 Thread via GitHub
szehon-ho commented on code in PR #10644: URL: https://github.com/apache/iceberg/pull/10644#discussion_r1676490679 ## docs/docs/spark-writes.md: ## @@ -228,6 +232,24 @@ SET spark.wap.branch = audit-branch INSERT INTO prod.db.table VALUES (3, 'c'); ``` +### Via DataFrames +

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676486826 ## core/src/main/java/org/apache/iceberg/encryption/EncryptionUtil.java: ## @@ -71,30 +70,35 @@ public static KeyManagementClient createKmsClient(Map catalogPro }

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676486225 ## core/src/main/java/org/apache/iceberg/encryption/EncryptionUtil.java: ## @@ -71,30 +70,35 @@ public static KeyManagementClient createKmsClient(Map catalogPro }

Re: [PR] Allow writing `pa.Table` that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on code in PR #921: URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1676463963 ## pyiceberg/io/pyarrow.py: ## @@ -1450,14 +1451,17 @@ def field_partner(self, partner_struct: Optional[pa.Array], field_id: int, _: st except

Re: [PR] Allow writing `pa.Table` that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on code in PR #921: URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1676463963 ## pyiceberg/io/pyarrow.py: ## @@ -1450,14 +1451,17 @@ def field_partner(self, partner_struct: Optional[pa.Array], field_id: int, _: st except

Re: [PR] Allow writing `pa.Table` that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on code in PR #921: URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1676465583 ## pyiceberg/io/pyarrow.py: ## @@ -2079,36 +2083,63 @@ def _check_schema_compatible(table_schema: Schema, other_schema: pa.Schema, down Raises:

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676478216 ## core/src/main/java/org/apache/iceberg/BaseSnapshot.java: ## @@ -62,14 +70,56 @@ class BaseSnapshot implements Snapshot { Map summary, Integer schemaId,

Re: [PR] Repair manifest action [iceberg]

2024-07-12 Thread via GitHub
danielcweeks commented on PR #10445: URL: https://github.com/apache/iceberg/pull/10445#issuecomment-2226375695 > It looks similar to my attempt in #2608 @szehon-ho would you have any issue if we proceed with this PR? I think there's overlap between the two, but this one addresses

Re: [PR] [Docs] Add examples for DataFrame branch writes [iceberg]

2024-07-12 Thread via GitHub
anuragmantri commented on code in PR #10644: URL: https://github.com/apache/iceberg/pull/10644#discussion_r1676475608 ## docs/docs/spark-writes.md: ## @@ -228,6 +232,24 @@ SET spark.wap.branch = audit-branch INSERT INTO prod.db.table VALUES (3, 'c'); ``` +### Via DataFrames

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676470278 ## api/src/main/java/org/apache/iceberg/ManifestListFile.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676470278 ## api/src/main/java/org/apache/iceberg/ManifestListFile.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676469711 ## api/src/main/java/org/apache/iceberg/ManifestListFile.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676467771 ## api/src/main/java/org/apache/iceberg/ManifestListFile.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Allow writing dataframes that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on code in PR #921: URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1676467035 ## tests/integration/test_writes/test_writes.py: ## @@ -964,18 +964,38 @@ def test_sanitize_character_partitioned(catalog: Catalog) -> None: assert

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676465401 ## api/src/main/java/org/apache/iceberg/ManifestListFile.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Allow writing dataframes that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on code in PR #921: URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1676465583 ## pyiceberg/io/pyarrow.py: ## @@ -2079,36 +2083,63 @@ def _check_schema_compatible(table_schema: Schema, other_schema: pa.Schema, down Raises:

Re: [PR] Allow writing dataframes that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on code in PR #921: URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1676463963 ## pyiceberg/io/pyarrow.py: ## @@ -1450,14 +1451,17 @@ def field_partner(self, partner_struct: Optional[pa.Array], field_id: int, _: st except

Re: [PR] Manifest list encryption [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1676462895 ## api/src/main/java/org/apache/iceberg/Snapshot.java: ## @@ -162,6 +162,16 @@ default Iterable removedDeleteFiles(FileIO io) { */ String

[PR] Update _check_compatible_schema to support subset of schema [iceberg-python]

2024-07-12 Thread via GitHub
syun64 opened a new pull request, #921: URL: https://github.com/apache/iceberg-python/pull/921 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Deprecate to_requested_schema [iceberg-python]

2024-07-12 Thread via GitHub
HonahX merged PR #918: URL: https://github.com/apache/iceberg-python/pull/918 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Spark: Add SparkSQLProperty to control split-size [iceberg]

2024-07-12 Thread via GitHub
sumedhsakdeo commented on PR #10336: URL: https://github.com/apache/iceberg/pull/10336#issuecomment-2226308996 Thank you so much @szehon-ho for your contribution to spark side. OPTIONS is indeed the right way to achieve this functionality. I was wondering if UPDATE and DELETE support is

Re: [I] glue.endpoint config implementation? [iceberg-python]

2024-07-12 Thread via GitHub
HonahX closed issue #414: glue.endpoint config implementation? URL: https://github.com/apache/iceberg-python/issues/414 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Glue endpoint config variable, continue #530 [iceberg-python]

2024-07-12 Thread via GitHub
HonahX merged PR #920: URL: https://github.com/apache/iceberg-python/pull/920 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] [Bug] Load the proper AWS credential for glue/dynamodb catalog [iceberg-python]

2024-07-12 Thread via GitHub
jayceslesar commented on issue #892: URL: https://github.com/apache/iceberg-python/issues/892#issuecomment-2226257388 maybe its fine to just break compatibility leading up to 1.0.0? What does the java iceberg do when looking for s3 credentials? -- This is an automated message from the

Re: [PR] [Docs] Add examples for DataFrame branch writes [iceberg]

2024-07-12 Thread via GitHub
anuragmantri commented on code in PR #10644: URL: https://github.com/apache/iceberg/pull/10644#discussion_r1676383706 ## docs/docs/spark-writes.md: ## @@ -332,6 +332,30 @@ The writer must enable the `mergeSchema` option. ```scala

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
grantatspothero commented on code in PR #10523: URL: https://github.com/apache/iceberg/pull/10523#discussion_r1676380801 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -192,6 +192,11 @@ protected void cleanUncommitted(Set committed) { } } +

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
grantatspothero commented on code in PR #10523: URL: https://github.com/apache/iceberg/pull/10523#discussion_r1676375211 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -198,6 +198,14 @@ protected void cleanUncommitted(Set committed) { } } +

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-12 Thread via GitHub
HonahX commented on PR #910: URL: https://github.com/apache/iceberg-python/pull/910#issuecomment-2226229012 Merged! Thanks for the great work from @syun64 and reviews from @Fokko @kevinjqliu -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] Cannot cast a datetime type with a timezone into a timestampz type. [iceberg-python]

2024-07-12 Thread via GitHub
HonahX closed issue #863: Cannot cast a datetime type with a timezone into a timestampz type. URL: https://github.com/apache/iceberg-python/issues/863 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-12 Thread via GitHub
HonahX merged PR #910: URL: https://github.com/apache/iceberg-python/pull/910 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Chnage the description in Table metadata spec about the cardinality/mapping between snapshot and puffin [iceberg]

2024-07-12 Thread via GitHub
karuppayya commented on issue #10693: URL: https://github.com/apache/iceberg/issues/10693#issuecomment-2226179436 cc: @findepi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Allow writing dataframes that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
syun64 commented on code in PR #829: URL: https://github.com/apache/iceberg-python/pull/829#discussion_r1676337956 ## pyiceberg/table/__init__.py: ## @@ -484,10 +484,6 @@ def append(self, df: pa.Table, snapshot_properties: Dict[str, str] = EMPTY_DICT)

Re: [I] provide night build [iceberg-python]

2024-07-12 Thread via GitHub
kevinjqliu commented on issue #734: URL: https://github.com/apache/iceberg-python/issues/734#issuecomment-2226162839 Closing in favor of #872 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] provide night build [iceberg-python]

2024-07-12 Thread via GitHub
kevinjqliu closed issue #734: provide night build URL: https://github.com/apache/iceberg-python/issues/734 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Allow writing dataframes that are either a subset of table schema or in arbitrary order [iceberg-python]

2024-07-12 Thread via GitHub
kevinjqliu commented on code in PR #829: URL: https://github.com/apache/iceberg-python/pull/829#discussion_r1676323868 ## pyiceberg/table/__init__.py: ## @@ -484,10 +484,6 @@ def append(self, df: pa.Table, snapshot_properties: Dict[str, str] = EMPTY_DICT)

Re: [PR] Core: Prevent dropping column which is referenced by active partition… [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on PR #10352: URL: https://github.com/apache/iceberg/pull/10352#issuecomment-2226155639 Sorry about the delay on this, got busy and forgot I had this open! I've seen more related issue reports to this, so I'm going to prioritize it. -- This is an automated

Re: [I] How to move Iceberg table from one location to another [iceberg]

2024-07-12 Thread via GitHub
namangoel31 commented on issue #3142: URL: https://github.com/apache/iceberg/issues/3142#issuecomment-2226154660 @cccs-jc, how do you determine the schema for writing to avro? I'm not able to get anything useful. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
danielcweeks commented on PR #10688: URL: https://github.com/apache/iceberg/pull/10688#issuecomment-2226150038 I don't think we should be adding new `LockManager` implementations. The general discussion has been that we want to deprecate catalog implementations that rely on external

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10523: URL: https://github.com/apache/iceberg/pull/10523#discussion_r1676308188 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -198,6 +198,14 @@ protected void cleanUncommitted(Set committed) { } } +

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10523: URL: https://github.com/apache/iceberg/pull/10523#discussion_r1676308188 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -198,6 +198,14 @@ protected void cleanUncommitted(Set committed) { } } +

Re: [PR] Core:support redis and http lock-manager [iceberg]

2024-07-12 Thread via GitHub
rdblue commented on code in PR #10688: URL: https://github.com/apache/iceberg/pull/10688#discussion_r1676303612 ## build.gradle: ## @@ -358,6 +358,7 @@ project(':iceberg-core') { implementation libs.jackson.databind implementation libs.caffeine implementation

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10523: URL: https://github.com/apache/iceberg/pull/10523#discussion_r1676297839 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -198,6 +198,14 @@ protected void cleanUncommitted(Set committed) { } } +

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on PR #10523: URL: https://github.com/apache/iceberg/pull/10523#issuecomment-2226118154 Thanks @grantatspothero the overall approach makes sense and this time it is closely dependent on the internal state of `FastAppend` which combined with the new tests should

Re: [PR] Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10523: URL: https://github.com/apache/iceberg/pull/10523#discussion_r1676113425 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -565,6 +570,10 @@ protected boolean canInheritSnapshotId() { return

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-12 Thread via GitHub
dramaticlly commented on PR #10678: URL: https://github.com/apache/iceberg/pull/10678#issuecomment-2226008338 > > @sl255051 appreciate you are taking the stub for the PR. > > But I am wondering why do you think column name case insensitivity is the right behavior when building

Re: [PR] Core: Prevent dropping column which is referenced by active partition… [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10352: URL: https://github.com/apache/iceberg/pull/10352#discussion_r1676213273 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +537,34 @@ private static Schema applyChanges( } } +Map>

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

2024-07-12 Thread via GitHub
sl255051 commented on PR #10678: URL: https://github.com/apache/iceberg/pull/10678#issuecomment-2225993288 > @sl255051 appreciate you are taking the stub for the PR. > > But I am wondering why do you think column name case insensitivity is the right behavior when building

Re: [PR] Core: Prevent dropping column which is referenced by active partition… [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10352: URL: https://github.com/apache/iceberg/pull/10352#discussion_r1676213273 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +537,34 @@ private static Schema applyChanges( } } +Map>

Re: [PR] Core: Prevent dropping column which is referenced by active partition… [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10352: URL: https://github.com/apache/iceberg/pull/10352#discussion_r1676213273 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +537,34 @@ private static Schema applyChanges( } } +Map>

Re: [PR] Core: Prevent dropping column which is referenced by active partition… [iceberg]

2024-07-12 Thread via GitHub
amogh-jahagirdar commented on code in PR #10352: URL: https://github.com/apache/iceberg/pull/10352#discussion_r1676213273 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +537,34 @@ private static Schema applyChanges( } } +Map>

Re: [PR] Docs: Update defaults for distribution mode [iceberg]

2024-07-12 Thread via GitHub
szehon-ho commented on code in PR #10575: URL: https://github.com/apache/iceberg/pull/10575#discussion_r1676206823 ## docs/docs/configuration.md: ## @@ -67,7 +67,7 @@ Iceberg tables support table properties to configure table behavior, like the de |

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-12 Thread via GitHub
findepi commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1676177423 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -175,7 +181,25 @@ public Statistics estimateStatistics() { protected

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1676163120 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -192,4 +209,65 @@ public synchronized T next() { return queue.poll(); } } +

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1676170877 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -88,7 +91,18 @@ private ParallelIterator( @Override public void close() {

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1676163120 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -192,4 +209,65 @@ public synchronized T next() { return queue.poll(); } } +

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225908686 @stevenzwu thanks for your comments! > Curious if you have done any performance testing. echo to another comment. wondering if the default queue size of 10K would affect the

Re: [PR] Core: Limit memory used by ParallelIterable [iceberg]

2024-07-12 Thread via GitHub
findepi commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225902783 > > Can't the caller set a lower limit then, by calling the new constructor overload? > > Yes, that's possible but then you already have to inherit quite a few classes to

  1   2   >