[GitHub] [iceberg] ajantha-bhat commented on issue #4565: Failure in expire snapshots action due to exception creating partition summary rows

2022-04-19 Thread GitBox
ajantha-bhat commented on issue #4565: URL: https://github.com/apache/iceberg/issues/4565#issuecomment-1103535683 @nreich : Thanks for verifying. I think we can close this issue as the fix is already present in the master branch. -- This is an automated message from the Apache Git

[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #4577: Fixes read metadata table failed due to illegal character

2022-04-19 Thread GitBox
ConeyLiu commented on code in PR #4577: URL: https://github.com/apache/iceberg/pull/4577#discussion_r853767529 ## core/src/main/java/org/apache/iceberg/avro/BuildAvroProjection.java: ## @@ -106,10 +106,16 @@ public Schema record(Schema record, List names, Iterable s

[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #4577: Fixes read metadata table failed due to illegal character

2022-04-19 Thread GitBox
ConeyLiu commented on code in PR #4577: URL: https://github.com/apache/iceberg/pull/4577#discussion_r852555002 ## core/src/main/java/org/apache/iceberg/avro/BuildAvroProjection.java: ## @@ -106,10 +106,16 @@ public Schema record(Schema record, List names, Iterable s

[GitHub] [iceberg] nastra commented on a diff in pull request #4491: Nessie: Extract Catalog client code to NessieClient for Trino Consumption

2022-04-19 Thread GitBox
nastra commented on code in PR #4491: URL: https://github.com/apache/iceberg/pull/4491#discussion_r853753845 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -0,0 +1,334 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [iceberg] stevenzwu commented on issue #3941: [Feature Request] Support for change data capture

2022-04-19 Thread GitBox
stevenzwu commented on issue #3941: URL: https://github.com/apache/iceberg/issues/3941#issuecomment-1103471729 @aokolnychyi great write-up on the algorithm. I think the per-snapshot algorithm is a good starting point that should cover a lot of use cases. > Build a predicate to use wh

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #4580: API: Introduce a new IncrementalTableScan interface

2022-04-19 Thread GitBox
stevenzwu commented on code in PR #4580: URL: https://github.com/apache/iceberg/pull/4580#discussion_r853609392 ## api/src/main/java/org/apache/iceberg/IncrementalTableScan.java: ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more c

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #4580: API: Introduce a new IncrementalTableScan interface

2022-04-19 Thread GitBox
stevenzwu commented on code in PR #4580: URL: https://github.com/apache/iceberg/pull/4580#discussion_r853717200 ## api/src/main/java/org/apache/iceberg/Scan.java: ## @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor lice

[GitHub] [iceberg] wypoon commented on pull request #4395: Spark: Add custom metric for number of file splits read by a SparkScan

2022-04-19 Thread GitBox
wypoon commented on PR #4395: URL: https://github.com/apache/iceberg/pull/4395#issuecomment-1103448541 @kbendick do you have any further feedback on this? @RussellSpitzer I have implemented your suggestion for a custom metric for how many delete rows are applied in a scan. I decided to op

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #4539: Change Data Capture(CDC)[Draft]

2022-04-19 Thread GitBox
stevenzwu commented on code in PR #4539: URL: https://github.com/apache/iceberg/pull/4539#discussion_r853706941 ## api/src/main/java/org/apache/iceberg/actions/Cdc.java: ## @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributo

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #4539: Change Data Capture(CDC)[Draft]

2022-04-19 Thread GitBox
stevenzwu commented on code in PR #4539: URL: https://github.com/apache/iceberg/pull/4539#discussion_r853706211 ## api/src/main/java/org/apache/iceberg/actions/ActionsProvider.java: ## @@ -74,4 +74,11 @@ default ExpireSnapshots expireSnapshots(Table table) { default DeleteRea

[GitHub] [iceberg] hililiwei commented on a diff in pull request #3991: Flink: Support nested projection

2022-04-19 Thread GitBox
hililiwei commented on code in PR #3991: URL: https://github.com/apache/iceberg/pull/3991#discussion_r853696393 ## flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/data/RowDataProjection.java: ## @@ -60,11 +76,37 @@ public static RowDataProjection create(Schema schema,

[GitHub] [iceberg] hililiwei commented on a diff in pull request #3991: Flink: Support nested projection

2022-04-19 Thread GitBox
hililiwei commented on code in PR #3991: URL: https://github.com/apache/iceberg/pull/3991#discussion_r853696393 ## flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/data/RowDataProjection.java: ## @@ -60,11 +76,37 @@ public static RowDataProjection create(Schema schema,

[GitHub] [iceberg] hililiwei commented on a diff in pull request #3991: Flink: Support nested projection

2022-04-19 Thread GitBox
hililiwei commented on code in PR #3991: URL: https://github.com/apache/iceberg/pull/3991#discussion_r853696393 ## flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/data/RowDataProjection.java: ## @@ -60,11 +76,37 @@ public static RowDataProjection create(Schema schema,

[GitHub] [iceberg] hililiwei commented on a diff in pull request #3991: Flink: Support nested projection

2022-04-19 Thread GitBox
hililiwei commented on code in PR #3991: URL: https://github.com/apache/iceberg/pull/3991#discussion_r853696393 ## flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/data/RowDataProjection.java: ## @@ -60,11 +76,37 @@ public static RowDataProjection create(Schema schema,

[GitHub] [iceberg] wuwenchi closed issue #4562: flink-sql doesn't support alter table properties which has primary key

2022-04-19 Thread GitBox
wuwenchi closed issue #4562: flink-sql doesn't support alter table properties which has primary key URL: https://github.com/apache/iceberg/issues/4562 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [iceberg] samredai commented on a diff in pull request #4536: Docs: add Cloudera native docs section

2022-04-19 Thread GitBox
samredai commented on code in PR #4536: URL: https://github.com/apache/iceberg/pull/4536#discussion_r853666482 ## docs/cloudera/_index.md: ## @@ -0,0 +1,23 @@ +--- +title: "Cloudera" +bookIconImage: ../img/cloudera-logo.png +bookFlatSection: true +weight: 415 +bookExternalUrlNew

[GitHub] [iceberg] hililiwei commented on a diff in pull request #3991: Flink: Support nested projection

2022-04-19 Thread GitBox
hililiwei commented on code in PR #3991: URL: https://github.com/apache/iceberg/pull/3991#discussion_r849480029 ## flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/data/RowDataProjection.java: ## @@ -63,16 +109,48 @@ public static RowDataProjection create(RowType rowType

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #4585: Core : Strip trailing slash from tableLocation in LocationProvider

2022-04-19 Thread GitBox
singhpk234 commented on code in PR #4585: URL: https://github.com/apache/iceberg/pull/4585#discussion_r853417779 ## core/src/main/java/org/apache/iceberg/LocationProviders.java: ## @@ -160,7 +161,7 @@ private static String pathContext(String tableLocation) { } } - pri

[GitHub] [iceberg] dramaticlly commented on a diff in pull request #4590: Python: Add PartitionField

2022-04-19 Thread GitBox
dramaticlly commented on code in PR #4590: URL: https://github.com/apache/iceberg/pull/4590#discussion_r853648670 ## python/src/iceberg/partitioning.py: ## @@ -0,0 +1,57 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

[GitHub] [iceberg-docs] samredai opened a new pull request, #73: Add hint boxes and codetabs to landing-page

2022-04-19 Thread GitBox
samredai opened a new pull request, #73: URL: https://github.com/apache/iceberg-docs/pull/73 This adds some shortcodes to the landing-page theme that allows us to include hint-boxes and codetabs (code blocks in a tabbed container for multiple languages). Here's an example of what this looks

[GitHub] [iceberg] dramaticlly commented on a diff in pull request #4590: Python: Add PartitionField

2022-04-19 Thread GitBox
dramaticlly commented on code in PR #4590: URL: https://github.com/apache/iceberg/pull/4590#discussion_r853645219 ## python/src/iceberg/partitioning.py: ## @@ -0,0 +1,57 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

[GitHub] [iceberg] felixYyu commented on a diff in pull request #4584: System Printing: Removing System.out and System.err

2022-04-19 Thread GitBox
felixYyu commented on code in PR #4584: URL: https://github.com/apache/iceberg/pull/4584#discussion_r853641957 ## spark/v2.4/spark/src/test/java/org/apache/iceberg/spark/data/TestParquetAvroReader.java: ## @@ -116,8 +120,8 @@ public void testStructSchema() throws IOException {

[GitHub] [iceberg] felixYyu commented on a diff in pull request #4584: System Printing: Removing System.out and System.err

2022-04-19 Thread GitBox
felixYyu commented on code in PR #4584: URL: https://github.com/apache/iceberg/pull/4584#discussion_r853640863 ## api/src/test/java/org/apache/iceberg/types/TestReadabilityChecks.java: ## @@ -362,7 +365,7 @@ public void testStructWriteReordering() { List errors = CheckCompa

[GitHub] [iceberg] samredai commented on a diff in pull request #4590: Python: Add PartitionField

2022-04-19 Thread GitBox
samredai commented on code in PR #4590: URL: https://github.com/apache/iceberg/pull/4590#discussion_r853612744 ## python/src/iceberg/partitioning.py: ## @@ -0,0 +1,57 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #4580: API: Introduce a new IncrementalTableScan interface

2022-04-19 Thread GitBox
stevenzwu commented on code in PR #4580: URL: https://github.com/apache/iceberg/pull/4580#discussion_r853609392 ## api/src/main/java/org/apache/iceberg/IncrementalTableScan.java: ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more c

[GitHub] [iceberg] dramaticlly opened a new pull request, #4590: Python: Add PartitionField

2022-04-19 Thread GitBox
dramaticlly opened a new pull request, #4590: URL: https://github.com/apache/iceberg/pull/4590 This is 1st step of reintroduce https://github.com/apache/iceberg/issues/3228 adds `PartitionField` in `iceberg/partitioning.py` ``` from iceberg.partitioning import PartitionField

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #3983: Spark: Spark3 ZOrder Rewrite Strategy

2022-04-19 Thread GitBox
szehon-ho commented on code in PR #3983: URL: https://github.com/apache/iceberg/pull/3983#discussion_r853581777 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/actions/SparkZOrderStrategy.java: ## @@ -0,0 +1,219 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] stevenzwu commented on pull request #4580: API: Introduce a new IncrementalTableScan interface

2022-04-19 Thread GitBox
stevenzwu commented on PR #4580: URL: https://github.com/apache/iceberg/pull/4580#issuecomment-1103245098 @flyrain This is just a starting point. I am sure the current `IncrementalScan` interface is NOT good for the CDC read today, which needs more complex planning control. That was also pa

[GitHub] [iceberg] flyrain commented on a diff in pull request #4539: Change Data Capture(CDC)[Draft]

2022-04-19 Thread GitBox
flyrain commented on code in PR #4539: URL: https://github.com/apache/iceberg/pull/4539#discussion_r853511419 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/actions/BaseCdcSparkAction.java: ## @@ -0,0 +1,262 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #4560: Core: Fix Partitions table for evolved partition specs

2022-04-19 Thread GitBox
RussellSpitzer commented on code in PR #4560: URL: https://github.com/apache/iceberg/pull/4560#discussion_r853523997 ## spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/source/TestMetadataTablesWithPartitionEvolution.java: ## @@ -261,6 +262,118 @@ public void testEntriesM

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #4560: Core: Fix Partitions table for evolved partition specs

2022-04-19 Thread GitBox
RussellSpitzer commented on code in PR #4560: URL: https://github.com/apache/iceberg/pull/4560#discussion_r853522539 ## core/src/main/java/org/apache/iceberg/PartitionsTable.java: ## @@ -93,16 +93,36 @@ private static StaticDataTask.Row convertPartition(Partition partition) {

[GitHub] [iceberg] flyrain commented on a diff in pull request #4539: Change Data Capture(CDC)[Draft]

2022-04-19 Thread GitBox
flyrain commented on code in PR #4539: URL: https://github.com/apache/iceberg/pull/4539#discussion_r853514236 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/actions/BaseCdcSparkAction.java: ## @@ -0,0 +1,262 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [iceberg] flyrain commented on a diff in pull request #4539: Change Data Capture(CDC)[Draft]

2022-04-19 Thread GitBox
flyrain commented on code in PR #4539: URL: https://github.com/apache/iceberg/pull/4539#discussion_r853511419 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/actions/BaseCdcSparkAction.java: ## @@ -0,0 +1,262 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [iceberg] flyrain commented on a diff in pull request #4539: Change Data Capture(CDC)[Draft]

2022-04-19 Thread GitBox
flyrain commented on code in PR #4539: URL: https://github.com/apache/iceberg/pull/4539#discussion_r853507529 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/actions/BaseCdcSparkAction.java: ## @@ -0,0 +1,262 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [iceberg] flyrain commented on a diff in pull request #4539: Change Data Capture(CDC)[Draft]

2022-04-19 Thread GitBox
flyrain commented on code in PR #4539: URL: https://github.com/apache/iceberg/pull/4539#discussion_r853498509 ## core/src/main/java/org/apache/iceberg/BaseFileScanTask.java: ## @@ -47,6 +47,10 @@ public BaseFileScanTask(DataFile file, DeleteFile[] deletes, String schemaString

[GitHub] [iceberg] flyrain commented on a diff in pull request #4580: API: Introduce a new IncrementalTableScan interface

2022-04-19 Thread GitBox
flyrain commented on code in PR #4580: URL: https://github.com/apache/iceberg/pull/4580#discussion_r853475107 ## api/src/main/java/org/apache/iceberg/IncrementalTableScan.java: ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more con

[GitHub] [iceberg] flyrain commented on pull request #4580: API: Introduce a new IncrementalTableScan interface

2022-04-19 Thread GitBox
flyrain commented on PR #4580: URL: https://github.com/apache/iceberg/pull/4580#issuecomment-1103136490 Thanks @stevenzwu for the PR. I’m OK with the change, but I doubt if CDC can use the the interface IncrementalTableScan. Basically CDC requires much finer control of planning, check my CD

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #4575: Core: Add context headers to RESTCatalog

2022-04-19 Thread GitBox
RussellSpitzer commented on code in PR #4575: URL: https://github.com/apache/iceberg/pull/4575#discussion_r853442456 ## core/src/main/java/org/apache/iceberg/catalog/BaseSessionCatalog.java: ## @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #4585: Core : Strip trailing slash from tableLocation in LocationProvider

2022-04-19 Thread GitBox
singhpk234 commented on code in PR #4585: URL: https://github.com/apache/iceberg/pull/4585#discussion_r853417779 ## core/src/main/java/org/apache/iceberg/LocationProviders.java: ## @@ -160,7 +161,7 @@ private static String pathContext(String tableLocation) { } } - pri

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #4580: API: Introduce a new IncrementalTableScan interface

2022-04-19 Thread GitBox
stevenzwu commented on code in PR #4580: URL: https://github.com/apache/iceberg/pull/4580#discussion_r853407912 ## api/src/main/java/org/apache/iceberg/Scan.java: ## @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor licen

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #4580: API: Introduce a new IncrementalTableScan interface

2022-04-19 Thread GitBox
stevenzwu commented on code in PR #4580: URL: https://github.com/apache/iceberg/pull/4580#discussion_r853407290 ## api/src/main/java/org/apache/iceberg/Scan.java: ## @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor lice

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #4580: API: Introduce a new IncrementalTableScan interface

2022-04-19 Thread GitBox
stevenzwu commented on code in PR #4580: URL: https://github.com/apache/iceberg/pull/4580#discussion_r853406060 ## api/src/main/java/org/apache/iceberg/TableScan.java: ## @@ -148,7 +89,9 @@ default TableScan select(String... columns) { * @return a table scan which can read a

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #4580: API: Introduce a new IncrementalTableScan interface

2022-04-19 Thread GitBox
stevenzwu commented on code in PR #4580: URL: https://github.com/apache/iceberg/pull/4580#discussion_r853404534 ## api/src/main/java/org/apache/iceberg/IncrementalTableScan.java: ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more c

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #4580: API: Introduce a new IncrementalTableScan interface

2022-04-19 Thread GitBox
stevenzwu commented on code in PR #4580: URL: https://github.com/apache/iceberg/pull/4580#discussion_r853404336 ## api/src/main/java/org/apache/iceberg/IncrementalTableScan.java: ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more c

[GitHub] [iceberg] rdblue commented on a diff in pull request #4585: Core : Strip trailing slash from tableLocation in LocationProvider

2022-04-19 Thread GitBox
rdblue commented on code in PR #4585: URL: https://github.com/apache/iceberg/pull/4585#discussion_r853404115 ## core/src/main/java/org/apache/iceberg/LocationProviders.java: ## @@ -160,7 +161,7 @@ private static String pathContext(String tableLocation) { } } - private

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #4585: Core : Strip trailing slash from tableLocation in LocationProvider

2022-04-19 Thread GitBox
singhpk234 commented on code in PR #4585: URL: https://github.com/apache/iceberg/pull/4585#discussion_r853393283 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopCatalog.java: ## @@ -98,11 +98,8 @@ public HadoopCatalog() { @Override public void initialize(String nam

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #4585: Core : Strip trailing slash from tableLocation in LocationProvider

2022-04-19 Thread GitBox
singhpk234 commented on code in PR #4585: URL: https://github.com/apache/iceberg/pull/4585#discussion_r853390995 ## core/src/main/java/org/apache/iceberg/CatalogUtil.java: ## @@ -341,4 +341,15 @@ public static void configureHadoopConf(Object maybeConfigurable, Object conf) {

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #4585: Core : Strip trailing slash from tableLocation in LocationProvider

2022-04-19 Thread GitBox
singhpk234 commented on code in PR #4585: URL: https://github.com/apache/iceberg/pull/4585#discussion_r853390812 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopCatalog.java: ## @@ -98,11 +98,8 @@ public HadoopCatalog() { @Override public void initialize(String nam

[GitHub] [iceberg] kbendick commented on issue #4346: Make DeleteOrphanFiles in Spark reliable

2022-04-19 Thread GitBox
kbendick commented on issue #4346: URL: https://github.com/apache/iceberg/issues/4346#issuecomment-1102969266 > @aokolnychyi, that plan sounds great to me. I think that covers all the cases we need to. Yeah this all sounds good to me as well. The nuances / different configurati

[GitHub] [iceberg] puchengy opened a new issue, #4589: Python library feature proposal: list_partitions api

2022-04-19 Thread GitBox
puchengy opened a new issue, #4589: URL: https://github.com/apache/iceberg/issues/4589 I am proposing a feature of list_partitions of a given table in the python library. Similar to https://github.com/apache/iceberg/issues/3843, during the migration from Hive to Iceberg, we see usage

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #4575: Core: Add context headers to RESTCatalog

2022-04-19 Thread GitBox
jackye1995 commented on code in PR #4575: URL: https://github.com/apache/iceberg/pull/4575#discussion_r853347091 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -0,0 +1,479 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [iceberg] wypoon commented on a diff in pull request #4588: Spark: Add custom metric for number of deletes applied by a SparkScan

2022-04-19 Thread GitBox
wypoon commented on code in PR #4588: URL: https://github.com/apache/iceberg/pull/4588#discussion_r853347186 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnVectorWithFilter.java: ## @@ -97,9 +97,18 @@ public byte[] getBinary(int rowId) { re

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #4575: Core: Add context headers to RESTCatalog

2022-04-19 Thread GitBox
jackye1995 commented on code in PR #4575: URL: https://github.com/apache/iceberg/pull/4575#discussion_r853347091 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -0,0 +1,479 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [iceberg] RussellSpitzer commented on pull request #4491: Nessie: Extract Catalog client code to NessieClient for Trino Consumption

2022-04-19 Thread GitBox
RussellSpitzer commented on PR #4491: URL: https://github.com/apache/iceberg/pull/4491#issuecomment-1102930672 @nastra Merged! Thanks for the PR and I hope to see that Nessie Trino Support soon! -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [iceberg] RussellSpitzer merged pull request #4491: Nessie: Extract Catalog client code to NessieClient for Trino Consumption

2022-04-19 Thread GitBox
RussellSpitzer merged PR #4491: URL: https://github.com/apache/iceberg/pull/4491 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #4578: Core: Update Remove Snapshots procedure for branching and tagging

2022-04-19 Thread GitBox
amogh-jahagirdar commented on code in PR #4578: URL: https://github.com/apache/iceberg/pull/4578#discussion_r853341526 ## core/src/main/java/org/apache/iceberg/RemoveSnapshots.java: ## @@ -163,21 +180,86 @@ private TableMetadata internalApply() { this.base = ops.refresh();

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #4578: Core: Update Remove Snapshots procedure for branching and tagging

2022-04-19 Thread GitBox
amogh-jahagirdar commented on code in PR #4578: URL: https://github.com/apache/iceberg/pull/4578#discussion_r853341526 ## core/src/main/java/org/apache/iceberg/RemoveSnapshots.java: ## @@ -163,21 +180,86 @@ private TableMetadata internalApply() { this.base = ops.refresh();

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #4575: Core: Add context headers to RESTCatalog

2022-04-19 Thread GitBox
jackye1995 commented on code in PR #4575: URL: https://github.com/apache/iceberg/pull/4575#discussion_r853339764 ## api/src/main/java/org/apache/iceberg/catalog/SessionCatalog.java: ## @@ -0,0 +1,323 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #4575: Core: Add context headers to RESTCatalog

2022-04-19 Thread GitBox
jackye1995 commented on code in PR #4575: URL: https://github.com/apache/iceberg/pull/4575#discussion_r853339764 ## api/src/main/java/org/apache/iceberg/catalog/SessionCatalog.java: ## @@ -0,0 +1,323 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #4578: Core: Update Remove Snapshots procedure for branching and tagging

2022-04-19 Thread GitBox
amogh-jahagirdar commented on code in PR #4578: URL: https://github.com/apache/iceberg/pull/4578#discussion_r853338493 ## core/src/main/java/org/apache/iceberg/RemoveSnapshots.java: ## @@ -163,21 +180,86 @@ private TableMetadata internalApply() { this.base = ops.refresh();

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #4578: Core: Update Remove Snapshots procedure for branching and tagging

2022-04-19 Thread GitBox
amogh-jahagirdar commented on code in PR #4578: URL: https://github.com/apache/iceberg/pull/4578#discussion_r853338493 ## core/src/main/java/org/apache/iceberg/RemoveSnapshots.java: ## @@ -163,21 +180,86 @@ private TableMetadata internalApply() { this.base = ops.refresh();

[GitHub] [iceberg] flyrain commented on a diff in pull request #4391: Docs : add s3 access-point documentation

2022-04-19 Thread GitBox
flyrain commented on code in PR #4391: URL: https://github.com/apache/iceberg/pull/4391#discussion_r853327699 ## docs/integrations/aws.md: ## @@ -435,6 +435,25 @@ For the above example, the objects in S3 will be saved with tags: `my_key1=my_va For more details on tag restric

[GitHub] [iceberg] wypoon commented on a diff in pull request #4588: Spark: Add custom metric for number of deletes applied by a SparkScan

2022-04-19 Thread GitBox
wypoon commented on code in PR #4588: URL: https://github.com/apache/iceberg/pull/4588#discussion_r853320291 ## core/src/main/java/org/apache/iceberg/deletes/DeleteCounter.java: ## @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more co

[GitHub] [iceberg] wypoon opened a new pull request, #4588: Spark: Add custom metric for number of deletes applied by a SparkScan

2022-04-19 Thread GitBox
wypoon opened a new pull request, #4588: URL: https://github.com/apache/iceberg/pull/4588 This is an extension of #4395. Here we add a custom metric for the number of delete rows that have been applied in a scan of a format v2 table. Tested manually by creating a format v2 table us

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #4560: Core: Fix Partitions table for evolved partition specs

2022-04-19 Thread GitBox
szehon-ho commented on code in PR #4560: URL: https://github.com/apache/iceberg/pull/4560#discussion_r853294340 ## core/src/main/java/org/apache/iceberg/PartitionsTable.java: ## @@ -93,16 +93,36 @@ private static StaticDataTask.Row convertPartition(Partition partition) { r

[GitHub] [iceberg] nreich commented on issue #4565: Failure in expire snapshots action due to exception creating partition summary rows

2022-04-19 Thread GitBox
nreich commented on issue #4565: URL: https://github.com/apache/iceberg/issues/4565#issuecomment-1102877437 I have been able to use the latest master to successfully conduct snapshot expiration on the table and can confirm that this issue will be resolved when #3411 is released. -- This

[GitHub] [iceberg] nastra commented on a diff in pull request #4491: Nessie: simplify code in Nessie catalog

2022-04-19 Thread GitBox
nastra commented on code in PR #4491: URL: https://github.com/apache/iceberg/pull/4491#discussion_r853271375 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNessieTable.java: ## @@ -323,22 +321,35 @@ public void testExistingTableUpdate() { } @Test - public void te

[GitHub] [iceberg] rizaon commented on a diff in pull request #4518: core: Provide mechanism to cache manifest file content

2022-04-19 Thread GitBox
rizaon commented on code in PR #4518: URL: https://github.com/apache/iceberg/pull/4518#discussion_r852991634 ## core/src/main/java/org/apache/iceberg/hadoop/CachingHadoopTables.java: ## @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #4491: Nessie: simplify code in Nessie catalog

2022-04-19 Thread GitBox
RussellSpitzer commented on code in PR #4491: URL: https://github.com/apache/iceberg/pull/4491#discussion_r853268516 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNessieTable.java: ## @@ -323,22 +321,35 @@ public void testExistingTableUpdate() { } @Test - public

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #4491: Nessie: simplify code in Nessie catalog

2022-04-19 Thread GitBox
RussellSpitzer commented on code in PR #4491: URL: https://github.com/apache/iceberg/pull/4491#discussion_r853261544 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -0,0 +1,334 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] [iceberg] szlta commented on pull request #4560: Core: Fix Partitions table for evolved partition specs

2022-04-19 Thread GitBox
szlta commented on PR #4560: URL: https://github.com/apache/iceberg/pull/4560#issuecomment-1102843550 Thanks for catching this @szehon-ho, this change looks good to me, just added a nit comment. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [iceberg] szlta commented on a diff in pull request #4560: Core: Fix Partitions table for evolved partition specs

2022-04-19 Thread GitBox
szlta commented on code in PR #4560: URL: https://github.com/apache/iceberg/pull/4560#discussion_r853254883 ## core/src/main/java/org/apache/iceberg/PartitionsTable.java: ## @@ -93,16 +93,36 @@ private static StaticDataTask.Row convertPartition(Partition partition) { retur

[GitHub] [iceberg] rdblue commented on a diff in pull request #4575: Core: Add context headers to RESTCatalog

2022-04-19 Thread GitBox
rdblue commented on code in PR #4575: URL: https://github.com/apache/iceberg/pull/4575#discussion_r853253104 ## api/src/main/java/org/apache/iceberg/catalog/SessionCatalog.java: ## @@ -0,0 +1,323 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more c

[GitHub] [iceberg] danielcweeks commented on a diff in pull request #4575: Core: Add context headers to RESTCatalog

2022-04-19 Thread GitBox
danielcweeks commented on code in PR #4575: URL: https://github.com/apache/iceberg/pull/4575#discussion_r853249916 ## api/src/main/java/org/apache/iceberg/catalog/SessionCatalog.java: ## @@ -0,0 +1,323 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [iceberg] danielcweeks commented on a diff in pull request #4575: Core: Add context headers to RESTCatalog

2022-04-19 Thread GitBox
danielcweeks commented on code in PR #4575: URL: https://github.com/apache/iceberg/pull/4575#discussion_r853249916 ## api/src/main/java/org/apache/iceberg/catalog/SessionCatalog.java: ## @@ -0,0 +1,323 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [iceberg] davidjdiebold closed issue #4587: Cannot create a table with pyspark

2022-04-19 Thread GitBox
davidjdiebold closed issue #4587: Cannot create a table with pyspark URL: https://github.com/apache/iceberg/issues/4587 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

[GitHub] [iceberg] davidjdiebold commented on issue #4587: Cannot create a table with pyspark

2022-04-19 Thread GitBox
davidjdiebold commented on issue #4587: URL: https://github.com/apache/iceberg/issues/4587#issuecomment-1102830037 Found the error, I was not downloading the jar correctly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [iceberg] rdblue commented on a diff in pull request #4584: logger: SLF4J logger instead System.out and System.err

2022-04-19 Thread GitBox
rdblue commented on code in PR #4584: URL: https://github.com/apache/iceberg/pull/4584#discussion_r853244577 ## spark/v2.4/spark/src/test/java/org/apache/iceberg/spark/data/TestParquetAvroReader.java: ## @@ -116,8 +120,8 @@ public void testStructSchema() throws IOException {

[GitHub] [iceberg] danielcweeks commented on a diff in pull request #4575: Core: Add context headers to RESTCatalog

2022-04-19 Thread GitBox
danielcweeks commented on code in PR #4575: URL: https://github.com/apache/iceberg/pull/4575#discussion_r853244316 ## api/src/main/java/org/apache/iceberg/catalog/SessionCatalog.java: ## @@ -0,0 +1,323 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [iceberg] rdblue commented on a diff in pull request #4584: logger: SLF4J logger instead System.out and System.err

2022-04-19 Thread GitBox
rdblue commented on code in PR #4584: URL: https://github.com/apache/iceberg/pull/4584#discussion_r853243723 ## api/src/test/java/org/apache/iceberg/types/TestReadabilityChecks.java: ## @@ -362,7 +365,7 @@ public void testStructWriteReordering() { List errors = CheckCompati

[GitHub] [iceberg] rdblue commented on pull request #4585: Core : Strip trailing slash from tableLocation in LocationProvider

2022-04-19 Thread GitBox
rdblue commented on PR #4585: URL: https://github.com/apache/iceberg/pull/4585#issuecomment-1102823657 Thanks, @singhpk234! Looks mostly good, but I think we should use just one `stripTrailingSlash` method. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [iceberg] rdblue commented on a diff in pull request #4585: Core : Strip trailing slash from tableLocation in LocationProvider

2022-04-19 Thread GitBox
rdblue commented on code in PR #4585: URL: https://github.com/apache/iceberg/pull/4585#discussion_r853239809 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopCatalog.java: ## @@ -98,11 +98,8 @@ public HadoopCatalog() { @Override public void initialize(String name, M

[GitHub] [iceberg] davidjdiebold opened a new issue, #4587: Cannot create a table with pyspark

2022-04-19 Thread GitBox
davidjdiebold opened a new issue, #4587: URL: https://github.com/apache/iceberg/issues/4587 Hello, I'm unable to create a table with iceberg using pyspark on a colab notebook, using spark 3.2. I have been relying on this tutorial https://iceberg.apache.org/docs/latest/getting-sta

[GitHub] [iceberg] rdblue commented on a diff in pull request #4585: Core : Strip trailing slash from tableLocation in LocationProvider

2022-04-19 Thread GitBox
rdblue commented on code in PR #4585: URL: https://github.com/apache/iceberg/pull/4585#discussion_r853239540 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopCatalog.java: ## @@ -98,11 +98,8 @@ public HadoopCatalog() { @Override public void initialize(String name, M

[GitHub] [iceberg] rdblue commented on a diff in pull request #4585: Core : Strip trailing slash from tableLocation in LocationProvider

2022-04-19 Thread GitBox
rdblue commented on code in PR #4585: URL: https://github.com/apache/iceberg/pull/4585#discussion_r853238932 ## core/src/main/java/org/apache/iceberg/LocationProviders.java: ## @@ -51,7 +52,7 @@ public static LocationProvider locationsFor(String location, Map impl,

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #4575: Core: Add context headers to RESTCatalog

2022-04-19 Thread GitBox
RussellSpitzer commented on code in PR #4575: URL: https://github.com/apache/iceberg/pull/4575#discussion_r853238699 ## api/src/main/java/org/apache/iceberg/catalog/SessionCatalog.java: ## @@ -0,0 +1,323 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

[GitHub] [iceberg] rdblue commented on a diff in pull request #4585: Core : Strip trailing slash from tableLocation in LocationProvider

2022-04-19 Thread GitBox
rdblue commented on code in PR #4585: URL: https://github.com/apache/iceberg/pull/4585#discussion_r853238064 ## core/src/main/java/org/apache/iceberg/CatalogUtil.java: ## @@ -341,4 +341,15 @@ public static void configureHadoopConf(Object maybeConfigurable, Object conf) {

[GitHub] [iceberg] rdblue closed pull request #3614: Flink: Fix ALTER for tables with primary key fields

2022-04-19 Thread GitBox
rdblue closed pull request #3614: Flink: Fix ALTER for tables with primary key fields URL: https://github.com/apache/iceberg/pull/3614 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [iceberg] rdblue commented on pull request #3614: Flink: Fix ALTER for tables with primary key fields

2022-04-19 Thread GitBox
rdblue commented on PR #3614: URL: https://github.com/apache/iceberg/pull/3614#issuecomment-1102819242 This was fixed in #4561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [iceberg] rdblue commented on pull request #4561: Flink: support modify table properties with primary key.

2022-04-19 Thread GitBox
rdblue commented on PR #4561: URL: https://github.com/apache/iceberg/pull/4561#issuecomment-1102818720 @wuwenchi, thanks for clarifying how the test was written! Since this was mostly a copy of an existing test then I can see why it looks so similar to the one in the other PR. -- This is

[GitHub] [iceberg] puchengy commented on issue #4583: [Feature Proposal] support setting custom metadata location prefix from catalog level

2022-04-19 Thread GitBox
puchengy commented on issue #4583: URL: https://github.com/apache/iceberg/issues/4583#issuecomment-1102809317 @singhpk234 Thanks for sharing. Yes, I think this PR will provide some help to my user-case to certain extent, but can not fully fulfill my goal. -- This is an automated message f

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #3983: Spark: Spark3 ZOrder Rewrite Strategy

2022-04-19 Thread GitBox
RussellSpitzer commented on code in PR #3983: URL: https://github.com/apache/iceberg/pull/3983#discussion_r853207010 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/actions/SparkZOrderStrategy.java: ## @@ -0,0 +1,219 @@ +/* + * Licensed to the Apache Software Foundati

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #3983: Spark: Spark3 ZOrder Rewrite Strategy

2022-04-19 Thread GitBox
RussellSpitzer commented on code in PR #3983: URL: https://github.com/apache/iceberg/pull/3983#discussion_r853205055 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetWriters.java: ## @@ -299,6 +301,28 @@ public void write(int repetitionLevel, byte[] by

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #3983: Spark: Spark3 ZOrder Rewrite Strategy

2022-04-19 Thread GitBox
RussellSpitzer commented on code in PR #3983: URL: https://github.com/apache/iceberg/pull/3983#discussion_r853202949 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/actions/SparkZOrderStrategy.java: ## @@ -0,0 +1,219 @@ +/* + * Licensed to the Apache Software Foundati

[GitHub] [iceberg] rajarshisarkar opened a new pull request, #4586: Spark: Add Spark procedures for snapshot branching and tagging

2022-04-19 Thread GitBox
rajarshisarkar opened a new pull request, #4586: URL: https://github.com/apache/iceberg/pull/4586 This PR adds the Spark procedures for snapshot branching and tagging. I have added the `CreateTagProcedure` for the initial round of review. I will add the other procedures if this looks

[GitHub] [iceberg] rajarshisarkar commented on a diff in pull request #4578: Core: Update Remove Snapshots procedure for branching and tagging

2022-04-19 Thread GitBox
rajarshisarkar commented on code in PR #4578: URL: https://github.com/apache/iceberg/pull/4578#discussion_r853019270 ## core/src/main/java/org/apache/iceberg/RemoveSnapshots.java: ## @@ -163,21 +180,86 @@ private TableMetadata internalApply() { this.base = ops.refresh();

[GitHub] [iceberg] marton-bod commented on a diff in pull request #4536: Docs: add Cloudera native docs section

2022-04-19 Thread GitBox
marton-bod commented on code in PR #4536: URL: https://github.com/apache/iceberg/pull/4536#discussion_r852700922 ## docs/cloudera/_index.md: ## @@ -0,0 +1,23 @@ +--- +title: "Cloudera" +bookIconImage: ../img/cloudera-logo.png +bookFlatSection: true +weight: 415 +bookExternalUrlN

[GitHub] [iceberg] openinx commented on pull request #4553: Add Support For Flink 1.15

2022-04-19 Thread GitBox
openinx commented on PR #4553: URL: https://github.com/apache/iceberg/pull/4553#issuecomment-1102581835 @kbendick I think we've just merged few iceberg PRs for the old flink 1.14, would you mind to update this PR to include those latest changes to flink 1.15 ? -- This is an automated me

[GitHub] [iceberg] openinx merged pull request #4561: Flink: support modify table properties with primary key.

2022-04-19 Thread GitBox
openinx merged PR #4561: URL: https://github.com/apache/iceberg/pull/4561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

[GitHub] [iceberg] openinx commented on pull request #4561: Flink: support modify table properties with primary key.

2022-04-19 Thread GitBox
openinx commented on PR #4561: URL: https://github.com/apache/iceberg/pull/4561#issuecomment-1102578232 I think it's reasonable to add the new unit test based on the original unit tests. So I'm okay to skip the credit for the old PR. Thanks all for the reviewing ! -- This is an automate

  1   2   >