[GitHub] [incubator-iceberg] rdblue closed issue #787: Iceberg ORC should push down predicates to the ORC reader

2020-05-26 Thread GitBox
rdblue closed issue #787: URL: https://github.com/apache/incubator-iceberg/issues/787 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #199: ORC metrics

2020-05-26 Thread GitBox
rdblue commented on a change in pull request #199: URL: https://github.com/apache/incubator-iceberg/pull/199#discussion_r430545525 ## File path: orc/src/main/java/org/apache/iceberg/orc/avro/GenericDataOrcWriter.java ## @@ -0,0 +1,526 @@ +/* + * Licensed to the Apache Software

[GitHub] [incubator-iceberg] rdblue opened a new pull request #1064: Add DeleteFile and manifest reader and writer for deletes

2020-05-26 Thread GitBox
rdblue opened a new pull request #1064: URL: https://github.com/apache/incubator-iceberg/pull/1064 This adds a new interface, DeleteFile, and implementations of ManfiestReader and ManifestWriter for deletes. This is an autom

[GitHub] [incubator-iceberg] rdblue commented on issue #1053: Does iceberg already support spark 3.0?

2020-05-26 Thread GitBox
rdblue commented on issue #1053: URL: https://github.com/apache/incubator-iceberg/issues/1053#issuecomment-633651372 Yes, the Spark 3 branch supports both SQL DDL and DML. This is an automated message from the Apache Git Ser

[GitHub] [incubator-iceberg] mehtaashish23 opened a new issue #1065: Unable to read Iceberg table with hadoop conf passed in as options

2020-05-26 Thread GitBox
mehtaashish23 opened a new issue #1065: URL: https://github.com/apache/incubator-iceberg/issues/1065 It seems we are reading Hadoop configuration from the current active SparkSession [here], which might not have creds available on it, in an environment, when we have multiple SparkSession (

[GitHub] [incubator-iceberg] rdblue commented on pull request #935: [WIP] Internal relocated version of Guava

2020-05-26 Thread GitBox
rdblue commented on pull request #935: URL: https://github.com/apache/incubator-iceberg/pull/935#issuecomment-634133454 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [incubator-iceberg] edgarRd commented on pull request #199: ORC metrics

2020-05-26 Thread GitBox
edgarRd commented on pull request #199: URL: https://github.com/apache/incubator-iceberg/pull/199#issuecomment-634361689 I've addressed some comments, but still need to rebase for #989 and #1070 so that'd simplify this PR. T

[GitHub] [incubator-iceberg] rdblue commented on pull request #1067: Use Nebula version plugins

2020-05-26 Thread GitBox
rdblue commented on pull request #1067: URL: https://github.com/apache/incubator-iceberg/pull/1067#issuecomment-634281405 @jerryshao, you may be interested in this change because it allows multiple Spark versions in the build. @massdosage, this should help fix your dependency issues as we

[GitHub] [incubator-iceberg] mehtaashish23 commented on issue #1065: Unable to read Iceberg table with hadoop conf passed in as options

2020-05-26 Thread GitBox
mehtaashish23 commented on issue #1065: URL: https://github.com/apache/incubator-iceberg/issues/1065#issuecomment-634228498 PR https://github.com/apache/incubator-iceberg/pull/1066 This is an automated message from the Apach

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #1046: ISSUE-189: Add support for union record type

2020-05-26 Thread GitBox
rdsr commented on a change in pull request #1046: URL: https://github.com/apache/incubator-iceberg/pull/1046#discussion_r430588847 ## File path: api/src/main/java/org/apache/iceberg/types/Types.java ## @@ -526,25 +526,25 @@ public static StructType of(List fields) { retu

[GitHub] [incubator-iceberg] shardulm94 commented on pull request #1069: ORC: Disable predicate pushdown for timestamp type

2020-05-26 Thread GitBox
shardulm94 commented on pull request #1069: URL: https://github.com/apache/incubator-iceberg/pull/1069#issuecomment-634327409 cc @rdblue @chenjunjiedada This is an automated message from the Apache Git Service. To respond t

[GitHub] [incubator-iceberg] rdblue commented on pull request #1063: Remove wrapper from GenericManifestEntry

2020-05-26 Thread GitBox
rdblue commented on pull request #1063: URL: https://github.com/apache/incubator-iceberg/pull/1063#issuecomment-633741047 I'm closing this because it is part of #1064. This is an automated message from the Apache Git Service

[GitHub] [incubator-iceberg] rdblue opened a new pull request #1063: Remove wrapper from GenericManifestEntry

2020-05-26 Thread GitBox
rdblue opened a new pull request #1063: URL: https://github.com/apache/incubator-iceberg/pull/1063 This wrapper is no longer needed because the v1 and v2 writers use IndexedManifestEntry and IndexedDataFile to write. This is

[GitHub] [incubator-iceberg] chenjunjiedada commented on a change in pull request #974: Add unit tests for sequence number

2020-05-26 Thread GitBox
chenjunjiedada commented on a change in pull request #974: URL: https://github.com/apache/incubator-iceberg/pull/974#discussion_r430136902 ## File path: core/src/test/java/org/apache/iceberg/TestSequenceNumberForV2Table.java ## @@ -0,0 +1,299 @@ +/* + * Licensed to the Apache

[GitHub] [incubator-iceberg] rdblue merged pull request #1069: ORC: Disable predicate pushdown for timestamp type

2020-05-26 Thread GitBox
rdblue merged pull request #1069: URL: https://github.com/apache/incubator-iceberg/pull/1069 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [incubator-iceberg] rdblue merged pull request #1013: Wrap sub-class of PositionAccessor into WrappedPositionAccessor inste…

2020-05-26 Thread GitBox
rdblue merged pull request #1013: URL: https://github.com/apache/incubator-iceberg/pull/1013 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [incubator-iceberg] rdblue opened a new pull request #1062: Replace references for incubator-iceberg with iceberg repository.

2020-05-26 Thread GitBox
rdblue opened a new pull request #1062: URL: https://github.com/apache/incubator-iceberg/pull/1062 This PR replaces references to the current incubator-iceberg repository with references to the new name. This should be deployed after the rename is completed. -

[GitHub] [incubator-iceberg] rdblue closed pull request #1063: Remove wrapper from GenericManifestEntry

2020-05-26 Thread GitBox
rdblue closed pull request #1063: URL: https://github.com/apache/incubator-iceberg/pull/1063 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #1070: Update metrics tests to use Iceberg generics

2020-05-26 Thread GitBox
rdsr commented on a change in pull request #1070: URL: https://github.com/apache/incubator-iceberg/pull/1070#discussion_r430790685 ## File path: core/src/test/java/org/apache/iceberg/TestMetrics.java ## @@ -95,59 +88,57 @@ optional(8, "dateCol", DateType.get()), r

[GitHub] [incubator-iceberg] rdblue closed pull request #935: [WIP] Internal relocated version of Guava

2020-05-26 Thread GitBox
rdblue closed pull request #935: URL: https://github.com/apache/incubator-iceberg/pull/935 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [incubator-iceberg] chenjunjiedada edited a comment on pull request #983: Convert date and timestamp values in generics

2020-05-26 Thread GitBox
chenjunjiedada edited a comment on pull request #983: URL: https://github.com/apache/incubator-iceberg/pull/983#issuecomment-633808288 This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [incubator-iceberg] edgarRd commented on a change in pull request #1070: Update metrics tests to use Iceberg generics

2020-05-26 Thread GitBox
edgarRd commented on a change in pull request #1070: URL: https://github.com/apache/incubator-iceberg/pull/1070#discussion_r430790459 ## File path: core/src/test/java/org/apache/iceberg/TestMetrics.java ## @@ -95,59 +88,57 @@ optional(8, "dateCol", DateType.get()),

[GitHub] [incubator-iceberg] chenjunjiedada commented on pull request #983: Convert date and timestamp values in generics

2020-05-26 Thread GitBox
chenjunjiedada commented on pull request #983: URL: https://github.com/apache/incubator-iceberg/pull/983#issuecomment-633770057 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [incubator-iceberg] rdblue commented on pull request #1068: Add shaded Guava module

2020-05-26 Thread GitBox
rdblue commented on pull request #1068: URL: https://github.com/apache/incubator-iceberg/pull/1068#issuecomment-634295422 I tested this locally, so I didn't wait for CI to complete. This is an automated message from the Apac

[GitHub] [incubator-iceberg] rdblue opened a new pull request #1070: Update metrics tests to use Iceberg generics

2020-05-26 Thread GitBox
rdblue opened a new pull request #1070: URL: https://github.com/apache/incubator-iceberg/pull/1070 This updates `TestMetrics` to use Iceberg generic records for its test cases. This is to avoid creating an Avro writer for ORC just to make the tests work. Using Iceberg generics for t

[GitHub] [incubator-iceberg] rdblue opened a new pull request #1068: Add shaded Guava module

2020-05-26 Thread GitBox
rdblue opened a new pull request #1068: URL: https://github.com/apache/incubator-iceberg/pull/1068 This fixes #935. The purpose of this PR is to merge master into that branch and merge, because there are so many commit conflicts with the import changes. Co-authored-by: awoodhead

[GitHub] [incubator-iceberg] rdblue commented on pull request #1013: Wrap sub-class of PositionAccessor into WrappedPositionAccessor inste…

2020-05-26 Thread GitBox
rdblue commented on pull request #1013: URL: https://github.com/apache/incubator-iceberg/pull/1013#issuecomment-633663562 Since this didn't get picked up by CI, I ran `./gradlew check` locally and everything looks good so I'll merge. Thanks for fixing this, @waterlx! -

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #1060: AppenderMetrics

2020-05-26 Thread GitBox
rdblue commented on a change in pull request #1060: URL: https://github.com/apache/incubator-iceberg/pull/1060#discussion_r430732666 ## File path: data/src/main/java/org/apache/iceberg/MetricsAppender.java ## @@ -0,0 +1,509 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [incubator-iceberg] shardulm94 commented on a change in pull request #199: ORC metrics

2020-05-26 Thread GitBox
shardulm94 commented on a change in pull request #199: URL: https://github.com/apache/incubator-iceberg/pull/199#discussion_r430737899 ## File path: orc/src/main/java/org/apache/iceberg/orc/OrcMetrics.java ## @@ -19,51 +19,211 @@ package org.apache.iceberg.orc; +import com

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #199: ORC metrics

2020-05-26 Thread GitBox
rdsr commented on a change in pull request #199: URL: https://github.com/apache/incubator-iceberg/pull/199#discussion_r430106417 ## File path: core/src/test/java/org/apache/iceberg/TestMetrics.java ## @@ -69,22 +59,22 @@ public abstract class TestMetrics { public static f

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #974: Add unit tests for sequence number

2020-05-26 Thread GitBox
rdblue commented on a change in pull request #974: URL: https://github.com/apache/incubator-iceberg/pull/974#discussion_r430026545 ## File path: core/src/test/java/org/apache/iceberg/TestSequenceNumberForV2Table.java ## @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache Software

[GitHub] [incubator-iceberg] mehtaashish23 opened a new pull request #1066: Issue-1065: Unable to read Iceberg table with hadoop spark options

2020-05-26 Thread GitBox
mehtaashish23 opened a new pull request #1066: URL: https://github.com/apache/incubator-iceberg/pull/1066 Copies method `mergeIcebergHadoopConfs` from IcebergSource and used it to extract Hadoop config sent as options. This

[GitHub] [incubator-iceberg] sudssf commented on a change in pull request #1046: ISSUE-189: Add support for union record type

2020-05-26 Thread GitBox
sudssf commented on a change in pull request #1046: URL: https://github.com/apache/incubator-iceberg/pull/1046#discussion_r430121047 ## File path: api/src/main/java/org/apache/iceberg/types/Types.java ## @@ -526,25 +526,25 @@ public static StructType of(List fields) { re

[GitHub] [incubator-iceberg] shardulm94 opened a new pull request #1069: ORC: Disable predicate pushdown for timestamp type

2020-05-26 Thread GitBox
shardulm94 opened a new pull request #1069: URL: https://github.com/apache/incubator-iceberg/pull/1069 In https://github.com/apache/incubator-iceberg/pull/983#issuecomment-633808288 we discovered issues with ORC predicate pushdown for timestamp types, where timestamps less than epoch were

[GitHub] [incubator-iceberg] rdblue merged pull request #973: Push down Iceberg expressions to the ORC reader

2020-05-26 Thread GitBox
rdblue merged pull request #973: URL: https://github.com/apache/incubator-iceberg/pull/973 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [incubator-iceberg] rdblue merged pull request #1066: Issue-1065: Unable to read Iceberg table with hadoop spark options

2020-05-26 Thread GitBox
rdblue merged pull request #1066: URL: https://github.com/apache/incubator-iceberg/pull/1066 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [incubator-iceberg] mehtaashish23 commented on pull request #1066: Issue-1065: Unable to read Iceberg table with hadoop spark options

2020-05-26 Thread GitBox
mehtaashish23 commented on pull request #1066: URL: https://github.com/apache/incubator-iceberg/pull/1066#issuecomment-634163120 This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [incubator-iceberg] rdblue edited a comment on pull request #1063: Remove wrapper from GenericManifestEntry

2020-05-26 Thread GitBox
rdblue edited a comment on pull request #1063: URL: https://github.com/apache/incubator-iceberg/pull/1063#issuecomment-633741047 I'm closing this because it was needed for #1064 and isn't worth merging separately. This is a

[GitHub] [incubator-iceberg] waterlx closed issue #994: WrappedPositionAccessor is not generated but Position2Accessor

2020-05-26 Thread GitBox
waterlx closed issue #994: URL: https://github.com/apache/incubator-iceberg/issues/994 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [incubator-iceberg] rdblue commented on pull request #974: Add unit tests for sequence number

2020-05-26 Thread GitBox
rdblue commented on pull request #974: URL: https://github.com/apache/incubator-iceberg/pull/974#issuecomment-633659294 @chenjunjiedada, it looks like the changes from #1038 are included in the diff for this one. Could you rebase to remove them? This may merge cleanly because the changes a

[GitHub] [incubator-iceberg] rdblue commented on pull request #983: Convert date and timestamp values in generics

2020-05-26 Thread GitBox
rdblue commented on pull request #983: URL: https://github.com/apache/incubator-iceberg/pull/983#issuecomment-633650626 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [incubator-iceberg] rdblue commented on pull request #1060: AppenderMetrics

2020-05-26 Thread GitBox
rdblue commented on pull request #1060: URL: https://github.com/apache/incubator-iceberg/pull/1060#issuecomment-633647739 This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [incubator-iceberg] shardulm94 commented on a change in pull request #1069: ORC: Disable predicate pushdown for timestamp type

2020-05-26 Thread GitBox
shardulm94 commented on a change in pull request #1069: URL: https://github.com/apache/incubator-iceberg/pull/1069#discussion_r430767382 ## File path: orc/src/main/java/org/apache/iceberg/orc/ExpressionToSearchArgument.java ## @@ -254,8 +255,10 @@ public Action or(Action leftC

[GitHub] [incubator-iceberg] rdblue merged pull request #983: Convert date and timestamp values in generics

2020-05-26 Thread GitBox
rdblue merged pull request #983: URL: https://github.com/apache/incubator-iceberg/pull/983 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [incubator-iceberg] rdblue commented on pull request #973: Push down Iceberg expressions to the ORC reader

2020-05-26 Thread GitBox
rdblue commented on pull request #973: URL: https://github.com/apache/incubator-iceberg/pull/973#issuecomment-633657876 Merged. Nice work, @shardulm94! Thank you for working on this, it's a great feature to have for ORC users. --

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #1069: ORC: Disable predicate pushdown for timestamp type

2020-05-26 Thread GitBox
rdblue commented on a change in pull request #1069: URL: https://github.com/apache/incubator-iceberg/pull/1069#discussion_r430764489 ## File path: orc/src/main/java/org/apache/iceberg/orc/ExpressionToSearchArgument.java ## @@ -254,8 +255,10 @@ public Action or(Action leftChild

[GitHub] [incubator-iceberg] rdblue commented on pull request #922: [Part 1] Add partition spec evolution

2020-05-26 Thread GitBox
rdblue commented on pull request #922: URL: https://github.com/apache/incubator-iceberg/pull/922#issuecomment-633655191 @jun-he, thanks for working on this. I don't think that the API here is the right one. I think that the change-based API is the only one we need, so it doesn't make

[GitHub] [incubator-iceberg] rdblue opened a new pull request #1067: Use Nebula version plugins

2020-05-26 Thread GitBox
rdblue opened a new pull request #1067: URL: https://github.com/apache/incubator-iceberg/pull/1067 This replaces `gradle-consistent-versions` with `nebula.dependency-recommender` and `nebula.dependency-lock`. The purpose of this change is to enable having separate Spark 2.x and Spark 3.x m

[GitHub] [incubator-iceberg] shardulm94 commented on issue #1057: Adding an attribute in ORC TypeDescription causes failures.

2020-05-26 Thread GitBox
shardulm94 commented on issue #1057: URL: https://github.com/apache/incubator-iceberg/issues/1057#issuecomment-634373952 @rdsr This should have been fixed already by https://issues.apache.org/jira/browse/ORC-556. Are you sure your local copy was using ORC >= 1.6.3 when you tested this?

[GitHub] [incubator-iceberg] rdblue merged pull request #1068: Add shaded Guava module

2020-05-26 Thread GitBox
rdblue merged pull request #1068: URL: https://github.com/apache/incubator-iceberg/pull/1068 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #922: [Part 1] Add partition spec evolution

2020-05-26 Thread GitBox
rdblue commented on a change in pull request #922: URL: https://github.com/apache/incubator-iceberg/pull/922#discussion_r430019241 ## File path: api/src/test/java/org/apache/iceberg/TestPartitionSpecValidation.java ## @@ -241,11 +241,25 @@ public void testAddPartitionFieldsWit

[GitHub] [incubator-iceberg] shardulm94 opened a new pull request #1071: ORC: In BuildOrcProjection field should be optional if any parent is optional

2020-05-26 Thread GitBox
shardulm94 opened a new pull request #1071: URL: https://github.com/apache/incubator-iceberg/pull/1071 Fixes #961 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #1046: ISSUE-189: Add support for union record type

2020-05-26 Thread GitBox
rdblue commented on a change in pull request #1046: URL: https://github.com/apache/incubator-iceberg/pull/1046#discussion_r430012232 ## File path: api/src/main/java/org/apache/iceberg/types/Types.java ## @@ -526,25 +526,25 @@ public static StructType of(List fields) { re

[GitHub] [incubator-iceberg] waterlx commented on issue #994: WrappedPositionAccessor is not generated but Position2Accessor

2020-05-26 Thread GitBox
waterlx commented on issue #994: URL: https://github.com/apache/incubator-iceberg/issues/994#issuecomment-633775757 I am about to close this issue as PR #1013 is merged into master This is an automated message from the Apach

[GitHub] [incubator-iceberg] edgarRd commented on a change in pull request #199: ORC metrics

2020-05-26 Thread GitBox
edgarRd commented on a change in pull request #199: URL: https://github.com/apache/incubator-iceberg/pull/199#discussion_r430619496 ## File path: orc/src/main/java/org/apache/iceberg/orc/avro/GenericDataOrcWriter.java ## @@ -0,0 +1,526 @@ +/* + * Licensed to the Apache Softwar

[GitHub] [incubator-iceberg] shardulm94 commented on pull request #983: Convert date and timestamp values in generics

2020-05-26 Thread GitBox
shardulm94 commented on pull request #983: URL: https://github.com/apache/incubator-iceberg/pull/983#issuecomment-634305850 @rdblue Ack, on it. This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [incubator-iceberg] rdblue closed issue #972: Filtering records with timestamp doesn't work in IcebergGenerics

2020-05-26 Thread GitBox
rdblue closed issue #972: URL: https://github.com/apache/incubator-iceberg/issues/972 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [incubator-iceberg] rdblue commented on pull request #1066: Issue-1065: Unable to read Iceberg table with hadoop spark options

2020-05-26 Thread GitBox
rdblue commented on pull request #1066: URL: https://github.com/apache/incubator-iceberg/pull/1066#issuecomment-634154922 This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #983: Convert date and timestamp values in generics

2020-05-26 Thread GitBox
rdblue commented on a change in pull request #983: URL: https://github.com/apache/incubator-iceberg/pull/983#discussion_r430016878 ## File path: data/src/main/java/org/apache/iceberg/data/TableScanIterable.java ## @@ -161,7 +161,10 @@ public boolean hasNext() { if

[GitHub] [incubator-iceberg] rdblue edited a comment on pull request #1066: Issue-1065: Unable to read Iceberg table with hadoop spark options

2020-05-26 Thread GitBox
rdblue edited a comment on pull request #1066: URL: https://github.com/apache/incubator-iceberg/pull/1066#issuecomment-634165685 Okay, so it sounds like the FileIO for the table is created correctly, it is just the locality code that is a problem? Is this throwing an exception? (If so, ca

[GitHub] [incubator-iceberg] chenjunjiedada closed pull request #983: Convert date and timestamp values in generics

2020-05-26 Thread GitBox
chenjunjiedada closed pull request #983: URL: https://github.com/apache/incubator-iceberg/pull/983 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [incubator-iceberg] rdblue commented on issue #1061: Add DataFile rewrite support to compact small data files

2020-05-26 Thread GitBox
rdblue commented on issue #1061: URL: https://github.com/apache/incubator-iceberg/issues/1061#issuecomment-633650881 Sounds great! I'd love to see more actions to maintain table data and metadata. This is an automated messa