I could try to take a stab at fixing this given that you've pointed out very clearly the expected behavior in your previous explanation.
On Thu, May 30, 2019 at 10:31 PM Ryan Blue <rb...@netflix.com.invalid> wrote: > Yeah, this is a bug. You should be able to define multiple partition > functions on the same field. But we do want to check that multiple time > partitions are not used because they are redundant. I'll open a PR. Thanks > for pointing this out! > > On Tue, May 28, 2019 at 4:15 AM Anton Okolnychyi > <aokolnyc...@apple.com.invalid> wrote: > >> Hm, this is actually a good question. >> >> My understanding is that we shouldn't explicitly define partitioning by >> year/month/day/hour on the same column. Instead, we should be fine with >> hour only. Iceberg produces ordinals for time-based partition functions. As >> far as I remember, Ryan was planning to submit a PR in order to prohibit >> multiple partition functions. >> >> I believe in the above case you are trying to create one partition spec >> with multiple partition functions on the same field. >> >> Keep in mind that if you partition by hour only, the directory structure >> won’t contain year/month/day folders. If you are to have that directory >> structure, you need to have actual columns for year/month/day in your >> dataset and use identity partition function. >> >> Thanks, >> Anton >> >> >> > On 28 May 2019, at 09:27, filip <filip....@gmail.com> wrote: >> > >> > >> > A while back I bumped into an issue with what seems to be an >> inconsistency in the partition spec API or maybe it's just an >> implementation bug. >> > Attempting to have multiple partitions specs on the same schema field I >> bumped into an issue regarding the fact that while the API allows for >> multiple partitions spec defined for same field, internally this conflicts >> with the assumption that there is only one partition spec per field. >> > >> > Given this partition spec: >> > >> > PartitionSpec spec = PartitionSpec.builderFor(schema) >> > .withSpecId(0) >> > .year("timestamp") >> > .month("timestamp") >> > .day("timestamp") >> > .hour("timestamp") >> > .build(); >> > >> > Trying to validate partition pruning with similar code to: >> > >> > UnboundPredicate<Object> match = Expressions.equal("timestamp", >> > >> >> Literal.of("2019-01-11T00:00:00.000000").to(TimestampType.withoutZone()).value()); >> > Assert.assertTrue( >> > new InclusiveManifestEvaluator(spec, >> match).eval(table.currentSnapshot().manifests().get(0)); >> > >> > I get an unexpected google collection exception: >> > >> > java.lang.IllegalArgumentException: Multiple entries with same key: >> 1=org.apache.iceberg.PartitionField@da8cdda7 and >> 1=org.apache.iceberg.PartitionField@e5c6fddb >> > >> > at >> com.google.common.collect.ImmutableMap.conflictException(ImmutableMap.java:215) >> > at >> com.google.common.collect.ImmutableMap.checkNoConflict(ImmutableMap.java:209) >> > at >> com.google.common.collect.RegularImmutableMap.checkNoConflictInKeyBucket(RegularImmutableMap.java:147) >> > at >> com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:110) >> > at >> com.google.common.collect.ImmutableMap$Builder.build(ImmutableMap.java:393) >> > at >> org.apache.iceberg.PartitionSpec.lazyFieldsBySourceId(PartitionSpec.java:232) >> > at >> org.apache.iceberg.PartitionSpec.getFieldBySourceId(PartitionSpec.java:95) >> > at >> org.apache.iceberg.expressions.Projections$InclusiveProjection.predicate(Projections.java:208) >> > at >> org.apache.iceberg.expressions.Projections$InclusiveProjection.predicate(Projections.java:200) >> > at >> org.apache.iceberg.expressions.Projections$BaseProjectionEvaluator.predicate(Projections.java:185) >> > at >> org.apache.iceberg.expressions.Projections$BaseProjectionEvaluator.predicate(Projections.java:136) >> > at >> org.apache.iceberg.expressions.ExpressionVisitors.visit(ExpressionVisitors.java:152) >> > at >> org.apache.iceberg.expressions.Projections$BaseProjectionEvaluator.project(Projections.java:152) >> > at >> org.apache.iceberg.expressions.InclusiveManifestEvaluator.<init>(InclusiveManifestEvaluator.java:63) >> > at >> org.apache.iceberg.expressions.InclusiveManifestEvaluator.<init>(InclusiveManifestEvaluator.java:56) >> > at >> org.apache.iceberg.TestScansAndSchemaEvolution.testMultiPartitionPerFieldTransform(TestScansAndSchemaEvolution.java:177) >> > >> > >> > I was wondering if this issue is tracked so maybe I could help out. >> > >> > Thanks, >> > /Filip >> >> > > -- > Ryan Blue > Software Engineer > Netflix > -- Filip Bocse