I could try to take a stab at fixing this given that you've pointed out
very clearly the expected behavior in your previous explanation.

On Thu, May 30, 2019 at 10:31 PM Ryan Blue <rb...@netflix.com.invalid>
wrote:

> Yeah, this is a bug. You should be able to define multiple partition
> functions on the same field. But we do want to check that multiple time
> partitions are not used because they are redundant. I'll open a PR. Thanks
> for pointing this out!
>
> On Tue, May 28, 2019 at 4:15 AM Anton Okolnychyi
> <aokolnyc...@apple.com.invalid> wrote:
>
>> Hm, this is actually a good question.
>>
>> My understanding is that we shouldn't explicitly define partitioning by
>> year/month/day/hour on the same column. Instead, we should be fine with
>> hour only. Iceberg produces ordinals for time-based partition functions. As
>> far as I remember, Ryan was planning to submit a PR in order to prohibit
>> multiple partition functions.
>>
>> I believe in the above case you are trying to create one partition spec
>> with multiple partition functions on the same field.
>>
>> Keep in mind that if you partition by hour only, the directory structure
>> won’t contain year/month/day folders. If you are to have that directory
>> structure, you need to have actual columns for year/month/day in your
>> dataset and use identity partition function.
>>
>> Thanks,
>> Anton
>>
>>
>> > On 28 May 2019, at 09:27, filip <filip....@gmail.com> wrote:
>> >
>> >
>> > A while back I bumped into an issue with what seems to be an
>> inconsistency in the partition spec API or maybe it's just an
>> implementation bug.
>> > Attempting to have multiple partitions specs on the same schema field I
>> bumped into an issue regarding the fact that while the API allows for
>> multiple partitions spec defined for same field, internally this conflicts
>> with the assumption that there is only one partition spec per field.
>> >
>> > Given this partition spec:
>> >
>> > PartitionSpec spec = PartitionSpec.builderFor(schema)
>> >             .withSpecId(0)
>> >             .year("timestamp")
>> >             .month("timestamp")
>> >             .day("timestamp")
>> >             .hour("timestamp")
>> >             .build();
>> >
>> > Trying to validate partition pruning with similar code to:
>> >
>> > UnboundPredicate<Object> match = Expressions.equal("timestamp",
>> >
>>  
>> Literal.of("2019-01-11T00:00:00.000000").to(TimestampType.withoutZone()).value());
>> > Assert.assertTrue(
>> > new InclusiveManifestEvaluator(spec,
>> match).eval(table.currentSnapshot().manifests().get(0));
>> >
>> > I get an unexpected google collection exception:
>> >
>> > java.lang.IllegalArgumentException: Multiple entries with same key:
>> 1=org.apache.iceberg.PartitionField@da8cdda7 and
>> 1=org.apache.iceberg.PartitionField@e5c6fddb
>> >
>> > at
>> com.google.common.collect.ImmutableMap.conflictException(ImmutableMap.java:215)
>> > at
>> com.google.common.collect.ImmutableMap.checkNoConflict(ImmutableMap.java:209)
>> > at
>> com.google.common.collect.RegularImmutableMap.checkNoConflictInKeyBucket(RegularImmutableMap.java:147)
>> > at
>> com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:110)
>> > at
>> com.google.common.collect.ImmutableMap$Builder.build(ImmutableMap.java:393)
>> > at
>> org.apache.iceberg.PartitionSpec.lazyFieldsBySourceId(PartitionSpec.java:232)
>> > at
>> org.apache.iceberg.PartitionSpec.getFieldBySourceId(PartitionSpec.java:95)
>> > at
>> org.apache.iceberg.expressions.Projections$InclusiveProjection.predicate(Projections.java:208)
>> > at
>> org.apache.iceberg.expressions.Projections$InclusiveProjection.predicate(Projections.java:200)
>> > at
>> org.apache.iceberg.expressions.Projections$BaseProjectionEvaluator.predicate(Projections.java:185)
>> > at
>> org.apache.iceberg.expressions.Projections$BaseProjectionEvaluator.predicate(Projections.java:136)
>> > at
>> org.apache.iceberg.expressions.ExpressionVisitors.visit(ExpressionVisitors.java:152)
>> > at
>> org.apache.iceberg.expressions.Projections$BaseProjectionEvaluator.project(Projections.java:152)
>> > at
>> org.apache.iceberg.expressions.InclusiveManifestEvaluator.<init>(InclusiveManifestEvaluator.java:63)
>> > at
>> org.apache.iceberg.expressions.InclusiveManifestEvaluator.<init>(InclusiveManifestEvaluator.java:56)
>> > at
>> org.apache.iceberg.TestScansAndSchemaEvolution.testMultiPartitionPerFieldTransform(TestScansAndSchemaEvolution.java:177)
>> >
>> >
>> > I was wondering if this issue is tracked so maybe I could help out.
>> >
>> > Thanks,
>> > /Filip
>>
>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


-- 
Filip Bocse

Reply via email to