@Paul Rogers It may be the undefined case when the file is generated with drill.version = 1.9-SNAPSHOT. It is more easy to determine corrupted date with this flag and there is no need to wait the end of release to merge these changes.
@Jinfeng NI It looks like you are right. With consistent mode (isDateCorrect = true) all tests are passed. So I am going to open a jira ticket for it with next changes https://github.com/vdiravka/drill/commit/ff8d5c7d601915f760d1b0e96187303410cac5d3 Thanks. Kind regards Vitalii 2016-10-25 18:36 GMT+00:00 Jinfeng Ni <[email protected]>: > I'm not sure if I fully understand your answers. The bottom line is > quite simple: given a set of parquet files, the ParquetTableMeta > instance constructed in Drill should have identical value for > "isDateCorrect", whether it comes from parquet footer, or parquet > metadata cache, or whether there is partition pruning or not. However, > the code shows that this flag is not in consistent mode across > different cases. > > > > On Tue, Oct 25, 2016 at 11:24 AM, Vitalii Diravka > <[email protected]> wrote: > > Hi Jinfeng, > > > > 1.If the parquet files are generated with Drill after Drill-4203 these > > files have "isDateCorrect = true" property. > > Drill serializes this property from metadata now. When we set this > property > > in the first constructor we will hide the value from metadata. > > IsDateCorrect will be false only if this value equals to the false (no > case > > for it now) or absent in parquet metadata footer. > > > > > > 2. I'm not sure the reason to change isDateCorrect metadata property when > > the user disable dates correction. > > If you have some use case it would be great if you provide it. > > > > 3. Maybe you are right regarding to when Parquet metadata is cloned. > > Here I added the property in the same manner as Jason's new property > > "drillVersion. So need it a separate unit test? > > > > > > Kind regards > > Vitalii > > > > 2016-10-25 16:23 GMT+00:00 Jinfeng Ni <[email protected]>: > > > >> Forgot to copy the link to the code. > >> > >> [1] https://github.com/apache/drill/blob/master/exec/java- > >> exec/src/main/java/org/apache/drill/exec/store/parquet/ > >> Metadata.java#L950-L955 > >> > >> On Tue, Oct 25, 2016 at 9:16 AM, Jinfeng Ni <[email protected]> wrote: > >> > @Jason, @Vitalli, > >> > > >> > Any thoughts on this question, since both you worked on fix of > >> DRILL-4203? > >> > > >> > Looking through the code, there is a third case [1], where this flag > >> > is set to false when Parquet metadata is cloned (after partition > >> > pruning, etc). That means, for the 2nd case where the flag is set to > >> > true, if there is pruning happening, the new parquet metadata will see > >> > the flag is flipped to false. This does not make sense to me. > >> > > >> > > >> > > >> > On Mon, Oct 24, 2016 at 3:10 PM, Jinfeng Ni <[email protected]> wrote: > >> >> Hello All, > >> >> > >> >> DRILL-4203 addressed the date field issue. In the fix, it introduced > >> >> a new field in ParquetTableMetadata_v2 : isDateCorrect. I have some > >> >> difficulty in understanding the meaning of this field. > >> >> > >> >> According to [1], this field is set to false, when Drill gets parquet > >> >> metadata from parquet footer. This field is set to true in code > flow > >> >> of [2] and [3], when Drill gets parquet metadata from meta data > cache. > >> >> > >> >> Questions I have: > >> >> 1. If the parquet files are generated with Drill after DRILL-4203, > >> >> Drill still thinks date field is NOT correct (isDateCorrect = false)? > >> >> 2. Why does this filed have nothing to do with "autoCorrection" flag > >> >> [4]? If someone turns off autoCorrection, will it have impact on > this > >> >> "isDateCorrect" flag ? > >> >> > >> >> Thanks in advance for any input, > >> >> > >> >> Jinfeng > >> >> > >> >> > >> >> [1] https://github.com/apache/drill/blob/master/exec/java- > >> exec/src/main/java/org/apache/drill/exec/store/parquet/ > Metadata.java#L932 > >> >> [2] https://github.com/apache/drill/blob/master/exec/java- > >> exec/src/main/java/org/apache/drill/exec/store/parquet/ > Metadata.java#L936 > >> >> [3] https://github.com/apache/drill/blob/master/exec/java- > >> exec/src/main/java/org/apache/drill/exec/store/parquet/ > Metadata.java#L187 > >> >> [4] https://github.com/apache/drill/blob/master/exec/java- > >> exec/src/main/java/org/apache/drill/exec/store/parquet/ > >> Metadata.java#L354-L355 > >> >
