I have opened a PR <https://github.com/apache/drill/pull/644> for the DRILL-4980 <https://issues.apache.org/jira/browse/DRILL-4980>.
@Jinfeng I removed is.date.correct flag from the writer. And I left the checking of it in detectCorruptDates() method for the reason if somebody has already generated the parquet files with this property. @Jason I have added the drill version checking as you suggested to do. @Paul I have added parquet-writer.version property to parquet metadata footer. If you have other thoughts about variables names, please let me know I will try to respond as fast as possible. Kind regards Vitalii 2016-10-31 7:07 GMT+02:00 Paul Rogers <prog...@maprtech.com>: > Choosing a good property name should solve the confusion issue. Perhaps > drill.writer as the name. > > The writer version is not needed if we feel that we’ll never again change > the writer or introduce bugs. Since that is hard to predict, adding a > writer version is very low cost insurance. Indeed, including a writer > version is a very common technique. The question we should answer is why we > don’t need to use this technique here… > > One can imagine that, as we evolve from 1.x to the 2.1 (or later) format, > we may do so in fits and starts, perhaps based on community contributions. > A version will help us know what capabilities were supported in the writer > that wrote a particular file. > > Thanks, > > - Paul > > > On Oct 28, 2016, at 3:03 PM, Jason Altekruse <ja...@dremio.com> wrote: > > > > The only worry I have about declaring a writer version is possible > > confusion with the Parquet format version itself. The format is already > > defined through version 2.1 or something like that, but we are currently > > only writing files based on the 1.x version of the format. > > > > My preferred solution to this problem would be to just make point > releases > > for problems like this (like in this case we could have made a 1.8.1 > > release, and then all of the 1.8.0-SNAPSHOT would all known to be bad and > > everything after would be 1.8.1-SNAPSHOT and could have been known to be > > correct). > > > > I'm open to to hearing other opinions on this, I just generally feel like > > these bugs should be rare, and fixing them should be done with a lot of > > care (and in this case I missed a few things). I don't think it would be > > crazy to say that we should only merge these kinds of patches if we are > > willing to say the fix is ready for a release. > > > > Jason Altekruse > > Software Engineer at Dremio > > Apache Drill Committer > > > > On Fri, Oct 28, 2016 at 2:52 PM, Vitalii Diravka < > vitalii.dira...@gmail.com> > > wrote: > > > >> Jinfeng, > >> > >> isDateCorrect will be false in the code when isDateCorrect property is > >> absent in the parquet metadata. > >> > >> Anyway I am going to implement the mentioned approach with the > >> parquet-writer.version instead of isDateCorrect property. > >> > >