[
https://issues.apache.org/jira/browse/DRILL-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Rogers resolved DRILL-5833.
--------------------------------
Resolution: Fixed
> Parquet reader fails with assertion error for Decimal9, Decimal18 types
> -----------------------------------------------------------------------
>
> Key: DRILL-5833
> URL: https://issues.apache.org/jira/browse/DRILL-5833
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.10.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Fix For: 1.13.0
>
>
> The {{TestParquetWriter.testDecimal()}} test recently failed. As it turns
> out, this test never ran properly before against the "old" Parquet reader.
> Because the {{store.parquet.use_new_reader}} was left at a previous value,
> sometimes the test would run against the "new" reader (and succeed) or
> against the "old" reader (and fail.)
> Once the test was forced to run against the "old" reader, it fails deep in
> the Parquet record reader in
> {{DrillParquetGroupConverter.getConverterForType()}}.
> The code attempts to create a Decimal9 writer by calling
> {{SingleMapWriter.decimal9(String name)}} to create the writer. However, the
> code in this method says:
> {code}
> public Decimal9Writer decimal9(String name) {
> // returns existing writer
> final FieldWriter writer = fields.get(name.toLowerCase());
> assert writer != null;
> return writer;
> }
> {code}
> And, indeed, the assertion is triggered.
> As it turns out, the code for Decimal28 shows the proper solution:
> {code}
> mapWriter.decimal28Sparse(name, metadata.getScale(), metadata.getPrecision())
> {code}
> That is, pass the scale and precision to this form of the method which
> actually creates the writer:
> {code}
> public Decimal9Writer decimal9(String name, int scale, int precision) {
> {code}
> Applying the same pattern to for the Parquet Decimal9 and Decimal18 types
> allows the above test to get past the asserts. Given this error, it is clear
> that this test could never have run, and so the error in the Parquet reader
> was never detected.
> It also turns out that the test itself is wrong, reversing the validation and
> test queries:
> {code}
> public void runTestAndValidate(String selection, String
> validationSelection, String inputTable, String outputFile) throws Exception {
> try {
> deleteTableIfExists(outputFile);
> ...
> // Query reads from the input (JSON) table
> String query = String.format("SELECT %s FROM %s", selection,
> inputTable);
> String create = "CREATE TABLE " + outputFile + " AS " + query;
> // validate query reads from the output (Parquet) table
> String validateQuery = String.format("SELECT %s FROM " + outputFile,
> validationSelection);
> test(create);
> testBuilder()
> .unOrdered()
> .sqlQuery(query) // Query under test is input query
> .sqlBaselineQuery(validateQuery) // Baseline query is output query
> .go();
> {code}
> Given this, it is the Parquet data that is wrong, not the baseline.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)