[jira] [Commented] (DRILL-5833) Parquet reader fails with assertion error for Decimal9, Decimal18 types

Volodymyr Vysotskyi (JIRA) Wed, 10 Jan 2018 01:50:26 -0800

    [ 
https://issues.apache.org/jira/browse/DRILL-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16319973#comment-16319973
 ]


Volodymyr Vysotskyi commented on DRILL-5833:
--------------------------------------------

[~Paul.Rogers] since this issue was fixed in the scope of DRILL-5832, may we 
resolve this Jira?

> Parquet reader fails with assertion error for Decimal9, Decimal18 types
> -----------------------------------------------------------------------
>
>                 Key: DRILL-5833
>                 URL: https://issues.apache.org/jira/browse/DRILL-5833
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.13.0
>
>
> The {{TestParquetWriter.testDecimal()}} test recently failed. As it turns 
> out, this test never ran properly before against the "old" Parquet reader. 
> Because the {{store.parquet.use_new_reader}} was left at a previous value, 
> sometimes the test would run against the "new" reader (and succeed) or 
> against the "old" reader (and fail.)
> Once the test was forced to run against the "old" reader, it fails deep in 
> the Parquet record reader in 
> {{DrillParquetGroupConverter.getConverterForType()}}.
> The code attempts to create a Decimal9 writer by calling 
> {{SingleMapWriter.decimal9(String name)}} to create the writer. However, the 
> code in this method says:
> {code}
>   public Decimal9Writer decimal9(String name) {
>     // returns existing writer
>     final FieldWriter writer = fields.get(name.toLowerCase());
>     assert writer != null;
>     return writer;
>   }
> {code}
> And, indeed, the assertion is triggered.
> As it turns out, the code for Decimal28 shows the proper solution:
> {code}
> mapWriter.decimal28Sparse(name, metadata.getScale(), metadata.getPrecision())
> {code}
> That is, pass the scale and precision to this form of the method which 
> actually creates the writer:
> {code}
>   public Decimal9Writer decimal9(String name, int scale, int precision) {
> {code}
> Applying the same pattern to for the Parquet Decimal9 and Decimal18 types 
> allows the above test to get past the asserts. Given this error, it is clear 
> that this test could never have run, and so the error in the Parquet reader 
> was never detected.
> It also turns out that the test itself is wrong, reversing the validation and 
> test queries:
> {code}
>   public void runTestAndValidate(String selection, String 
> validationSelection, String inputTable, String outputFile) throws Exception {
>     try {
>       deleteTableIfExists(outputFile);
>       ...
>       // Query reads from the input (JSON) table
>       String query = String.format("SELECT %s FROM %s", selection, 
> inputTable);
>       String create = "CREATE TABLE " + outputFile + " AS " + query;
>       // validate query reads from the output (Parquet) table
>       String validateQuery = String.format("SELECT %s FROM " + outputFile, 
> validationSelection);
>       test(create);
>       testBuilder()
>           .unOrdered()
>           .sqlQuery(query) // Query under test is input query
>           .sqlBaselineQuery(validateQuery) // Baseline query is output query
>           .go();
> {code}
> Given this, it is the Parquet data that is wrong, not the baseline.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5833) Parquet reader fails with assertion error for Decimal9, Decimal18 types

Reply via email to