[ 
https://issues.apache.org/jira/browse/DRILL-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16490872#comment-16490872
 ] 

ASF GitHub Bot commented on DRILL-6353:
---------------------------------------

arina-ielchiieva commented on a change in pull request #1259: DRILL-6353: 
Upgrade Parquet MR dependencies
URL: https://github.com/apache/drill/pull/1259#discussion_r190929789
 
 

 ##########
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetMetadataCache.java
 ##########
 @@ -737,6 +738,7 @@ public void testBooleanPartitionPruning() throws Exception 
{
     }
   }
 
+  @Ignore
 
 Review comment:
   I have investigated why these tests fail. For example, let's take 
`testIntervalDayPartitionPruning`.
   First test creates partitioned table using Drill. Since table is created at 
runtime, new parquet lib is used. Created table contains 4 files, one of them 
contains all nulls.  For this file with nulls, statistics for all types except 
of binary is `num_nulls: 3, min/max not defined`. For binary type it is `no 
stats for this column`. For binary columns without null, statistics is written 
correctly. Did not check when mixed though (but I think it should be fine). In 
previous parquet version, statistics was written correctly. Maybe this is bug 
in parquet, maybe in Drill writer. 
   
   Another problem is with metadata file. We do write metadata for binary 
columns into it successfully. Example:
   ```
     "columnTypeInfo" : {
       "`col_intrvl_day`" : {
         "name" : [ "col_intrvl_day" ],
         "primitiveType" : "FIXED_LEN_BYTE_ARRAY",
         "originalType" : "INTERVAL",
         "precision" : 0,
         "scale" : 0,
         "repetitionLevel" : 0,
         "definitionLevel" : 1
       },
           "name" : [ "col_intrvl_day" ],
           "minValue" : "AAAAABoAAACQ4KEB",
           "maxValue" : "AAAAABoAAACQ4KEB",
           "nulls" : 0
   ```
   But when reading it back from file, we read empty strings. Looks like this 
one is Drill bug.
   
   @vrozov  I also have noticed that `ParquetFileReader.readFooter(conf, path, 
NO_FILTER);` is deprecated. If you'll have a chance, please replace it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Upgrade Parquet MR dependencies
> -------------------------------
>
>                 Key: DRILL-6353
>                 URL: https://issues.apache.org/jira/browse/DRILL-6353
>             Project: Apache Drill
>          Issue Type: Task
>            Reporter: Vlad Rozov
>            Assignee: Vlad Rozov
>            Priority: Major
>             Fix For: 1.14.0
>
>
> Upgrade from a custom build {{1.8.1-drill-r0}} to Apache release {{1.10.0}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to