[
https://issues.apache.org/jira/browse/DRILL-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995219#comment-13995219
]
Jason Altekruse commented on DRILL-649:
---------------------------------------
Sorry I didn't get back to you last week, we will still be doing vector copies
for non-dictionary encoded values. One frustrating thing is that much of the
reason we are running into such frequent problems is we are running tests on
tiny files, so everything has a small enough set of values that it can be more
efficiently stored as a dictionary. I don't know common use cases, but I think
with 1 gig row groups it will actually be fairly rare that an int or long
column will be limited to a dictionary of 50,000 values.
> Unable to read dictionary encoded parquet file generated from impala or avro
> ----------------------------------------------------------------------------
>
> Key: DRILL-649
> URL: https://issues.apache.org/jira/browse/DRILL-649
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Steven Phillips
> Assignee: Jason Altekruse
> Attachments: nation.parquet
>
>
> support for dictionary encoding was recently added, but it looks like some
> dictionary encoded files are still unreadable by drill. For example, the
> parquet file created from an avro file attached to DRILL-389 still fails.
> I also created a simple parquet file from impala, which also fails to read.
> I will attach the file.
--
This message was sent by Atlassian JIRA
(v6.2#6252)