yordan-pavlov commented on issue #171:
URL: https://github.com/apache/arrow-rs/issues/171#issuecomment-991960434


   @tustvold  I should have read the blog post you linked earlier 
(https://arrow.apache.org/blog/2019/09/05/faster-strings-cpp-parquet/) before 
commenting; it appears that the C++ implementation of the arrow parquet reader 
converts plain-encoded fallback pages into a dictionary similar to the latest 
approach you described:
   > When decoding a ColumnChunk, we first append the dictionary values and 
indices into an Arrow DictionaryBuilder, and when we encounter the “fall back” 
portion we use a hash table to convert those values to dictionary-encoded form 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to