[
https://issues.apache.org/jira/browse/PARQUET-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17313163#comment-17313163
]
ASF GitHub Bot commented on PARQUET-1982:
-----------------------------------------
fschmalzel commented on pull request #871:
URL: https://github.com/apache/parquet-mr/pull/871#issuecomment-811886333
First of sorry for the late answer.
We use parquet-data to display graphs. We needed the optimization to go back
pages if a user scrolls to the left or zooms out. I currently don't have any
concrete numbers but it was a lot more performant. Going from practically
unusably slow to reasonably fast, comparable to our legacy file format.
I plan to go and get some more precise numbers for our use case soon.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Allow random access to row groups in ParquetFileReader
> ------------------------------------------------------
>
> Key: PARQUET-1982
> URL: https://issues.apache.org/jira/browse/PARQUET-1982
> Project: Parquet
> Issue Type: New Feature
> Components: parquet-mr
> Reporter: Felix Schmalzel
> Priority: Minor
> Labels: parquetReader, random-access
>
> The used SeekableInputStream and all other components of the
> ParquetFileReader already support random access and row groups should be
> independent of each other.
> This would allow reusing the opened InputStream when you want to go back a
> row group. It also makes accessing specific row groups a lot easier.
> I've already developed a patch that would enable this functionality. I will
> link the merge request in the next few days.
> Is there a related ticket that i have overlooked?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)