steveloughran commented on PR #3559:
URL: https://github.com/apache/parquet-java/pull/3559#issuecomment-4480988160

   so for all cluster filesystems with vector io, it's fine as is. HDFS doesn't 
support it and for the cloud stores it's all ranged reads straight into 
allocated buffers.
   
   I think maybe in hadoop we should just cut the attempt to be clever and 
merge ranges, and just do the parallel reads. On clusterfs work with Owen 
O'Malley and claude to do the right thing here. 
   
   what would be the perceived penalty of reading the whole file block into one 
allocated buffer, copying the requested pieces into two separate buffers, and 
then releasing the larger one (a release function is now returned down after 
all). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to