westonpace commented on pull request #11588:
URL: https://github.com/apache/arrow/pull/11588#issuecomment-961333636


   For context it is probably worth pointing out that @niyue recently added 
#11486 which gets around the classic "IPC reader reads the entire file even if 
you only want a few columns" issue.
   
   However, I agree with @pitrou .  It sounds like you are not just limiting 
which columns you are accessing but you are also accessing very few rows.  In 
that case the problem is likely the fact that the record batch file reader 
loads the entire array via `ArrayLoader => GetBuffer => ReadBuffer => 
RandomAccessFile::ReadAt(entire-buffer-range)`
   
   And, in `MemoryMappedFile::ReadAt` we call 
`::arrow::internal::MemoryAdviseWillNeed` on the entire range accessed.
   
   Int hat case, the solution is what Antoine suggested.  We should provide an 
option in MemoryMappedFile to prevent calls to madvise.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to