I have a very large CSV file (nearly 13 million records) stored in Azure Storage and read via the Azure Storage plugin. The drillbit configuration has a modest 4GB heap size. Is there an effective way to select all the records from the file without running out of resources in Drill?
SELECT * … is too big SELECT * with OFFSET and LIMIT sounds like the right approach, but OFFSET still requires scanning through the offset records, and this seems to hit the same memory issues even with small LIMITs once the offset is large enough. Would it help to switch the format to something other than CSV? Or move it to a different storage mechanism? Or something else?