I'm trying to use gdal.FileFromMemBuffer to do some in-memory processing, but I 
ran into what seems to be a 2 GB limit.

If I create a TIF on disk that is just below 2 GB, things work fine:
import gdal
drv = gdal.GetDriverByName("GTiff")
ds = drv.Create("45000.tif", 45000, 45000, 1, gdal.GDT_Byte)
ds = None
with open("45000.tif", "r") as f:
    membuf = f.read()
gdal.FileFromMemBuffer("/vsimem/45000.tif", membuf)
ds = gdal.Open("/vsimem/45000.tif")
print(ds.RasterXSize) -> Prints "45000"

If I repeat this process with a file that is just over 2 GB:
import gdal
drv = gdal.GetDriverByName("GTiff")
ds = drv.Create("48000.tif", 48000, 48000, 1, gdal.GDT_Byte)
ds = None
with open("48000.tif", "r") as f:
    membuf = f.read()
gdal.FileFromMemBuffer("/vsimem/48000.tif", membuf)

On OSX, I get this error:
python2.7(30843,0x7fffa80063c0) malloc: *** 
mach_vm_map(size=18446744071718969344) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug

On Linux, I get no error but if I try to ds = gdal.Open("/vsimem/48000.tif") I 
get a "no such file or directory" error.

I found this SWIG wrapper function in swig/include/cpl.i:
void wrapper_VSIFileFromMemBuffer( const char* utf8_path, int nBytes, const 
GByte *pabyData)
{
    GByte* pabyDataDup = (GByte*)VSIMalloc(nBytes);
    if (pabyDataDup == NULL)
            return;
    memcpy(pabyDataDup, pabyData, nBytes);
    VSIFCloseL(VSIFileFromMemBuffer(utf8_path, (GByte*) pabyDataDup, nBytes, 
TRUE));
}

It seems like the input "int nBytes" is the problem, as it is passed to 
VSIMalloc whicih takes a size_t type. The int type is signed and 32-bit so it 
can't handle over 2 * 2^30 (2 GB). It's probably rolling over, then when cast 
to size_t it is interpreted as that huge size in the OSX error message.

Also, is there any plan to expose the boolean that controls whether it takes 
ownership of the passed in buffer? As it is now, calling this function requires 
2x the memory because of the malloc and memcpy. Maybe the ownership of the 
buffer is too tricky when dealing with multiple languages and reference 
counting...

This electronic communication and any attachments may contain confidential and 
proprietary information of DigitalGlobe, Inc. If you are not the intended 
recipient, or an agent or employee responsible for delivering this 
communication to the intended recipient, or if you have received this 
communication in error, please do not print, copy, retransmit, disseminate or 
otherwise use the information. Please indicate to the sender that you have 
received this communication in error, and delete the copy you received.

DigitalGlobe reserves the right to monitor any electronic communication sent or 
received by its employees, agents or representatives.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to