I ran this in a vm with much less memory and it immediately failed with a memory error:
Traceback (most recent call last): File "testavro.py", line 31, in <module> for r in reader: File "/usr/local/lib/python2.7/dist-packages/avro/datafile.py", line 362, in next datum = self.datum_reader.read(self.datum_decoder) File "/usr/local/lib/python2.7/dist-packages/avro/io.py", line 445, in read return self.read_data(self.writers_schema, self.readers_schema, decoder) File "/usr/local/lib/python2.7/dist-packages/avro/io.py", line 490, in read_data return self.read_record(writers_schema, readers_schema, decoder) File "/usr/local/lib/python2.7/dist-packages/avro/io.py", line 690, in read_record field_val = self.read_data(field.type, readers_field.type, decoder) File "/usr/local/lib/python2.7/dist-packages/avro/io.py", line 484, in read_data return self.read_array(writers_schema, readers_schema, decoder) File "/usr/local/lib/python2.7/dist-packages/avro/io.py", line 582, in read_array for i in range(block_count): MemoryError On Tue, Oct 27, 2015 at 1:36 PM, web user <webuser1...@gmail.com> wrote: > Hi, > > I'm doing the following: > > from avro.datafile import DataFileReader > from avro.datafile import DataFileWriter > from avro.io import DatumReader > from avro.io import DatumWriter > > def OpenAvroFileToRead(avro_filename): > DataFileReader(open(avro_filename, 'r'), DatumReader()) > > > with OpenAvroFileToRead(avro_filename) as reader: > for r in reader: > .... > > I have an avro file which is only 500 bytes. I think there is a data > structure in there which is null or empty. > > I put in print statements before and after "for r in reader". On the > instruction, for r in reader it consumes about 400Gigs of memory before I > have to kill the process. > > That is 400Gigs! Ihave 1TB on my server. I have tried this with 1.6.1 and > 1.7.1 and 1.7.7 and get the same behavior on all three versions. > > Any ideas on what is causing this? > > Regards, > > WU >