Avro large binary read memory problem

Nicolas Paris Tue, 23 Jul 2019 09:57:23 -0700

Hi

I have those avro file with the schema id:Long, content:Binary


the binary are large image with a maximum of 2GB of size.

I d like to get a subset of row "where id in (...)"

Sadly I get memory errors even if the subset is 0 of size. It looks like
the reader stores the binary information until the heap size or the
container is killed by yarn.

Any idea how to tune the memory management to avoid to get memory
problem?

Thanks

-- spark 2.4.3

-- 
nicolas

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Avro large binary read memory problem

Reply via email to