Hi

I have those avro file with the schema id:Long, content:Binary

the binary are large image with a maximum of 2GB of size.

I d like to get a subset of row "where id in (...)"

Sadly I get memory errors even if the subset is 0 of size. It looks like
the reader stores the binary information until the heap size or the
container is killed by yarn.

Any idea how to tune the memory management to avoid to get memory
problem?

Thanks

-- spark 2.4.3

-- 
nicolas

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to