Hi I have those avro file with the schema id:Long, content:Binary
the binary are large image with a maximum of 2GB of size. I d like to get a subset of row "where id in (...)" Sadly I get memory errors even if the subset is 0 of size. It looks like the reader stores the binary information until the heap size or the container is killed by yarn. Any idea how to tune the memory management to avoid to get memory problem? Thanks -- spark 2.4.3 -- nicolas --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org