On Tue, Jul 23, 2019 at 05:10:19PM +0000, Mario Amatucci wrote: > https://spark.apache.org/docs/2.2.0/configuration.html#memory-management
thanks for the pointer, however, I tried almost every configuration and the behavior tends to show that spark keeps things in memory instead of releasing it On Tue, Jul 23, 2019 at 05:10:19PM +0000, Mario Amatucci wrote: > https://spark.apache.org/docs/2.2.0/configuration.html#memory-management > > MARIO AMATUCCI > Senior Software Engineer > > Office: +48 12 881 10 05 x 31463 Email: mario_amatu...@epam.com > Gdansk, Poland epam.com > > ~do more with less~ > > CONFIDENTIALITY CAUTION AND DISCLAIMER > This message is intended only for the use of the individual(s) or entity(ies) > to which it is addressed and contains information that is legally privileged > and confidential. If you are not the intended recipient, or the person > responsible for delivering the message to the intended recipient, you are > hereby notified that any dissemination, distribution or copying of this > communication is strictly prohibited. All unintended recipients are obliged > to delete this message and destroy any printed copies. > > > -----Original Message----- > From: Nicolas Paris <nicolas.pa...@riseup.net> > Sent: Tuesday, July 23, 2019 6:56 PM > To: user@spark.apache.org > Subject: Avro large binary read memory problem > > Hi > > I have those avro file with the schema id:Long, content:Binary > > the binary are large image with a maximum of 2GB of size. > > I d like to get a subset of row "where id in (...)" > > Sadly I get memory errors even if the subset is 0 of size. It looks like the > reader stores the binary information until the heap size or the container is > killed by yarn. > > Any idea how to tune the memory management to avoid to get memory problem? > > Thanks > > -- spark 2.4.3 > > -- > nicolas > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > -- nicolas --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org