[ https://issues.apache.org/jira/browse/SPARK-21140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen closed SPARK-21140. ----------------------------- This should not be reopened, given the state of the discussion. This is an open ended question for the mailing list. > Reduce collect high memory requrements > -------------------------------------- > > Key: SPARK-21140 > URL: https://issues.apache.org/jira/browse/SPARK-21140 > Project: Spark > Issue Type: Improvement > Components: Input/Output > Affects Versions: 2.1.1 > Environment: Linux Debian 8 using hadoop 2.7.2. > Reporter: michael procopio > > I wrote a very simple Scala application which used flatMap to create an RDD > containing a 512 mb partition of 256 byte arrays. Experimentally, I > determined that spark.executor.memory had to be set at 3 gb in order to > colledt the data. This seems extremely high. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org