[ https://issues.apache.org/jira/browse/SPARK-9399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Rosen updated SPARK-9399: ------------------------------ Description: There are a few minor optimizations in PythonRDD which may avoid garbage creation or Scala overheads: - Replace a foreach() loop with a while() loop. - Returns nulls instead of Options to avoid calling `Option.apply()` and `Option.isDefined` once per read() call. - Call .size() instead of .length(), thereby avoiding an implicit Java -> Scala collections conversion. was: There are a few minor optimizations in PythonRDD which may avoid garbage creation or Scala overheads: - Replace a foreach() loop with a while() loop. - Returns nulls instead of Options to avoid allocating an Option once per read() call. - Call .size() instead of .length(), thereby avoiding an implicit Java -> Scala collections conversion. > Assorted micro-optimizations in PythonRDD > ----------------------------------------- > > Key: SPARK-9399 > URL: https://issues.apache.org/jira/browse/SPARK-9399 > Project: Spark > Issue Type: Improvement > Components: PySpark > Reporter: Josh Rosen > Assignee: Josh Rosen > Priority: Minor > > There are a few minor optimizations in PythonRDD which may avoid garbage > creation or Scala overheads: > - Replace a foreach() loop with a while() loop. > - Returns nulls instead of Options to avoid calling `Option.apply()` and > `Option.isDefined` once per read() call. > - Call .size() instead of .length(), thereby avoiding an implicit Java -> > Scala collections conversion. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org