In sampleArray(), there is a loop: for (i <- 0 until ARRAY_SAMPLE_SIZE) {
ARRAY_SAMPLE_SIZE is a constant (100). Not clear how the amount of computation in sampleArray() can be reduced. Which Spark release are you using ? Thanks On Sun, Jan 17, 2016 at 6:22 AM, 张峻 <julian_do...@me.com> wrote: > Dear All > > I used jProfiler to profiling my spark application. > And I had find more than 70% cpu is used by the > org.apache.spark.util.SizeEstimator class. > > There call tree is as blow. > > java.lang.Thread.run > --scala.collection.immutable.Range.foreach$mVc$sp > > ----org.apache.spark.util.SizeEstimator$$anonfun$sampleArray$1.apply$mcVI$sp > ------scala.collection.immutable.List.foreach > > --------org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply > --scala.collection.immutable.List.foreach > ----org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply > > My code don’t show in this two biggest branch of the call tree. > > I want to know what will cause spark to spend so many time in > “Range.foreach” or “.List.foreach” > Any one can give me some tips? > > BR > > Julian Zhang > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >