In sampleArray(), there is a loop:
    for (i <- 0 until ARRAY_SAMPLE_SIZE) {

ARRAY_SAMPLE_SIZE is a constant (100).

Not clear how the amount of computation in sampleArray() can be reduced.

Which Spark release are you using ?

Thanks

On Sun, Jan 17, 2016 at 6:22 AM, 张峻 <julian_do...@me.com> wrote:

> Dear All
>
> I used jProfiler to profiling my spark application.
> And I had find more than 70% cpu is used by the
> org.apache.spark.util.SizeEstimator class.
>
> There call tree is as blow.
>
> java.lang.Thread.run
> --scala.collection.immutable.Range.foreach$mVc$sp
>
> ----org.apache.spark.util.SizeEstimator$$anonfun$sampleArray$1.apply$mcVI$sp
> ------scala.collection.immutable.List.foreach
>
> --------org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply
> --scala.collection.immutable.List.foreach
> ----org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply
>
> My code don’t show in this two biggest branch of the call tree.
>
> I want to know what will cause spark to spend so many time in
> “Range.foreach” or “.List.foreach”
> Any one can give me some tips?
>
> BR
>
> Julian Zhang
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to