Re: How to tunning my spark application.

Ted Yu Sun, 17 Jan 2016 08:09:06 -0800

For 'List.foreach', it is likely for the pointerFields shown below:

  private class ClassInfo(
    val shellSize: Long,
    val pointerFields: List[Field]) {}


FYI

On Sun, Jan 17, 2016 at 7:15 AM, 张峻 <julian_do...@me.com> wrote:

> Dear Ted
>
> My Spark release is 1.5.2
>
> BR
>
> Julian Zhang
>
> 在 2016年1月17日，23:10，Ted Yu <yuzhih...@gmail.com> 写道：
>
> In sampleArray(), there is a loop:
>     for (i <- 0 until ARRAY_SAMPLE_SIZE) {
>
> ARRAY_SAMPLE_SIZE is a constant (100).
>
> Not clear how the amount of computation in sampleArray() can be reduced.
>
> Which Spark release are you using ?
>
> Thanks
>
> On Sun, Jan 17, 2016 at 6:22 AM, 张峻 <julian_do...@me.com> wrote:
>
>> Dear All
>>
>> I used jProfiler to profiling my spark application.
>> And I had find more than 70% cpu is used by the
>> org.apache.spark.util.SizeEstimator class.
>>
>> There call tree is as blow.
>>
>> java.lang.Thread.run
>> --scala.collection.immutable.Range.foreach$mVc$sp
>>
>> ----org.apache.spark.util.SizeEstimator$$anonfun$sampleArray$1.apply$mcVI$sp
>> ------scala.collection.immutable.List.foreach
>>
>> --------org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply
>> --scala.collection.immutable.List.foreach
>> ----org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply
>>
>> My code don’t show in this two biggest branch of the call tree.
>>
>> I want to know what will cause spark to spend so many time in
>> “Range.foreach” or “.List.foreach”
>> Any one can give me some tips?
>>
>> BR
>>
>> Julian Zhang
>>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Re: How to tunning my spark application.

Reply via email to