Dear Ted

My Spark release is 1.5.2

BR

Julian Zhang 

> 在 2016年1月17日,23:10,Ted Yu <yuzhih...@gmail.com> 写道:
> 
> In sampleArray(), there is a loop:
>     for (i <- 0 until ARRAY_SAMPLE_SIZE) {
> 
> ARRAY_SAMPLE_SIZE is a constant (100).
> 
> Not clear how the amount of computation in sampleArray() can be reduced.
> 
> Which Spark release are you using ?
> 
> Thanks
> 
>> On Sun, Jan 17, 2016 at 6:22 AM, 张峻 <julian_do...@me.com> wrote:
>> Dear All
>> 
>> I used jProfiler to profiling my spark application.
>> And I had find more than 70% cpu is used by the 
>> org.apache.spark.util.SizeEstimator class.
>> 
>> There call tree is as blow.
>> 
>> java.lang.Thread.run
>> --scala.collection.immutable.Range.foreach$mVc$sp
>> ----org.apache.spark.util.SizeEstimator$$anonfun$sampleArray$1.apply$mcVI$sp
>> ------scala.collection.immutable.List.foreach
>> --------org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply
>> --scala.collection.immutable.List.foreach
>> ----org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply
>> 
>> My code don’t show in this two biggest branch of the call tree.
>> 
>> I want to know what will cause spark to spend so many time in 
>> “Range.foreach” or “.List.foreach”
>> Any one can give me some tips?
>> 
>> BR
>> 
>> Julian Zhang
>> 
>> 
>> 
>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
> 

Reply via email to