Re: How to tunning my spark application.

2016-01-17 Thread Ted Yu
In sampleArray(), there is a loop: for (i <- 0 until ARRAY_SAMPLE_SIZE) { ARRAY_SAMPLE_SIZE is a constant (100). Not clear how the amount of computation in sampleArray() can be reduced. Which Spark release are you using ? Thanks On Sun, Jan 17, 2016 at 6:22 AM, 张峻

Re: How to tunning my spark application.

2016-01-17 Thread 张峻
Dear Ted My Spark release is 1.5.2 BR Julian Zhang > 在 2016年1月17日,23:10,Ted Yu 写道: > > In sampleArray(), there is a loop: > for (i <- 0 until ARRAY_SAMPLE_SIZE) { > > ARRAY_SAMPLE_SIZE is a constant (100). > > Not clear how the amount of computation in

How to tunning my spark application.

2016-01-17 Thread 张峻
Dear All I used jProfiler to profiling my spark application. And I had find more than 70% cpu is used by the org.apache.spark.util.SizeEstimator class. There call tree is as blow. java.lang.Thread.run --scala.collection.immutable.Range.foreach$mVc$sp

Re: How to tunning my spark application.

2016-01-17 Thread Ted Yu
For 'List.foreach', it is likely for the pointerFields shown below: private class ClassInfo( val shellSize: Long, val pointerFields: List[Field]) {} FYI On Sun, Jan 17, 2016 at 7:15 AM, 张峻 wrote: > Dear Ted > > My Spark release is 1.5.2 > > BR > > Julian Zhang >