回复:Re: Re: Re: How to decrease the time of storing block in memory

2015-06-10 Thread luohui20001
thanks Ak,  thanks for your idea. I had tried using spark to do what the 
shell did. However it is not fast enough as I expected and not very easy. 




 

Thanksamp;Best regards!
San.Luo

- 原始邮件 -
发件人:Akhil Das ak...@sigmoidanalytics.com
收件人:罗辉 luohui20...@sina.com
抄送人:user user@spark.apache.org
主题:Re: Re: Re: How to decrease the time of storing block in memory
日期:2015年06月09日 18点05分

Hi 罗辉
I think you interpret the logs wrong. 
Your program actually runs from this point: (Rest of them are just starting up 
stuffs and connecting)
15/06/08 16:14:22 INFO broadcast.TorrentBroadcast: Started reading broadcast 
variable 015/06/08 16:14:23 INFO storage.MemoryStore: ensureFreeSpace(1561) 
called with curMem=0, maxMem=37050384315/06/08 16:14:23 INFO 
storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory 
(estimated size 1561.0 B, free 353.3 MB)15/06/08 16:14:23 INFO 
storage.BlockManagerMaster: Updated info of block broadcast_0_piece015/06/08 
16:14:23 INFO broadcast.TorrentBroadcast: Reading broadcast variable 0 took 967 
ms15/06/08 16:14:23 INFO storage.MemoryStore: ensureFreeSpace(2168) called with 
curMem=1561, maxMem=37050384315/06/08 16:14:23 INFO storage.MemoryStore: Block 
broadcast_0 stored as values in memory (estimated size 2.1 KB, free 353.3 MB)
= At this point it has already stored the broadcast piece in memory. And 
starts your Task 0
15/06/08 16:14:42 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 
0). 693 bytes result sent to driver
= It took 19s to finish your Task 0, and starts Task 1 from this point

15/06/08 16:14:42 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
115/06/08 16:14:42 INFO executor.Executor: Running task 1.0 in stage 0.0 (TID 
1)15/06/08 16:14:56 INFO executor.Executor: Finished task 1.0 in stage 0.0 (TID 
1). 693 bytes result sent to driver15/06/08 16:14:56 INFO 
executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown

Now, to speed up things you need to obtain parallelism (at least 2-3 times the 
number of cores you have), which could mean that your sort.sh is running on a 
single core. You can perhaps instead of triggering an external command try to 
do the operation within spark itself, in that way you can always control the 
parallelism and stuffs. Hope it helps.

ThanksBest Regards

On Tue, Jun 9, 2015 at 3:00 PM,  luohui20...@sina.com wrote:
hi akhil
Not exactly ,the task took 54s to finish, started from 16:14:02 and ended at 
16:14:56. 
within this 54s , it needs 19s to store value in memory, which started from 
16:14:23 and ended at 16:14:42. I think this is the most time-wasting part of 
this task ,also unreasonable.You may check the log attached in previous mail.
and here is my codes:

import org.apache.spark._

object GeneCompare3 {
  def main(args: Array[String]) {
//i:piece number, j:user number
val i = args(0).toInt
val j = args(1)
val conf = new SparkConf().setAppName(CompareGenePiece  + i +  of User  
+ j).setMaster(spark://slave3:7077).set(spark.executor.memory, 2g)
val sc = new SparkContext(conf)
println(start to compare gene)
val runmodifyshell2 = List(run, sort.sh)
val runmodifyshellRDD2 = sc.makeRDD(runmodifyshell2)
val pipeModify2 = runmodifyshellRDD2.pipe(sh /opt/sh/bin/sort.sh 
/opt/data/shellcompare/db/chr + i + .txt /opt/data/shellcompare/data/user + 
j + /pgs/sample/samplechr + i + .txt /opt/data/shellcompare/data/user + j + 
/pgs/intermediateResult/result + i + .txt 600)
pipeModify2.collect()sc.stop()
  }
}



 

Thanksamp;Best regards!
San.Luo

- 原始邮件 -
发件人:Akhil Das ak...@sigmoidanalytics.com
收件人:罗辉 luohui20...@sina.com
抄送人:user user@spark.apache.org
主题:Re: Re: How to decrease the time of storing block in memory
日期:2015年06月09日 16点51分

Is it that task taking 19s? It won't be simply taking 19s to store 2KB of data 
into memory there could be other operations happening too (the transformations 
that you are doing), It would be good if you can paste the code snippet that 
you are running to have a better understanding.ThanksBest Regards

On Tue, Jun 9, 2015 at 2:09 PM,  luohui20...@sina.com wrote:
Only 1 minor GC, 0.07s. 




 

Thanksamp;Best regards!
San.Luo

- 原始邮件 -
发件人:Akhil Das ak...@sigmoidanalytics.com
收件人:罗辉 luohui20...@sina.com
抄送人:user user@spark.apache.org
主题:Re: How to decrease the time of storing block in memory
日期:2015年06月09日 15点02分

May be you should check in your driver UI and see if there's any GC time 
involved etc. ThanksBest Regards

On Mon, Jun 8, 2015 at 5:45 PM,  luohui20...@sina.com wrote:
hi there  I am trying to descrease my app's running time in worker node. I 
checked the log and found the most time-wasting part is below:15/06/08 16:14:23 
INFO storage.MemoryStore: Block broadcast_0 stored as values in memory 
(estimated size 2.1 KB, free 353.3 MB)
15/06/08 16:14:42 INFO executor.Executor: 

Re: Re: Re: How to decrease the time of storing block in memory

2015-06-09 Thread Akhil Das
Hi 罗辉

I think you interpret the logs wrong.

Your program actually runs from this point: (Rest of them are just starting
up stuffs and connecting)

15/06/08 16:14:22 INFO broadcast.TorrentBroadcast: Started reading
broadcast variable 0
15/06/08 16:14:23 INFO storage.MemoryStore: ensureFreeSpace(1561) called
with curMem=0, maxMem=370503843
15/06/08 16:14:23 INFO storage.MemoryStore: Block broadcast_0_piece0 stored
as bytes in memory (estimated size 1561.0 B, free 353.3 MB)
15/06/08 16:14:23 INFO storage.BlockManagerMaster: Updated info of block
broadcast_0_piece0
15/06/08 16:14:23 INFO broadcast.TorrentBroadcast: Reading broadcast
variable 0 took 967 ms
15/06/08 16:14:23 INFO storage.MemoryStore: ensureFreeSpace(2168) called
with curMem=1561, maxMem=370503843
15/06/08 16:14:23 INFO storage.MemoryStore: Block broadcast_0 stored as
values in memory (estimated size 2.1 KB, free 353.3 MB)

= At this point it has already stored the broadcast piece in memory. And
starts your Task 0

15/06/08 16:14:42 INFO executor.Executor: Finished task 0.0 in stage 0.0
(TID 0). 693 bytes result sent to driver

= It took 19s to finish your Task 0, and starts Task 1 from this point

15/06/08 16:14:42 INFO executor.CoarseGrainedExecutorBackend: Got assigned
task 1
15/06/08 16:14:42 INFO executor.Executor: Running task 1.0 in stage 0.0
(TID 1)
15/06/08 16:14:56 INFO executor.Executor: Finished task 1.0 in stage 0.0
(TID 1). 693 bytes result sent to driver
15/06/08 16:14:56 INFO executor.CoarseGrainedExecutorBackend: Driver
commanded a shutdown


Now, to speed up things you need to obtain parallelism (at least 2-3 times
the number of cores you have), which could mean that your sort.sh is
running on a single core. You can perhaps instead of triggering an external
command try to do the operation within spark itself, in that way you can
always control the parallelism and stuffs. Hope it helps.



Thanks
Best Regards

On Tue, Jun 9, 2015 at 3:00 PM, luohui20...@sina.com wrote:

 hi akhil

 Not exactly ,the task took 54s to finish, started from 16:14:02 and ended
 at 16:14:56.

 within this 54s , it needs 19s to store value in memory, which started
 from 16:14:23 and ended at 16:14:42. I think this is the most time-wasting
 part of this task ,also unreasonable.

 You may check the log attached in previous mail.


 and here is my codes:


 import org.apache.spark._

 object GeneCompare3 {
   def main(args: Array[String]) {
 //i:piece number, j:user number
 val i = args(0).toInt
 val j = args(1)
 val conf = new SparkConf().setAppName(CompareGenePiece  + i +  of
 User  + j).setMaster(spark://slave3:7077).set(spark.executor.memory,
 2g)

 val sc = new SparkContext(conf)


 println(start to compare gene)
 val runmodifyshell2 = List(run, sort.sh)
 val runmodifyshellRDD2 = sc.makeRDD(runmodifyshell2)
 val pipeModify2 = runmodifyshellRDD2.pipe(sh /opt/sh/bin/sort.sh
 /opt/data/shellcompare/db/chr + i + .txt
 /opt/data/shellcompare/data/user + j + /pgs/sample/samplechr + i + .txt
 /opt/data/shellcompare/data/user + j + /pgs/intermediateResult/result +
 i + .txt 600)

 pipeModify2.collect()

 sc.stop()
   }
 }


 

 Thanksamp;Best regards!
 San.Luo

 - 原始邮件 -
 发件人:Akhil Das ak...@sigmoidanalytics.com
 收件人:罗辉 luohui20...@sina.com
 抄送人:user user@spark.apache.org
 主题:Re: Re: How to decrease the time of storing block in memory
 日期:2015年06月09日 16点51分

 Is it that task taking 19s? It won't be simply taking 19s to store 2KB of
 data into memory there could be other operations happening too (the
 transformations that you are doing), It would be good if you can paste the
 code snippet that you are running to have a better understanding.

 Thanks
 Best Regards

 On Tue, Jun 9, 2015 at 2:09 PM, luohui20...@sina.com wrote:

 Only 1 minor GC, 0.07s.


 

 Thanksamp;Best regards!
 San.Luo

 - 原始邮件 -
 发件人:Akhil Das ak...@sigmoidanalytics.com
 收件人:罗辉 luohui20...@sina.com
 抄送人:user user@spark.apache.org
 主题:Re: How to decrease the time of storing block in memory
 日期:2015年06月09日 15点02分

 May be you should check in your driver UI and see if there's any GC time
 involved etc.

 Thanks
 Best Regards

 On Mon, Jun 8, 2015 at 5:45 PM, luohui20...@sina.com wrote:

 hi there

   I am trying to descrease my app's running time in worker node. I
 checked the log and found the most time-wasting part is below:

 15/06/08 16:14:23 INFO storage.MemoryStore: Block broadcast_0 stored as
 values in memory (estimated size 2.1 KB, free 353.3 MB)
 15/06/08 16:14:42 INFO executor.Executor: Finished task 0.0 in stage 0.0
 (TID 0). 693 bytes result sent to driver


 I don't know why it needs 19s to storing 2.1KB size data to memory. Is
 there any tuning method?



 The attache is the full log, here it is:

 15/06/08 16:14:02 INFO executor.CoarseGrainedExecutorBackend: Registered
 signal handlers for [TERM, HUP, INT]
 15/06/08 16:14:07 WARN 

回复:Re: Re: How to decrease the time of storing block in memory

2015-06-09 Thread luohui20001
hi akhil
Not exactly ,the task took 54s to finish, started from 16:14:02 and ended at 
16:14:56. 
within this 54s , it needs 19s to store value in memory, which started from 
16:14:23 and ended at 16:14:42. I think this is the most time-wasting part of 
this task ,also unreasonable.You may check the log attached in previous mail.
and here is my codes:

import org.apache.spark._

object GeneCompare3 {
  def main(args: Array[String]) {
//i:piece number, j:user number
val i = args(0).toInt
val j = args(1)
val conf = new SparkConf().setAppName(CompareGenePiece  + i +  of User  
+ j).setMaster(spark://slave3:7077).set(spark.executor.memory, 2g)
val sc = new SparkContext(conf)
println(start to compare gene)
val runmodifyshell2 = List(run, sort.sh)
val runmodifyshellRDD2 = sc.makeRDD(runmodifyshell2)
val pipeModify2 = runmodifyshellRDD2.pipe(sh /opt/sh/bin/sort.sh 
/opt/data/shellcompare/db/chr + i + .txt /opt/data/shellcompare/data/user + 
j + /pgs/sample/samplechr + i + .txt /opt/data/shellcompare/data/user + j + 
/pgs/intermediateResult/result + i + .txt 600)
pipeModify2.collect()sc.stop()
  }
}



 

Thanksamp;Best regards!
San.Luo

- 原始邮件 -
发件人:Akhil Das ak...@sigmoidanalytics.com
收件人:罗辉 luohui20...@sina.com
抄送人:user user@spark.apache.org
主题:Re: Re: How to decrease the time of storing block in memory
日期:2015年06月09日 16点51分

Is it that task taking 19s? It won't be simply taking 19s to store 2KB of data 
into memory there could be other operations happening too (the transformations 
that you are doing), It would be good if you can paste the code snippet that 
you are running to have a better understanding.ThanksBest Regards

On Tue, Jun 9, 2015 at 2:09 PM,  luohui20...@sina.com wrote:
Only 1 minor GC, 0.07s. 




 

Thanksamp;Best regards!
San.Luo

- 原始邮件 -
发件人:Akhil Das ak...@sigmoidanalytics.com
收件人:罗辉 luohui20...@sina.com
抄送人:user user@spark.apache.org
主题:Re: How to decrease the time of storing block in memory
日期:2015年06月09日 15点02分

May be you should check in your driver UI and see if there's any GC time 
involved etc. ThanksBest Regards

On Mon, Jun 8, 2015 at 5:45 PM,  luohui20...@sina.com wrote:
hi there  I am trying to descrease my app's running time in worker node. I 
checked the log and found the most time-wasting part is below:15/06/08 16:14:23 
INFO storage.MemoryStore: Block broadcast_0 stored as values in memory 
(estimated size 2.1 KB, free 353.3 MB)
15/06/08 16:14:42 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 
0). 693 bytes result sent to driver

I don't know why it needs 19s to storing 2.1KB size data to memory. Is there 
any tuning method?

The attache is the full log, here it is:15/06/08 16:14:02 INFO 
executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, 
HUP, INT]
15/06/08 16:14:07 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
15/06/08 16:14:10 INFO spark.SecurityManager: Changing view acls to: root
15/06/08 16:14:10 INFO spark.SecurityManager: Changing modify acls to: root
15/06/08 16:14:10 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(root); users with 
modify permissions: Set(root)
15/06/08 16:14:14 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/06/08 16:14:14 INFO Remoting: Starting remoting
15/06/08 16:14:15 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://driverPropsFetcher@slave5:54684]
15/06/08 16:14:15 INFO util.Utils: Successfully started service 
'driverPropsFetcher' on port 54684.
15/06/08 16:14:16 INFO spark.SecurityManager: Changing view acls to: root
15/06/08 16:14:16 INFO spark.SecurityManager: Changing modify acls to: root
15/06/08 16:14:16 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(root); users with 
modify permissions: Set(root)
15/06/08 16:14:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Shutting down remote daemon.
15/06/08 16:14:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote 
daemon shut down; proceeding with flushing remote transports.
15/06/08 16:14:16 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/06/08 16:14:16 INFO Remoting: Starting remoting
15/06/08 16:14:17 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Remoting shut down.
15/06/08 16:14:17 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sparkExecutor@slave5:49169]
15/06/08 16:14:17 INFO util.Utils: Successfully started service 'sparkExecutor' 
on port 49169.
15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to MapOutputTracker: 
akka.tcp://sparkDriver@slave5:58630/user/MapOutputTracker
15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to BlockManagerMaster: 
akka.tcp://sparkDriver@slave5:58630/user/BlockManagerMaster
15/06/08 

回复:Re: How to decrease the time of storing block in memory

2015-06-09 Thread luohui20001
Only 1 minor GC, 0.07s. 




 

Thanksamp;Best regards!
San.Luo

- 原始邮件 -
发件人:Akhil Das ak...@sigmoidanalytics.com
收件人:罗辉 luohui20...@sina.com
抄送人:user user@spark.apache.org
主题:Re: How to decrease the time of storing block in memory
日期:2015年06月09日 15点02分

May be you should check in your driver UI and see if there's any GC time 
involved etc. ThanksBest Regards

On Mon, Jun 8, 2015 at 5:45 PM,  luohui20...@sina.com wrote:
hi there  I am trying to descrease my app's running time in worker node. I 
checked the log and found the most time-wasting part is below:15/06/08 16:14:23 
INFO storage.MemoryStore: Block broadcast_0 stored as values in memory 
(estimated size 2.1 KB, free 353.3 MB)
15/06/08 16:14:42 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 
0). 693 bytes result sent to driver

I don't know why it needs 19s to storing 2.1KB size data to memory. Is there 
any tuning method?

The attache is the full log, here it is:15/06/08 16:14:02 INFO 
executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, 
HUP, INT]
15/06/08 16:14:07 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
15/06/08 16:14:10 INFO spark.SecurityManager: Changing view acls to: root
15/06/08 16:14:10 INFO spark.SecurityManager: Changing modify acls to: root
15/06/08 16:14:10 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(root); users with 
modify permissions: Set(root)
15/06/08 16:14:14 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/06/08 16:14:14 INFO Remoting: Starting remoting
15/06/08 16:14:15 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://driverPropsFetcher@slave5:54684]
15/06/08 16:14:15 INFO util.Utils: Successfully started service 
'driverPropsFetcher' on port 54684.
15/06/08 16:14:16 INFO spark.SecurityManager: Changing view acls to: root
15/06/08 16:14:16 INFO spark.SecurityManager: Changing modify acls to: root
15/06/08 16:14:16 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(root); users with 
modify permissions: Set(root)
15/06/08 16:14:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Shutting down remote daemon.
15/06/08 16:14:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote 
daemon shut down; proceeding with flushing remote transports.
15/06/08 16:14:16 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/06/08 16:14:16 INFO Remoting: Starting remoting
15/06/08 16:14:17 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Remoting shut down.
15/06/08 16:14:17 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sparkExecutor@slave5:49169]
15/06/08 16:14:17 INFO util.Utils: Successfully started service 'sparkExecutor' 
on port 49169.
15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to MapOutputTracker: 
akka.tcp://sparkDriver@slave5:58630/user/MapOutputTracker
15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to BlockManagerMaster: 
akka.tcp://sparkDriver@slave5:58630/user/BlockManagerMaster
15/06/08 16:14:17 INFO storage.DiskBlockManager: Created local directory at 
/tmp/spark-548b4618-4aba-4b63-9467-381fbfea8d5b/spark-83737dd6-46b0-47ee-82a5-5afee46bdbf5/spark-0fe9d8ba-2910-44a2-bf4f-80d179f5d58b/blockmgr-b4884e7a-2527-447a-9fc5-1823d923c2f1
15/06/08 16:14:17 INFO storage.MemoryStore: MemoryStore started with capacity 
353.3 MB
15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to OutputCommitCoordinator: 
akka.tcp://sparkDriver@slave5:58630/user/OutputCommitCoordinator
15/06/08 16:14:17 INFO executor.CoarseGrainedExecutorBackend: Connecting to 
driver: akka.tcp://sparkDriver@slave5:58630/user/CoarseGrainedScheduler
15/06/08 16:14:18 INFO worker.WorkerWatcher: Connecting to worker 
akka.tcp://sparkWorker@slave5:48926/user/Worker
15/06/08 16:14:18 INFO executor.CoarseGrainedExecutorBackend: Successfully 
registered with driver
15/06/08 16:14:18 INFO executor.Executor: Starting executor ID 0 on host slave5
15/06/08 16:14:18 INFO worker.WorkerWatcher: Successfully connected to 
akka.tcp://sparkWorker@slave5:48926/user/Worker
15/06/08 16:14:21 WARN internal.ThreadLocalRandom: Failed to generate a seed 
from SecureRandom within 3 seconds. Not enough entrophy?
15/06/08 16:14:21 INFO netty.NettyBlockTransferService: Server created on 53449
15/06/08 16:14:21 INFO storage.BlockManagerMaster: Trying to register 
BlockManager
15/06/08 16:14:21 INFO storage.BlockManagerMaster: Registered BlockManager
15/06/08 16:14:21 INFO util.AkkaUtils: Connecting to HeartbeatReceiver: 
akka.tcp://sparkDriver@slave5:58630/user/HeartbeatReceiver
15/06/08 16:14:21 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 0
15/06/08 16:14:21 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
15/06/08 16:14:21 INFO executor.Executor: Fetching 

Re: Re: How to decrease the time of storing block in memory

2015-06-09 Thread Akhil Das
Is it that task taking 19s? It won't be simply taking 19s to store 2KB of
data into memory there could be other operations happening too (the
transformations that you are doing), It would be good if you can paste the
code snippet that you are running to have a better understanding.

Thanks
Best Regards

On Tue, Jun 9, 2015 at 2:09 PM, luohui20...@sina.com wrote:

 Only 1 minor GC, 0.07s.


 

 Thanksamp;Best regards!
 San.Luo

 - 原始邮件 -
 发件人:Akhil Das ak...@sigmoidanalytics.com
 收件人:罗辉 luohui20...@sina.com
 抄送人:user user@spark.apache.org
 主题:Re: How to decrease the time of storing block in memory
 日期:2015年06月09日 15点02分

 May be you should check in your driver UI and see if there's any GC time
 involved etc.

 Thanks
 Best Regards

 On Mon, Jun 8, 2015 at 5:45 PM, luohui20...@sina.com wrote:

 hi there

   I am trying to descrease my app's running time in worker node. I
 checked the log and found the most time-wasting part is below:

 15/06/08 16:14:23 INFO storage.MemoryStore: Block broadcast_0 stored as
 values in memory (estimated size 2.1 KB, free 353.3 MB)
 15/06/08 16:14:42 INFO executor.Executor: Finished task 0.0 in stage 0.0
 (TID 0). 693 bytes result sent to driver


 I don't know why it needs 19s to storing 2.1KB size data to memory. Is
 there any tuning method?



 The attache is the full log, here it is:

 15/06/08 16:14:02 INFO executor.CoarseGrainedExecutorBackend: Registered
 signal handlers for [TERM, HUP, INT]
 15/06/08 16:14:07 WARN util.NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable
 15/06/08 16:14:10 INFO spark.SecurityManager: Changing view acls to: root
 15/06/08 16:14:10 INFO spark.SecurityManager: Changing modify acls to: root
 15/06/08 16:14:10 INFO spark.SecurityManager: SecurityManager:
 authentication disabled; ui acls disabled; users with view permissions:
 Set(root); users with modify permissions: Set(root)
 15/06/08 16:14:14 INFO slf4j.Slf4jLogger: Slf4jLogger started
 15/06/08 16:14:14 INFO Remoting: Starting remoting
 15/06/08 16:14:15 INFO Remoting: Remoting started; listening on addresses
 :[akka.tcp://driverPropsFetcher@slave5:54684]
 15/06/08 16:14:15 INFO util.Utils: Successfully started service
 'driverPropsFetcher' on port 54684.
 15/06/08 16:14:16 INFO spark.SecurityManager: Changing view acls to: root
 15/06/08 16:14:16 INFO spark.SecurityManager: Changing modify acls to: root
 15/06/08 16:14:16 INFO spark.SecurityManager: SecurityManager:
 authentication disabled; ui acls disabled; users with view permissions:
 Set(root); users with modify permissions: Set(root)
 15/06/08 16:14:16 INFO remote.RemoteActorRefProvider$RemotingTerminator:
 Shutting down remote daemon.
 15/06/08 16:14:16 INFO remote.RemoteActorRefProvider$RemotingTerminator:
 Remote daemon shut down; proceeding with flushing remote transports.
 15/06/08 16:14:16 INFO slf4j.Slf4jLogger: Slf4jLogger started
 15/06/08 16:14:16 INFO Remoting: Starting remoting
 15/06/08 16:14:17 INFO remote.RemoteActorRefProvider$RemotingTerminator:
 Remoting shut down.
 15/06/08 16:14:17 INFO Remoting: Remoting started; listening on addresses
 :[akka.tcp://sparkExecutor@slave5:49169]
 15/06/08 16:14:17 INFO util.Utils: Successfully started service
 'sparkExecutor' on port 49169.
 15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to MapOutputTracker:
 akka.tcp://sparkDriver@slave5:58630/user/MapOutputTracker
 15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to BlockManagerMaster:
 akka.tcp://sparkDriver@slave5:58630/user/BlockManagerMaster
 15/06/08 16:14:17 INFO storage.DiskBlockManager: Created local directory
 at
 /tmp/spark-548b4618-4aba-4b63-9467-381fbfea8d5b/spark-83737dd6-46b0-47ee-82a5-5afee46bdbf5/spark-0fe9d8ba-2910-44a2-bf4f-80d179f5d58b/blockmgr-b4884e7a-2527-447a-9fc5-1823d923c2f1
 15/06/08 16:14:17 INFO storage.MemoryStore: MemoryStore started with
 capacity 353.3 MB
 15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to
 OutputCommitCoordinator: akka.tcp://sparkDriver@slave5
 :58630/user/OutputCommitCoordinator
 15/06/08 16:14:17 INFO executor.CoarseGrainedExecutorBackend: Connecting
 to driver: akka.tcp://sparkDriver@slave5:58630/user/CoarseGrainedScheduler
 15/06/08 16:14:18 INFO worker.WorkerWatcher: Connecting to worker
 akka.tcp://sparkWorker@slave5:48926/user/Worker
 15/06/08 16:14:18 INFO executor.CoarseGrainedExecutorBackend: Successfully
 registered with driver
 15/06/08 16:14:18 INFO executor.Executor: Starting executor ID 0 on host
 slave5
 15/06/08 16:14:18 INFO worker.WorkerWatcher: Successfully connected to
 akka.tcp://sparkWorker@slave5:48926/user/Worker
 15/06/08 16:14:21 WARN internal.ThreadLocalRandom: Failed to generate a
 seed from SecureRandom within 3 seconds. Not enough entrophy?
 15/06/08 16:14:21 INFO netty.NettyBlockTransferService: Server created on
 53449
 15/06/08 16:14:21 INFO storage.BlockManagerMaster: Trying to register
 BlockManager
 15/06/08 16:14:21 INFO 

Re: How to decrease the time of storing block in memory

2015-06-09 Thread Akhil Das
May be you should check in your driver UI and see if there's any GC time
involved etc.

Thanks
Best Regards

On Mon, Jun 8, 2015 at 5:45 PM, luohui20...@sina.com wrote:

 hi there

   I am trying to descrease my app's running time in worker node. I
 checked the log and found the most time-wasting part is below:

 15/06/08 16:14:23 INFO storage.MemoryStore: Block broadcast_0 stored as
 values in memory (estimated size 2.1 KB, free 353.3 MB)
 15/06/08 16:14:42 INFO executor.Executor: Finished task 0.0 in stage 0.0
 (TID 0). 693 bytes result sent to driver


 I don't know why it needs 19s to storing 2.1KB size data to memory. Is
 there any tuning method?



 The attache is the full log, here it is:

 15/06/08 16:14:02 INFO executor.CoarseGrainedExecutorBackend: Registered
 signal handlers for [TERM, HUP, INT]
 15/06/08 16:14:07 WARN util.NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable
 15/06/08 16:14:10 INFO spark.SecurityManager: Changing view acls to: root
 15/06/08 16:14:10 INFO spark.SecurityManager: Changing modify acls to: root
 15/06/08 16:14:10 INFO spark.SecurityManager: SecurityManager:
 authentication disabled; ui acls disabled; users with view permissions:
 Set(root); users with modify permissions: Set(root)
 15/06/08 16:14:14 INFO slf4j.Slf4jLogger: Slf4jLogger started
 15/06/08 16:14:14 INFO Remoting: Starting remoting
 15/06/08 16:14:15 INFO Remoting: Remoting started; listening on addresses
 :[akka.tcp://driverPropsFetcher@slave5:54684]
 15/06/08 16:14:15 INFO util.Utils: Successfully started service
 'driverPropsFetcher' on port 54684.
 15/06/08 16:14:16 INFO spark.SecurityManager: Changing view acls to: root
 15/06/08 16:14:16 INFO spark.SecurityManager: Changing modify acls to: root
 15/06/08 16:14:16 INFO spark.SecurityManager: SecurityManager:
 authentication disabled; ui acls disabled; users with view permissions:
 Set(root); users with modify permissions: Set(root)
 15/06/08 16:14:16 INFO remote.RemoteActorRefProvider$RemotingTerminator:
 Shutting down remote daemon.
 15/06/08 16:14:16 INFO remote.RemoteActorRefProvider$RemotingTerminator:
 Remote daemon shut down; proceeding with flushing remote transports.
 15/06/08 16:14:16 INFO slf4j.Slf4jLogger: Slf4jLogger started
 15/06/08 16:14:16 INFO Remoting: Starting remoting
 15/06/08 16:14:17 INFO remote.RemoteActorRefProvider$RemotingTerminator:
 Remoting shut down.
 15/06/08 16:14:17 INFO Remoting: Remoting started; listening on addresses
 :[akka.tcp://sparkExecutor@slave5:49169]
 15/06/08 16:14:17 INFO util.Utils: Successfully started service
 'sparkExecutor' on port 49169.
 15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to MapOutputTracker:
 akka.tcp://sparkDriver@slave5:58630/user/MapOutputTracker
 15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to BlockManagerMaster:
 akka.tcp://sparkDriver@slave5:58630/user/BlockManagerMaster
 15/06/08 16:14:17 INFO storage.DiskBlockManager: Created local directory
 at
 /tmp/spark-548b4618-4aba-4b63-9467-381fbfea8d5b/spark-83737dd6-46b0-47ee-82a5-5afee46bdbf5/spark-0fe9d8ba-2910-44a2-bf4f-80d179f5d58b/blockmgr-b4884e7a-2527-447a-9fc5-1823d923c2f1
 15/06/08 16:14:17 INFO storage.MemoryStore: MemoryStore started with
 capacity 353.3 MB
 15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to
 OutputCommitCoordinator: akka.tcp://sparkDriver@slave5
 :58630/user/OutputCommitCoordinator
 15/06/08 16:14:17 INFO executor.CoarseGrainedExecutorBackend: Connecting
 to driver: akka.tcp://sparkDriver@slave5:58630/user/CoarseGrainedScheduler
 15/06/08 16:14:18 INFO worker.WorkerWatcher: Connecting to worker
 akka.tcp://sparkWorker@slave5:48926/user/Worker
 15/06/08 16:14:18 INFO executor.CoarseGrainedExecutorBackend: Successfully
 registered with driver
 15/06/08 16:14:18 INFO executor.Executor: Starting executor ID 0 on host
 slave5
 15/06/08 16:14:18 INFO worker.WorkerWatcher: Successfully connected to
 akka.tcp://sparkWorker@slave5:48926/user/Worker
 15/06/08 16:14:21 WARN internal.ThreadLocalRandom: Failed to generate a
 seed from SecureRandom within 3 seconds. Not enough entrophy?
 15/06/08 16:14:21 INFO netty.NettyBlockTransferService: Server created on
 53449
 15/06/08 16:14:21 INFO storage.BlockManagerMaster: Trying to register
 BlockManager
 15/06/08 16:14:21 INFO storage.BlockManagerMaster: Registered BlockManager
 15/06/08 16:14:21 INFO util.AkkaUtils: Connecting to HeartbeatReceiver:
 akka.tcp://sparkDriver@slave5:58630/user/HeartbeatReceiver
 15/06/08 16:14:21 INFO executor.CoarseGrainedExecutorBackend: Got assigned
 task 0
 15/06/08 16:14:21 INFO executor.Executor: Running task 0.0 in stage 0.0
 (TID 0)
 15/06/08 16:14:21 INFO executor.Executor: Fetching
 http://192.168.100.11:50648/jars/ShellCompare.jar with timestamp
 1433751228699
 15/06/08 16:14:21 INFO util.Utils: Fetching
 http://192.168.100.11:50648/jars/ShellCompare.jar to
 

How to decrease the time of storing block in memory

2015-06-08 Thread luohui20001
hi there  I am trying to descrease my app's running time in worker node. I 
checked the log and found the most time-wasting part is below:15/06/08 16:14:23 
INFO storage.MemoryStore: Block broadcast_0 stored as values in memory 
(estimated size 2.1 KB, free 353.3 MB)
15/06/08 16:14:42 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 
0). 693 bytes result sent to driver

I don't know why it needs 19s to storing 2.1KB size data to memory. Is there 
any tuning method?

The attache is the full log, here it is:15/06/08 16:14:02 INFO 
executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, 
HUP, INT]
15/06/08 16:14:07 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
15/06/08 16:14:10 INFO spark.SecurityManager: Changing view acls to: root
15/06/08 16:14:10 INFO spark.SecurityManager: Changing modify acls to: root
15/06/08 16:14:10 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(root); users with 
modify permissions: Set(root)
15/06/08 16:14:14 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/06/08 16:14:14 INFO Remoting: Starting remoting
15/06/08 16:14:15 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://driverPropsFetcher@slave5:54684]
15/06/08 16:14:15 INFO util.Utils: Successfully started service 
'driverPropsFetcher' on port 54684.
15/06/08 16:14:16 INFO spark.SecurityManager: Changing view acls to: root
15/06/08 16:14:16 INFO spark.SecurityManager: Changing modify acls to: root
15/06/08 16:14:16 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(root); users with 
modify permissions: Set(root)
15/06/08 16:14:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Shutting down remote daemon.
15/06/08 16:14:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote 
daemon shut down; proceeding with flushing remote transports.
15/06/08 16:14:16 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/06/08 16:14:16 INFO Remoting: Starting remoting
15/06/08 16:14:17 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Remoting shut down.
15/06/08 16:14:17 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sparkExecutor@slave5:49169]
15/06/08 16:14:17 INFO util.Utils: Successfully started service 'sparkExecutor' 
on port 49169.
15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to MapOutputTracker: 
akka.tcp://sparkDriver@slave5:58630/user/MapOutputTracker
15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to BlockManagerMaster: 
akka.tcp://sparkDriver@slave5:58630/user/BlockManagerMaster
15/06/08 16:14:17 INFO storage.DiskBlockManager: Created local directory at 
/tmp/spark-548b4618-4aba-4b63-9467-381fbfea8d5b/spark-83737dd6-46b0-47ee-82a5-5afee46bdbf5/spark-0fe9d8ba-2910-44a2-bf4f-80d179f5d58b/blockmgr-b4884e7a-2527-447a-9fc5-1823d923c2f1
15/06/08 16:14:17 INFO storage.MemoryStore: MemoryStore started with capacity 
353.3 MB
15/06/08 16:14:17 INFO util.AkkaUtils: Connecting to OutputCommitCoordinator: 
akka.tcp://sparkDriver@slave5:58630/user/OutputCommitCoordinator
15/06/08 16:14:17 INFO executor.CoarseGrainedExecutorBackend: Connecting to 
driver: akka.tcp://sparkDriver@slave5:58630/user/CoarseGrainedScheduler
15/06/08 16:14:18 INFO worker.WorkerWatcher: Connecting to worker 
akka.tcp://sparkWorker@slave5:48926/user/Worker
15/06/08 16:14:18 INFO executor.CoarseGrainedExecutorBackend: Successfully 
registered with driver
15/06/08 16:14:18 INFO executor.Executor: Starting executor ID 0 on host slave5
15/06/08 16:14:18 INFO worker.WorkerWatcher: Successfully connected to 
akka.tcp://sparkWorker@slave5:48926/user/Worker
15/06/08 16:14:21 WARN internal.ThreadLocalRandom: Failed to generate a seed 
from SecureRandom within 3 seconds. Not enough entrophy?
15/06/08 16:14:21 INFO netty.NettyBlockTransferService: Server created on 53449
15/06/08 16:14:21 INFO storage.BlockManagerMaster: Trying to register 
BlockManager
15/06/08 16:14:21 INFO storage.BlockManagerMaster: Registered BlockManager
15/06/08 16:14:21 INFO util.AkkaUtils: Connecting to HeartbeatReceiver: 
akka.tcp://sparkDriver@slave5:58630/user/HeartbeatReceiver
15/06/08 16:14:21 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 0
15/06/08 16:14:21 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
15/06/08 16:14:21 INFO executor.Executor: Fetching 
http://192.168.100.11:50648/jars/ShellCompare.jar with timestamp 1433751228699
15/06/08 16:14:21 INFO util.Utils: Fetching 
http://192.168.100.11:50648/jars/ShellCompare.jar to 
/tmp/spark-548b4618-4aba-4b63-9467-381fbfea8d5b/spark-83737dd6-46b0-47ee-82a5-5afee46bdbf5/spark-99aba156-5803-41b8-9df1-3e36305f43c3/fetchFileTemp8512004298922421624.tmp
15/06/08 16:14:21 INFO util.Utils: Copying