Try tez.task.launch.cmd-opts = '-Xprof ...rest of your java opts....' -agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s is very costly and time consuming. -Xprof is very lightweight. http://stackoverflow.com/questions/32083547/explanation-of-java-xprof-output/32087981#32087981 has information on interpreting the results.
On Wed, Jan 18, 2017 at 4:09 PM, Piyush Narang <[email protected]> wrote: > hi folks, > > I had a couple of Cascading3 on Tez jobs that seemed to be running slower > on Tez as compared to Hadoop. Wanted to try and get some hprof profiles to > see what the jobs are spending time on, so thought I'd try and get a hprof > profile. I tried running the job with: > > -Dmapreduce.task.profile=true \ > > -Dtez.task.launch.cluster-default.cmd-opts="-XX:+UseSerialGC > -Djava.net.preferIPv4Stack=true -XX:ReservedCodeCacheSize=128M > -XX:MaxMetaspaceSize=256M -XX:CompressedClassSpaceSize=256M > -XX:CICompilerCount=2 -XX:HeapDumpPath=<LOG_DIR>/heapdump-@[email protected] > -XX:ErrorFile=<LOG_DIR>/hs_err_pid-@[email protected] > -agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y, > verbose=n,file=<LOG_DIR>/profile-@[email protected]" > Now when I kick off the Tez job, it seems to spend very long (upwards of > an hour) stuck at 0%. The tasks don't seem to proceed beyond this: > > 2017-01-18 23:53:26,261 [INFO] [TezChild] |tez.FlowProcessor|: flow node id: > E08C07BFB10141D8B6D7211E5AF172E4, all 1 inputs ready in: 00:00:00.002 > > > Tried capturing some jstacks and the top of the stack seems to hold some hprof > > related frames. > > > Has anyone been able to profile their Tez jobs with hprof? Are there any other > > settings I'm missing? > > > Same set of options(replace tez.task.launch.cluster-default.cmd-opts with > mapreduce.task.profile.params) seem to work fine in case of Hadoop. I end up > getting the > > hprof profiles there. > > > Thanks, > > -- > - Piyush >
