[ 
https://issues.apache.org/jira/browse/TEZ-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-1566:
----------------------------------
    Attachment: TEZ-1566.1.patch

Post some analysis done on local cluster

Package, Size of logs, number of INFO log lines
======================================
(org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler:,1300035426,3252238)
(org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler:,858108834,2962423)
(org.apache.tez.runtime.library.common.sort.impl.IFile:,305599064,1376503)
(org.apache.tez.runtime.library.common.shuffle.impl.Fetcher:,270034851,736752)
(org.apache.tez.runtime.library.common.shuffle.impl.MergeManager:,112706088,395205)
(org.apache.tez.runtime.library.shuffle.common.HttpConnection:,104086919,293698)
(org.apache.tez.runtime.LogicalIOProcessorRuntimeTask:,17632363,78517)
(org.apache.tez.dag.history.HistoryEventHandler:,10456251,8040)


analyzed a job log which ran at 10 TB scale (application_1411511669099_0978):
Time taken: 2799.673 seconds
(org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler:,4220211264,12618536)
(org.apache.tez.runtime.library.common.shuffle.impl.Fetcher:,2708017450,7370119)
(org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler:,1714279150,3975012)
(org.apache.tez.runtime.library.common.shuffle.impl.MergeManager:,1096099941,3733091)
(org.apache.tez.runtime.library.common.sort.impl.IFile:,838530582,3631986)
(org.apache.tez.runtime.library.shuffle.common.HttpConnection:,646022486,1537542)
(org.apache.tez.runtime.LogicalIOProcessorRuntimeTask:,32617293,141620)
(org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter:,31282080,209802)
(org.apache.tez.dag.history.HistoryEventHandler:,21952366,18959)


Post fix (same job @ 10 TB scale): application_1411511669099_0980
Time taken: 2245.886 seconds
(org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler:,3185060504,9118344)
(org.apache.tez.runtime.library.common.shuffle.impl.Fetcher:,2697984459,7443895)
(org.apache.tez.runtime.library.common.shuffle.impl.MergeManager:,1107250913,3770807)
(org.apache.tez.runtime.library.shuffle.common.HttpConnection:,505282991,1046448)
(org.apache.tez.runtime.LogicalIOProcessorRuntimeTask:,32956738,143419)
(org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter:,26897012,172731)
(org.apache.tez.dag.history.HistoryEventHandler:,21827554,18819)

Job runtime difference is around 25% and overall log size is also reduced by 
around 30-33%.  Job runtime difference could be mainly attributed by moving 
logs in ShuffleEventInputHandler, ShuffleScheduler to debug level (attributing 
to faster processing of events instead of waiting for syncs of log4j). 

Attaching first version of the patch.  [~sseth] - Can you please review? import 
changes were mainly due to auto-import rearrange of intellj.

> Reduce log verbosity
> --------------------
>
>                 Key: TEZ-1566
>                 URL: https://issues.apache.org/jira/browse/TEZ-1566
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Siddharth Seth
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-1566.1.patch
>
>
> We generate 40+GB of logs for a simple 20K X 1K sleep job, at INFO level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to