[ 
https://issues.apache.org/jira/browse/TEZ-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2775:
----------------------------------
    Attachment: TEZ-2775.3.txt

- Added logProgress() in ShuffleEventHandler. Added it in 
UnorderedKVInput.close(), ShuffleEventHandlerImpl.handleEvent, 
ShuffleInputEventHandlerOrderedGrouped.handleEvent. By default it should print 
every 50 times; it would print it before closing as well.
- Added logProgress() in ShuffleScheduler's close() call as well. In case of 
any abrupt failures, it should log how many are copied.
- Retained logs in MergeManager.closeInMemoryFile.
- Left "FetcherOrderedGrouped for decomp len" unchanged as it ends up occupying 
lots of space. However, this information would be useful for debugging corner 
cases issues of downloading multiple attempts from same connection. (Since it 
is not a common scenario, retaining at DEBUG level itself)
- No changes to ShuffleUtils:logIndividualFetchComplete() (need to change perf 
tool later)


Without Patch: Query_75 @ 10 TB scale (with hive patch for l4j in SMB)
application_1439860407967_1259: 418,061,352 (compressed) : 6,375,796,132 
(uncompressed)

With .1 Patch:
application_1439860407967_1260: 142,492,745 (compressed) : 1,365,908,923 
(uncompressed)

With .3 patch:
application_1439860407967_1280: 219,133,295 (compressed) : 2,410,892,797 
(uncompressed)


> Reduce logging in runtime components
> ------------------------------------
>
>                 Key: TEZ-2775
>                 URL: https://issues.apache.org/jira/browse/TEZ-2775
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>         Attachments: TEZ-2775.1.txt, TEZ-2775.3.txt
>
>
> Specifically Shuffle, which logs a lot for each event being processed and 
> data being fetched.
> Also PipelinedShuffle is fairly noisy - some of the information from here 
> could be consolidated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to