[ 
https://issues.apache.org/jira/browse/TEZ-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519864#comment-14519864
 ] 

Hitesh Shah commented on TEZ-2378:
----------------------------------

bq. These are normal exceptions when fetcher is unable to get the data from 
local system

I don't understand. Why is the fetcher failing to read off the local 
filesystem? That should be a cause for concern I would assume. A distraction 
yes when you know that there is something else wrong but a problem regardless 
if this is happening in the first place.

Also, if the local fetch fails, do we error out as falling back to the http 
fetch would likely hit the same error?

bq. 2015-04-28 05:41:45,487 WARN [Fetcher [Map_5] #15] shuffle.Fetcher: Failed 
to shuffle output of InputAttemptIdentifier [inputIdentifier=InputIdentifier 
[inputIndex=81], attemptNumber=0, 
pathComponent=attempt_1429683757595_0485_1_03_000081_0_10003, 
fetchTypeInfo=FINAL_MERGE_ENABLED, spillEventId=-1] from 
cn047-10.l42scl.hortonworks.com(local fetch)
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
output/attempt_1429683757595_0485_1_03_000081_0_10003/file.out.index in any of 
the configured local directories

Also, I am assuming that we are not making calls to the local fetcher when the 
data is remote or if the output size is 0? So, the above error should only be 
seen when the map output somehow disappeared off the local disk?



> In case Fetcher (unordered) fails to do local fetch, log in debug mode to 
> reduce log size
> -----------------------------------------------------------------------------------------
>
>                 Key: TEZ-2378
>                 URL: https://issues.apache.org/jira/browse/TEZ-2378
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>
> Following can be logged as debug mode as opposed to WARN level. May be 
> counters can be added later to track the number of times it failed to do 
> local-fetch.
> {noformat}
> 2015-04-28 05:41:45,487 WARN [Fetcher [Map_5] #15] shuffle.Fetcher: Failed to 
> shuffle output of InputAttemptIdentifier [inputIdentifier=InputIdentifier 
> [inputIndex=81], attemptNumber=0, 
> pathComponent=attempt_1429683757595_0485_1_03_000081_0_10003, 
> fetchTypeInfo=FINAL_MERGE_ENABLED, spillEventId=-1] from 
> cn047-10.l42scl.hortonworks.com(local fetch)
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
> output/attempt_1429683757595_0485_1_03_000081_0_10003/file.out.index in any 
> of the configured local directories
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:449)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:164)
>         at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.getShuffleInputFileName(Fetcher.java:612)
>         at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.getTezIndexRecord(Fetcher.java:592)
>         at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.doLocalDiskFetch(Fetcher.java:537)
>         at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.doSharedFetch(Fetcher.java:353)
>         at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:192)
>         at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:72)
>         at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to