Sergey created PIG-3416:
---------------------------

             Summary: Looks like MultiStore stucks
                 Key: PIG-3416
                 URL: https://issues.apache.org/jira/browse/PIG-3416
             Project: Pig
          Issue Type: Bug
          Components: data
    Affects Versions: 0.11
            Reporter: Sergey


Hi, I've met strange problem. Maybe it's related to data. BUt I'm not sure. I'm 
working with derivative in avro format so all "bad data" should be caught on 
early stages.
My pig script worked to 2 days each hour (invoked using oozie coordinator).
Now it stucks. It always have one reducer which shows progess = 67.55%
I see in TT log that it does merge, sort, then starts reduce.

I do use custom UDF in my pig script.
I've added counters trying to debug the situation.My UDF works with bags.
Counter says that UDF worked fine because "Reduce input groups" = "invocation 
times of UDF".

I even see counters of output:
{code}
Map-Reduce Framework
Combine input records   0
Combine output records  0
Reduce input groups     31 019
Reduce shuffle bytes    65 071 957
Reduce input records    96 727
Reduce output records   0
Spilled Records 0
CPU time spent (ms)     48 870
Physical memory (bytes) snapshot        643 358 720
Virtual memory (bytes) snapshot 3 821 920 256
Total committed heap usage (bytes)      1 057 357 824

MarkEndPointsForCurrentHour
callTimes       31 019
totalExecutionTime      13 690

MultiStoreCounters
Output records in _0_08 26 862
Output records in _1_09 7 383
{code}
Counters say that all data passed through my UDF and even some output has been 
written.
But reducer (always only 1 of 54 total reducers) stucks for 1 hour an then 
killed by JT because of timeout. All other 53 reducers finished in 7 minutes.

How can I debug MultiStore?


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to