Sebastian Nagel created NUTCH-3059:
--------------------------------------

             Summary: Generator: selector job does not count reduce output 
records
                 Key: NUTCH-3059
                 URL: https://issues.apache.org/jira/browse/NUTCH-3059
             Project: Nutch
          Issue Type: Bug
          Components: generator
    Affects Versions: 1.20
            Reporter: Sebastian Nagel
             Fix For: 1.21


The selector step (job) of the Generator does not count the reduce output 
records resp. shows the count "0":
{noformat}
2024-06-05 13:57:09,299 INFO o.a.n.c.Generator [main] Generator: starting

2024-06-05 13:57:09,299 INFO o.a.n.c.Generator [main] Generator: selecting 
best-scoring urls due for fetch.
...
         Map-Reduce Framework
                Map input records=6
                Map output records=6
                ...
                Combine input records=0
                Combine output records=0
                Reduce input groups=1
                Reduce shuffle bytes=594
                Reduce input records=6
                Reduce output records=0
                Spilled Records=12
                ...
{noformat}
Not a big issue but should investigate why this happens. The other counters 
seem to work properly, also the partitioner job shows the reduce output 
records. The issue is observed in local and distributed mode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to