2010/10/26 Johannes.Lichtenberger <[email protected]>: > On 10/26/2010 07:39 AM, Paweł Łoziński wrote: >> Hi, >> >> the framework doesn't give you the first/last information about reduce >> job you perform in your reducer. Just as the mapper doesn't give you >> information whether the (key, value) pair passed to map function is >> first/last for a given key. However you can workaround this by adding >> special values to your data, e.g. <page><id>0</id>... and >> <page><id>Long.MAX_VALUE</id>.... When you encounter those in your >> reducer, you know you are at the beginning/end of your data and you >> can emit <root> and </root>. > > This wouldn't work, since it might as well be possible that the last > value isn't Long.MAX_VALUE. >
The idea is to choose such a special-value, that the last value in your data will be definitely smaller. In case of 64bit numerical values this would be Long.MAX_VALUE, generally speaking - the last value in the possible range of values (or better: the last value +1). Then you can be sure the reducer will process it as the last value, and emit </root> to your output. Of course, if you have multiple reducers, the closing tag will appear only in the output of one of them.
