Bug in DeleteDuplicates.java ?
this function throws IOException. Why? public long getPos() throws IOException { return (doc*INDEX_LENGTH)/maxDoc; } It should be throwing ArithmeticException What happens when maxDoc is zero? Gal
Re: Bug in DeleteDuplicates.java ?
Gal Nitzan wrote: this function throws IOException. Why? public long getPos() throws IOException { return (doc*INDEX_LENGTH)/maxDoc; } It should be throwing ArithmeticException The IOException is required by the API of RecordReader. What happens when maxDoc is zero? Ka-boom! ;-) You're right, this should be wrapped in an IOException and rethrown. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: Bug in DeleteDuplicates.java ?
Andrzej Bialecki wrote: Gal Nitzan wrote: this function throws IOException. Why? public long getPos() throws IOException { return (doc*INDEX_LENGTH)/maxDoc; } It should be throwing ArithmeticException The IOException is required by the API of RecordReader. What happens when maxDoc is zero? Ka-boom! ;-) You're right, this should be wrapped in an IOException and rethrown. No, it should really just be fixed to not cause an ArithmeticException. This is called to report progress. In this case the input "file" for the map is a Lucene index whose documents we iterate through. To simplify the construction of input splits (without opening each index) a constant "length" is used for each "file". So we have to scale the document numbers to give progress in this range. The problem is that progress may be reported even when there are no documents in the index. So the call is valid and no exception should be thrown. Doug