On Sep 5, 2010, at 2:59 PM, Sean Owen wrote: > Having an interesting issue fixing up a technical point in the class > SequenceFileVectorIterable. Its Iterator is wrong in that hasNext() > advances the iteration and next() doesn't. There's a way to fix this > easily: the Iterator just needs to always read one item ahead to know > whether a next one exists. > > However doing this the straightforward way, the Iterator doesn't know > the current key/value it's on -- always the next one. This is an issue > since in the one usage of this class, in VectorDumper, the current key > is accessed for printing. > > The Iterator could just store the last key/value it saw. However, the > key can be an arbitrary Writable. To do this correctly, the Writable > would have to be Cloneable and be clone()-ed, which is not guaranteed > and maybe undesirable. > > 1. We can remove the option in VectorDumper to print keys to fix this, > since that's the only thing that wants to read the current key. How > bad is that? > 2. We can iterate directly over the SequenceFile in VectorDumper to > get desired behavior. OK? Then actually SequenceFileVectorIterable > goes away.
For some reason, I recall the SFVI being useful for other things, but I'm failing to recall them at the moment.