On Sep 5, 2010, at 2:59 PM, Sean Owen wrote:

> Having an interesting issue fixing up a technical point in the class
> SequenceFileVectorIterable. Its Iterator is wrong in that hasNext()
> advances the iteration and next() doesn't. There's a way to fix this
> easily: the Iterator just needs to always read one item ahead to know
> whether a next one exists.
> 
> However doing this the straightforward way, the Iterator doesn't know
> the current key/value it's on -- always the next one. This is an issue
> since in the one usage of this class, in VectorDumper, the current key
> is accessed for printing.
> 
> The Iterator could just store the last key/value it saw. However, the
> key can be an arbitrary Writable. To do this correctly, the Writable
> would have to be Cloneable and be clone()-ed, which is not guaranteed
> and maybe undesirable.
> 
> 1. We can remove the option in VectorDumper to print keys to fix this,
> since that's the only thing that wants to read the current key. How
> bad is that?
> 2. We can iterate directly over the SequenceFile in VectorDumper to
> get desired behavior. OK? Then actually SequenceFileVectorIterable
> goes away.

For some reason, I recall the SFVI being useful for other things, but I'm 
failing to recall them at the moment.

Reply via email to