On Mon, Dec 13, 2010 at 10:08 PM, Owen O'Malley <[email protected]> wrote:
> > On Dec 7, 2010, at 2:37 PM, Roy T. Fielding wrote: > > The proposal is to change the extension mechanism incompatibly with >>> unclear benefits, >>> >> >> Good, these are technical reasons. The benefits can be cleared by docs. >> By incompatible, I assume you mean forward-compatibility of old versions >> of Hadoop reading newer files. Can we fix that by having the new >> implementation use the old file format by default until it is configured >> to use one of the new interfaces for writing? >> > > > There are two goals here. The first is to extend the serialization plugin > interface. The current patch does things completely compatibly including a > shim that will use the previous plugins to satisfy the new API. The benefits > are also clear. Avro serialization is possible when it wasn't previously. It > also provides a wide range of opportunities that weren't previously > possible. > > The file format was changed as a demonstration that the serialization > interface was useful and complete. The file change is also backwards > compatible and will automatically read old versions of the file. Old > versions of the code will complain with an error message if they are given a > new version. This is exactly the pattern we have used in the past. > > So, no there are no technical issues with the patch as it stands. One of the technical issues is the fact that this precludes users from using PB (or thrift or avro) in their jobs if the version required conflicts with what Hadoop proper has on the classpath. We've already seen these kinds of conflicts with other libraries in the wild and I would like to minimize this possibility in the future. Was there something in the patch that addressed this (I may have missed it; only did a cursory scan through)? Jumping back to the "non-technical" issue, I really think it would help to develop a course of action for resolution similar to what I suggested earlier. It doesn't need to be specifically what I suggested, but I do think that consensus building and conflict resolution are in the best interest of the community. I feel like we could debate what people said, did, meant, or the specifics of this issue for a long time. Thanks and regards. -- Eric Sammer twitter: esammer data: www.cloudera.com
