Hi, I'm fine with defaulting to Avro 1.5.4 -- as I understand it, this doesn't get in the way of people pulling in a newer version of Avro in their own poms, so I don't see it as a problem.
I personally think that we need to resolve CRUNCH-23 before we do a release -- however, I think that there are still some compatibility issues with the patch in its current state, and I can't commit to looking into it much in the coming days. I also don't want to really delay the release. What I'd like to propose is that we just set the sort to use a single reducer for now -- this way the total order sorting will work, but just be less efficient. I think that having a slow sort gives a better impression than having a broken sort, especially considering that sort is a base operation that other operations might be built on top of, so if we have a broken sort it could result in some very hard-to-find issues for users. Other than that, I'm all for a release, and very excited about reaching that milestone! Hope everyone is enjoying their vacation (and actually not reading this until they're back from vacation). - Gabriel On Wednesday 22 August 2012 at 08:50, Josh Wills wrote: > Hey all, > > I just committed CRUNCH-16, which was the last of our open issues that I > wanted to resolve before our first release. Although I look forward to the > total ordering sort in CRUNCH-23 and refactoring the planner in CRUNCH-34, > I feel fine holding off on them until the next release. If any of you feel > differently or have any other features/bug fixes that you would like to get > in, now would be a good time to discuss them and give an ETA on their > arrival. > > Following Matthias' release proposal, we should create a release branch and > do the final preparations for the release against it. In my mind, that > consists of removing the SNAPSHOT labels from the POMs and, more > importantly, deciding on the Avro versions that will be supported in 0.3.0. > For most of the release, we've been working against 1.6.2 or 1.7.0. But > Hadoop 2.0.0-alpha runs against Avro 1.5.x, which is not API compatible > with either, and has certain limitations w/respect to mixing specific and > reflection-based schemas that can cause problems in certain use cases. > Fortunately, Gabriel's changes as part of CRUNCH-16 dynamically check the > functionality that the version of Avro that is being used supports and > adapts our handling of them accordingly. > > There is no reason that we couldn't run with Avro 1.7.x on Hadoop 1.0.3 and > run on Avro 1.5.x on Hadoop 2.0.0-alpha, but it feels a little odd to have > users go backwards in terms of the capabilities of the API when they move > to a later version of Hadoop. Therefore, I think that the release branch > for Crunch 0.3.0 should default to using Avro 1.5.4 for both 1.0.3 and > 2.0.0-alpha, in order to minimize the surprise that a new user would > encounter in working with the release. We can of course have documentation > on the Wiki explaining the issue and notifying users of how to upgrade > their Avro version by changing the pom, as we verified that CRUNCH-16 will > work on Avro 1.5.x, 1.6.x, and 1.7.x. > > I am on vacation tomorrow through Sunday and will be out of phone/email/IM > contact for that entire time. (I'm really looking forward to the downtime.) > > It feels good to be close to a release. I'll check back in with the list on > Sunday evening to see where everyone is at. > > J
