I was reviewing a video from Hadoop Summit 2011[1] where Arun Murthy mentioned that MRv2 was moving towards protocol buffers as the wire format but I feel like this is contrary to an Avro presentation that Doug Cutting did back in Hadoop World '09[2]. I haven't stayed up to date with the Jira for MRv2 but is there a disagreement between contributors as to which format will be the de facto standard going forward and if so what are the biggest points of contention? The only reason I bring this up is I am trying to integrate a serialization framework into our best practices and, while I am currently working towards Avro, this disconnect caused a little concern.
Matt *1 - http://www.youtube.com/watch?v=2FpO7w6X41I *2 - http://www.cloudera.com/videos/hw09_next_steps_for_hadoop This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware". Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export control laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations.