I have been very excited to give Pig 0.9 a try and run it against our Cloudera CDH3U0 hadoop cluster and I need to point Pig to the cloudera hadoop libraries to make it work. I tried re-building pig and ran into the behavior described in PIG-2055. My other option seems to be to set the PIG_HADOOP_VERSION environment variable and add the cloudera hadoop-core jar to the pig lib directory. This did not seem to work, however if I change bin/pig to put the inclusion of the hadoop jar in front of the pig jar in the classpath it works as I would expect ( i.e., it includes the Cloudera hadoop-core jar before pig-core.jar and runs correctly. )
The question I have is does this approach seem to make sense? If so does it seem worthwhile to log a JIRA or am I just mis-understanding how to specify an alternate hadoop distribution. It does look like bin/pig in trunk is different although it also seems to include pig-core.jar before the hadoop jars in the classpath. Thanks
