Re: hadoop15 & hadoop14 both in lib

Andrzej Bialecki Sat, 16 Feb 2008 01:57:40 -0800

Alan Gates wrote:

A few answers to your questions.
The hadoopX.jar files in pig's lib directory are not the standard hadoopjars. They differ in two ways. First, we recreate a hadoop jar thatrolls in all the jars needed to compile with hadoop. This is somewherearound 15 jars. Second, we have a small hack we add for historicalreasons. We need to resolve both of those issues. Once we do we canuse stock hadoop jars instead of carrying along our own.

If you want to keep hadoop-related jars separate from other jars, youcould put them all together in a lib/hadoop subdir. Re-packaging jars isconfusing, you lose versioning information of dependent jars and alsosome jars may depend on specific values in MANIFEST, which repackagingmay have dropped.

Regarding the hack: we had similar problems in Nutch. If changes arerequired to core Hadoop, perhaps it's better to submit them to Hadoopfor inclusion. If they are a temporary hack, perhaps a facade class is abetter approach. In some cases in Nutch we had to used a patched libraryanyway, which was then clearly marked as such and diffs from the stockversion were available in JIRA.


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: hadoop15 & hadoop14 both in lib

Reply via email to