Hi all, I think the problem(s) lies deeper and should be solved at more fundamental level: OSGi (http://www.osgi.org)
This has the *classloader*, *modularity* and *distribution* management maturity that IMHO Hadoop clearly needs (from what I know, albeit circa 1.9). It's 10 years old, not headline app-server-tastic, nor flavour of the month http://java.dzone.com/articles/osgi-feast-or-famine - but that's the point, this proven tech. And it'll eventually be present at the lowest levels of Java, eg. project jigsaw: http://openjdk.java.net/projects/jigsaw/ which is a precursor to completing JSR-291 <http://markmail.org/message/5fjx7pzq6kmwagch> A number of people have tried to introduce OSGi to Hadoop but it seems their efforts *may* have been ignored by those in the meritocratic circle of power - this is a real shame, perhaps Jira voting is the way to draw attention to this? You could vote on this ticket that's over 2 years old... https://issues.apache.org/jira/browse/MAPREDUCE-243 OSGi can easily solve the classloading and lifecycle management issues in Hadoop, and brings a lot more besides. Can someone please explain to me the rationale for continuing to ignore such an obvious and elegant solution? Best regards, Caspar <http://techdistrict.kirkk.com/2010/02/26/osgi-devcon-slides/> On 14 April 2010 19:19, Cooper, Chris <[email protected]> wrote: > Scott, > > I think the direction your comments in > https://issues.apache.org/jira/browse/MAPREDUCE-1700 is spot on. You > should be looking at J2EE container class loader hierarchies. I've attached > a couple of good links that cover this approach. > > > > http://www.ibm.com/developerworks/websphere/library/techarticles/0112_deboer/deboer.html > http://www.objectsource.com/j2eechapters/Ch21-ClassLoaders_and_J2EE.htm > > I'm sure Mike and I would both be willing to work with you to contribute a > solution if you're interested. > > Best regards, > > CC > > -----Original Message----- > From: Scott Carey [mailto:[email protected]] > Sent: Wednesday, April 14, 2010 1:08 PM > To: [email protected] > Subject: Re: Custom Class Loader for Hadoop M/R jobs? > > My long term suggestions are in > https://issues.apache.org/jira/browse/MAPREDUCE-1700. The framework > definitely needs to handle this and not place the burden on users, IMO. But > that won't help you in the short term. > > Whether removing or replacing a Hadoop jar is an acceptable option to you > (or others) in the short term is up to you. Obviously, its not a great long > term solution but if you (or someone else) has to make it work ASAP, it > might be the only option. In our case, we package our own rpm and have a > few custom patches to Hadoop so removing one jar is a trivial thing to do in > the short / medium term. > > -Scott > > On Apr 14, 2010, at 10:33 AM, Segel, Mike wrote: > > > Scott, > > > > While that may work for a quick fix. Its not a good long term solution > and you then run in to a problem where you upgrade your hadoop release and > the removed jar is replaced or if you replace the jar, it possible to get > overwritten. > > > > In this specific instance, the Jackson libraries are not that important > and they can be replaced. > > But that doesn't mean that this issue won't come up again and its > something you can't easily pop out and replace. > > > > This is why I'm looking at custom class loading and trying to understand > what can be accomplished with the methods in the Configuration class. > > > > Thx > > > > -Mike > > > > > > -----Original Message----- > > From: Scott Carey [mailto:[email protected]] > > Sent: Wednesday, April 14, 2010 12:02 PM > > To: [email protected] > > Subject: Re: Custom Class Loader for Hadoop M/R jobs? > > > > Depending on what the dependency is, you might be able to just remove it > from hadoop's lib directory on your cluster. > > > > For me, Hadoop's later versions has jackson-1.0.1 in its lib directory > and that breaks usage of Avro in a M/R job among other things. However, the > feature that uses this library is unimportant to me (configuration dump in > JSON format) so I just removed the jar. > > > > -Scott > > > > On Apr 14, 2010, at 6:39 AM, Segel, Mike wrote: > > > >> Hi, > >> > >> Ok, here's a bit of a bizarre issue... > >> > >> How do you handle class collisions between Hadoop and your m/r job which > calls other 3rd party classes. > >> > >> An example: Hadoop has an older version of an open source jar in its > /lib directory. You're interfacing with a 3rd party OS tool that uses a > later release of the same jar. > >> > >> You can modify the classpath, and that might work. But the better way is > to create a Custom Class Loader. (Non-trivial) > >> > >> Looking at the Configuration class, it looks like there are a couple of > methods that deal with loading a class in to the configuration so that the > m/r jobs can have access to them on each node. > >> > >> Is this the correct intended use, or am I missing something? > >> Has anyone done something like this? > >> > >> Thx > >> > >> -Mike > >> > >> Michael Segel > >> Architect, R&D > >> NAVTEQ > >> 425 West Randolph Street > >> Chicago, IL 60606 > >> (T) +1 312-780-3432 > >> (C) +1 312-952-8175 > >> www.navteq.com<http://www.navteq.com/> > >> > >> > >> > >> The information contained in this communication may be CONFIDENTIAL and > is intended only for the use of the recipient(s) named above. If you are > not the intended recipient, you are hereby notified that any dissemination, > distribution, or copying of this communication, or any of its contents, is > strictly prohibited. If you have received this communication in error, > please notify the sender and delete/destroy the original message and any > copy of it from your computer or paper files. > > > > > > > > The information contained in this communication may be CONFIDENTIAL and > is intended only for the use of the recipient(s) named above. If you are > not the intended recipient, you are hereby notified that any dissemination, > distribution, or copying of this communication, or any of its contents, is > strictly prohibited. If you have received this communication in error, > please notify the sender and delete/destroy the original message and any > copy of it from your computer or paper files. > > > > The information contained in this communication may be CONFIDENTIAL and is > intended only for the use of the recipient(s) named above. If you are not > the intended recipient, you are hereby notified that any dissemination, > distribution, or copying of this communication, or any of its contents, is > strictly prohibited. If you have received this communication in error, > please notify the sender and delete/destroy the original message and any > copy of it from your computer or paper files. >
