[ https://issues.apache.org/jira/browse/PIG-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413532#comment-13413532 ]
Raghu Angadi commented on PIG-2815: ----------------------------------- An example: {noformat} register elephant-bird.jar; -- for working with Thrift objects. -- (1) register T_One.jar; -- (2) -- ThriftPigLoader takes name of a Thrift class that corresponds to input. a = load '/logs/T_One' using ThriftPigLoader('thrift.gen.T_One'); -- (3) register second.jar; -- (4) b = load '/logs/T_two' using ThriftPigLoader('thrift.gen.T_two'); -- (5) -- FAIL! {noformat} * (1): new classlaoder cl_A is created with root classloader as the parent. * (2): cl_B is created with root as the parent. * (3): {{ThirftPigLoader.class}} is instantiated with cl_B and cached. * (4): cl_C is created with root as the parent. * (5): {{thrift.gen.T_two.class}} is instantiated with cl_C, but '{{ThriftPigLoader.class}}' from cl_B is reused by Pig. So all the Thrift classes seen by ThriftPigLoader are entirely different from all the Thrift classes seen by {{thrift.gen.T_two}} since cl_B is not a parent of cl_C. That can lead to a number of issues and it does. > class loader management in PigContext > ------------------------------------- > > Key: PIG-2815 > URL: https://issues.apache.org/jira/browse/PIG-2815 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.9.0 > Reporter: Raghu Angadi > Fix For: 0.11 > > > The way {{PigContext.classloader}} and resolveClassName() are managed can > lead to strange class loading issues, especially when not all {{register}} > statements are at the top (example in the first comment). > Two factors contribute to this: sometimes only one of them and sometimes > together: > # a new classloader (CL) is created after registering each jar. > ** but the new jar's parent is the root CL rather than previous CL, > effectively throwing previous CL away. > # resolveClassName() caches classes based on just the name > ** A class is not defined by name alone. Classes loaded by two different > unrelated CLs are different objects even if both extract the class from same > physical jar file. > ** because of (1), the cached class is not necessarily same as the class > that would be loaded based on 'current' CL > having different class objects for same class have many subtle side effects. > e.g. there would be two instances of static variables. > I think both should be fixed.. thought fixing one of them might be good > enough in many cases. I will add a patch. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira