UDFContext.getUDFProperties does not handle collisions in hashcode of udf classname (+ arg hashcodes) -----------------------------------------------------------------------------------------------------
Key: PIG-1821 URL: https://issues.apache.org/jira/browse/PIG-1821 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Thejas M Nair Fix For: 0.9.0 In code below, if generateKey() returns same value for two udfs, the udfs would end up sharing the properties object. {code} private HashMap<Integer, Properties> udfConfs = new HashMap<Integer, Properties>(); public Properties getUDFProperties(Class c) { Integer k = generateKey(c); Properties p = udfConfs.get(k); if (p == null) { p = new Properties(); udfConfs.put(k, p); } return p; } private int generateKey(Class c) { return c.getName().hashCode(); } public Properties getUDFProperties(Class c, String[] args) { Integer k = generateKey(c, args); Properties p = udfConfs.get(k); if (p == null) { p = new Properties(); udfConfs.put(k, p); } return p; } private int generateKey(Class c, String[] args) { int hc = c.getName().hashCode(); for (int i = 0; i < args.length; i++) { hc <<= 1; hc ^= args[i].hashCode(); } return hc; } {code} To prevent this, a new class (say X) that can hold the classname and args should be created, and instead of HashMap<Integer, Properties>, HashMap<X, Properties> should be used. Then HahsMap will deal with the collisions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.