Hey all, I have one file A with a 'day' column like "2011/3/2" and another B with a column 'timestamp' like "2011/3/2 12:32" ... I want to join on these two field in these records. I do something like this:
A_and_B = JOIN A by (tracking_id, day) LEFT OUTER, B by (tracking_id, STRSPLIT(timestamp, ' ', 1).$0) where you can see I am projecting out the first element of the tuple returned by strsplit... When I run this I get an error of the form: org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN ERROR 2042: Error in new logical plan. Try -Dpig.usenewlogicalplan=false. Putting the environment variable before the "-x local" I see that the join appears to be working. Yay. I am happy that thing seem to be working, though I would appreciate some feedback from those in the know as to why the environment variable fixes this and if there is a more canonical way of doing this join. thanks, daniel