Worng stats shown when there are multiple loads but same file names -------------------------------------------------------------------
Key: PIG-1779 URL: https://issues.apache.org/jira/browse/PIG-1779 Project: Pig Issue Type: Bug Components: tools Affects Versions: 0.8.0 Reporter: Vivek Padmanabhan In Pig 0.8 , the stats is showing wrong information when ever I have multiple loads and the the file names are similar . a) Problem 1 Sample Script : A = LOAD 'myfolder/tryme' AS (f1); B = LOAD 'myfolder/anotherfolder/tryme' AS (f2); C = JOIN A BY f1, B BY f2; DUMP C; Here I have 10 records for A and 3 records for B , but pig says Successfully read 6 records from: "<nn>/myfolder/anotherfolder/tryme" Successfully read 6 records from: "<nn>myfolder/tryme" b) Problem 2 A = LOAD 'myfolder/tryme' AS (f1); B = LOAD 'myfolder/an1111otherfolder/tryme' AS (f2); C = JOIN A BY f1, B BY f2; DUMP C; Here there is no folder named an1111otherfolder while "myfolder/tryme" exists . But pig says Failed to read data from "<nn>/myfolder/an1111otherfolder/tryme" Failed to read data from "<nn>/myfolder/tryme" -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.