Worng stats shown when there are multiple loads but same file names
-------------------------------------------------------------------

                 Key: PIG-1779
                 URL: https://issues.apache.org/jira/browse/PIG-1779
             Project: Pig
          Issue Type: Bug
          Components: tools
    Affects Versions: 0.8.0
            Reporter: Vivek Padmanabhan


In Pig 0.8 , the stats is showing wrong information when ever I have multiple 
loads and the the file names are similar .

a) Problem 1
Sample Script : 
A = LOAD 'myfolder/tryme' AS (f1);
B = LOAD 'myfolder/anotherfolder/tryme' AS (f2);
C = JOIN A BY f1, B BY f2;
DUMP C;

Here I have 10 records for A and 3 records for B , but pig says 
Successfully read 6 records from: "<nn>/myfolder/anotherfolder/tryme"
Successfully read 6 records from: "<nn>myfolder/tryme"

b) Problem 2
A = LOAD 'myfolder/tryme' AS (f1);
B = LOAD 'myfolder/an1111otherfolder/tryme' AS (f2);
C = JOIN A BY f1, B BY f2;
DUMP C;

Here there is no folder named an1111otherfolder while "myfolder/tryme" exists . 
But pig says
Failed to read data from "<nn>/myfolder/an1111otherfolder/tryme"
Failed to read data from "<nn>/myfolder/tryme"


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to