Your code to read word uses ‘\t’ as a delimiter but your file uses , as a 
delimiter.

> On 1 19, 2016, at 18:10, Chloe Huang <chloe.hu...@ussuning.com> wrote:
> 
> Hi Guys,
> 
> I am using PIG for data processing. But the join function seems not work in 
> my case. 
> 
> The PIG script is as follow:
> 
> A = LOAD './q' USING PigStorage(',') AS (ori_query: chararray, t: chararray, 
> w: chararray);
> 
> B = LOAD './word' USING PigStorage('\t') AS (word: chararray, proID: 
> chararray, proScore: chararray);
> 
> C = JOIN A by t, B by word;
> --DUMP C;
> 
> STORE C INTO 'join_out';
> 
> First I am loading my test case 'q' into A, and then load my test case 'word' 
> into B. 
> By "JOIN A  by t, B by word', I am expecting an inner join of A's field 't' 
> with B's field 'word'. In my test case, I have included many common fields in 
> A.t and B.word. 
> But I got nothing in my result C. The output file is also empty.
> 
> Here is a small piece of 'q':         (The document 'q' is attached)
> dark shoes for lady,dark,3.234
> dark shoes for lady,shoes,2.261
> dark shoes for lady,for,1.223
> dark shoes for lady,lady,2.345
> casual male shoes,casual,3.478
> casual male shoes,male,2.675
> casual male shoes,shoes,4.265
> casual sporty,casual,2.678
> 
> 
> Here is a small piece of 'word'    (The document 'word' is attached)
> for,104365130,0.588235294118
> male,104365130, 0.588235294118
> 35,104365130,0.588235294118
> ar,104365132,0.588235294118
> cow,104365132,0.652521008403
> mm,104365132,0.588235294118
> 45109,104365135,0.588235294118
> medium,104365135,0.588235294118
> casual,104365135,0.588235294118
> fur,104365135,0.652521008403
> lady,104365135,0.652521008403
> shoes,104365135,0.6
> st,104366010,0.533333333333
> ad,104366010,0.533333333333
> ray,104366010,0.597619047619
> chic,104366010,0.533333333333
> d,104394306,0.519480519481
> dark,104394306,0.519480519481
> comf,104394306,0.574358568261
> casual,104394306,0.574358568261
> sporty,104394306,0.574358568261
> PEACEPRINCESS,104394306,0.0
> shoes,104394889,1.15914601533
> 
> 
> A.t and B.word are both defined as chararray, I have included my test cases 
> 'q' and 'word' in the attachment. 
> Does anyone have an idea why JOIN is not working here?
> 
> 
> Many Thanks,
> Chloe H

Reply via email to