Thank you for your reply on this. Your recent query is working perfectly.
Yes, we don't have primary key here in this case. Are there some other ways of achieving the same output when we don't have a primary key? On Wed, Apr 27, 2016 at 12:11 PM, Markovitz, Dudu <dmarkov...@paypal.com> wrote: > In order to achieve the LEFT JOIN I’m generating a primary key to table A > using ROW_NUMBER. > > If you already have a primary key, you can use it and skip the WITH clause. > > > > Dudu > > > > *with* a_rn *as* (*select* row_number () over () *as* rn,* *from* a) > > > > *select* * > > > > *from* a_rn *as* a > > > > *left* *join* (*select* a.rn > > ,b.* > > > > *from* a_rn *as* a > > > > *cross* *join* b > > > > *where* a.*type* = b.*type* > > > > *and* a.code like *case* b.code > *when* 'All' *then* '%' *else* b.code *end* > > *and* a.country like *case* b.country > *when* 'All' *then* '%' *else* b.country *end* > > *and* a.flag like *case* b.flag > *when* 'All' *then* '%' *else* b.flag *end* > > ) > > *as* b > > > > *on* b.rn = > > a.rn > > *;* > > > > 1 101 USA OAS Y > 1 101 All OAS All John > > 2 142 US OAS X > 2 142 All OAS All John > > 3 101 US Retail X > 3 All All Retail All John > > 4 142 USA MO X NULL > NULL NULL NULL NULL NULL > > 5 192 US Mod X > 5 192 All Mod All Sean > > 6 101 USA Sal X NULL > NULL NULL NULL NULL NULL > > 7 101 USA Win X NULL > NULL NULL NULL NULL NULL > > 8 101 USA Feed X NULL > NULL NULL NULL NULL NULL > > 9 142 CAN Feed X NULL > NULL NULL NULL NULL NULL > > 10 101 USA OCP X NULL > NULL NULL NULL NULL NULL > > 11 101 USA STORE X NULL NULL > NULL NULL NULL NULL > > 12 129 USA Tax Y NULL > NULL NULL NULL NULL NULL > > 13 119 USA Tax Y NULL > NULL NULL NULL NULL NULL > > 14 101 USA Pus Y NULL > NULL NULL NULL NULL NULL > > 15 142 CN Pus Y > 15 142 All Pus Y Smith > > 16 142 CA Pus Y > 16 142 All Pus Y Smith > > 17 142 US Pus S NULL > NULL NULL NULL NULL NULL > > 18 101 CN Tax Y > 18 101 All Tax Y Smith > > 19 101 CA Tax Y > 19 101 All Tax Y Smith > > 20 101 USA Tax Y 20 > 101 All Tax Y Smith > > 21 101 US Tax X > 21 101 All Tax X John > > > > *From:* Kishore A [mailto:kishore.atmak...@gmail.com] > *Sent:* Wednesday, April 27, 2016 9:15 PM > *To:* user@hive.apache.org > *Subject:* RE: Question on Implementing CASE in Hive Join > > > > I feel the mail at your end has got a different format. > > Remove * and - in the data what you see...That is my final data....You can > correlate with B > > 2. I need A LeftJoin B > > On Apr 27, 2016 11:41 PM, "Markovitz, Dudu" <dmarkov...@paypal.com> wrote: > > Huston, we have a problem J > > > > 1. > > Data > > > > Here are the tables with tabs replaced by ‘*’ and spaces replaced by ‘-’. > > A is a mismatch of tabs and spaces and B contains only (a very strange > number of) spaces. > > > > So you’re using TRIM with all A columns, but how exactly have you defined > table B? > > > > > > *A* > > 101*---US*------Tax*----X > > 101*---USA*------Tax*----Y > > 101*---CA*------Tax*----Y > > 101*---CN*------Tax*----Y > > 142*---US*------Pus*----S > > 142*---CA*------Pus*----Y > > 142*---CN*------Pus*----Y > > 101*---USA*------Pus*----Y > > 119*---USA*------Tax*----Y > > 129*---USA*------Tax*----Y > > 101*---USA*------STORE*----X > > 101*---USA*------OCP*----X > > 142*---CAN*------Feed*----X > > 101*---USA*------Feed*----X > > 101*---USA*------Win*----X > > 101*---USA*------Sal*----X > > 192*---US*------Mod*----X > > 142*---USA*------MO*----X > > 101*---US*------Retail*X > > 142*---US*------OAS*----X > > 101*---USA*------OAS*----Y > > > > *B* > > All--------All-----------Retail------All----John > > All-------US----------OM----------X------John > > ALL-----US---------OM----------Y------Smith > > 101-----All----------OAS---------All----John > > 142-----All----------OAS---------All----John > > 192-----All----------Mod----------All----Sean > > 101-----All----------Tax-----------X------John > > 101-----All---------Tax------------Y------Smith > > 142-----US---------Pus----------X------John > > 142-----All----------Pus----------Y------Smith-- > > > > > > 2. > > Requirement > > > > “A left join B” guarantees that all the records from A appear in the > result set and that the number of rows in the result set is *at least* > the number of rows of A. > > You should take into consideration that a record in A might have several > matches from B. > > > > Also – > > In your original question you’ve talked about “B LEFT JOIN A” and now it’s > “A LEFT JOIN B”. > > > > 3. > > Scenario 1 > > > > I don’t want to start a lecture at that point, but ‘ON‘ and ‘WHERE’ have a > completely different meaning and use. > > You’re conditions in the WHERE clause turned the LEFT JOIN to INNER JOIN. > > > > 4. > > Scenario 2 > > > > I’ve suggested ‘LIKE’ not ‘=’ (but it won’t work because of Hive > limitations that require equality conditions for JOIN). > > > > 5. > > Scenario 3 > > > > CROSS JOIN does not use ON (Hive lets you do that but it not an SQL > standard and it’s actually an INNER JOIN). > > > > 6. > > CASE > > > > CASE is defined by ANSI/ISO and works in Hive the same way it works in HQL > the same way it works in any other SQL dialect. > > > > In conclusion - > > The solution I gave you in the previous mails works as expected. > > You should check your data and your tables’ definitions. > > > > Just tell me if you want “B LEFT JOIN A” or “A LEFT JOIN B” and I’ll send > you the exact code. > > > > Dudu > > > > *From:* Kishore A [mailto:kishore.atmak...@gmail.com] > *Sent:* Wednesday, April 27, 2016 6:30 PM > *To:* user@hive.apache.org > *Subject:* Re: Question on Implementing CASE in Hive Join > > > > Hello Dudu, > > > > Thank you for previous reply. I tried all the scenarios but nothing is > working out. I need this condition with with minimal steps like by using > case when. > > > > Please find the below > > > > > > TableA > > Code Country Type Flag > > 101 US Tax X > > 101 USA Tax Y > > 101 CA Tax Y > > 101 CN Tax Y > > 142 US Pus S > > 142 CA Pus Y > > 142 CN Pus Y > > 101 USA Pus Y > > 119 USA Tax Y > > 129 USA Tax Y > > 101 USA STORE X > > 101 USA OCP X > > 142 CAN Feed X > > 101 USA Feed X > > 101 USA Win X > > 101 USA Sal X > > 192 US Mod X > > 142 USA MO X > > 101 US Retail X > > 142 US OAS X > > 101 USA OAS Y > > > > TableB > > Code Country Type Flag Value > > All All Retail All John > > All US OM X John > > ALL US OM Y Smith > > 101 All OAS All John > > 142 All OAS All John > > 192 All Mod All Sean > > 101 All Tax X John > > 101 All Tax Y Smith > > 142 US Pus X John > > 142 All Pus Y Smith > > > > > > This is how my data in reality looks like. *Type *column is the fixed > column here > > > > Requirement : TableA LeftJoin TableB , stamp the matching *Value *from > TableB. Total result should have the number of rows as TableA > > > > Scenario 1: > > > > Select * FROM TableA a > > Left Join TableB b ON > > LOWER(TRIM(a.Type)) = LOWER(TRIM(b.Type)) > > WHERE LOWER(TRIM(a.Code)) = (CASE WHEN LOWER(TRIM(b.Code)) = > LOWER('All') THEN LOWER(TRIM(a.Code)) ELSE LOWER(TRIM(b.Code)) END) > > AND LOWER(TRIM(a.country)) = (CASE WHEN LOWER(TRIM(b.country)) = > LOWER('All') THEN LOWER(TRIM(a.country)) ELSE LOWER(TRIM(b.country)) END) > > AND LOWER(TRIM(a.flag)) = (CASE WHEN LOWER(TRIM(b.flag)) = > LOWER('All') THEN LOWER(TRIM(a.flag)) ELSE LOWER(TRIM(b.flag)) END); > > > > on the output the number of rows should be same as the left table, but as > I am apply WHERE data is getting filtered out and is giving only > > matching records > > > > > > > > Scenario 2: Suggested by you > > > > Select * FROM TableA a > > Left Join TableB b ON > > LOWER(TRIM(a.Type)) = LOWER(TRIM(b.Type)) > > AND LOWER(TRIM(a.Code)) = (CASE WHEN LOWER(TRIM(b.Code)) = > LOWER('All') THEN '%' ELSE LOWER(TRIM(b.Code)) END) > > AND LOWER(TRIM(a.country)) = (CASE WHEN LOWER(TRIM(b.country)) = > LOWER('All') THEN '%' ELSE LOWER(TRIM(b.country)) END) > > AND LOWER(TRIM(a.flag)) = (CASE WHEN LOWER(TRIM(b.flag)) = > LOWER('All') THEN '%' ELSE LOWER(TRIM(b.flag)) END); > > > > As per the left join this is giving me the number of rows just as the left > side table but I am not having the matching records as per the > > right table. I have NULLs on the right side. > > > > Scenario 3: Cross Join > > > > Select * FROM TableA a > > CROSS Join TableB b ON > > LOWER(TRIM(a.Type)) = LOWER(TRIM(b.Type)) > > AND LOWER(TRIM(a.Code)) = (CASE WHEN LOWER(TRIM(b.Code)) = > LOWER('All') THEN '%' ELSE LOWER(TRIM(b.Code)) END) > > AND LOWER(TRIM(a.country)) = (CASE WHEN LOWER(TRIM(b.country)) = > LOWER('All') THEN '%' ELSE LOWER(TRIM(b.country)) END) > > AND LOWER(TRIM(a.flag)) = (CASE WHEN LOWER(TRIM(b.flag)) = > LOWER('All') THEN '%' ELSE LOWER(TRIM(b.flag)) END); > > > > Giving same result as Scenario 2 - All the left columns with NULLs on right > > > > I don't know if we still need to fine tune the query as Hive is not > supporting CASE WHEN just as in SQL. Your thoughts are more helpful to me. > > > > Kishore > > > > > > On Wed, Apr 20, 2016 at 12:04 PM, Markovitz, Dudu <dmarkov...@paypal.com> > wrote: > > The second version works as expected (after fixing a typo in the word > ‘indicator’). > > If you don’t get any results you should check your data (maybe the fields > contains trailing spaces or control characters etc.). > > > > If you’re willing to replace the ‘OUTER’ with ‘INNER’, there’s another > option - > > > > *select* * > > > > *from* b > > > > *cross* *join* a > > > > *where* a.*type* = b.*type* > > *and* a.code like *case* b.code *when* 'ALL' *then* '%' > *else* b.code *end* > > *and* a.indicator like *case* b.indicator *when* 'ALL' *then* '%' > *else* b.indicator *end* > > *;* > > > > Dudu > > > > > > *From:* Kishore A [mailto:kishore.atmak...@gmail.com] > *Sent:* Wednesday, April 20, 2016 5:04 PM > > > *To:* user@hive.apache.org > *Subject:* Re: Question on Implementing CASE in Hive Join > > > > Hi Dudu, > > > > Thank you for sending queries around this. > > > > I have run these queries and below are the observations > > > > 1. It did return the same error as before" SemanticException [Error > 10017]: Line 4:4 Both left and right aliases encountered in JOIN 'code'" > > > > 2. Query execution is successful but not retrieving any results out of it. > > > > I am clueless and not able to proceed to next step until this is resolved. > Do you have any other suggestions please? > > > > Kishore > > > > On Tue, Apr 19, 2016 at 6:08 AM, Markovitz, Dudu <dmarkov...@paypal.com> > wrote: > > Please try the following two options. > > Option 2 might be better, performance wise (depending of the data volume > and characteristics). > > > > P.s. > > I didn’t understand the explanation about the LEFT JOIN > > > > > > Dudu > > > > 1. > > > > *select* b.code > > ,b.*value* > > > > *from* b > > > > *left* *join* a > > > > *on* a.*type* = b.*type* > > *and* a.code like *case* b.code *when* > 'ALL' *then* '%' *else* b.code *end* > > *and* a.indicator like *case* b.indicatior *when* > 'ALL' *then* '%' *else* b.indicatior *end* > > *;* > > > > > > > > 2. > > > > *select* b.code > > ,b.*value* > > > > *from* b > > > > *left* *join* a > > > > *on* a.*type* = b.*type* > > *and* a.code = b.code > > *and* a.indicator = b.indicatior > > > > *where* b.code != 'ALL' > > *and* b.indicatior != 'ALL' > > > > *union* *all* > > > > *select* b.code > > ,b.*value* > > > > *from* b > > > > *left* *join* a > > > > *on* a.*type* = b.*type* > > *and* a.indicator = b.indicatior > > > > *where* b.code = 'ALL' > > *and* b.indicatior != 'ALL' > > > > *union* *all* > > > > *select* b.code > > ,b.*value* > > > > *from* b > > > > *left* *join* a > > > > *on* a.*type* = b.*type* > > *and* a.code = b.code > > > > *where* b.code != 'ALL' > > *and* b.indicatior = 'ALL' > > > > *union* *all* > > > > *select* b.code > > ,b.*value* > > > > *from* b > > > > *left* *join* a > > > > *on* a.*type* = b.*type* > > > > *where* b.code = 'ALL' > > *and* b.indicatior = 'ALL' > > *;* > > > > > > *From:* Kishore A [mailto:kishore.atmak...@gmail.com] > *Sent:* Tuesday, April 19, 2016 3:51 PM > *To:* user@hive.apache.org > *Subject:* Re: Question on Implementing CASE in Hive Join > > > > Hi Dudu, > > > > Actually we use both fields from left and right tables, I mentioned right > table just for my convenience to check whether ALL from right table can be > pulled as per join condition match. > > > > One more reason why we use left join is we should not have extra columns > after join. > > > > Kishore > > > > > > > > On Tue, Apr 19, 2016 at 5:46 AM, Markovitz, Dudu <dmarkov...@paypal.com> > wrote: > > Before dealing with the technical aspect, can you please explain what is > the point of using LEFT JOIN without selecting any field from table A? > > > > Thanks > > > > Dudu > > > > *From:* Kishore A [mailto:kishore.atmak...@gmail.com] > *Sent:* Tuesday, April 19, 2016 2:29 PM > *To:* user@hive.apache.org > *Subject:* Question on Implementing CASE in Hive Join > > > > Hi, > > > > I have a scenario to implement to cases in Hive Joins. I need to implement > case on the value on which join condition to be applied. > > > > Table A > > Code// Type// Indicator// Value// > > A 1 XYZ John > > B 1 PQR Smith > > C 2 XYZ John > > C 2 PQR Smith > > D 3 PQR Smith > > E 3 XYZ Smith > > F 4 MNO Smith > > G 3 MNO Smith > > ... > >