Other way I can think at this is .. 1) ignore all -1 and create a tmp table 2) I see there are couple of time stamps 3) Oder the table by timestamp 4) from this tmp tabel create anothe tmp table which says FK MinStartTime MaxEndTime Location 5) Now this tmp table from step 4 join with ur raw data and put where clause with min and max times
I hope this is not confusing On Mon, Sep 15, 2014 at 6:25 PM, Viral Parikh <viral.j.par...@gmail.com> wrote: > thanks! > > is there any other way than writing python UDF etc. > > any way i can leverage hive joins to get this working? > > On Mon, Sep 15, 2014 at 6:56 AM, Sreenath <sreenaths1...@gmail.com> wrote: > >> How about writing a python UDF that takes input line by line >> and it saves the previous lines location and can replace it with that >> if location turns out to be '-1' >> >> On 15 September 2014 17:01, Nitin Pawar <nitinpawar...@gmail.com> wrote: >> >>> have you taken a look at lag and lead functions ? >>> >>> On Mon, Sep 15, 2014 at 4:46 PM, Viral Parikh <viral.j.par...@gmail.com> >>> wrote: >>> >>>> To Whomsoever It May Concern, >>>> >>>> I posted this question last week but still haven't heard from anyone; >>>> I'd appreciate any reply. >>>> >>>> I've got a table that contains a LocationId field. In some cases, where >>>> a record shares the same foreign key, the LocationId might come through as >>>> -1. >>>> >>>> What I want to do is in my select query is in the case of this >>>> happening, the previous location. >>>> >>>> Example data: >>>> >>>> Record FK StartTime EndTime Location1 >>>> 110 2011/01/01 12.30 2011/01/01 6.10 4562 110 >>>> 2011/01/01 3.40 2011/01/01 4.00 -13 110 2011/01/02 >>>> 1.00 2011/01/02 8.00 8914 110 2011/01/02 5.00 >>>> 2011/01/02 6.00 -15 110 2011/01/02 6.10 2011/01/02 >>>> 6.30 -1 >>>> >>>> The -1 should come out as 456 for record 2, and 891 for record 4 and 5 >>>> >>>> Can someone help me do this with Hive syntax? >>>> >>>> I can do it using SQL syntax (as below) but since Hive doesnt support >>>> correlated subqueries in select clauses and so I am unable to get it. >>>> >>>> SELECT T1.record, >>>> T1.fk, >>>> T1.start_time, >>>> T1.end_time, >>>> CASE WHEN T1.location != -1 THEN Location >>>> ELSE >>>> ( >>>> SELECT TOP (1) >>>> T2.location >>>> FROM #temp1 AS T2 >>>> WHERE T2.record < T1.record >>>> AND T2.fk = T1.fk >>>> AND T2.location != -1 >>>> ORDER BY T2.Record DESC >>>> ) >>>> ENDFROM #temp1 AS T1 >>>> >>>> Thank you for your help in advance! >>>> >>> >>> >>> >>> -- >>> Nitin Pawar >>> >> >> >> >> -- >> Sreenath S Kamath >> Bangalore >> Ph No:+91-9590989106 >> > > -- Nitin Pawar