Other way I can think at this is ..

1) ignore all -1 and create a tmp table
2) I see there are couple of time stamps
3) Oder the table by timestamp
4) from this tmp tabel create anothe tmp table which says FK MinStartTime
MaxEndTime Location
5) Now this tmp table from step 4 join with ur raw data and put where
clause with min and max times

I hope this is not confusing

On Mon, Sep 15, 2014 at 6:25 PM, Viral Parikh <viral.j.par...@gmail.com>
wrote:

> thanks!
>
> is there any other way than writing python UDF etc.
>
> any way i can leverage hive joins to get this working?
>
> On Mon, Sep 15, 2014 at 6:56 AM, Sreenath <sreenaths1...@gmail.com> wrote:
>
>> How about writing a python UDF that takes input line by line
>> and it saves the previous lines location and can replace it with that
>> if location turns out to be '-1'
>>
>> On 15 September 2014 17:01, Nitin Pawar <nitinpawar...@gmail.com> wrote:
>>
>>> have you taken a look at lag and lead functions ?
>>>
>>> On Mon, Sep 15, 2014 at 4:46 PM, Viral Parikh <viral.j.par...@gmail.com>
>>> wrote:
>>>
>>>> To Whomsoever It May Concern,
>>>>
>>>> I posted this question last week but still haven't heard from anyone;
>>>> I'd appreciate any reply.
>>>>
>>>> I've got a table that contains a LocationId field. In some cases, where
>>>> a record shares the same foreign key, the LocationId might come through as
>>>> -1.
>>>>
>>>> What I want to do is in my select query is in the case of this
>>>> happening, the previous location.
>>>>
>>>> Example data:
>>>>
>>>> Record  FK     StartTime               EndTime          Location1       
>>>> 110  2011/01/01 12.30        2011/01/01 6.10      4562       110  
>>>> 2011/01/01 3.40         2011/01/01 4.00       -13       110  2011/01/02 
>>>> 1.00         2011/01/02 8.00      8914       110  2011/01/02 5.00         
>>>> 2011/01/02 6.00       -15       110  2011/01/02 6.10         2011/01/02 
>>>> 6.30       -1
>>>>
>>>> The -1 should come out as 456 for record 2, and 891 for record 4 and 5
>>>>
>>>> Can someone help me do this with Hive syntax?
>>>>
>>>> I can do it using SQL syntax (as below) but since Hive doesnt support
>>>> correlated subqueries in select clauses and so I am unable to get it.
>>>>
>>>> SELECT  T1.record,
>>>>         T1.fk,
>>>>         T1.start_time,
>>>>         T1.end_time,
>>>>         CASE WHEN T1.location != -1 THEN Location
>>>>         ELSE
>>>>             (
>>>>             SELECT  TOP (1)
>>>>                     T2.location
>>>>             FROM    #temp1 AS T2
>>>>             WHERE   T2.record < T1.record
>>>>             AND     T2.fk = T1.fk
>>>>             AND     T2.location != -1
>>>>             ORDER   BY T2.Record DESC
>>>>             )
>>>>         ENDFROM    #temp1 AS T1
>>>>
>>>> Thank you for your help in advance!
>>>>
>>>
>>>
>>>
>>> --
>>> Nitin Pawar
>>>
>>
>>
>>
>> --
>> Sreenath S Kamath
>> Bangalore
>> Ph No:+91-9590989106
>>
>
>


-- 
Nitin Pawar

Reply via email to