Re: Need urgent help in hive query

2013-07-02 Thread dyuti a
sure will check to that Robin


-Thanks


On Sat, Jun 29, 2013 at 1:11 AM, Robin Morris  wrote:

>  It probably won't make a difference to performance — the subqueries will
> be executed in separate map-reduce jobs, and the output put on disk anyway.
>
>
>  First, get it to work; second, optimize.
>
>  Robin
>
>   From: dyuti a 
> Reply-To: "user@hive.apache.org" 
> Date: Friday, June 28, 2013 12:05 PM
> To: "user@hive.apache.org" 
> Subject: Re: Need urgent help in hive query
>
>   Hi Robin,
> Thanks for your reply. Hope this will help us but still looking into
> performance wise tooif no other option then have to go by this.
>
> Dti
>
>
> On Sat, Jun 29, 2013 at 12:03 AM, Robin Morris  wrote:
>
>>  Split up the query – put results of the subquery into a table, run the
>> final query on that, then drop the temporary table.
>>
>>  Robin
>>
>>   From: dyuti a 
>> Reply-To: "user@hive.apache.org" 
>> Date: Friday, June 28, 2013 10:54 AM
>> To: "user@hive.apache.org" , Michael Malak <
>> michaelma...@yahoo.com>
>> Subject: Re: Fwd: Need urgent help in hive query
>>
>>   Hi Michael,
>> Thanks for your help, is there any other possible options apart from this.
>>
>>
>> On Fri, Jun 28, 2013 at 10:33 PM, Michael Malak 
>> wrote:
>>
>>>  Just copy and paste the whole long expressions to their second
>>> occurrences.
>>>
>>>   --
>>> *From:* dyuti a 
>>> *To:* user@hive.apache.org
>>> *Sent:* Friday, June 28, 2013 10:58 AM
>>> *Subject:* Fwd: Need urgent help in hive query
>>>
>>>   Hi Experts,
>>>  I'm trying with the below SQL query in Hive, which does not support
>>> column alias access in subsequent columns as shown below in the query. Is
>>> there any other way to rewrite the same without using alias? any of your
>>> help are really appreciated.
>>>
>>> INSERT INTO CAUD
>>>(
>>>  pst_cc
>>>   pst_no
>>>   pst_ber
>>>   pst_tkn
>>>   pst_dtm
>>>   pst_nbr
>>>   pst_cde
>>>   pst_dte
>>> )
>>> SELECT der.cc
>>>   der.no
>>>   der.ber
>>>   der.tkn
>>>   der.dtm
>>>   der.nbr
>>>   der.cde
>>>   der.dte
>>>
>>> FROM (SELECT udp.cc
>>>udp.no
>>>udp.ber
>>>udp.tkn
>>>,CASE WHEN udp.SYSTEM_DTE>1600 AND udp.SYSTEM_DTE<1 THEN
>>> udp.SYSTEM_DTE
>>> WHEN udp.DTE_OUT>1600 AND udp.DTE_OUT<1 THEN
>>> udp.DTE_OUT
>>> WHEN udp.DTE_IN>1600 AND udp.DTE_IN<1 THEN
>>> udp.DTE_IN
>>> ELSE '1231'
>>> END  AS DTE_OUT
>>>,CASE WHEN udp.TME_OUT <> 0 THEN udp.TME_OUT
>>> WHEN udp.TME_IN <> 0 THEN udp.TME_IN
>>> ELSE 0
>>> END AS TME_OUT
>>> ,TRIM(CAST(TME_OUT  AS CHAR(6))) AS TME_OUT1
>>> ,CAST(CAST(SUBSTR(TRIM(DTE_OUT),1,8)  AS CHAR(8)) AS DATE FORMAT
>>> 'mmdd')  AS DTE_OUT_O
>>> ,CASE WHEN TME_OUT>9 THEN CAST(TME_OUT1 AS CHAR(6))
>>> WHEN TME_OUT> AND TME_OUT<=9 THEN
>>> CAST('0'||TME_OUT1  AS CHAR(6))
>>> WHEN TME_OUT>999 AND TME_OUT<= THEN CAST('00'||TME_OUT1
>>> AS CHAR(6))
>>> WHEN TME_OUT>99 AND TME_OUT<=999 THEN CAST('000'||TME_OUT1
>>> AS CHAR(6))
>>> WHEN TME_OUT>9 AND TME_OUT<=99 THEN CAST(''||TME_OUT1 AS
>>> CHAR(6))
>>> WHEN TME_OUT>0 AND TME_OUT<=9 THEN CAST('0'||TME_OUT1 AS
>>> CHAR(6))
>>> WHEN TME_OUT=0 THEN '00'
>>> END AS TME_OUT2
>>>
>>> ,SUBSTR(TRIM(TME_OUT2),1,2)||':'||SUBSTR(TRIM(TME_OUT2),3,2)||':'||SUBSTR(TRIM(TME_OUT2),5,2)
>>> AS
>>> TME_OUT_O
>>> , CAST( DTE_OUT_O||' '||TME_OUT_O AS TIMESTAMP FORMAT
>>> 'MMDD:HH: MI:SS') AS DTM
>>> ,udp.nbr  AS  nbr
>>>
>>>FROM   STS_GNCAUDP   udp
>>>
>>> INNER JOIN LOAD_LOG LZ_LL  ON udp.LOG_KEY=LZ_LL.LOG_KEY
>>> INNER JOIN ESA_LOAD_LOG ESA_LL  ON
>>> LZ_LL.ESA_LOAD_LOG_KEY=ESA_LL.LOG_KEY
>>> AND ESA_LL.PBLSH_IND='$PBLSH_IND'
>>> AND ESA_LL.LOAD_END_DTM ='$HIGH_DATE_TIME'
>>> AND ESA_LL.SOR_CD= '$CLM_SOR_CD'
>>> AND ESA_LL.SUBJ_AREA_NM= '$SUBJ_AREA_NM'
>>> AND ESA_LL.WORK_FLOW_NM= '$WORK_FLOW_NM'
>>> QUALIFY ROW_NUMBER()  OVER  (PARTITION BY udp.cc,udp.pst_no,
>>> udp.cde,udp.nbr,udp.dte,udp.LOG_KEY
>>> ORDER BY DTM DESC)=1) AS der
>>> ;
>>>
>>>
>>>
>>> Thanks in advance!
>>> Dti
>>>
>>>
>>>
>>>
>>>
>>
>


Re: Need urgent help in hive query

2013-06-28 Thread Robin Morris
It probably won't make a difference to performance — the subqueries will be 
executed in separate map-reduce jobs, and the output put on disk anyway.

First, get it to work; second, optimize.

Robin

From: dyuti a mailto:hadoop.hiv...@gmail.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
mailto:user@hive.apache.org>>
Date: Friday, June 28, 2013 12:05 PM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
mailto:user@hive.apache.org>>
Subject: Re: Need urgent help in hive query

Hi Robin,
Thanks for your reply. Hope this will help us but still looking into 
performance wise tooif no other option then have to go by this.

Dti


On Sat, Jun 29, 2013 at 12:03 AM, Robin Morris 
mailto:r...@baynote.com>> wrote:
Split up the query – put results of the subquery into a table, run the final 
query on that, then drop the temporary table.

Robin

From: dyuti a mailto:hadoop.hiv...@gmail.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
mailto:user@hive.apache.org>>
Date: Friday, June 28, 2013 10:54 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
mailto:user@hive.apache.org>>, Michael Malak 
mailto:michaelma...@yahoo.com>>
Subject: Re: Fwd: Need urgent help in hive query

Hi Michael,
Thanks for your help, is there any other possible options apart from this.


On Fri, Jun 28, 2013 at 10:33 PM, Michael Malak 
mailto:michaelma...@yahoo.com>> wrote:
Just copy and paste the whole long expressions to their second occurrences.


From: dyuti a mailto:hadoop.hiv...@gmail.com>>
To: user@hive.apache.org<mailto:user@hive.apache.org>
Sent: Friday, June 28, 2013 10:58 AM
Subject: Fwd: Need urgent help in hive query

Hi Experts,
I'm trying with the below SQL query in Hive, which does not support column 
alias access in subsequent columns as shown below in the query. Is there any 
other way to rewrite the same without using alias? any of your help are really 
appreciated.

INSERT INTO CAUD
   (
 pst_cc
  pst_no
  pst_ber
  pst_tkn
  pst_dtm
  pst_nbr
  pst_cde
  pst_dte
)
SELECT der.cc<http://der.cc/>
  der.no<http://der.no/>
  der.ber
  der.tkn
  der.dtm
  der.nbr
  der.cde
  der.dte

FROM (SELECT udp.cc<http://udp.cc/>
   udp.no<http://udp.no/>
   udp.ber
   udp.tkn
   ,CASE WHEN udp.SYSTEM_DTE>1600 AND udp.SYSTEM_DTE<1 THEN 
udp.SYSTEM_DTE
WHEN udp.DTE_OUT>1600 AND udp.DTE_OUT<1 THEN udp.DTE_OUT
WHEN udp.DTE_IN>1600 AND udp.DTE_IN<1 THEN udp.DTE_IN
ELSE '1231'
END  AS DTE_OUT
   ,CASE WHEN udp.TME_OUT <> 0 THEN udp.TME_OUT
WHEN udp.TME_IN <> 0 THEN udp.TME_IN
ELSE 0
END AS TME_OUT
,TRIM(CAST(TME_OUT  AS CHAR(6))) AS TME_OUT1
,CAST(CAST(SUBSTR(TRIM(DTE_OUT),1,8)  AS CHAR(8)) AS DATE FORMAT 
'mmdd')  AS DTE_OUT_O
,CASE WHEN TME_OUT>9 THEN CAST(TME_OUT1 AS CHAR(6))
WHEN TME_OUT> AND TME_OUT<=9 THEN CAST('0'||TME_OUT1  AS 
CHAR(6))
WHEN TME_OUT>999 AND TME_OUT<= THEN CAST('00'||TME_OUT1 AS 
CHAR(6))
WHEN TME_OUT>99 AND TME_OUT<=999 THEN CAST('000'||TME_OUT1 AS 
CHAR(6))
WHEN TME_OUT>9 AND TME_OUT<=99 THEN CAST(''||TME_OUT1 AS 
CHAR(6))
WHEN TME_OUT>0 AND TME_OUT<=9 THEN CAST('0'||TME_OUT1 AS 
CHAR(6))
WHEN TME_OUT=0 THEN '00'
END AS TME_OUT2

,SUBSTR(TRIM(TME_OUT2),1,2)||':'||SUBSTR(TRIM(TME_OUT2),3,2)||':'||SUBSTR(TRIM(TME_OUT2),5,2)
 AS
TME_OUT_O
, CAST( DTE_OUT_O||' '||TME_OUT_O AS TIMESTAMP FORMAT 'MMDD:HH: 
MI:SS') AS DTM
,udp.nbr  AS  nbr

   FROM   STS_GNCAUDP   udp

INNER JOIN LOAD_LOG LZ_LL  ON udp.LOG_KEY=LZ_LL.LOG_KEY
INNER JOIN ESA_LOAD_LOG ESA_LL  ON 
LZ_LL.ESA_LOAD_LOG_KEY=ESA_LL.LOG_KEY
AND ESA_LL.PBLSH_IND='$PBLSH_IND'
AND ESA_LL.LOAD_END_DTM ='$HIGH_DATE_TIME'
AND ESA_LL.SOR_CD= '$CLM_SOR_CD'
AND ESA_LL.SUBJ_AREA_NM= '$SUBJ_AREA_NM'
AND ESA_LL.WORK_FLOW_NM= '$WORK_FLOW_NM'
QUALIFY ROW_NUMBER()  OVER  (PARTITION BY udp.cc,udp.pst_no,
udp.cde,udp.nbr,udp.dte,udp.LOG_KEY
ORDER BY DTM DESC)=1) AS der
;



Thanks in advance!
Dti








Re: Need urgent help in hive query

2013-06-28 Thread dyuti a
Hi Robin,
Thanks for your reply. Hope this will help us but still looking into
performance wise tooif no other option then have to go by this.

Dti


On Sat, Jun 29, 2013 at 12:03 AM, Robin Morris  wrote:

>  Split up the query – put results of the subquery into a table, run the
> final query on that, then drop the temporary table.
>
>  Robin
>
>   From: dyuti a 
> Reply-To: "user@hive.apache.org" 
> Date: Friday, June 28, 2013 10:54 AM
> To: "user@hive.apache.org" , Michael Malak <
> michaelma...@yahoo.com>
> Subject: Re: Fwd: Need urgent help in hive query
>
>   Hi Michael,
> Thanks for your help, is there any other possible options apart from this.
>
>
> On Fri, Jun 28, 2013 at 10:33 PM, Michael Malak wrote:
>
>>  Just copy and paste the whole long expressions to their second
>> occurrences.
>>
>>   --
>> *From:* dyuti a 
>> *To:* user@hive.apache.org
>> *Sent:* Friday, June 28, 2013 10:58 AM
>> *Subject:* Fwd: Need urgent help in hive query
>>
>>   Hi Experts,
>>  I'm trying with the below SQL query in Hive, which does not support
>> column alias access in subsequent columns as shown below in the query. Is
>> there any other way to rewrite the same without using alias? any of your
>> help are really appreciated.
>>
>> INSERT INTO CAUD
>>(
>>  pst_cc
>>   pst_no
>>   pst_ber
>>   pst_tkn
>>   pst_dtm
>>   pst_nbr
>>   pst_cde
>>   pst_dte
>> )
>> SELECT der.cc
>>   der.no
>>   der.ber
>>   der.tkn
>>   der.dtm
>>   der.nbr
>>   der.cde
>>   der.dte
>>
>> FROM (SELECT udp.cc
>>udp.no
>>udp.ber
>>udp.tkn
>>,CASE WHEN udp.SYSTEM_DTE>1600 AND udp.SYSTEM_DTE<1 THEN
>> udp.SYSTEM_DTE
>> WHEN udp.DTE_OUT>1600 AND udp.DTE_OUT<1 THEN
>> udp.DTE_OUT
>> WHEN udp.DTE_IN>1600 AND udp.DTE_IN<1 THEN
>> udp.DTE_IN
>> ELSE '1231'
>> END  AS DTE_OUT
>>,CASE WHEN udp.TME_OUT <> 0 THEN udp.TME_OUT
>> WHEN udp.TME_IN <> 0 THEN udp.TME_IN
>> ELSE 0
>> END AS TME_OUT
>> ,TRIM(CAST(TME_OUT  AS CHAR(6))) AS TME_OUT1
>> ,CAST(CAST(SUBSTR(TRIM(DTE_OUT),1,8)  AS CHAR(8)) AS DATE FORMAT
>> 'mmdd')  AS DTE_OUT_O
>> ,CASE WHEN TME_OUT>9 THEN CAST(TME_OUT1 AS CHAR(6))
>> WHEN TME_OUT> AND TME_OUT<=9 THEN CAST('0'||TME_OUT1
>> AS CHAR(6))
>> WHEN TME_OUT>999 AND TME_OUT<= THEN CAST('00'||TME_OUT1
>> AS CHAR(6))
>> WHEN TME_OUT>99 AND TME_OUT<=999 THEN CAST('000'||TME_OUT1 AS
>> CHAR(6))
>> WHEN TME_OUT>9 AND TME_OUT<=99 THEN CAST(''||TME_OUT1 AS
>> CHAR(6))
>> WHEN TME_OUT>0 AND TME_OUT<=9 THEN CAST('0'||TME_OUT1 AS
>> CHAR(6))
>> WHEN TME_OUT=0 THEN '00'
>> END AS TME_OUT2
>>
>> ,SUBSTR(TRIM(TME_OUT2),1,2)||':'||SUBSTR(TRIM(TME_OUT2),3,2)||':'||SUBSTR(TRIM(TME_OUT2),5,2)
>> AS
>> TME_OUT_O
>> , CAST( DTE_OUT_O||' '||TME_OUT_O AS TIMESTAMP FORMAT
>> 'MMDD:HH: MI:SS') AS DTM
>> ,udp.nbr  AS  nbr
>>
>>FROM   STS_GNCAUDP   udp
>>
>> INNER JOIN LOAD_LOG LZ_LL  ON udp.LOG_KEY=LZ_LL.LOG_KEY
>> INNER JOIN ESA_LOAD_LOG ESA_LL  ON
>> LZ_LL.ESA_LOAD_LOG_KEY=ESA_LL.LOG_KEY
>> AND ESA_LL.PBLSH_IND='$PBLSH_IND'
>> AND ESA_LL.LOAD_END_DTM ='$HIGH_DATE_TIME'
>> AND ESA_LL.SOR_CD= '$CLM_SOR_CD'
>> AND ESA_LL.SUBJ_AREA_NM= '$SUBJ_AREA_NM'
>> AND ESA_LL.WORK_FLOW_NM= '$WORK_FLOW_NM'
>> QUALIFY ROW_NUMBER()  OVER  (PARTITION BY udp.cc,udp.pst_no,
>> udp.cde,udp.nbr,udp.dte,udp.LOG_KEY
>> ORDER BY DTM DESC)=1) AS der
>> ;
>>
>>
>>
>> Thanks in advance!
>> Dti
>>
>>
>>
>>
>>
>


Re: Need urgent help in hive query

2013-06-28 Thread Robin Morris
Split up the query – put results of the subquery into a table, run the final 
query on that, then drop the temporary table.

Robin

From: dyuti a mailto:hadoop.hiv...@gmail.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
mailto:user@hive.apache.org>>
Date: Friday, June 28, 2013 10:54 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
mailto:user@hive.apache.org>>, Michael Malak 
mailto:michaelma...@yahoo.com>>
Subject: Re: Fwd: Need urgent help in hive query

Hi Michael,
Thanks for your help, is there any other possible options apart from this.


On Fri, Jun 28, 2013 at 10:33 PM, Michael Malak 
mailto:michaelma...@yahoo.com>> wrote:
Just copy and paste the whole long expressions to their second occurrences.


From: dyuti a mailto:hadoop.hiv...@gmail.com>>
To: user@hive.apache.org<mailto:user@hive.apache.org>
Sent: Friday, June 28, 2013 10:58 AM
Subject: Fwd: Need urgent help in hive query

Hi Experts,
I'm trying with the below SQL query in Hive, which does not support column 
alias access in subsequent columns as shown below in the query. Is there any 
other way to rewrite the same without using alias? any of your help are really 
appreciated.

INSERT INTO CAUD
   (
 pst_cc
  pst_no
  pst_ber
  pst_tkn
  pst_dtm
  pst_nbr
  pst_cde
  pst_dte
)
SELECT der.cc<http://der.cc/>
  der.no<http://der.no/>
  der.ber
  der.tkn
  der.dtm
  der.nbr
  der.cde
  der.dte

FROM (SELECT udp.cc<http://udp.cc/>
   udp.no<http://udp.no/>
   udp.ber
   udp.tkn
   ,CASE WHEN udp.SYSTEM_DTE>1600 AND udp.SYSTEM_DTE<1 THEN 
udp.SYSTEM_DTE
WHEN udp.DTE_OUT>1600 AND udp.DTE_OUT<1 THEN udp.DTE_OUT
WHEN udp.DTE_IN>1600 AND udp.DTE_IN<1 THEN udp.DTE_IN
ELSE '1231'
END  AS DTE_OUT
   ,CASE WHEN udp.TME_OUT <> 0 THEN udp.TME_OUT
WHEN udp.TME_IN <> 0 THEN udp.TME_IN
ELSE 0
END AS TME_OUT
,TRIM(CAST(TME_OUT  AS CHAR(6))) AS TME_OUT1
,CAST(CAST(SUBSTR(TRIM(DTE_OUT),1,8)  AS CHAR(8)) AS DATE FORMAT 
'mmdd')  AS DTE_OUT_O
,CASE WHEN TME_OUT>9 THEN CAST(TME_OUT1 AS CHAR(6))
WHEN TME_OUT> AND TME_OUT<=9 THEN CAST('0'||TME_OUT1  AS 
CHAR(6))
WHEN TME_OUT>999 AND TME_OUT<= THEN CAST('00'||TME_OUT1 AS 
CHAR(6))
WHEN TME_OUT>99 AND TME_OUT<=999 THEN CAST('000'||TME_OUT1 AS 
CHAR(6))
WHEN TME_OUT>9 AND TME_OUT<=99 THEN CAST(''||TME_OUT1 AS 
CHAR(6))
WHEN TME_OUT>0 AND TME_OUT<=9 THEN CAST('0'||TME_OUT1 AS 
CHAR(6))
WHEN TME_OUT=0 THEN '00'
END AS TME_OUT2

,SUBSTR(TRIM(TME_OUT2),1,2)||':'||SUBSTR(TRIM(TME_OUT2),3,2)||':'||SUBSTR(TRIM(TME_OUT2),5,2)
 AS
TME_OUT_O
, CAST( DTE_OUT_O||' '||TME_OUT_O AS TIMESTAMP FORMAT 'MMDD:HH: 
MI:SS') AS DTM
,udp.nbr  AS  nbr

   FROM   STS_GNCAUDP   udp

INNER JOIN LOAD_LOG LZ_LL  ON udp.LOG_KEY=LZ_LL.LOG_KEY
INNER JOIN ESA_LOAD_LOG ESA_LL  ON 
LZ_LL.ESA_LOAD_LOG_KEY=ESA_LL.LOG_KEY
AND ESA_LL.PBLSH_IND='$PBLSH_IND'
AND ESA_LL.LOAD_END_DTM ='$HIGH_DATE_TIME'
AND ESA_LL.SOR_CD= '$CLM_SOR_CD'
AND ESA_LL.SUBJ_AREA_NM= '$SUBJ_AREA_NM'
AND ESA_LL.WORK_FLOW_NM= '$WORK_FLOW_NM'
QUALIFY ROW_NUMBER()  OVER  (PARTITION BY udp.cc,udp.pst_no,
udp.cde,udp.nbr,udp.dte,udp.LOG_KEY
ORDER BY DTM DESC)=1) AS der
;



Thanks in advance!
Dti







Re: Fwd: Need urgent help in hive query

2013-06-28 Thread dyuti a
Hi Michael,
Thanks for your help, is there any other possible options apart from this.


On Fri, Jun 28, 2013 at 10:33 PM, Michael Malak wrote:

> Just copy and paste the whole long expressions to their second occurrences.
>
>   --
>  *From:* dyuti a 
> *To:* user@hive.apache.org
> *Sent:* Friday, June 28, 2013 10:58 AM
> *Subject:* Fwd: Need urgent help in hive query
>
> Hi Experts,
> I'm trying with the below SQL query in Hive, which does not support column
> alias access in subsequent columns as shown below in the query. Is there
> any other way to rewrite the same without using alias? any of your help are
> really appreciated.
>
> INSERT INTO CAUD
>(
>  pst_cc
>   pst_no
>   pst_ber
>   pst_tkn
>   pst_dtm
>   pst_nbr
>   pst_cde
>   pst_dte
> )
> SELECT der.cc
>   der.no
>   der.ber
>   der.tkn
>   der.dtm
>   der.nbr
>   der.cde
>   der.dte
>
> FROM (SELECT udp.cc
>udp.no
>udp.ber
>udp.tkn
>,CASE WHEN udp.SYSTEM_DTE>1600 AND udp.SYSTEM_DTE<1 THEN
> udp.SYSTEM_DTE
> WHEN udp.DTE_OUT>1600 AND udp.DTE_OUT<1 THEN
> udp.DTE_OUT
> WHEN udp.DTE_IN>1600 AND udp.DTE_IN<1 THEN
> udp.DTE_IN
> ELSE '1231'
> END  AS DTE_OUT
>,CASE WHEN udp.TME_OUT <> 0 THEN udp.TME_OUT
> WHEN udp.TME_IN <> 0 THEN udp.TME_IN
> ELSE 0
> END AS TME_OUT
> ,TRIM(CAST(TME_OUT  AS CHAR(6))) AS TME_OUT1
> ,CAST(CAST(SUBSTR(TRIM(DTE_OUT),1,8)  AS CHAR(8)) AS DATE FORMAT
> 'mmdd')  AS DTE_OUT_O
> ,CASE WHEN TME_OUT>9 THEN CAST(TME_OUT1 AS CHAR(6))
> WHEN TME_OUT> AND TME_OUT<=9 THEN CAST('0'||TME_OUT1
> AS CHAR(6))
> WHEN TME_OUT>999 AND TME_OUT<= THEN CAST('00'||TME_OUT1 AS
> CHAR(6))
> WHEN TME_OUT>99 AND TME_OUT<=999 THEN CAST('000'||TME_OUT1 AS
> CHAR(6))
> WHEN TME_OUT>9 AND TME_OUT<=99 THEN CAST(''||TME_OUT1 AS
> CHAR(6))
> WHEN TME_OUT>0 AND TME_OUT<=9 THEN CAST('0'||TME_OUT1 AS
> CHAR(6))
> WHEN TME_OUT=0 THEN '00'
> END AS TME_OUT2
>
> ,SUBSTR(TRIM(TME_OUT2),1,2)||':'||SUBSTR(TRIM(TME_OUT2),3,2)||':'||SUBSTR(TRIM(TME_OUT2),5,2)
> AS
> TME_OUT_O
> , CAST( DTE_OUT_O||' '||TME_OUT_O AS TIMESTAMP FORMAT
> 'MMDD:HH: MI:SS') AS DTM
> ,udp.nbr  AS  nbr
>
>FROM   STS_GNCAUDP   udp
>
> INNER JOIN LOAD_LOG LZ_LL  ON udp.LOG_KEY=LZ_LL.LOG_KEY
> INNER JOIN ESA_LOAD_LOG ESA_LL  ON
> LZ_LL.ESA_LOAD_LOG_KEY=ESA_LL.LOG_KEY
> AND ESA_LL.PBLSH_IND='$PBLSH_IND'
> AND ESA_LL.LOAD_END_DTM ='$HIGH_DATE_TIME'
> AND ESA_LL.SOR_CD= '$CLM_SOR_CD'
> AND ESA_LL.SUBJ_AREA_NM= '$SUBJ_AREA_NM'
> AND ESA_LL.WORK_FLOW_NM= '$WORK_FLOW_NM'
> QUALIFY ROW_NUMBER()  OVER  (PARTITION BY udp.cc,udp.pst_no,
> udp.cde,udp.nbr,udp.dte,udp.LOG_KEY
> ORDER BY DTM DESC)=1) AS der
> ;
>
>
>
> Thanks in advance!
> Dti
>
>
>
>
>


Re: Fwd: Need urgent help in hive query

2013-06-28 Thread Michael Malak
Just copy and paste the whole long expressions to their second occurrences.



 From: dyuti a 
To: user@hive.apache.org 
Sent: Friday, June 28, 2013 10:58 AM
Subject: Fwd: Need urgent help in hive query
 


Hi Experts,
I'm trying with the below SQL query in Hive, which does not support column 
alias access in subsequent columns as shown below in the query. Is there any 
other way to rewrite the same without using alias? any of your help are really 
appreciated.
 
INSERT INTO CAUD    
   (
 pst_cc
  pst_no
  pst_ber
  pst_tkn
  pst_dtm
  pst_nbr
  pst_cde
  pst_dte
)
SELECT der.cc
  der.no
  der.ber
  der.tkn
  der.dtm
  der.nbr
  der.cde
  der.dte
  
FROM (SELECT udp.cc
   udp.no
   udp.ber
   udp.tkn
   ,CASE WHEN udp.SYSTEM_DTE>1600 AND udp.SYSTEM_DTE<1 THEN 
udp.SYSTEM_DTE
    WHEN udp.DTE_OUT>1600 AND udp.DTE_OUT<1 THEN udp.DTE_OUT
    WHEN udp.DTE_IN>1600 AND udp.DTE_IN<1 THEN udp.DTE_IN
    ELSE '1231'
    END  AS DTE_OUT
   ,CASE WHEN udp.TME_OUT <> 0 THEN udp.TME_OUT
    WHEN udp.TME_IN <> 0 THEN udp.TME_IN
    ELSE 0
    END AS TME_OUT
    ,TRIM(CAST(TME_OUT  AS CHAR(6))) AS TME_OUT1
    ,CAST(CAST(SUBSTR(TRIM(DTE_OUT),1,8)  AS CHAR(8)) AS DATE FORMAT 
'mmdd')  AS DTE_OUT_O
    ,CASE WHEN TME_OUT>9 THEN CAST(TME_OUT1 AS CHAR(6))
    WHEN TME_OUT> AND TME_OUT<=9 THEN CAST('0'||TME_OUT1  AS 
CHAR(6))
    WHEN TME_OUT>999 AND TME_OUT<= THEN CAST('00'||TME_OUT1 AS 
CHAR(6))
    WHEN TME_OUT>99 AND TME_OUT<=999 THEN CAST('000'||TME_OUT1 AS 
CHAR(6))
    WHEN TME_OUT>9 AND TME_OUT<=99 THEN CAST(''||TME_OUT1 AS 
CHAR(6))
    WHEN TME_OUT>0 AND TME_OUT<=9 THEN CAST('0'||TME_OUT1 AS 
CHAR(6)) 
    WHEN TME_OUT=0 THEN '00'
    END AS TME_OUT2
    
,SUBSTR(TRIM(TME_OUT2),1,2)||':'||SUBSTR(TRIM(TME_OUT2),3,2)||':'||SUBSTR(TRIM(TME_OUT2),5,2)
 AS  
    TME_OUT_O
    , CAST( DTE_OUT_O||' '||TME_OUT_O AS TIMESTAMP FORMAT 'MMDD:HH: 
MI:SS') AS DTM
    ,udp.nbr  AS  nbr 
    
   FROM   STS_GNCAUDP   udp 
   
    INNER JOIN LOAD_LOG LZ_LL  ON udp.LOG_KEY=LZ_LL.LOG_KEY
    INNER JOIN ESA_LOAD_LOG ESA_LL  ON 
LZ_LL.ESA_LOAD_LOG_KEY=ESA_LL.LOG_KEY
    AND ESA_LL.PBLSH_IND='$PBLSH_IND'
    AND ESA_LL.LOAD_END_DTM ='$HIGH_DATE_TIME'
    AND ESA_LL.SOR_CD= '$CLM_SOR_CD'
    AND ESA_LL.SUBJ_AREA_NM= '$SUBJ_AREA_NM'
    AND ESA_LL.WORK_FLOW_NM= '$WORK_FLOW_NM'    
    QUALIFY ROW_NUMBER()  OVER  (PARTITION BY udp.cc,udp.pst_no,
    udp.cde,udp.nbr,udp.dte,udp.LOG_KEY
    ORDER BY DTM DESC)=1) AS der
;
 
 
 
Thanks in advance!
Dti

Fwd: Need urgent help in hive query

2013-06-28 Thread dyuti a
Hi Experts,
I'm trying with the below SQL query in Hive, which does not support column
alias access in subsequent columns as shown below in the query. Is there
any other way to rewrite the same without using alias? any of your help are
really appreciated.

INSERT INTO CAUD
   (
 pst_cc
  pst_no
  pst_ber
  pst_tkn
  pst_dtm
  pst_nbr
  pst_cde
  pst_dte
)
SELECT der.cc
  der.no
  der.ber
  der.tkn
  der.dtm
  der.nbr
  der.cde
  der.dte

FROM (SELECT udp.cc
   udp.no
   udp.ber
   udp.tkn
   ,CASE WHEN udp.SYSTEM_DTE>1600 AND udp.SYSTEM_DTE<1 THEN
udp.SYSTEM_DTE
WHEN udp.DTE_OUT>1600 AND udp.DTE_OUT<1 THEN
udp.DTE_OUT
WHEN udp.DTE_IN>1600 AND udp.DTE_IN<1 THEN
udp.DTE_IN
ELSE '1231'
END  AS DTE_OUT
   ,CASE WHEN udp.TME_OUT <> 0 THEN udp.TME_OUT
WHEN udp.TME_IN <> 0 THEN udp.TME_IN
ELSE 0
END AS TME_OUT
,TRIM(CAST(TME_OUT  AS CHAR(6))) AS TME_OUT1
,CAST(CAST(SUBSTR(TRIM(DTE_OUT),1,8)  AS CHAR(8)) AS DATE FORMAT
'mmdd')  AS DTE_OUT_O
,CASE WHEN TME_OUT>9 THEN CAST(TME_OUT1 AS CHAR(6))
WHEN TME_OUT> AND TME_OUT<=9 THEN CAST('0'||TME_OUT1
AS CHAR(6))
WHEN TME_OUT>999 AND TME_OUT<= THEN CAST('00'||TME_OUT1 AS
CHAR(6))
WHEN TME_OUT>99 AND TME_OUT<=999 THEN CAST('000'||TME_OUT1 AS
CHAR(6))
WHEN TME_OUT>9 AND TME_OUT<=99 THEN CAST(''||TME_OUT1 AS
CHAR(6))
WHEN TME_OUT>0 AND TME_OUT<=9 THEN CAST('0'||TME_OUT1 AS
CHAR(6))
WHEN TME_OUT=0 THEN '00'
END AS TME_OUT2

,SUBSTR(TRIM(TME_OUT2),1,2)||':'||SUBSTR(TRIM(TME_OUT2),3,2)||':'||SUBSTR(TRIM(TME_OUT2),5,2)
AS
TME_OUT_O
, CAST( DTE_OUT_O||' '||TME_OUT_O AS TIMESTAMP FORMAT
'MMDD:HH: MI:SS') AS DTM
,udp.nbr  AS  nbr

   FROM   STS_GNCAUDP   udp

INNER JOIN LOAD_LOG LZ_LL  ON udp.LOG_KEY=LZ_LL.LOG_KEY
INNER JOIN ESA_LOAD_LOG ESA_LL  ON
LZ_LL.ESA_LOAD_LOG_KEY=ESA_LL.LOG_KEY
AND ESA_LL.PBLSH_IND='$PBLSH_IND'
AND ESA_LL.LOAD_END_DTM ='$HIGH_DATE_TIME'
AND ESA_LL.SOR_CD= '$CLM_SOR_CD'
AND ESA_LL.SUBJ_AREA_NM= '$SUBJ_AREA_NM'
AND ESA_LL.WORK_FLOW_NM= '$WORK_FLOW_NM'
QUALIFY ROW_NUMBER()  OVER  (PARTITION BY udp.cc,udp.pst_no,
udp.cde,udp.nbr,udp.dte,udp.LOG_KEY
ORDER BY DTM DESC)=1) AS der
;



Thanks in advance!
Dti


Re: Help in hive query

2012-10-31 Thread dyuti a
Hey john,
I did the same using join from two tables but your suggestion helped me a
lot in many ways.It works great!

Thanks a ton!

Regards,
dti

On Tue, Oct 30, 2012 at 7:27 PM, John Meagher wrote:

> The WHERE part in the approvals can be moved up to be an IF in the
> SELECT...
>
>
> SELECT client_id,receive_dd,receive_hh,
>   receive_hh+1,
>   COUNT(1) AS transaction_count,
>   SUM( IF ( response=00, 1, 0) ) AS approval_count,
>   SUM( IF ( response=00, 1, 0) ) / COUNT(1) * 100 AS percent
> FROM sale_test
> group by fdc_client_id,receive_dd,receive_hh,receive_hh+1;
>
>
>
> On Tue, Oct 30, 2012 at 1:51 AM, dyuti a  wrote:
> > Hi All,
> >
> > I want to perform (No.of .approvals in an hour/No.of transactions in that
> > hour)*100.
> >
> > //COUNT(1) AS cnt  gives total transactions in an hour
> > SELECT client_id,receive_dd,receive_hh,receive_hh+1,COUNT(1) AS cnt FROM
> > sale_test group by fdc_client_id,receive_dd,receive_hh,receive_hh+1;
> >
> > GETREGREOS  23  16  17  5969
> > GETREGREOS  23  21  22  2602
> > GETREGREOS  24  3   4   114
> >
> > //Approved transactions where response=00
> > SELECT client_id,receive_dd,receive_hh,receive_hh+1,COUNT(1) AS cnt FROM
> > sale_test where response=00 group by
> > client_id,receive_dd,receive_hh,receive_hh+1;
> > GETREGREOS  23  16  17  5775
> > GETREGREOS  23  21  22  2515
> > GETREGREOS  24  3   4   103
> >
> >
> > I want to perform 100 * (5775/5969) , 100 * (2515/2602) , 100 * (103/114)
> > like the same for all other clients for each hour i.e., (No.of
> .approvals in
> > an hour/No.of transactions in that hour)*100.
> >
> > Please help me out as how to achieve this in hive.
> >
> >
> > Thanks & Regards,
> > dti
>


Re: Help in hive query

2012-10-30 Thread John Meagher
The WHERE part in the approvals can be moved up to be an IF in the SELECT...


SELECT client_id,receive_dd,receive_hh,
  receive_hh+1,
  COUNT(1) AS transaction_count,
  SUM( IF ( response=00, 1, 0) ) AS approval_count,
  SUM( IF ( response=00, 1, 0) ) / COUNT(1) * 100 AS percent
FROM sale_test
group by fdc_client_id,receive_dd,receive_hh,receive_hh+1;



On Tue, Oct 30, 2012 at 1:51 AM, dyuti a  wrote:
> Hi All,
>
> I want to perform (No.of .approvals in an hour/No.of transactions in that
> hour)*100.
>
> //COUNT(1) AS cnt  gives total transactions in an hour
> SELECT client_id,receive_dd,receive_hh,receive_hh+1,COUNT(1) AS cnt FROM
> sale_test group by fdc_client_id,receive_dd,receive_hh,receive_hh+1;
>
> GETREGREOS  23  16  17  5969
> GETREGREOS  23  21  22  2602
> GETREGREOS  24  3   4   114
>
> //Approved transactions where response=00
> SELECT client_id,receive_dd,receive_hh,receive_hh+1,COUNT(1) AS cnt FROM
> sale_test where response=00 group by
> client_id,receive_dd,receive_hh,receive_hh+1;
> GETREGREOS  23  16  17  5775
> GETREGREOS  23  21  22  2515
> GETREGREOS  24  3   4   103
>
>
> I want to perform 100 * (5775/5969) , 100 * (2515/2602) , 100 * (103/114)
> like the same for all other clients for each hour i.e., (No.of .approvals in
> an hour/No.of transactions in that hour)*100.
>
> Please help me out as how to achieve this in hive.
>
>
> Thanks & Regards,
> dti


RE: NEED HELP in Hive Query

2012-10-14 Thread yogesh dhari

Thanks John :-),

I got it now in Pig also :-).

A = load '/File/00_0' using PigStorage('\u0001')  
 as as (name, date, url, hit:INT); 

B = group A by (id, name, date, url);  

 C = foreach B generate flatten(A.id), flatten(A.name), flatten(A.url), 
SUM(A.hit) ;

D = distinct C; 

Dump D;

Thanks & Regards
Yogesh Kumar Dhari

From: j...@omernik.com
Date: Sun, 14 Oct 2012 12:29:23 -0500
Subject: Re: NEED HELP in Hive Query
To: user@hive.apache.org

select NAME, DATE, URL, SUM(HITCOUNT) as HITCOUNT from yourtable group by NAME, 
DATE, URL
That's the HIVE answer. Not sure the PIG answer. 





On Sun, Oct 14, 2012 at 9:54 AM, yogesh dhari  wrote:






Hi all, 

I have this file. I want this operation to perform in HIVE & PIG

  NAME  DATE   URL  
 HITCOUNT


   timesascent.in2008-08-27
http://timesascent.in/index.aspx?page=tparchives15


timesascent.in2008-08-27
http://timesascent.in/index.aspx?page=article§id=1&contentid=200812182008121814134447219270b26
20


timesascent.in2008-08-27http://timesascent.in/37
timesascent.in2008-08-27http://timesascent.in/section/39/Job%20Wise 
   14


timesascent.in2008-08-27
http://timesascent.in/article/7/2011062120110621171709769aacc537/Work-environment--Employee-productivity.html
20


timesascent.in2008-08-27http://timesascent.in/17
timesascent.in2008-08-27http://timesascent.in/section/2/Interviews  
  15


timesascent.in2008-08-27http://timesascent.in/17
   timesascent.in2008-08-27http://timesascent.in/27


timesascent.in2008-08-27http://timesascent.in/37
timesascent.in2008-08-27http://timesascent.in/27


timesascent.in2008-08-27http://www.timesascent.in/16
timesascent.in2008-08-27http://timesascent.in/section/2/Interviews  
  14


timesascent.in2008-08-27http://timesascent.in/14
timesascent.in2008-08-27http://timesascent.in/22




I want to add all HITCOUNT for the same NAME, DATE & URL  

like 

 timesascent.in2008-08-27http://timesascent.in/(addition of all 
hitcount under same name, date, url   (37+17+17+27+))



Please suggest me is there any method to perform this query.


Thanks & Regards
Yogesh Kumar



  

  

Re: NEED HELP in Hive Query

2012-10-14 Thread John Omernik
select NAME, DATE, URL, SUM(HITCOUNT) as HITCOUNT from yourtable group by
NAME, DATE, URL

That's the HIVE answer. Not sure the PIG answer.




On Sun, Oct 14, 2012 at 9:54 AM, yogesh dhari  wrote:

>  Hi all,
>
> I have this file. I want this operation to perform in *HIVE & PIG*
>
>   NAME  DATE
> URL
> HITCOUNT
>timesascent.in2008-08-27
> http://timesascent.in/index.aspx?page=tparchives15
> timesascent.in2008-08-27
> http://timesascent.in/index.aspx?page=article§id=1&contentid=200812182008121814134447219270b26
> 20
> timesascent.in2008-08-27http://timesascent.in/37
> timesascent.in2008-08-27
> http://timesascent.in/section/39/Job%20Wise14
> timesascent.in2008-08-27
> http://timesascent.in/article/7/2011062120110621171709769aacc537/Work-environment--Employee-productivity.html
> 20
> timesascent.in2008-08-27http://timesascent.in/17
> timesascent.in2008-08-27
> http://timesascent.in/section/2/Interviews15
> timesascent.in2008-08-27http://timesascent.in/17
>timesascent.in2008-08-27http://timesascent.in/27
> timesascent.in2008-08-27http://timesascent.in/37
> timesascent.in2008-08-27http://timesascent.in/27
> timesascent.in2008-08-27http://www.timesascent.in/16
> timesascent.in2008-08-27
> http://timesascent.in/section/2/Interviews14
> timesascent.in2008-08-27http://timesascent.in/14
> timesascent.in2008-08-27http://timesascent.in/22
>
>
> I want to *add all HITCOUNT for the same NAME, DATE & URL  *
>
> like
>
> timesascent.in2008-08-27http://timesascent.in/(addition of
> all hitcount under same name, date, url   (37+17+17+27+))
>
> Please suggest me is there any method to perform this query.
>
>
> Thanks & Regards
> Yogesh Kumar
>
>
>
>


NEED HELP in Hive Query

2012-10-14 Thread yogesh dhari

Hi all, 

I have this file. I want this operation to perform in HIVE & PIG

  NAME  DATE   URL  
 HITCOUNT
   timesascent.in2008-08-27
http://timesascent.in/index.aspx?page=tparchives15
timesascent.in2008-08-27
http://timesascent.in/index.aspx?page=article§id=1&contentid=200812182008121814134447219270b26
20
timesascent.in2008-08-27http://timesascent.in/37
timesascent.in2008-08-27http://timesascent.in/section/39/Job%20Wise 
   14
timesascent.in2008-08-27
http://timesascent.in/article/7/2011062120110621171709769aacc537/Work-environment--Employee-productivity.html
20
timesascent.in2008-08-27http://timesascent.in/17
timesascent.in2008-08-27http://timesascent.in/section/2/Interviews  
  15
timesascent.in2008-08-27http://timesascent.in/17
   timesascent.in2008-08-27http://timesascent.in/27
timesascent.in2008-08-27http://timesascent.in/37
timesascent.in2008-08-27http://timesascent.in/27
timesascent.in2008-08-27http://www.timesascent.in/16
timesascent.in2008-08-27http://timesascent.in/section/2/Interviews  
  14
timesascent.in2008-08-27http://timesascent.in/14
timesascent.in2008-08-27http://timesascent.in/22


I want to add all HITCOUNT for the same NAME, DATE & URL  

like 

 timesascent.in2008-08-27http://timesascent.in/(addition of all 
hitcount under same name, date, url   (37+17+17+27+))

Please suggest me is there any method to perform this query.


Thanks & Regards
Yogesh Kumar



  

Re: Help in hive query

2012-10-10 Thread Manu A
Thanks Jan. It worked!


Regards,
Manu

On Wed, Oct 10, 2012 at 12:00 PM, Jan Dolinár  wrote:

> Hi Manu,
>
> I believe the last "group by q2.auth_count" is wrong, because it
> causes computing average only across lines with same value of
> q2.auth_count, which is of course equal to its value.
>
> Best regards,
> J. Dolinar
>
> On Wed, Oct 10, 2012 at 8:19 AM, Manu A  wrote:
> > Hi All,
> > The result for the below query is 194965.00.0 , but 194965 is the
> > result of inner query from count(q1.response). It looks like the outer
> query
> > [select avg(q2.auth_count), stddev_pop(q2.auth_count)]didn't work at all.
> >
> >
> >  //Query
> > select avg(q2.auth_count), stddev_pop(q2.auth_count)
> > from (
> > select q1.TEXT_CCYY ,count(q1.response) as auth_count
> > from(
> >select * from Sale1 where TEXT_DD=7 AND TEXT_HH=15  AND
> > response=00)q1
> > group by q1.TEXT_CCYY,q1.response)q2
> > group by q2.auth_count;
> >
> >
> > Please help me is there anything i have to change in query.
> >
> >
> > Thanks & Regards,
> > Manu
> >
> >
>


Re: Help in hive query

2012-10-09 Thread Jan Dolinár
Hi Manu,

I believe the last "group by q2.auth_count" is wrong, because it
causes computing average only across lines with same value of
q2.auth_count, which is of course equal to its value.

Best regards,
J. Dolinar

On Wed, Oct 10, 2012 at 8:19 AM, Manu A  wrote:
> Hi All,
> The result for the below query is 194965.00.0 , but 194965 is the
> result of inner query from count(q1.response). It looks like the outer query
> [select avg(q2.auth_count), stddev_pop(q2.auth_count)]didn't work at all.
>
>
>  //Query
> select avg(q2.auth_count), stddev_pop(q2.auth_count)
> from (
> select q1.TEXT_CCYY ,count(q1.response) as auth_count
> from(
>select * from Sale1 where TEXT_DD=7 AND TEXT_HH=15  AND
> response=00)q1
> group by q1.TEXT_CCYY,q1.response)q2
> group by q2.auth_count;
>
>
> Please help me is there anything i have to change in query.
>
>
> Thanks & Regards,
> Manu
>
>


Help in hive query

2012-10-09 Thread Manu A
 Hi All,
The result for the below query is 194965.00.0 , but 194965 is the
result of inner query from count(q1.response). It looks like the outer
query [select avg(q2.auth_count), stddev_pop(q2.auth_count)]didn't work at
all.


 //Query
select avg(q2.auth_count), stddev_pop(q2.auth_count)
from (
select q1.TEXT_CCYY ,count(q1.response) as auth_count
from(
   select * from Sale1 where TEXT_DD=7 AND TEXT_HH=15  AND
response=00)q1
group by q1.TEXT_CCYY,q1.response)q2
group by q2.auth_count;


Please help me is there anything i have to change in query.


*Thanks & Regards,*
*Manu*