;2012/09/18 00:00:00'*
* AND SojTimestampToDate(event.event_timestamp) <= '2012/09/18 02:00:00'*
Can anyone shed some light on this whether I am doing right or not?
*Raihan Jamal*
Just to add here
*SojTimestampToDate* will return data in this format only *2012/02/29
17:01:43*
*Raihan Jamal*
On Wed, Oct 3, 2012 at 4:46 PM, Raihan Jamal wrote:
> This is still not working as in the XML file the *final* property has
> been set as true so that means I cannot overr
s for this job 2070929 exceeds the
configured limit 20*
* *
Any other suggestion what should I do to overcome this problem? May be any
changes in the query can overcome this problem?
*Raihan Jamal*
On Wed, Oct 3, 2012 at 2:59 PM, Chalcy Raja
wrote:
> Hi Raihan,
>
> ** **
What about if I do like below? Will this work?
*
set mapred.jobtracker.maxtasks.per.job=-1*
*Raihan Jamal*
On Wed, Oct 3, 2012 at 2:59 PM, Chalcy Raja
wrote:
> Hi Raihan,
>
> ** **
>
> You can set it in hive prompt like below,
>
> set mapred.jobtracker.max
changes manually from the Hive prompt? Any
suggestions?
*Raihan Jamal*
On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal wrote:
> Can anyone help me out here? What does the below error means? And this is
> the query I am using-
>
> *SELECT cguid,*
> * event_item,*
>
;) >= unix_timestamp('2012/09/18 00:00:00', '/MM/dd HH:mm:ss')*
* AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd
HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', '/MM/dd HH:mm:ss')*
* ) n ON m.cguid = n.ch
error means? Can anyone help me out here?
*Raihan Jamal*
That basically means your data was not in the correct format when you move
or copied the data to HDFS. So there is one file which is corrupted, you
can find the file name in your error logs.
*Raihan Jamal*
On Tue, Aug 28, 2012 at 7:23 AM, Kiwon Lee wrote:
> Hi
>
> I have
Let me try that.. Thanks for the help.
*Raihan Jamal*
On Wed, Aug 15, 2012 at 5:10 PM, hadoop hive wrote:
> Hey Jamal,
>You can use bash shell script combined with hive query, in shell script
> you can check for exit status.
> E.g :
> #!/bin/bash
> hive -e "
other
HiveQL queries.
*Raihan Jamal*
I think you can use here LIMIT-
Limit indicates the number of rows to be returned. The rows returned are
chosen at random. The following query returns 5 rows from t1 at random.
SELECT * FROM t1 LIMIT 5
http://karmasphere.com/hive-queries-on-table-data
*Raihan Jamal*
On Tue, Aug 14, 2012
Is there any difference between count(*) and count(1) in Hive. And which
one should we use in general and why?
Given that I am on Hive 0.6 version.
*Raihan Jamal*
Thanks Jan for the suggestion.
*Raihan Jamal*
On Tue, Aug 7, 2012 at 10:01 PM, Jan Dolinár wrote:
> The shell will interpret the query in your command as SELECT
> ... explode(split(timestamps, *#*)) ... if you run it the way you wrote
> it, i.e. without the quotation. The way ar
Let me try that and I will update on this thread.
*Raihan Jamal*
On Tue, Aug 7, 2012 at 11:39 AM, Techy Teck wrote:
> Then that means I don't need to create that userdefinedfunction right?
>
>
>
> On Tue, Aug 7, 2012 at 11:32 AM, Jan Dolinár wrote:
>
>> Hi
. And I
don't know why they are saying like this, so that is the reason I was doing
like this.
Any suggestions will be appreciated to make this thing work
*Raihan Jamal*
On Tue, Aug 7, 2012 at 11:11 AM, Vijay wrote:
> Given the implementation of the UDF, I don't think hive would
Yes it supports -e option, but in your query what is date?
hive -e "CREATE TEMPORARY FUNCTION yesterdaydate
AS 'com.example.hive.udf.YesterdayDate';
SELECT * FROM REALTIME where dt=$(*date* -d -1day +%Y%m%d) LIMIT 10;"
*Raihan Jamal*
On Tue, Aug 7, 2012 at 11:18 AM
is- How to get the Yesterdays date which I can use
on the Date Partition I cannot use hiveconf here as I am working with Hive
0.6
*Raihan Jamal*
On Tue, Aug 7, 2012 at 10:37 AM, Jan Dolinár wrote:
> I'm afraid that he query
>
> SELECT * FROM REALTIME where dt= yesterday
DEPENDENCIES:
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: 5
Time taken: 12.126 seconds
*Raihan Jamal*
On Tue, Aug 7, 2012 at 10:56 AM, Jan Dolinár wrote:
> Oops, sorry I made a copy&paste mistake :) The annotation should read
806’ LIMIT 10;
So that means it will look for data in the corresponding dt partition
*(20120806)
*only right as above table is partitioned on dt column ? And it will not
scan the whole table right?**
*Raihan Jamal*
On Mon, Aug 6, 2012 at 10:56 PM, Jan Dolinár wrote:
> Hi Jamal,
>
ng is wrong the way I am doing it for sure?
*Raihan Jamal*
On Mon, Aug 6, 2012 at 10:56 PM, Jan Dolinár wrote:
> Hi Jamal,
>
> Check if the function really returns what it should and that your data are
> really in MMdd format. You can do this by simple query like this:
>
Yup, Thanks it worked.
*Raihan Jamal*
On Fri, Jul 20, 2012 at 1:40 PM, Bejoy KS wrote:
> **
> Raihan
>
> To see the failed task logs in hadoop, the easiest approach is
> drilling down the jobtracker web UI.
>
> Go to the job url (which you'll get in the beginning
I tried opening the below URL, and nothing got opened, I got page cannot be
displayed. Why is that so?
*Raihan Jamal*
On Fri, Jul 20, 2012 at 12:39 PM, Sriram Krishnan wrote:
> What version of Hadoop and Hive are you using? We have seen errors like
> this in the past – and you can ac
After setting this in Hive-
hive> SET hive.exec.show.job.failure.debug.info=false;
I can see the logs on my console itself? Or I need to go somewhere to see
the actual logs and what is causing the problem?
*Raihan Jamal*
On Fri, Jul 20, 2012 at 12:28 PM, kulkarni.swar...@gmail.
at
org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
*
*... 15 more*
*Ended Job = job_201207172005_14407 with exception
'java.lang.RuntimeException(Error while reading from task log url)'*
*Raihan Jamal*
Something like this will work in Hive?
*ON ((UNIX_TIMESTAMP(testingtable1.created_time) - (prod_and_ts.timestamps
/ 1000)) / 60* 1000 <= 15 minutes)*
*Raihan Jamal*
On Wed, Jul 18, 2012 at 4:48 PM, Raihan Jamal wrote:
> This is the CREATED_TIME *`2012-07-17 00:00:22`* and this
(UNIX_TIMESTAMP(testingtable1.created_time) - (prod_and_ts.timestamps /
1000) = 15 minutes)
How I can do the above case if difference between timestamps is within 15
minutes then data will get matched by the above `ON clause`
*Raihan Jamal*
And CREATED_TIME is string data type.
*Raihan Jamal*
On Wed, Jul 18, 2012 at 2:48 PM, Raihan Jamal wrote:
> This is the CREATED_TIME *2009-12-14 10:15:54*
> *
> *
> How I can get only the date part from the above created_time, just like
> below.
>
> *2009-12-14*
>
This is the CREATED_TIME *2009-12-14 10:15:54*
*
*
How I can get only the date part from the above created_time, just like
below.
*2009-12-14*
Any suggestions will be appreciated.
*Raihan Jamal*
*to_date(from_unixtime(cast(prod_and_ts.timestamps /1000 as BIGINT)))*
*
*
So this should work? I am currently running to see the output.
*Raihan Jamal*
On Wed, Jul 18, 2012 at 1:16 PM, Raihan Jamal wrote:
> Can you show me exact syntax, how to do this? It will be of great help to
&
Can you show me exact syntax, how to do this? It will be of great help to
me. Thanks.
*Raihan Jamal*
On Wed, Jul 18, 2012 at 1:14 PM, Paul Mackles wrote:
> That timestamp is in millseconds but the hive date functions expect
> seconds. Try dividing by 1000 first.
>
> From:
in an
email.
*Raihan Jamal*
On Tue, Jul 17, 2012 at 11:30 PM, Vinod Singh wrote:
> hive -e "SELECT count(*) from pds_table" > a.txt
>
> Thanks,
> Vinod
>
>
> On Wed, Jul 18, 2012 at 10:58 AM, Raihan Jamal wrote:
>
>> I am new to Unix Shel
> a.txt;
How can I do this from a shell script and send the output to a txt file and
then send that txt file as an attachment in an email.
*Raihan Jamal*
unix_timestamp(tt1.created_time) = tt2.timestamps)
Any suggestions will be appreciated.
*Raihan Jamal*
Sending it again. As I haven't got any reply on this. Any personal
experience will be appreciated.
*Raihan Jamal*
On Mon, Jul 9, 2012 at 3:37 PM, Raihan Jamal wrote:
> *Problem Statement:-*
>
> I need to compare two tables Table1 and Table2 and they both store same
> th
That makes sense to me. So that means whenever I do any HiveQL query, only
the output's are displayed on the console, they are not stored anywhere.
And if we want to store, then as Kulkarni suggested we need to do that.
Right?
*Raihan Jamal*
On Thu, Jul 12, 2012 at 12:07 PM, kulkarni
?
*Raihan Jamal*
On Thu, Jul 12, 2012 at 11:56 AM, Roberto Sanabria
wrote:
> Or you can output to a table and store it there.
>
>
> On Thu, Jul 12, 2012 at 2:53 PM, VanHuy Pham wrote:
>
>> The output can be printed out on terminal when you run it, or can be
>> stored if yo
any time limit on that meaning after this much
particular time it will be deleted?**
*Raihan Jamal*
Yup this works. Thanks for the help.
*Raihan Jamal*
On Tue, Jul 10, 2012 at 4:37 PM, Vijay wrote:
> In that case, wouldn't this work:
>
> SELECT buyer_id, item_id, rank(buyer_id), created_time
> FROM (
> SELECT buyer_id, item_id, created_time
> FROM testing
stamps);
I always get error as-
*FAILED: Error in semantic analysis: line 13:6 Invalid Table Alias or
Column Reference prod_and_ts*
*Raihan Jamal*
Thanks Vijay, Yes it worked. Can you also take a look into one of my other
post subject title *TOP 10.*
*Raihan Jamal*
On Tue, Jul 10, 2012 at 1:41 PM, Vijay wrote:
> to_date(from_unixtime(cast(timestamps as int)))
>
> On Tue, Jul 10, 2012 at 1:33 PM, Raihan Jamal
> wrote:
>
I need only the date not the hours and second, so that is the reason I was
using to_date and from_unxitime() take int as parameter and timestamps is a
string in this case.
*Raihan Jamal*
On Tue, Jul 10, 2012 at 1:28 PM, Vijay wrote:
> You need to use from_unixtime()
>
> On Tu
testingtable2 lateral view
explode(purchased_item) exploded_table as prod_and_ts) A;
This is the Output I am getting always.
*1004941621 NULL*
*1005268799 NULL*
*1061569397 NULL*
*1005542471 NULL*
*Raihan Jamal*
tingtable1
DISTRIBUTE BY buyer_id, item_id
SORT BY buyer_id, item_id, created_time desc
) a
WHERE rk < 10
ORDER BY buyer_id, created_time, rk;
*Raihan Jamal*
On Tue, Jul 10, 2012 at 12:16 AM, Jasper Knulst
wrote:
> Hi Raihan,
>
> You should use 'rank(buyer_id)' in
I am trying that solution. Currently I am running my query to see what
result I am getting back with UDF.
*Raihan Jamal*
On Tue, Jul 10, 2012 at 12:13 AM, Nitin Pawar wrote:
> i thought you managed to solve this with rank??
>
>
> On Tue, Jul 10, 2012 at 12:38 PM, Raihan
Problem with that approach is, with LIMIT 10, If I am putting after desc,
then it will get only 10 rows irrespective of BUYER_ID. But I need
specifically for each BUYER_ID 10 latest rows.
*Raihan Jamal*
On Tue, Jul 10, 2012 at 12:03 AM, Abhishek Tiwari <
abhishektiwari.bt...@gmail.
desc
) a
WHERE rank < 10
ORDER BY buyer_id, created_time, rank;
What changes I need to make?
*Raihan Jamal*
On Mon, Jul 9, 2012 at 11:52 PM, Nitin Pawar wrote:
> try rk in upper select statement as well
>
>
> On Tue, Jul 10, 2012 at 12:12 PM, Raihan Jamal wrote:
>
>&g
ons?
*Raihan Jamal*
On Mon, Jul 9, 2012 at 10:51 PM, Vijay wrote:
> hive has no built-in rank function. you'd need to use a user-defined
> function (UDF) to simulate it. there are a few custom implementations
> on the net that you can leverage.
>
> On Mon, Jul 9, 2012 at 10:40 PM
2012-07-09 06:54:37
*Raihan Jamal*
On Mon, Jul 9, 2012 at 7:56 PM, Andes wrote:
> **
> hello, you can use "desc" and "limit 10" to filter the top 10.
>
> 2012-07-10
> --
> **
> Best Regards
> Andes
>
> **
> -
Yup that worked for me. I figure that out after reading the docs, INNER
JOIN means JOIN in HiveQL.
*Raihan Jamal*
On Mon, Jul 9, 2012 at 2:48 PM, Roberto Sanabria wrote:
> Did you try just using "join" instead of "inner join"?
>
>
> On Mon, Jul 9, 2012
g ) in
subquery source`*
*Raihan Jamal*
50 matches
Mail list logo