Re: Multi-group-by select always scans entire table

2012-05-28 Thread Jan Dolinár
On Fri, May 25, 2012 at 12:03 PM, Jan Dolinár wrote: > > -- see what happens when you try to perform multi-group-by query on one of > the partitions > EXPLAIN EXTENDED > FROM partition_test > LATERAL VIEW explode(col1) tmp AS exp_col1 > INSERT OVERWRITE DIRECTORY '/test/1' > SELECT exp_col1 >

building a (hive) client in perl

2012-05-28 Thread Stephen Sprague
Hi Guys, Apologies if this is the wrong group - i'm thinking dev might be better - but here goes. After some fumbling around and googling around i've managed build a perl hive client and have connected and executed some queries and have even gotten results back. Woo-hoo! However, naturally there'

How to output into a binary table

2012-05-28 Thread Edward Capriolo
I am trying to find the example binary_output_format.q. create table abinary (num1 int, num2 int) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' with SERDEPROPERTIES ( 'serialization.last.column.takes.rest'='true' ) STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextI

Re: Need help with simple subquery

2012-05-28 Thread Igor Tatarinov
Try replacing the comma with JOIN igor decide.com On Mon, May 28, 2012 at 6:48 AM, shan s wrote: > I need help with a simple subquery. Given below data, I need counts and > percentage counts per category. (Re-phrasing my earlier question ) > With the code below I get an error: *FAILED: Parse Er

Re: odd header behavior?

2012-05-28 Thread Stephen Sprague
Hi Ed, No doubt about it. I definitely don't like taking unless i'm giving something too. I'm going to fish around in the dev world of hive and see what i can do. Cheers, Stephen Sprague On Mon, May 28, 2012 at 11:42 AM, Edward Capriolo wrote: > The thin client is a newer feature. Some things

Re: odd header behavior?

2012-05-28 Thread Edward Capriolo
The thin client is a newer feature. Some things like cli.print.header send output directly to the stream and the thrift client unfortunately only returns results over the thrift. It is a fair criticism, and sometimes annoying that output and error might go to a different place then the result data.

Re: odd header behavior?

2012-05-28 Thread Stephen Sprague
nah. no difference. okay. hive certainly shows immense potential but i think i just have to fact the facts - its very immature at this time. On Fri, May 25, 2012 at 10:59 PM, Nitin Pawar wrote: > try putting them in a file and execute as -f queryfile > > > On Sat, May 26, 2012 at 7:51 AM, Steph

RE: FW: Filtering on TIMESTAMP data type

2012-05-28 Thread Debarshi Basak
yeah...check it out https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions   or search google for date function for hive Debarshi BasakTata Consultancy ServicesMailto: debarshi.ba...@tcs.comWebsite: http://www.tcs.com_

RE: FW: Filtering on TIMESTAMP data type

2012-05-28 Thread Debarshi Basak
I have read that hive has something for time..i have to check..i am not sureDebarshi BasakTata Consultancy ServicesMailto: debarshi.ba...@tcs.comWebsite: http://www.tcs.comExperience certainty. IT ServicesBusiness SolutionsOutsourcing_

RE: FW: Filtering on TIMESTAMP data type

2012-05-28 Thread Ladda, Anand
Debarshi Didn't quite follow your first comment. I get the write-your-own UDF part but was wondering how others have been transitioning from STRING dates to TIMESTAMP dates and getting filtering, partition pruning, etc to work with constants -Anand From: Debarshi Basak [mailto:debarshi.ba...@tcs

Need help with simple subquery

2012-05-28 Thread shan s
I need help with a simple subquery. Given below data, I need counts and percentage counts per category. (Re-phrasing my earlier question ) With the code below I get an error: *FAILED: Parse Error:* line 6:50 *mismatched input ','* *expecting EOF near 'a'* Looking at the documentation the syntax it

Re: Job Scheduling in Hadoop-Hive

2012-05-28 Thread Florin Diaconeasa
Spring batch with a basic tasklet for querying the hive db should be of help. :) On 26.05.2012, at 17:48, Ronak Bhatt wrote: Hello - For those users whose setup is somewhat production, what do you use for job scheduling and dependency management? *thanks, ronak* * * * *

Re: subquery in select

2012-05-28 Thread wd
I didn't have run it. I think the sql I write should work in every db On Mon, May 28, 2012 at 4:54 PM, shan s wrote: > That was a typo in the email, but it still errors after the typo is > corrected. > Did you try to run it? > > Here is the entire script after creating the file, gt.txt from d

Re: subquery in select

2012-05-28 Thread shan s
That was a typo in the email, but it still errors after the typo is corrected. Did you try to run it? Here is the entire script after creating the file, gt.txt from data below CREATE EXTERNAL TABLE IF NOT EXISTS gt (id INT, category STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STOR

index updates

2012-05-28 Thread Franc Carter
Hi, I've been googling and haven't been able to find out if indexes get updated automatically when additional data is inserted ? Can someone point me to some goods docs on the current status of indexing in hive thanks -- *Franc Carter* | Systems architect | Sirca Ltd franc.car...@sirca.org

Re: subquery in select

2012-05-28 Thread wd
group by category On Mon, May 28, 2012 at 2:20 PM, shan s wrote: > (select category, count(*) as count from gt group by cat) a,