Issue while quering Hive

2013-09-16 Thread Garg, Rinku
Hi All, I have setup Hadoop, hive setup and trying to load gzip file in hadoop cluster. Files are loaded successfully and can be view on web UI. While executing Select query it gives me the below mentioned error. ERROR org.apache.hadoop.security.UserGroupInformation:

Duplicate rows when using group by in subquery

2013-09-16 Thread Mikael Öhman
Hello. This is basically the same question I posted on stackoverflow: http://stackoverflow.com/questions/18812390/hive-subquery-and-group-by/18818115?noredirect=1#18818115 I know the query is a bit noisy. But this query also demonstrates the error: select a.symbol from (select symbol,

Re: Issue while quering Hive

2013-09-16 Thread Nitin Pawar
Look at the error message Caused by: java.io.IOException: hdfs://localhost:54310/user/ hive/warehouse/cpj_tbl/cpj.csv.gz not a SequenceFile Did you create table with sequencefile ? On Mon, Sep 16, 2013 at 1:33 PM, Garg, Rinku rinku.g...@fisglobal.comwrote: Hi All, ** ** I have setup

RE: Issue while quering Hive

2013-09-16 Thread Garg, Rinku
Hi Nitin, Yes, I created the table with sequencefile. Thanks Regards, Rinku Garg From: Nitin Pawar [mailto:nitinpawar...@gmail.com] Sent: 16 September 2013 14:19 To: user@hive.apache.org Subject: Re: Issue while quering Hive Look at the error message Caused by: java.io.IOException:

User accounts to execute hive queries

2013-09-16 Thread shouvanik.haldar
Hi, Can you please tell me if its possible to execute hive queries as different users? Can we create read-only access for hive? Please help. Thanks Shouvanik Sent from my Windows Phone -Original Message- From: Nitin Pawar nitinpawar...@gmail.com Sent: ‎16-‎09-‎2013 15:57 To:

RE: Issue while quering Hive

2013-09-16 Thread Garg, Rinku
Thanks Nitin, That way it worked, But in that case Hadoop will not be able to split my file into chunks/blocks and run multiple maps in parallel. This can cause under-utilization of my cluster's 'mapping' power. Is that rue?? Thanks Regards, Rinku Garg From: Nitin Pawar

Re: Issue while quering Hive

2013-09-16 Thread Nitin Pawar
As per my understanding, hadoop 1.x does not provide you any help on processing compressing files in parallel manner. (Atleast this was the case few months back). This bzip2 splitting etc is added in hadoop2.x as per my understanding. On Mon, Sep 16, 2013 at 5:18 PM, Garg, Rinku

RE: 回复: hive 0.11 auto convert join bug report

2013-09-16 Thread Sun, Rui
Hi, Amit, You can see the description of HIVE-5256 for more detailed explanation. Both table aliases and names (if no alias) may run into this issue. This issue happened to be covered by the XML serialization/deserialization of the MapredWork containing the join operator (HashMap

RE: User accounts to execute hive queries

2013-09-16 Thread shouvanik.haldar
Hi Nitin, Users want to execute hive queries from their user name? Is it possible or they have to do it logged in as hive user? Thanks, Shouvanik -Original Message- From: Haldar, Shouvanik Sent: Monday, September 16, 2013 4:06 PM To: user@hive.apache.org Subject: User accounts to

Re: User accounts to execute hive queries

2013-09-16 Thread Nitin Pawar
You will need to tell few more things. Do you want it secured? Do you distinguish users in different categories on what one particular user can do or not? What kind of security do you have on hdfs? It is definitely possible for users to run queries on their own username but then you have to take

Re: Issue while quering Hive

2013-09-16 Thread Sanjay Subramanian
With regards to splitting an compression there are 2 options really as of now If u r using Sequence Files , then Snappy If u r using TXT files then LZO us great (u have to cross a few minor hoops to get LZO to work and I can provide guidance on that) Please don't use GZ (not splittable) / or

Re: Duplicate rows when using group by in subquery

2013-09-16 Thread Yin Huai
Hello Mikael, Seems your case is related to the bug reported in https://issues.apache.org/jira/browse/HIVE-5149. Basically, when hive uses a single MapReduce job to evaluate your query, c.Symbol and c.catid are used to partitioning data, and thus, rows with the same value of c.Symbol are not

Re: Interesting claims that seem untrue

2013-09-16 Thread Carter Shanklin
Ed, If nothing else I'm glad it was interesting enough to generate some discussion. These sorts of stats are always subjects of a lot of controversy. I have seen a lot of these sorts of charts float around in confidential slide decks and I think it's good to have them out in the open where anyone

FAILED: ParseException line 12:0 mismatched input 'Comment' expecting Identifier near ',' in column specification

2013-09-16 Thread Artem Ervits
Hello all, I'm trying to create a table with a column called Comment and I get the following exception: FAILED: ParseException line 12:0 mismatched input 'Comment' expecting Identifier near ',' in column specification I am running hive-common-0.10.0.23.jar version of Hive. Is that a bug or a

Generic UDFs and named parameters

2013-09-16 Thread Roberto Congiu
Hey guys, I wrote a generic UDF that takes a variable name of arrays and returns an array of structs built with the input arrays, something like: structify(array1,array2,array3) - returns an array of structs structarray where structarray[0] = [{ array1[0],array2[0],array3[0]}, {array1[1],