impala like formatting of tables inside Hive cli

2016-03-10 Thread Awhan Patnaik
Is there a setting that will yield nicely formatted tables as in Impala. I am attaching an example of what I mean.

count(*) not allowed in order by

2016-03-07 Thread Awhan Patnaik
I have to take the first 25 IDs ranked by count(*). But the following is not allowed in Hive select id from T order by count(*) desc limit 25; Which yields a "NOt yet supported place for UDAF count". The way around it is the following select id, count(*) as cnt from T group by id order by cnt de

divide column by its sum

2016-02-22 Thread Awhan Patnaik
select c, sum(c) from foo group by c; obviously does not work. there are 2 ways that i could find. 1) find out the sum in a separate query and hard code the sum in a new query. this obviously performs 2 passes over the table. 2) select a, a/sum(a) over() from foo group by a; how many passes over

Re: Need help :Does anybody has HDP cluster on EC2?

2016-02-15 Thread Awhan Patnaik
mention the FQDN to IP translation in the /etc/hosts file (if using linux) On Mon, Feb 15, 2016 at 1:55 PM, Divya Gehlot wrote: > Hi, > I have hadoop cluster set up in EC2. > I am unable to view application logs in Web UI as its taking internal IP > Like below : > http://ip-xxx-xx-xx-xxx.ap-sout

split string into constituent chars

2016-02-06 Thread Awhan Patnaik
split("abc", "") results in ["a", "b", "c", ""]. What is that 4th component in the result and how do I avoid getting that in the output? hive --version Hive 1.1.0-cdh5.4.5 Subversion file:///data/jenkins/workspace/generic-package-ubuntu64-14-04/CDH5.4.5-Packaging-Hive-2015-08-12_13-54-14/hive-1.1.

apply function to every column without explicitly writing column names

2016-01-27 Thread Awhan Patnaik
I would like to figure out the number of nulls in each column of a table. Is there a less verbose way so that I don't have to write all the column names? select count(x) for x in all_column_names from table foo; <-- fictional instead of select count(col1), count(col2) from table foo;

Re: query execution time in hive

2016-01-07 Thread Awhan Patnaik
ed by Peridale Technology > Ltd, its subsidiaries or their employees, unless expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Ltd, its subsidiaries nor their employees > accept any responsibility. >

increase number of reducers

2015-12-16 Thread Awhan Patnaik
3 node cluster with 15 gigs of RAM per node. Two tables L is approximately 1 Million rows, U is 100 Million. They both have latitude and longitude columns. I want to find the count of rows in U that are within a 10 mile radius of each of the row in L. I have indexed the latitude and longitude colu

make hive startup silent

2015-12-16 Thread Awhan Patnaik
When I launch hive from the command line it prints lots of settings information on to the screen, for example, LS_COLORS values, HADOOP_CLASSPATH values, many environment variables etc. How do I prevent this printing? Mind you I am not talking about the printing of the map and reduce progress which

Re: how to search the archive

2015-12-07 Thread Awhan Patnaik
On Mon, Dec 7, 2015 at 3:22 AM, Lefty Leverenz wrote: > I've been hoping someone else would answer the question about searching > the archives, but here's what I know: > >1. The Apache archives linked from Hive's mailing lists > page don't seem t

Re: how to search the archive

2015-12-06 Thread Awhan Patnaik
On Fri, Dec 4, 2015 at 8:41 PM, Takahiko Saito wrote: > Could a table be an external table? > > > Yes. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ExternalTables

Re: how to search the archive

2015-12-06 Thread Awhan Patnaik
On Fri, Dec 4, 2015 at 7:26 PM, Timothy Garza < timothy.ga...@collinsongroup.com> wrote: > > > > Q. What does that have to do with the text in the Subject line in your > email? > > > > > Yes sorry about that. That question is not related to the subject line of the mail. In any case could somebody

how to search the archive

2015-12-04 Thread Awhan Patnaik
Hey all! I have two questions: 1) How do I search the entire mailing list archive? 2) Sometimes I find that managed tables are not removed from HDFS even after I drop them from the Hive shell. After a "drop table foo", foo does not show up in a "show tables" listing however that table is present