How to manage huge partitioned table with 1000+ columns in Hive

2019-09-30 Thread Saurabh Santhosh
Hi, I am facing the following problem while trying to store/use a huge partitioned table with 1000+ columns in Hive. I would like to know how to solve this problem either using hive or any other store. Requirement: 1).There is a table with around 1000+ columns which is partitioned by date. 2).Ev

Random failure in HIVE tez engine

2019-03-28 Thread Saurabh Mishra
ucceed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 Regards Saurabh Mishra

Unsigned Data Type Support

2015-07-23 Thread saurabh
. Regards, Saurabh

Hive With tez

2015-07-05 Thread saurabh
vast topic and cannot be described, however some quick pointers will be helpful. I am currently working on: Query vectorization and COB with ORC tables. Thanks, Saurabh

Date Functions in Hive

2015-06-23 Thread saurabh
ase let me know if any more information is required on the same. Thanks, Saurabh

Re: Which [open-souce] SQL engine atop Hadoop?

2015-02-02 Thread Saurabh B
This is not open source but we are using Vertica and it works very nicely for us. There is a 1TB community edition but above that it costs money. It has really advanced SQL (analytical functions, etc), works like an RDBMS, has R/Java/C++ SDK and scales nicely. There is a similar option of Redshift

Executing Hive Queries in Parallel

2014-04-21 Thread saurabh
Hi, I need some inputs to execute hive queries in parallel. I tried doing this using CLI (by opening multiple ssh connection) and executed 4 HQL's; it was observed that the queries are getting executed sequentially. All the FOUR queries got submitted however while the first one was in execution mod

Re: Converting from textfile to sequencefile using Hive

2013-09-30 Thread Saurabh B
ividual text documents), but it does > get through all the mechanics of exactly what you state you want. > > The meetup page also has links to video, if the slides don't give enough > context. > > HTH > > [1]: http://www.meetup.com/Data-Science-MD/events/111081282/ &

Re: Converting from textfile to sequencefile using Hive

2013-09-30 Thread Saurabh B
Hi Nitin, No offense taken. Thank you for your response. Part of this is also trying to find the right tool for the job. I am doing queries to determine the cuts of tweets that I want, then doing some modest normalization (through a python script) and then I want to create sequenceFiles from that

Converting from textfile to sequencefile using Hive

2013-09-30 Thread Saurabh B
Hi, I have a lot of tweets saved as text. I created an external table on top of it to access it as textfile. I need to convert these to sequencefiles with each tweet as its own record. To do this, I created another table as a sequencefile table like so - CREATE EXTERNAL TABLE tweetseq( tweet ST

Converting from textfile to sequencefile using Hive

2013-09-30 Thread Saurabh Bhatnagar (Business Intelligence)
Hi, I have a lot of tweets saved as text. I created an external table on top of it to access it as textfile. I need to convert these to sequencefiles with each tweet as its own record. To do this, I created another table as a sequencefile table like so - CREATE EXTERNAL TABLE tweetseq( tweet ST

Converting from textfile to sequencefile using Hive

2013-09-29 Thread Saurabh Bhatnagar (Business Intelligence)
Hi, I have a lot of tweets saved as text. I created an external table on top of it to access it as textfile. I need to convert these to sequencefiles with each tweet as its own record. To do this, I created another table as a sequencefile table like so - CREATE EXTERNAL TABLE tweetseq( tweet ST

Re: Semantics of Rank.

2013-07-26 Thread saurabh
Hi all, Below are some of observations based on the on-going rank function discussion. 1. I executed below mentioned queries and only the query with "rank" (lowercase) executed successfully, rest were throwing exceptions "FAILED: SemanticException Failed to breakup Windowing invocations into Gro

Re: Question regarding external table and csv in NFS

2013-07-17 Thread Saurabh M
/h/tpc-h-impala/data/supplier.tbl'; I assume that "supplier.tbl" is a directory and the csv file is present in the same. Let me know if it worked! Thanks, Saurabh On Thu, Jul 18, 2013 at 1:55 AM, Mainak Ghosh wrote: > Hello, > > I have just started using Hive and I w

RE: Connecting to Hive from R through JDBC

2013-05-08 Thread Saurabh S
ce.com To: user@hive.apache.org Subject: Re: Connecting to Hive from R through JDBC Date: Wed, 8 May 2013 00:27:35 + Hi Saurabh The usual suspect looks like hive-server service is not running on server where hive is installed….The hive-server service needs to be installed and started….It

Connecting to Hive from R through JDBC

2013-05-07 Thread Saurabh S
river through following command:drv <- JDBC('org.apache.hadoop.hive.jdbc.HiveDriver', 'C:/Users/Saurabh/Documents/RWork/hive-jdbc-0.9.0-cdh4.1.2.jar') But when I try to make the connection using the following command:conn <- dbConnect(drv, 'jdbc:hi

Give Custom Username to Hive Output Files

2012-12-11 Thread Saurabh Mishra
Hi, When i try to insert some data into a hive table mapped to a specific location in the HDFS, the file which gets created has user information as 'hive' and permissions as '755' i.e 'rwxr-xr-x' Is there any way to change this so that i can give my own usename or atleast the user from where i h

Hive UDAF Limitation on Internally used Collections

2012-10-20 Thread Saurabh Mishra
insensitively * * @author Saurabh */ public class UDAFCaseInsensitiveDistinctMerge extends UDAF { /** * Default Separator Defined and used unless overriden. */ private static final String DEFAULT_SEPARATOR = ";"; /** * Nested Class to Store the Updated Set of Unique E

RE: Hive Query Unable to distribute load evenly in reducers

2012-10-18 Thread Saurabh Mishra
; then this configuration i am already using, but to no avail...:( Date: Tue, 16 Oct 2012 14:17:47 +0900 Subject: Re: Hive Query Unable to distribute load evenly in reducers From: navis@nexr.com To: user@hive.apache.org How about using MapJoin? 2012/10/16 Saurabh Mishra no there is

RE: Hive Query Unable to distribute load evenly in reducers

2012-10-15 Thread Saurabh Mishra
@hive.apache.org How about using MapJoin? 2012/10/16 Saurabh Mishra no there is apparently no heavy skewing. also another stats i wanted to point was, following is approximate table contents in this 4 table join query : tableA : 170 million (actual number, + i am also exploding these records, so the

RE: Hive Query Unable to distribute load evenly in reducers

2012-10-15 Thread Saurabh Mishra
Query Unable to distribute load evenly in reducers > From: philip.j.trom...@gmail.com > To: user@hive.apache.org > > Is your data heavily skewed towards certain values of a.x etc? > > On 15 October 2012 15:23, Saurabh Mishra > wrote: > > The queries are simple joins, somet

RE: Hive Query Unable to distribute load evenly in reducers

2012-10-15 Thread Saurabh Mishra
gt; To: user@hive.apache.org > > And your queries were? > > On Mon, Oct 15, 2012 at 8:09 PM, Saurabh Mishra > wrote: > > Hi, > > I am firing some hive queries joining tables containing upto 30millions > > records each. Since the load on the reducers is very significa

Hive Query Unable to distribute load evenly in reducers

2012-10-15 Thread Saurabh Mishra
any way to overcome this load distribution disparity. Any help in this regards will be highly appreciated. Sincerely Saurabh Mishra

Custom UDF in Python?

2012-06-05 Thread Saurabh S
Is it possible to write Hive UDFs in Python? I googled but didn't find anything. I would be happy with RTFM replies if you can give link to the manual.

RE: 'set cli header' throws null pointer exception

2012-06-01 Thread Saurabh S
l pointer exception > To: user@hive.apache.org > > Which version of Hive are you running? > > On Fri, Jun 1, 2012 at 3:49 PM, Saurabh S wrote: > > > > Well it seems that simply moving the set header statement after the 'c

RE: 'set cli header' throws null pointer exception

2012-06-01 Thread Saurabh S
ver.java:490) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:197) > -- > > Any idea what's going on? > > Regards, > Saurabh >

'set cli header' throws null pointer exception

2012-06-01 Thread Saurabh S
ssorImpl.invoke(DelegatingMethodAccessorImpl.java:25)     at java.lang.reflect.Method.invoke(Method.java:597)     at org.apache.hadoop.util.RunJar.main(RunJar.java:197) -- Any idea what's going on? Regards, Saurabh

Hive equivalent of group_concat() ?

2012-05-11 Thread Saurabh S
As far as I understand, there is no equivalent of MySQL group_concat() in Hive. This stackoverflow question is from Sept 2010: http://stackoverflow.com/questions/3703740/combine-multiple-rows-into-one-space-separated-string Does anyone know any other method to create a delimited list from from

RE: Passing date as hive configuration variable

2012-05-10 Thread Saurabh S
Whew, thanks everyone! I think wrapping quotes around that did it. Nicole, I was going to attempt that as a last resort. But the actual query is much longer and it would be extremely undesirable to do so. Regards, Saurabh > From: nicole

Passing date as hive configuration variable

2012-05-10 Thread Saurabh S
nd "select ${hiveconf:ref_date} from dummytable limit 1" produces "1999". I noticed that there is an option to "set hive.variable.substitute=false;", but in that case, hive throws the following error: FAILED: Parse Error: line 3:7 cannot recognize input near '$&#

Get current date in hive

2012-04-25 Thread Saurabh S
Hi, How do I get the current date in Hive? Specifically, I’m looking for the equivalent of following SQL where clause: where LOCAL_DT >= current date - 3 day I tried using where local_dt >= date_sub(to_date(unix_timestamp()), 3) but this method seems to be many times slower than

Hive equivalent of row_number()

2012-04-12 Thread Saurabh S
I have a table with three columns, A, B, and Score, where A and B are some items, and Score is some kind of affinity between A and B. There are N number of items of each A and B, so that the total number of rows in the table are N^2. Is there a way to fetch "top 5 items in B" for each item in A

RE: Help in aggregating comma separated values

2012-03-28 Thread Saurabh S
after the ‘3’ but before the tab? Matt Tucker From: Saurabh S [mailto:saurab...@live.com] Sent: Wednesday, March 28, 2012 2:45 PM To: user@hive.apache.org Subject: RE: Help in aggregating comma separated values Thanks for the reply, Matt. This is exactly what I'm looking for. I'll l

RE: Help in aggregating comma separated values

2012-03-28 Thread Saurabh S
ues, ",")) values_tbl as value > GROUP BY id, value > > > > Matt Tucker > > -Original Message- > From: Saurabh S [mailto:saurab...@live.com] > Sent: Wednesday, March 28, 2012 2:21 PM > To: user@hive.apache.org > Subject: Help in aggregating comma separate

Help in aggregating comma separated values

2012-03-28 Thread Saurabh S
stion rather than one specific to Hive, but I'm at a roadblock here. Thanks, Saurabh

Length of an array

2012-03-21 Thread Saurabh S
How do I get the length of an array in Hive? Specifically, I'm looking at the following problem: I'm splitting a column using the split() function and a pattern. However, the resulting array can have variable number of entries and I want to handle each case separately.

RE: Accessing elements from array returned by split() function

2012-03-01 Thread Saurabh S
ip.j.trom...@gmail.com > To: user@hive.apache.org > > I guess that split(...)[1] is giving you what's inbetween the 1st and > 2nd '/' character, which is nothing. Try split(...)[2]. > > Phil. > > On 1 March 2012 21:19, Saurabh S wrote: > > Hello, > > >

Accessing elements from array returned by split() function

2012-03-01 Thread Saurabh S
reason, that function on my database is running extremely slow. First time posting to this list. If there is anything wrong, please let me know. Regards, Saurabh

RE: Error in running Hive with Postgresql as metastore DB

2012-01-10 Thread Saurabh Bajaj
metastore DB. Thanks! Saurabh Bajaj | Senior Business Analyst | +91 9986588089 | www.mu-sigma.com<http://www.mu-sigma.com/> | From: Saurabh Bajaj Sent: Tuesday, January 10, 2012 2:44 PM To: 'user@hive.apache.org' Subject: Error in running Hive with Postgresql as metastore DB Hi

Error in running Hive with Postgresql as metastore DB

2012-01-10 Thread Saurabh Bajaj
y this error would be occuring. Thanks in advance! Saurabh Bajaj +91 9986588089 This email message may contain proprietary, private and confidential information. The information transmitted is intended only for the person(s) or entities to whi