Re: How to find hive version using hive editor in hue ?

2016-02-18 Thread Bennie Schut
Not directly but indirectly doing: set system:sun.java.command; That will likely give the the jar name which includes the version. On 18/02/16 08:12, Abhishek Dubey wrote: Thanks in advance.. *Warm Regards,* *Abhishek Dubey*

Re: Strict mode and joins

2015-10-19 Thread Bennie Schut
Hi Edward, That's possibly due to using unix_timestamp (although the error message seems misleading if that proves true) . It's technically correct it shouldn't be flagged as deterministic because every time you call it you'll get a different answer as time progresses. However reality is I ju

Re: HiveServer2 OOM

2015-10-12 Thread Bennie Schut
In my experience having looked at way to many heap dumps from hiveserver2 it always end up being a seriously over partitioned table and a user who decided to do a full table scan basically requesting all partitions. This often is by accident for example when using unix_timestamp to convert date

Re: PL/SQL to HiveQL translation

2013-07-25 Thread Bennie Schut
Hi Jerome, Yes it looks like you could stop using GET_SEMAINE and directly joining "calendrier_hebdo" with "calendrier" for example. For "FCALC_IDJOUR" you will have to make a udf so I hope you have some java skills :) The "calendrier" tables suggests you have star schema with a calendar tabl

Re: which approach is better

2013-07-18 Thread Bennie Schut
The best way to restore is from a backup. We use distcp to keep this scalable : http://hadoop.apache.org/docs/r1.2.0/distcp2.html The data we feed to hdfs also gets pushed to this backup and the metadatabase from hive also gets pushed here. So this combination works well for us (had to use it on

Re: Moving hive from one server to another

2013-07-03 Thread Bennie Schut
Unfortunately the ip is stored with each partition in the metadatabase. I once did an update on the metatdata for our server to replace all old ip's with new ip's. It's not pretty but that actually works. Op 28-6-2013 06:29, Manickam P schreef: Hi, What are the steps one should follow to move

RE: Hive Problems Reading Avro+Snappy Data

2013-04-08 Thread Bennie Schut
Just so you know there is still at least one bug using avro+compression like snappy: https://issues.apache.org/jira/browse/HIVE-3308 There's a simple one line patch but unfortunately it's not committed yet. From: Thomas, Matthew [mailto:mtho...@verisign.com] Sent: Monday, April 08, 2013 1:59 PM

hive & starschemas.

2013-04-02 Thread Bennie Schut
Hi all, I've been using hive with snappy and avro combined for a little while now compared to our older star schema setup with hive and wanted to share this experience with other hive users: http://tech.ebuddy.com/2013/03/28/from-star-schema-to-complete-denormalization/ I realize there is more t

RE: Getting Slow Query Performance!

2013-03-12 Thread Bennie Schut
Well it's probably worth to know 30G is really hitting rock bottom when you talk about big data. Hadoop is linearly scalable so probably going to 3 or 4 similar machines could get you below the mysql time but it's hardly a fair comparison. Setting it up I would suggest reading the hadoop docs:

RE: Getting Slow Query Performance!

2013-03-12 Thread Bennie Schut
Generally a single hadoop machine will perform worse then a single mysql machine. People normally use hadoop when they have so much data it won't really fit on a single machine and it would require specialized hardware (Stuff like SAN's) to run. 30GB of data really isn't that much and 2GB of ram

RE: Accessing sub column in hive

2013-03-08 Thread Bennie Schut
Perhaps worth posting the error. Some might know what the error means. Also a bit unrelated to hive but please do yourself a favor and don't use float to store monetary values like salary. You will get rounding issues at some point in time when you do arithmetic on them. Considering you are usin

RE: Re:RE: Problem with Hive JDBC server

2013-02-07 Thread Bennie Schut
: Gabor Makrai [mailto:makrai.l...@gmail.com] Sent: Wednesday, February 06, 2013 12:45 PM To: 王锋; Bennie Schut Cc: user@hive.apache.org Subject: Re: Re:RE: Problem with Hive JDBC server Hi guys, Bad news for me. I checked out and compiled the Hive trunk and got the same problem. I attached to

RE: Hive JDBC driver query statement timeout.

2013-02-06 Thread Bennie Schut
Normally that would be stmt.setQueryTimout however that call isn't implemented yet. So to answer, no there isn't. public void setQueryTimeout(int seconds) throws SQLException { throw new SQLException("Method not supported"); } You might find a parameter called "hive.stats.jdbc.timeout" b

Out Of Memory on localmode.

2013-02-05 Thread Bennie Schut
Hi, Just in case anyone else ever runs into this. Lately our cluster kept on killing itself with an OOM message in the kernel log. It took me a while to realize why this happened since no single process was causing it. I traced it back to a few queries running concurrently on a really small dat

RE: Problem with Hive JDBC server

2013-02-04 Thread Bennie Schut
13 at 11:53 AM, Bennie Schut mailto:bsc...@ebuddy.com>> wrote: Since it's small can you post the code? From: Gabor Makrai [mailto:makrai.l...@gmail.com<mailto:makrai.l...@gmail.com>] Sent: Monday, February 04, 2013 11:45 AM To: user@hive.apache.org<mailto:user@hive.apache.org&g

RE: Problem with Hive JDBC server

2013-02-04 Thread Bennie Schut
Since it's small can you post the code? From: Gabor Makrai [mailto:makrai.l...@gmail.com] Sent: Monday, February 04, 2013 11:45 AM To: user@hive.apache.org Subject: Problem with Hive JDBC server Hi guys, I'm writing you because I experienced a very strange problem which probably affects all Hiv

RE: Loading a Hive table simultaneously from 2 different sources

2013-01-24 Thread Bennie Schut
The benefit of using the partitioned approach is really nicely described in the oreilly book "Programming Hive". (Thanks for writing it Edward) For me the ability to drop a single partition if there's any doubt about the quality of the data of just one job is a large benefit. From: Edward Caprio

RE: Effecient partitions usage in join

2012-11-23 Thread Bennie Schut
cely: 2012-11-23 but you can obviously pick your own pattern. From: Dima Datsenko [mailto:di...@microsoft.com] Sent: Thursday, November 22, 2012 4:07 PM To: Bennie Schut; user@hive.apache.org Subject: RE: Effecient partitions usage in join Hi Benny, The udf solution sounds like a plan. Much better tha

RE: Effecient partitions usage in join

2012-11-22 Thread Bennie Schut
Unfortunately at the moment partition pruning is a bit limited in hive. When hive creates the query plan it decides what partitions to use. So if you put hardcoded list of partition_id items in the where clause it will know what to do. In the case of a join (or a subquery) it would have to run t

RE: Show job progress when using JDBC to run HIVE query

2012-09-17 Thread Bennie Schut
The jdbc driver uses thrift so if thrift can't then jdbc can't. This can be surprisingly difficult to do. Hive can split a query into x hadoop jobs and some will run in parallel and some will run in sequence. I've used oracle in the past (10 and 11) and I could also never find out how long a lar

RE: Loading data into data_dim table

2012-07-25 Thread Bennie Schut
Hi Prabhu, Be careful when going into the direction of calendar dimensions. While strictly speaking this is a cleaner dwh design you will for sure run into issues you might not expect. Consider this is probably what you would want to do (roughly) to query a day: select count(*) from fact f j

Re: hive runs slowly

2011-10-24 Thread Bennie Schut
"inner join" is simply translated to "join" they are the same thing (HIVE-2191) I'm guessing he means removing the join from the where part of the query and using the "select a,b from a join b on (a.id=b.id)" syntax. On 10/22/2011 05:05 AM, john smith wrote: You mean select a,b from a inner join

Re: Organizing a Hive Meetup for Hadoop World NYC

2011-10-12 Thread Bennie Schut
I'll be at hadoop world. Is the hive meetup still happening? On 08/29/2011 10:03 PM, Carl Steinbach wrote: Hi Ed, This is a one-time event targeted at Hadoop World attendees, though others are welcome to attend as well. Thanks. Carl On Mon, Aug 29, 2011 at 12:09 PM, Edward Capriolo mailto:e

Re: hive zookeeper locks.

2011-09-08 Thread Bennie Schut
Somewhere lower in my config file I set a incorrect LockManager so now it works :) On 09/07/2011 04:02 PM, Bennie Schut wrote: I've been trying to play with locks in hive using zookeeper but can't find documentation on how to configure it. I now have: hive.supports.concur

hive zookeeper locks.

2011-09-07 Thread Bennie Schut
I've been trying to play with locks in hive using zookeeper but can't find documentation on how to configure it. I now have: hive.supports.concurrency true hive.zookeeper.quorum localhost But I keep getting errors like this: 11/09/07 15:47:57 ERROR exec.DDLTask: FAILED: Error in metadata:

Re: Trouble creating indexes with psql metastore

2011-06-23 Thread Bennie Schut
I have a similar problem with a trunk build and a mysql metastore. Doing: alter table IDXS modify column DEFERRED_REBUILD boolean not null; Doesn't seem to fix it. Perhaps because mysql converts the boolean into a "tinyint(1)"? Is there an easy way to make it fail with an error instead of getti

Re: Hive connecting to squirrel on windows

2011-05-17 Thread Bennie Schut
If its 0.7 and "IOException: The system cannot find the path specified" then you ran into HIVE-2054. It seems Carl backported it to 0.7.1 so try that. If it's something else please post the error. On 05/17/2011 04:56 AM, Raghunath, Ranjith wrote: I have followed the document outlining how to

Re: hive hbase handler metadata NullPointerException

2011-03-29 Thread Bennie Schut
0.1, hadoop-0.20.2). Any help? -amit -------- *From:* Bennie Schut *To:* "user@hive.apache.org" *Sent:* Wed, 9 March, 2011 4:39:49 AM *Subject:* hive hbase handler metadata NullPointerException Hi All, I was trying out

Re: Hive & MS SQL Server

2011-03-24 Thread Bennie Schut
while freeing us guys to concentrate on better things like Hadoop & Hive :-) I assumed with the DB just being a metadata store that the database wouldn’t be an issue but were struggling a bit:-( On 24 March 2011 15:23, Bennie Schut <<mailto:bsc...@ebuddy.com>bsc...@ebuddy.com<ma

Re: Hive & MS SQL Server

2011-03-24 Thread Bennie Schut
Sorry to become a bit offtopic but how do you get into a situation where sqlserver 2005 becomes a requirement for a hive internal meta store? I doubt many of the developers of hive will have access to this database so I don't expect a lot of response on this. But hopefully someone can prove me

IOException on hadoop 0.20.2 with trunk.

2011-03-24 Thread Bennie Schut
So far I'm not able to reproduce this on our dev environment(only when going live) but when trying trunk I get errors like attached. I considering making a jira out of it but I'm not sure what query is causing this. java.io.IOException: Call to batiatus-int.ebuddy.com/10.10.0.5:9000 failed o

hive hbase handler metadata NullPointerException

2011-03-09 Thread Bennie Schut
Hi All, I was trying out hbase 0.89.20100924 with hive trunk with hadoop 0.20.2 When I'm running a simple insert I get this: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org.apache.hado

Re: Trouble using mysql metastore

2011-03-03 Thread Bennie Schut
he response ! I had CLASSPATH set to include /usr/share/java/mysql.jar ... in addition, I just copied the mysql.jar to the lib directory of hive. I still get the same bug. Any other ideas? Thanks, -Ajo On Wed, Mar 2, 2011 at 7:01 AM, Bennie Schut mailto:b

Re: Trouble using mysql metastore

2011-03-02 Thread Bennie Schut
Usually this is caused by not having the mysql jdbc driver on the classpath (it's not default included in hive). Just put the mysql jdbc driver in the hive folder under "lib/" On 03/02/2011 03:15 PM, Ajo Fod wrote: I've checked the mysql connection with a separate java file with the same string

Re: OutOfMemory errors on joining 2 large tables.

2011-02-23 Thread Bennie Schut
might help. Sent from my iPhone On Feb 22, 2011, at 12:46 AM, Bennie Schut wrote: I've just set the "hive.exec.reducers.bytes.per.reducer" to as low as 100k which caused this job to run with 999 reducers. I still have 5 tasks failing with an outofmemory. We have jvm reuse set to

Re: OutOfMemory errors on joining 2 large tables.

2011-02-22 Thread Bennie Schut
s problem: set mapred.job.reuse.jvm.num.tasks = 1; It's still puzzling me how it can run out of memory. It seems like some of the reducers get an unequally large share of the work. On 02/18/2011 10:53 AM, Bennie Schut wrote: When we try to join two large tables some of the reducers stop with an OutOfMemo

OutOfMemory errors on joining 2 large tables.

2011-02-18 Thread Bennie Schut
When we try to join two large tables some of the reducers stop with an OutOfMemory exception. Error: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) at org.apache.hadoop.mapred.ReduceTask$Re

Re: what char represents NULL value in hive?

2011-02-10 Thread Bennie Schut
At least on trunk it seems on external tables(perhaps also TextFile?) this works for integer values but not for string values. For a string it will then return as an empty string which you then have to find with " where field = '' " but I would prefer to use " where field is null ". Not sure if

Re: Too many open files

2011-01-07 Thread Bennie Schut
@gmail.com>> het volgende geschreven: Seems like this works for me too! That probably saved me for a bunch of hours tracing this down through hive and hadoop Do you know what the side effect of setting this to false would be?. Thanks! Terje On Fri, Jan 7, 2011 at 4:39 PM, Bennie Schut

RE: Too many open files

2011-01-06 Thread Bennie Schut
In the past I ran into a similar problem which was actually caused by a bug in hadoop. Someone was nice enough to come up with a workaround for this. Perhaps you are running into a similar problem. I also had this problem when calling lots of "load file" commands. After adding this to the hive-s

Filtering is supported only on partition keys of type string

2010-11-08 Thread Bennie Schut
Hi, I just recently updated to trunk, was lagging a few months behind. Now I'm getting errors like: "Filtering is supported only on partition keys of type string" It seems some type checking was added on org.apache.hadoop.hive.metastore.parser.ExpressionTree.java:161 which makes sure partition

RE: NOT IN query

2010-11-04 Thread Bennie Schut
You can use a left outer join which works in all databases. select a.value from tablea a left outer join tableb b on (b.value = a.value) where b.value is null; Databases are generally pretty good at doing joins so this usually performs good. From: איל (Eyal)