Not directly but indirectly doing:
set system:sun.java.command;
That will likely give the the jar name which includes the version.
On 18/02/16 08:12, Abhishek Dubey wrote:
Thanks in advance..
*Warm Regards,*
*Abhishek Dubey*
Hi Edward,
That's possibly due to using unix_timestamp (although the error message
seems misleading if that proves true) . It's technically correct it
shouldn't be flagged as deterministic because every time you call it
you'll get a different answer as time progresses. However reality is I
ju
In my experience having looked at way to many heap dumps from
hiveserver2 it always end up being a seriously over partitioned table
and a user who decided to do a full table scan basically requesting all
partitions. This often is by accident for example when using
unix_timestamp to convert date
Hi Jerome,
Yes it looks like you could stop using GET_SEMAINE and directly joining
"calendrier_hebdo" with "calendrier" for example. For "FCALC_IDJOUR" you
will have to make a udf so I hope you have some java skills :)
The "calendrier" tables suggests you have star schema with a calendar
tabl
The best way to restore is from a backup. We use distcp to keep this
scalable : http://hadoop.apache.org/docs/r1.2.0/distcp2.html
The data we feed to hdfs also gets pushed to this backup and the
metadatabase from hive also gets pushed here. So this combination works
well for us (had to use it on
Unfortunately the ip is stored with each partition in the metadatabase.
I once did an update on the metatdata for our server to replace all old
ip's with new ip's. It's not pretty but that actually works.
Op 28-6-2013 06:29, Manickam P schreef:
Hi,
What are the steps one should follow to move
Just so you know there is still at least one bug using avro+compression like
snappy:
https://issues.apache.org/jira/browse/HIVE-3308
There's a simple one line patch but unfortunately it's not committed yet.
From: Thomas, Matthew [mailto:mtho...@verisign.com]
Sent: Monday, April 08, 2013 1:59 PM
Hi all,
I've been using hive with snappy and avro combined for a little while now
compared to our older star schema setup with hive and wanted to share this
experience with other hive users:
http://tech.ebuddy.com/2013/03/28/from-star-schema-to-complete-denormalization/
I realize there is more t
Well it's probably worth to know 30G is really hitting rock bottom when you
talk about big data. Hadoop is linearly scalable so probably going to 3 or 4
similar machines could get you below the mysql time but it's hardly a fair
comparison.
Setting it up I would suggest reading the hadoop docs:
Generally a single hadoop machine will perform worse then a single mysql
machine. People normally use hadoop when they have so much data it won't really
fit on a single machine and it would require specialized hardware (Stuff like
SAN's) to run.
30GB of data really isn't that much and 2GB of ram
Perhaps worth posting the error. Some might know what the error means.
Also a bit unrelated to hive but please do yourself a favor and don't use float
to store monetary values like salary. You will get rounding issues at some
point in time when you do arithmetic on them. Considering you are usin
: Gabor Makrai [mailto:makrai.l...@gmail.com]
Sent: Wednesday, February 06, 2013 12:45 PM
To: 王锋; Bennie Schut
Cc: user@hive.apache.org
Subject: Re: Re:RE: Problem with Hive JDBC server
Hi guys,
Bad news for me. I checked out and compiled the Hive trunk and got the same
problem.
I attached to
Normally that would be stmt.setQueryTimout however that call isn't implemented
yet. So to answer, no there isn't.
public void setQueryTimeout(int seconds) throws SQLException {
throw new SQLException("Method not supported");
}
You might find a parameter called "hive.stats.jdbc.timeout" b
Hi,
Just in case anyone else ever runs into this.
Lately our cluster kept on killing itself with an OOM message in the kernel
log. It took me a while to realize why this happened since no single process
was causing it.
I traced it back to a few queries running concurrently on a really small
dat
13 at 11:53 AM, Bennie Schut
mailto:bsc...@ebuddy.com>> wrote:
Since it's small can you post the code?
From: Gabor Makrai [mailto:makrai.l...@gmail.com<mailto:makrai.l...@gmail.com>]
Sent: Monday, February 04, 2013 11:45 AM
To: user@hive.apache.org<mailto:user@hive.apache.org&g
Since it's small can you post the code?
From: Gabor Makrai [mailto:makrai.l...@gmail.com]
Sent: Monday, February 04, 2013 11:45 AM
To: user@hive.apache.org
Subject: Problem with Hive JDBC server
Hi guys,
I'm writing you because I experienced a very strange problem which probably
affects all Hiv
The benefit of using the partitioned approach is really nicely described in the
oreilly book "Programming Hive". (Thanks for writing it Edward)
For me the ability to drop a single partition if there's any doubt about the
quality of the data of just one job is a large benefit.
From: Edward Caprio
cely: 2012-11-23
but you can obviously pick your own pattern.
From: Dima Datsenko [mailto:di...@microsoft.com]
Sent: Thursday, November 22, 2012 4:07 PM
To: Bennie Schut; user@hive.apache.org
Subject: RE: Effecient partitions usage in join
Hi Benny,
The udf solution sounds like a plan. Much better tha
Unfortunately at the moment partition pruning is a bit limited in hive. When
hive creates the query plan it decides what partitions to use. So if you put
hardcoded list of partition_id items in the where clause it will know what to
do. In the case of a join (or a subquery) it would have to run t
The jdbc driver uses thrift so if thrift can't then jdbc can't.
This can be surprisingly difficult to do. Hive can split a query into x hadoop
jobs and some will run in parallel and some will run in sequence.
I've used oracle in the past (10 and 11) and I could also never find out how
long a lar
Hi Prabhu,
Be careful when going into the direction of calendar dimensions. While strictly
speaking this is a cleaner dwh design you will for sure run into issues you
might not expect. Consider this is probably what you would want to do (roughly)
to query a day:
select count(*)
from fact f
j
"inner join" is simply translated to "join" they are the same thing
(HIVE-2191)
I'm guessing he means removing the join from the where part of the query
and using the "select a,b from a join b on (a.id=b.id)" syntax.
On 10/22/2011 05:05 AM, john smith wrote:
You mean select a,b from a inner join
I'll be at hadoop world. Is the hive meetup still happening?
On 08/29/2011 10:03 PM, Carl Steinbach wrote:
Hi Ed,
This is a one-time event targeted at Hadoop World attendees, though
others are welcome to attend as well.
Thanks.
Carl
On Mon, Aug 29, 2011 at 12:09 PM, Edward Capriolo
mailto:e
Somewhere lower in my config file I set a incorrect LockManager so now
it works :)
On 09/07/2011 04:02 PM, Bennie Schut wrote:
I've been trying to play with locks in hive using zookeeper but can't
find documentation on how to configure it. I now have:
hive.supports.concur
I've been trying to play with locks in hive using zookeeper but can't
find documentation on how to configure it. I now have:
hive.supports.concurrency
true
hive.zookeeper.quorum
localhost
But I keep getting errors like this:
11/09/07 15:47:57 ERROR exec.DDLTask: FAILED: Error in metadata:
I have a similar problem with a trunk build and a mysql metastore.
Doing: alter table IDXS modify column DEFERRED_REBUILD boolean not null;
Doesn't seem to fix it. Perhaps because mysql converts the boolean into
a "tinyint(1)"?
Is there an easy way to make it fail with an error instead of getti
If its 0.7 and "IOException: The system cannot find the path specified"
then you ran into HIVE-2054. It seems Carl backported it to 0.7.1 so try
that.
If it's something else please post the error.
On 05/17/2011 04:56 AM, Raghunath, Ranjith wrote:
I have followed the document outlining how to
0.1, hadoop-0.20.2).
Any help?
-amit
--------
*From:* Bennie Schut
*To:* "user@hive.apache.org"
*Sent:* Wed, 9 March, 2011 4:39:49 AM
*Subject:* hive hbase handler metadata NullPointerException
Hi All,
I was trying out
while freeing us guys to
concentrate on better things like Hadoop & Hive :-) I assumed with the DB just
being a metadata store that the database wouldn’t be an issue but were
struggling a bit:-(
On 24 March 2011 15:23, Bennie Schut
<<mailto:bsc...@ebuddy.com>bsc...@ebuddy.com<ma
Sorry to become a bit offtopic but how do you get into a situation where
sqlserver 2005 becomes a requirement for a hive internal meta store?
I doubt many of the developers of hive will have access to this database
so I don't expect a lot of response on this. But hopefully someone can
prove me
So far I'm not able to reproduce this on our dev environment(only when
going live) but when trying trunk I get errors like attached.
I considering making a jira out of it but I'm not sure what query is
causing this.
java.io.IOException: Call to batiatus-int.ebuddy.com/10.10.0.5:9000 failed o
Hi All,
I was trying out hbase 0.89.20100924 with hive trunk with hadoop 0.20.2
When I'm running a simple insert I get this:
java.lang.RuntimeException: Error in configuring object
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at
org.apache.hado
he response !
I had CLASSPATH set to include
/usr/share/java/mysql.jar
... in addition, I just copied the mysql.jar to the lib directory
of hive.
I still get the same bug.
Any other ideas?
Thanks,
-Ajo
On Wed, Mar 2, 2011 at 7:01 AM, Bennie Schut mailto:b
Usually this is caused by not having the mysql jdbc driver on the
classpath (it's not default included in hive).
Just put the mysql jdbc driver in the hive folder under "lib/"
On 03/02/2011 03:15 PM, Ajo Fod wrote:
I've checked the mysql connection with a separate java file with the
same string
might help.
Sent from my iPhone
On Feb 22, 2011, at 12:46 AM, Bennie Schut wrote:
I've just set the "hive.exec.reducers.bytes.per.reducer" to as low as 100k
which caused this job to run with 999 reducers. I still have 5 tasks failing with an
outofmemory.
We have jvm reuse set to
s problem:
set mapred.job.reuse.jvm.num.tasks = 1;
It's still puzzling me how it can run out of memory. It seems like some
of the reducers get an unequally large share of the work.
On 02/18/2011 10:53 AM, Bennie Schut wrote:
When we try to join two large tables some of the reducers stop with an
OutOfMemo
When we try to join two large tables some of the reducers stop with an
OutOfMemory exception.
Error: java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
at
org.apache.hadoop.mapred.ReduceTask$Re
At least on trunk it seems on external tables(perhaps also TextFile?)
this works for integer values but not for string values. For a string it
will then return as an empty string which you then have to find with "
where field = '' " but I would prefer to use " where field is null ".
Not sure if
@gmail.com>> het volgende
geschreven:
Seems like this works for me too!
That probably saved me for a bunch of hours tracing this down through hive and
hadoop
Do you know what the side effect of setting this to false would be?.
Thanks!
Terje
On Fri, Jan 7, 2011 at 4:39 PM, Bennie Schut
In the past I ran into a similar problem which was actually caused by a bug in
hadoop. Someone was nice enough to come up with a workaround for this. Perhaps
you are running into a similar problem. I also had this problem when calling
lots of "load file" commands. After adding this to the hive-s
Hi,
I just recently updated to trunk, was lagging a few months behind. Now I'm
getting errors like: "Filtering is supported only on partition keys of type
string"
It seems some type checking was added on
org.apache.hadoop.hive.metastore.parser.ExpressionTree.java:161 which makes
sure partition
You can use a left outer join which works in all databases.
select a.value
from tablea a
left outer join tableb b on (b.value = a.value)
where b.value is null;
Databases are generally pretty good at doing joins so this usually performs
good.
From: איל (Eyal)
42 matches
Mail list logo