Feb 18, 2010 at 3:17 PM
Subject: Re: Hive Server Leaking File Descriptors?
To: "hive-user@hadoop.apache.org"
On 18 Feb 2010, at 20:29, "Zheng Shao" wrote:
I've tried to look into it a bit more and it seems to happen on
"load data
inpath"
gt;> Given that release 0.5.0 is much wanted right now, I don't think we
>> want to wait purely for 0.5.0 since the ultimate fix should come from
>> Hadoop.
>> We will definitely get HIVE-1181 for branch 0.5.
>>
>> Zheng
>>
>> ------ Forwarded messa
This has made my day.
Thanks for working through this Bennie and Zheng.
Andy.
From: Bennie Schut [bsc...@ebuddy.com]
Sent: 19 February 2010 07:47
To: hive-user@hadoop.apache.org
Subject: Re: Hive Server Leaking File Descriptors?
That's some great
e ultimate fix should come from
Hadoop.
We will definitely get HIVE-1181 for branch 0.5.
Zheng
-- Forwarded message --
From: Andy Kent
Date: Thu, Feb 18, 2010 at 3:17 PM
Subject: Re: Hive Server Leaking File Descriptors?
To: "hive-user@hadoop.apache.org"
On 18 Feb 20
y get HIVE-1181 for branch 0.5.
Zheng
-- Forwarded message --
From: Andy Kent
Date: Thu, Feb 18, 2010 at 3:17 PM
Subject: Re: Hive Server Leaking File Descriptors?
To: "hive-user@hadoop.apache.org"
On 18 Feb 2010, at 20:29, "Zheng Shao" wrote:
>> I
On 18 Feb 2010, at 20:29, "Zheng Shao" wrote:
>> I've tried to look into it a bit more and it seems to happen on
>> "load data
>> inpath"
This is inline with what we have been seeing as we do around 200 load
data statements per day and leak approx the same number of file
descriptors.
Is
I have included the output from lsof, Unfortunately though I only restarted the
server a few mins ago and so it doesn't really illustrate the problem. I'll try
to post back with the same command tomorrow so that we can diff between the two.
Thanks, Andy.
java 16908 hadoop cwd DIR
Can you go to that box, sudo as root, and do "lsof | grep 12345" where
12345 is the process id of the hive server?
We should be able to see the names of the files that are open.
Zheng
On Mon, Feb 15, 2010 at 7:42 AM, Andy Kent wrote:
> Nope, no luck so far.
>
> We have upped the number of file d
Nope, no luck so far.
We have upped the number of file descriptors and are having to restart hive
every week or so :(
Any other suggestions would be greatly appreciated.
On 15 Feb 2010, at 14:09, Bennie Schut wrote:
> Did this help? I'm running into a similar problem. slowly leaking
> connect
Did this help? I'm running into a similar problem. slowly leaking
connections to 50010 and after a hive restart all is ok again.
Andy Kent wrote:
I can give try and give it a go. I'm not convinced though as we are working
with CSV files and don't touch sequence files at all at the moment.
We
I can give try and give it a go. I'm not convinced though as we are working
with CSV files and don't touch sequence files at all at the moment.
We are using the Clodera Ubuntu Packages for Hadoop 0.20.1+133 and Hive 0.40
On 25 Jan 2010, at 15:30, Jay Booth wrote:
> Actually, we had an issue wi
Actually, we had an issue with this, it was a bug in SequenceFile where if
there were problems opening a file, it would leave a filehandle open and
never close it.
Here's the patch -- It's already fixed in 0.21/trunk, if I get some time
this week I'll submit it against 0.20.2 -- could you apply th
Yeah, I'd guess that this is a Hive issue, although it could be a
combination.. maybe if you're doing queries and then closing your thrift
connection before reading all results, Hive doesn't know what to do and
leaves the connection open? Once the west coast folks wake up, they might
have a bette
On 25 Jan 2010, at 13:59, Jay Booth wrote:
> That's the datanode port.. if I had to guess, Hive's connecting to DFS
> directly for some reason (maybe for "select *" queries?) and not finishing
> their reads or closing the connections after.
Thanks for the response.
That's what I was suspecti
That's the datanode port.. if I had to guess, Hive's connecting to DFS
directly for some reason (maybe for "select *" queries?) and not finishing
their reads or closing the connections after.
On Mon, Jan 25, 2010 at 6:13 AM, Andy Kent wrote:
> We use the hive thrift server and ruby client to su
We use the hive thrift server and ruby client to submit queries to hive. But we
have noticed that the number of open file descriptors climbs steadily on the
machine running hive.
On a cluster of 21 nodes with hive running for around 5 days and processing
around 200 queries per day we see aroun
16 matches
Mail list logo