Re: parse_url 0.5.0 Regression?

2010-03-09 Thread Andy Kent
On 9 Mar 2010, at 15:36, 김영우 wrote: > SELECT parse_url(url, 'QUERY', 'q') as query FROM table Thanks, this works so I assume it's the wiki that is incorrect... http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#String_Functions I don't seem to be able to edit the page so should I file a Jir

parse_url 0.5.0 Regression?

2010-03-09 Thread Andy Kent
We have a query like... SELECT parse_url(url, 'QUERY:q') as query FROM table I'm sure this used to work in hive 0.4.1 and the docs suggest that 'QUERY:key' should extract the relevant key from a query string but it always seems to return NULL in 0.5.0 is this a regression? or maybe the parse_ur

IN() Operator

2010-02-19 Thread Andy Kent
I couldn't find anything on the wiki so thought I would try here. Does Hive have an IN() operator similar to in MySQL? If not then is there an alternative way of testing for inclusion? Thanks, Andy.

Thrift Server Error Messages

2010-02-19 Thread Andy Kent
When executing commands on the hive command line it give really useful output if you have syntax errors in your query. When using the Thrift interface I seem to only be able to get errors like 'Error code: 11'. Is there a way to get at the human friendly error messages via the thrift interface?

RE: Hive Server Leaking File Descriptors?

2010-02-19 Thread Andy Kent
the ultimate fix should come from > Hadoop. > We will definitely get HIVE-1181 for branch 0.5. > > Zheng > > ------ Forwarded message -- > From: Andy Kent > Date: Thu, Feb 18, 2010 at 3:17 PM > Subject: Re: Hive Server Leaking File Descriptors? > To: "

Re: Hive Server Leaking File Descriptors?

2010-02-18 Thread Andy Kent
On 18 Feb 2010, at 20:29, "Zheng Shao" wrote: >> I've tried to look into it a bit more and it seems to happen on >> "load data >> inpath" This is inline with what we have been seeing as we do around 200 load data statements per day and leak approx the same number of file descriptors. Is

Re: Hive Server Leaking File Descriptors?

2010-02-15 Thread Andy Kent
I have included the output from lsof, Unfortunately though I only restarted the server a few mins ago and so it doesn't really illustrate the problem. I'll try to post back with the same command tomorrow so that we can diff between the two. Thanks, Andy. java 16908 hadoop cwd DIR

Re: Hive Server Leaking File Descriptors?

2010-02-15 Thread Andy Kent
leaking > connections to 50010 and after a hive restart all is ok again. > > Andy Kent wrote: >> I can give try and give it a go. I'm not convinced though as we are working >> with CSV files and don't touch sequence files at all at the moment. >> >> We are usin

Re: Hive Server Leaking File Descriptors?

2010-01-25 Thread Andy Kent
f you're doing queries and then closing your thrift > connection before reading all results, Hive doesn't know what to do and > leaves the connection open? Once the west coast folks wake up, they might > have a better answer for you than I do. > > > On Mon, Jan 25, 2

Re: Hive Server Leaking File Descriptors?

2010-01-25 Thread Andy Kent
On 25 Jan 2010, at 13:59, Jay Booth wrote: > That's the datanode port.. if I had to guess, Hive's connecting to DFS > directly for some reason (maybe for "select *" queries?) and not finishing > their reads or closing the connections after. Thanks for the response. That's what I was suspecti

Hive Server Leaking File Descriptors?

2010-01-25 Thread Andy Kent
We use the hive thrift server and ruby client to submit queries to hive. But we have noticed that the number of open file descriptors climbs steadily on the machine running hive. On a cluster of 21 nodes with hive running for around 5 days and processing around 200 queries per day we see aroun