Thanks for the counting solution!

Zhao, I've uploaded the S3 log parser to HIVE-693.

Among other things, I also noticed a Hive bug today: when using hive in
server mode (via python) to import 400 different partitions one after
another, datanode started reporting "too many open files" errors in its
logs. The analysis showed that Hive is not closing connections to
datanode at all when doing loads like this:

LOAD DATA LOCAL INPATH '/home/neith/xshairlogs//mapped-2009-04-24.gz'
OVERWRITE INTO TABLE shairlogs PARTITION (pdate='2009-04-24')


[don't know if that happens also in CLI mode - doing 300 commands like
that is a bit too tedious to see if it can be reproduced :]


bye
andraz


>From Zheng Shao <zsh...@gmail.com> Subject Re: counting different
regexes in a single pass Date Mon, 27 Jul 2009 20:46:24 GMT 
Hi Andraz,

I just opened a JIRA for AWS S3 log format.
Can you attach a patch file to: https://issues.apache.org/jira/browse/HIVE-693 ?

For your question, I think the approach suggested by David Lerman
should work fine.

-- 
Andraz Tori, CTO
Zemanta Ltd, New York, London, Ljubljana
www.zemanta.com
mail: and...@zemanta.com
tel: +386 41 515 767
twitter: andraz, skype: minmax_test



Reply via email to