namenode null pointer

2012-02-18 Thread Ben Cuthbert
All sometimes when I startup my hadoop I get the following error? 12/02/17 10:29:56 INFO namenode.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host =iMac.local/192.168.0.191 STARTUP_MSG: args = [] STARTUP_MSG:

better partitioning strategy in hive

2012-02-18 Thread rk vishu
Hello All, We have a hive table partitioned by date and hour(330 columns). We have 5 years worth of data for the table. Each hourly partition have around 800MB. So total 43,800 partitions with one file per partition. When we run select count(*) from table, hive is taking for ever to submit the

Re: better partitioning strategy in hive

2012-02-18 Thread rk vishu
Hello All, We have a hive table partitioned by date and hour(330 columns). We have 5 years worth of data for the table. Each hourly partition have around 800MB. So total 43,800 partitions with one file per partition. When we run select count(*) from table, hive is taking for ever to submit

Where is source code for the Hadoop Eclipse plugin?

2012-02-18 Thread Andy Doddington
Subject says it all really :-)

Re: Addendum to Hypertable vs. HBase Performance Test (w/ mslab enabled)

2012-02-18 Thread Doug Judd
Hi Edward, In the 1/2 trillion record use case that I referred to, data is streaming in from a realtime feed and needs to be online immediately, so bulk loading is not an option. In the 41 billion and 167 billion record insert tests, HBase consistently failed. We tried everything we could think

Re: Processing small xml files

2012-02-18 Thread Mohit Anchlia
On Fri, Feb 17, 2012 at 11:37 PM, Srinivas Surasani vas...@gmail.comwrote: Hi Mohit, You can use Pig for processing XML files. PiggyBank has build in load function to load the XML files. Also you can specify pig.maxCombinedSplitSize and pig.splitCombination for efficient processing. I

Re: Hadoop install

2012-02-18 Thread Keith Wiley
I always use the Cloudera packages, CDH3 I think it's called...but it isn't the latest by any shot. It's still .20. I think Hadoop is nearly to .23 although I'm not proficient on those kinds of details. I mentioned Cloudera's distribution because it falls into place pretty smoothly. For

Re: Hadoop install

2012-02-18 Thread Mohit Anchlia
Thanks Do I have to do something special to get Mahout xmlinput format and Pig with the new release of hadoop? On Sat, Feb 18, 2012 at 6:42 AM, Tom Deutsch tdeut...@us.ibm.com wrote: Mohit - one place to start is here; http://hadoop.apache.org/common/releases.html#Download The release

Hadoop Oppurtunity

2012-02-18 Thread maheswaran
Dear Group Members, We have openings in Hadoop/Big Data from 5 - 19 years of experience from Senior Developer to Heading the Hadoop Practice across the Globe with the TOP IT company in India. Work location may be anyware in INDIA/Global. Thanks Regards, Maheswaran A | Executive

Re: Hadoop Oppurtunity

2012-02-18 Thread larry
Hi: We are looking for someone to help install and support hadoop clusters. We are in Southern California. Thanks, Larry Lesser PSSC Labs (949) 380-7288 Tel. la...@pssclabs.com 20432 North Sea Circle Lake Forest, CA 92630

Re: Hadoop Oppurtunity

2012-02-18 Thread real great..
Could we actually create a separate mailing list for Hadoop related jobs? On Sun, Feb 19, 2012 at 11:40 AM, larry la...@pssclabs.com wrote: Hi: We are looking for someone to help install and support hadoop clusters. We are in Southern California. Thanks, Larry Lesser PSSC Labs (949)