Re: Skip Reduce Phase

2009-02-25 Thread Jothi Padmanabhan
Sorry, this mail was intended for somebody else. Please disregard. On 2/25/09 2:33 PM, Jothi Padmanabhan joth...@yahoo-inc.com wrote: Just to clarify -- setting test.build.data on the command line to point to some arbitrary directory in /tmp should work ant -Dtestcase=TestMapReduceLocal

Re: FAILED_UNCLEAN?

2009-02-25 Thread Nathan Marz
This is on Hadoop 0.19.1. The first time I saw it happen, the job was hung. That is, 5 map tasks were running, but looking at each task there was the FAILED_UNCLEAN task attempt and no other task attempts. I reran it again, the job failed immediately, and some of the tasks had

Re: [ANNOUNCE] Hadoop release 0.19.1 available

2009-02-25 Thread Aviad sela
Nigel, The SVN tag http://svn.apach.org/repos/asf/core/tags/release-0.19.1 include also the branch folder branch-0.19 which results with double size project which is not necessary. I need to use this release to apply patch HADOOP-4546 (

running hadoop in openvz vps

2009-02-25 Thread arulP
[r...@openvz2 ~]# cd /root/hadoop-0.18.2/ [r...@openvz2 hadoop-0.18.2]# bin/hadoop namenode -format Error occurred during initialization of VM Could not reserve enough space for object heap Could not create the Java virtual machine. ---how to resolve this problem??.i tried changing

Re: why print this error when using MultipleOutputFormat?

2009-02-25 Thread Rasit OZDAS
Qiang, I couldn't find now which one, but there is a JIRA issue about MultipleTextOutputFormat (especially when reducers = 0). If you have no reducers, you can try having one or two, then you can see if your problem is related with this one. Cheers, Rasit 2009/2/25 ma qiang maqiang1...@gmail.com

Re: Using Hadoop for near real-time processing of log data

2009-02-25 Thread Mikhail Yakshin
Hi, Is anyone using Hadoop as more of a near/almost real-time processing of log data for their systems to aggregate stats, etc? We do, although near realtime is pretty relative subject and your mileage may vary. For example, startups / shutdowns of Hadoop jobs are pretty expensive and it could

Re: OutOfMemory error processing large amounts of gz files

2009-02-25 Thread Tom White
Do you experience the problem with and without native compression? Set hadoop.native.lib to false to disable native compression. Cheers, Tom On Tue, Feb 24, 2009 at 9:40 PM, Gordon Mohr goj...@archive.org wrote: If you're doing a lot of gzip compression/decompression, you *might* be hitting

Re: Using Hadoop for near real-time processing of log data

2009-02-25 Thread Edward Capriolo
On Wed, Feb 25, 2009 at 1:13 PM, Mikhail Yakshin greycat.na@gmail.com wrote: Hi, Is anyone using Hadoop as more of a near/almost real-time processing of log data for their systems to aggregate stats, etc? We do, although near realtime is pretty relative subject and your mileage may

Re: Using Hadoop for near real-time processing of log data

2009-02-25 Thread Mikhail Yakshin
On Wed, Feb 25, 2009 at 10:09 PM, Edward Capriolo Is anyone using Hadoop as more of a near/almost real-time processing of log data for their systems to aggregate stats, etc? We do, although near realtime is pretty relative subject and your mileage may vary. For example, startups / shutdowns

Re: Using Hadoop for near real-time processing of log data

2009-02-25 Thread Edward Capriolo
Yeah, but what's the point of using Hadoop then? i.e. we lost all the parallelism? Some jobs do not need it. For example, I am working with the Hive sub project. If I have a table that is less then my block size. Having a large number of mappers or reducers is counter productive. Hadoop will

Re: OutOfMemory error processing large amounts of gz files

2009-02-25 Thread bzheng
Thanks for the suggestions. I tried the hadoop.native.lib setting (both in job config and in hadoop-sites.xml + restart) and the problem is still there. I finally got the exception w/some stack trace and here it is: 2009-02-25 12:24:18,312 INFO org.apache.hadoop.mapred.TaskTracker:

Re: Could not reserve enough space for heap in JVM

2009-02-25 Thread Anum Ali
If the solution given my Matei Zaharia wont work , which I guess it wont if you are using eclipse 3.3.0 because this is a bug , which they resloved it in later version which is eclipse 3.4 ganymede. Better upgrade eclipse version. On 2/26/09, Matei Zaharia ma...@cloudera.com wrote: These

Re: Could not reserve enough space for heap in JVM

2009-02-25 Thread Arijit Mukherjee
I was getting similar errors too while running the mapreduce samples. I fiddled with the hadoop-env.sh (where the HEAPSIZE is specified) and the hadoop-site.xml files - and rectified it after some trial and error. But I would like to know if there is a thumb rule for this. Right now, I've a core

Re: Could not reserve enough space for heap in JVM

2009-02-25 Thread Nick Cen
I got a question relatived to the HADOOP_HEAPSIZE variable. My machine's memory size is 16G. but when i set HADOOP_HEAPSIZE to 4GB, it thrown the exception refered in this thread. how can i make full use of my mem. thx. 2009/2/26 Arijit Mukherjee ariji...@gmail.com I was getting similar errors

Re: Could not reserve enough space for heap in JVM

2009-02-25 Thread souravm
Is ur machine 32 bit or 64 bit - Original Message - From: Nick Cen cenyo...@gmail.com To: core-user@hadoop.apache.org core-user@hadoop.apache.org Sent: Wed Feb 25 21:10:00 2009 Subject: Re: Could not reserve enough space for heap in JVM I got a question relatived to the HADOOP_HEAPSIZE

Re: Orange Labs is hosting an event about recommendation engines - March 3rd

2009-02-25 Thread Adam Rose
Hi Jeremy - I'm interested in attending this event. I'm the CTO at TubeMogul (www.tubemogul.com). We distribute video online, and do in-player video analytic tracking using Hadoop. We're considering potential business avenues involving recommendation engines at some point. I haven't been