Re: Newbie: error=24, Too many open files

2008-11-23 Thread tim robertson
Thank you Jeremy I am on Mac (10.5.5) and it is 256 by default. I will change this and rerun before running on the cluster. Thanks again Tim On Mon, Nov 24, 2008 at 8:38 AM, Jeremy Chow <[EMAIL PROTECTED]> wrote: > There are a file number limitation each process can open in unix/linux. The >

Re: Newbie: error=24, Too many open files

2008-11-23 Thread Jeremy Chow
There are a file number limitation each process can open in unix/linux. The default number in linux is 1024, you can use ulimit -n number to custom this limitation and ulimit -n to show this limitation. Regards, Jeremy -- My research interests are distributed systems, parallel computing and b

Re: How to integrate hadoop framework with web application

2008-11-23 Thread Alexander Aristov
Hi You may want to take a look at the Nutch project - hadoop based search engine. It has web application with hadoop integration. As far as I remember you should add hadoop libs and configuration files to classpath and init hadoop on startup. Alexander 2008/11/24 柳松 <[EMAIL PROTECTED]> > Dear

Re: Quickstart Docs

2008-11-23 Thread Arun C Murthy
On Nov 23, 2008, at 6:09 AM, Tim Williams wrote: The Quickstart[1] suggests the minimum java version is 1.5.x but I was only successful getting the examples running after using 1.6.Thanks, --tim [1] - http://hadoop.apache.org/core/docs/current/quickstart.html Thanks for pointing this

Re: Newbie: error=24, Too many open files

2008-11-23 Thread Amareshwari Sriramadasu
tim robertson wrote: Hi all, I am running MR which is scanning 130M records and then trying to group them into around 64,000 files. The Map does the grouping of the record by determining the key, and then I use a MultipleTextOutputFormat to write the file based on the key: @Override

Newbie: error=24, Too many open files

2008-11-23 Thread tim robertson
Hi all, I am running MR which is scanning 130M records and then trying to group them into around 64,000 files. The Map does the grouping of the record by determining the key, and then I use a MultipleTextOutputFormat to write the file based on the key: @Override protected String g

Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:

2008-11-23 Thread Saju K K
This is in referance with the sample application in the JAVAWord http://www.javaworld.com/javaworld/jw-09-2008/jw-09-hadoop.html?page=5 bin/hadoop dfs -mkdir /opt/www/hadoop/hadoop-0.18.2/words bin/hadoop dfs -put word1 /opt/www/hadoop/hadoop-0.18.2/words bin/hadoop dfs -put word2 /opt/www/hado

Re:How to integrate hadoop framework with web application

2008-11-23 Thread 柳松
Dear 晋光峰: Glad to see another Chinese name here. It sounds possible, but could you give us a little more detail? Best Regards. 在2008-11-24?09:41:15,"晋光峰"?<[EMAIL PROTECTED]>?写道: >Dear?all, > >Does?anyone?knows?how?to?integrate?hadoop?to?web?applications??I?want?to >startup?a?hadoop?job?by

Re: Hadoop+log4j

2008-11-23 Thread Scott Whitecross
Thanks Brian. So you have had luck w/ log4j? I haven't tried local mode. I will try it tonight and see how it goes for quick debugging. More so, I wanted to be able to easily log and watch events on a cluster, rather then digging through all the hadoop logging levels. I've also read tha

Re: Hadoop+log4j

2008-11-23 Thread Brian Bockelman
Hey Scott, I see nothing wrong offhand; have you tried to run in "local" mode? It'd be quicker to debug logging problems that way, as any bad misconfigurations (I think) should get printed out to stderr. Brian On Nov 23, 2008, at 9:01 PM, Scott Whitecross wrote: Thanks Brian. I've playe

Re: Hadoop+log4j

2008-11-23 Thread Scott Whitecross
Thanks Brian. I've played w/ the log4j.properties a bit, and haven't had any luck. Can you share how youve setup log4j? I am probably missing the obvious, but here is what I setup: log4j.logger.com.mycompany.hadoop=DEBUG,DX,console log4j.appender.DX=org.apache.log4j.DailyRollingFileAppend

Re: Hadoop+log4j

2008-11-23 Thread Brian Bockelman
Hey Scott, Have you tried configuring things from $HADOOP_HOME/conf/log4j.properties ? I'd just use my own logger and set up a separate syslog server. It's not an extremely elaborate setup (certainly, would quickly become a headache on a large cluster...), but it should be pretty easy to

Re: Hadoop+log4j

2008-11-23 Thread Scott Whitecross
I've looked around for a while, but it seems there isn't a way to log from Hadoop, without going through the logs/userlogs/ and the 'attempt' directories? That would mean that for logging I'm restricted to writing to System.out and System.err, then collecting via scripts? Thanks. On

How to integrate hadoop framework with web application

2008-11-23 Thread 晋光峰
Dear all, Does anyone knows how to integrate hadoop to web applications? I want to startup a hadoop job by the Java Servlet (in web server servlet container), then get the result and send result back to browser. Is this possible? How to connect the web server with the hadoop framework? Please giv

Fwd: Hadoop Installation

2008-11-23 Thread Mithila Nagendra
Hey guys!! Any ideas on this one! I m still stuck with this! I tried dropping the jar files into the lib. It still doesnt work.. The following is how the lib looks after the new files were put in: [EMAIL PROTECTED] hadoop-0.17.2.1]$ cd bin [EMAIL PROTECTED] bin]$ ls hadoophadoop-daem

Quickstart Docs

2008-11-23 Thread Tim Williams
The Quickstart[1] suggests the minimum java version is 1.5.x but I was only successful getting the examples running after using 1.6.Thanks, --tim [1] - http://hadoop.apache.org/core/docs/current/quickstart.html

Re: Newbie: multiple output files

2008-11-23 Thread tim robertson
Hi Jeremy, Thank you very much! Exactly what I was looking for Cheers, Tim On Sun, Nov 23, 2008 at 2:21 PM, Jeremy Chow <[EMAIL PROTECTED]> wrote: > Hi Tim, > > You can write a class inherit from org.apache.hadoop.mapred.lib. > MultipleOutputFormat. Override method generateFileNameForKeyValue

Re: Newbie: multiple output files

2008-11-23 Thread Jeremy Chow
Hi Tim, You can write a class inherit from org.apache.hadoop.mapred.lib. MultipleOutputFormat. Override method generateFileNameForKeyValue() like this 1. @Override 2. protected String generateFileNameForKeyValue(K key, V value, String name) { 3. return name + "_" + v

Newbie: multiple output files

2008-11-23 Thread tim robertson
Hi, Can someone please point me at the best way to create multiple output files based on the Key outputted from the Map? So I end up with no reduction, but a file per Key outputted in the Mapping phase, ideally with the Key as the file name. Many thanks, Tim