Re: combiner stats

2008-11-18 Thread Paco NATHAN
Thank you, Devaraj - That explanation helps a lot. Is the following reasonable to say? Combine input records count shown in the Map phase column of the report is a measure of how many times records have passed through the Combiner during merges of intermediate spills. Therefore, it may be

0.18.2 release compiled with java 6 ?

2008-11-18 Thread Johannes Zillmann
Dear Hadoop Developers, i work on a java 5 project and during upgrade from 0.18.1 to 0.18.2 this error appears: ... x.java:[11,-1] cannot access org.apache.hadoop.streaming.JarBuilder bad class file: /Users/jz/.m2/repository/org/apache/hadoop/hadoop-

Re: tasktracker startup Time

2008-11-18 Thread Steve Loughran
Bhupesh Bansal wrote: Hey folks, I re-started my cluster after some node failures and saw couple of tasktrackers not being up (they finally did after abt 20 Mins) In the logs below check the blue timestamp to Red timestamp. I was just curious what do we do while starting tasktracker that

Re: 0.18.2 release compiled with java 6 ?

2008-11-18 Thread Alex Loddengaard
Or just run `ant jar` from $HADOOP_HOME and grab the jar (postfixed with -dev) in $HADOOP_HOME/build. Alex On Tue, Nov 18, 2008 at 6:30 AM, 柳松 [EMAIL PROTECTED] wrote: You can also rebuild the jar by compiling all the sources in the 'src' folder with your working jdk.

Re: tasktracker startup Time

2008-11-18 Thread Bhupesh Bansal
Thanks Steve, I will try kill -QUIT and report back. Best Bhupesh On 11/18/08 5:45 AM, Steve Loughran [EMAIL PROTECTED] wrote: Bhupesh Bansal wrote: Hey folks, I re-started my cluster after some node failures and saw couple of tasktrackers not being up (they finally did after abt 20

Performing a Lookup in Multiple MapFiles?

2008-11-18 Thread Dan Benjamin
I've got a Hadoop process that creates as its output a MapFile. Using one reducer this is very slow (as the map is large), but with 150 (on a cluster of 80 nodes) it runs quickly. The problem is that it produces 150 output files as well. In a subsequent process I need to perform lookups on

ipc problems afer upgrading to hadoop 0.18

2008-11-18 Thread Johannes Zillmann
Hi there, having a custom client-server based on hadoop's ipc. Now after upgrading to hadoop 0.18.1 (from 0.17) i get following exception: ... Caused by: java.io.IOException: Call failed on local exception at org.apache.hadoop.ipc.Client.call(Client.java:718) at

Re: Performing a Lookup in Multiple MapFiles?

2008-11-18 Thread lohit
Hi Dan, You could do one few things to get around this. 1. In a subsequent step you could merge all your MapFile outputs into one file. This is if your MapFile output is small. 2. Else, you can use the same partition function which hadoop used to find the partition ID. Partition ID can tell you

FileSystem.append and FSDataOutputStream.seek

2008-11-18 Thread Wasim Bari
Hello, Does anyone know when Hadoop team has plan to Implement FileSystem.append(Path) functionality and Something seekable with FSDataOutputStream (mean seek capability) ? On which forum we can ask for some functionalities inclusion ? Thanks, Wasim

Re: ipc problems afer upgrading to hadoop 0.18

2008-11-18 Thread Johannes Zillmann
OK, that there is a exception was right. (this happened in a ipc-server not reachable-test) I just missed that fact that now a other exception is thrown in difference to previous version. Previous versions has thrown exceptions like ConnectException or SocketTimeoutException. Now

RE: Cannot run program bash: java.io.IOException: error=12, Cannot allocate memory

2008-11-18 Thread Xavier Stevens
I'm still seeing this problem on a cluster using Hadoop 0.18.2. I tried dropping the max number of map tasks per node from 8 to 7. I still get the error although it's less frequent. But I don't get the error at all when using Hadoop 0.17.2. Anyone have any suggestions? -Xavier -Original

Re: Cannot run program bash: java.io.IOException: error=12, Cannot allocate memory

2008-11-18 Thread Brian Bockelman
Hey Xavier, Don't forget, the Linux kernel reserves the memory; current heap space is disregarded. How much heap space does your data node and tasktracker get? (PS: overcommit ratio is disregarded if overcommit_memory=2). You also have to remember that there is some overhead from the

Re: What do you do with task logs?

2008-11-18 Thread Edward Capriolo
We just setup a log4j server. This takes the logs off the cluster. Plus you get all the benefits of log4j http://timarcher.com/?q=node/10

Re: What do you do with task logs?

2008-11-18 Thread Alex Loddengaard
You could take a look at Chukwa, which essentially collects and drops your logs to HDFS: http://wiki.apache.org/hadoop/Chukwa The last time I tried to play with Chukwa, it wasn't in a state to be played with yet. If that's still the case, then you can use Scribe to collect all of your logs in a

RE: Cannot run program bash: java.io.IOException: error=12, Cannot allocate memory

2008-11-18 Thread Xavier Stevens
1) It doesn't look like I'm out of memory but it is coming really close. 2) overcommit_memory is set to 2, overcommit_ratio = 100 As for the JVM, I am using Java 1.6. **Note of Interest**: The virtual memory I see allocated in top for each task is more than what I am specifying in the hadoop

RE: Hadoop User Group (Bay Area) Nov 19th

2008-11-18 Thread Ajay Anand
Please note that the room for this has been changed to Yahoo! Mission College Building 2, Training Rooms 5 6. Thanks Ajay -Original Message- From: Ajay Anand Sent: Friday, November 14, 2008 4:47 PM To: 'core-user@hadoop.apache.org'; '[EMAIL PROTECTED]'; '[EMAIL PROTECTED]'; '[EMAIL

Re: ipc problems afer upgrading to hadoop 0.18

2008-11-18 Thread Raghu Angadi
Johannes Zillmann wrote: OK, that there is a exception was right. (this happened in a ipc-server not reachable-test) I just missed that fact that now a other exception is thrown in difference to previous version. Previous versions has thrown exceptions like ConnectException or

RE: Cannot run program bash: java.io.IOException: error=12, Cannot allocate memory

2008-11-18 Thread Koji Noguchi
We had a similar issue before with Secondary Namenode failing with 2008-10-09 02:00:58,288 ERROR org.apache.hadoop.dfs.NameNode.Secondary: java.io.IOException: javax.security.auth.login.LoginException: Login failed: Cannot run program whoami: java.io.IOException: error=12, Cannot allocate

Re: Cannot run program bash: java.io.IOException: error=12, Cannot allocate memory

2008-11-18 Thread Brian Bockelman
Hey Koji, Possibly won't work here (but possibly will!). When overcommit_memory is turned off, Java locks its VM memory into non-swap (this request is additionally ignored when overcommit_memory is turned on...). The problem occurs when spawning a bash process and not a JVM, so there's

Re: Cannot run program bash: java.io.IOException: error=12, Cannot allocate memory

2008-11-18 Thread Edward J. Yoon
Hmm. In my experience, It often occurs on PC commodity cluster. small PIEstimator job also throws this error on PC cluster. But I don't get the error at all when using Hadoop 0.17.2. Yes, I was wonder about this. :) On Wed, Nov 19, 2008 at 7:32 AM, Xavier Stevens [EMAIL PROTECTED] wrote: I'm

Re: Web Proxy to Access DataNodes

2008-11-18 Thread Karl Anderson
On 13-Nov-08, at 8:44 PM, David Ritch wrote: On Thu, Nov 13, 2008 at 7:32 PM, Alex Loddengaard [EMAIL PROTECTED] wrote: You could also have your developers setup a SOCKS proxy with the -D option to ssh. Then have them install FoxyProxy. hadoop-ec2 has a utility to make this easy:

Re: A QQ group for Chinese Hadoop learners

2008-11-18 Thread 朱韬
the number of group is so sexy!

Re: Re: A QQ group for Chinese Hadoop learners

2008-11-18 Thread bacoo
what do you mean? sexy? bacoo 2008-11-19 发件人: 朱韬 发送时间: 2008-11-19 10:36:56 收件人: core-user 抄送: 主题: Re: A QQ group for Chinese Hadoop learners the number of group is so sexy!