Re: Hadoop Installation

2008-11-25 Thread Steve Loughran
Mithila Nagendra wrote: Hey Steve I deleted what ever I needed to.. still no luck.. You said that the classpath might be messed up.. Is there some way I can reset it? For the root user? What path do I set it to. Let's start with what kind of machine is this? Windows? or Linux. If Linux, whic

Datanode log for errors

2008-11-25 Thread Taeho Kang
Hi, I have encountered some IOExceptions in Datanode, while some intermediate/temporary map-reduce data is written to HDFS. 2008-11-25 18:27:08,070 INFO org.apache.hadoop.dfs.DataNode: writeBlock blk_-460494523413678075 received exception java.io.IOException: Block blk_-460494523413678075 is vali

Re: Getting Reduce Output Bytes

2008-11-25 Thread Sharad Agarwal
Is there an easy way to get Reduce Output Bytes? Reduce Output bytes not available directly but perhaps can be inferred from File system Read/Write bytes counters.

java.lang.OutOfMemoryError: Direct buffer memory

2008-11-25 Thread tim robertson
Hi all, I am doing a very simple Map that determines an integer value to assign to an input (1-64000). The reduction does nothing, but I then use this output formatter to put the data in a file per Key. public class CellBasedOutputFormat extends MultipleTextOutputFormat { @Override

Re: Getting Reduce Output Bytes

2008-11-25 Thread Paco NATHAN
Hi Lohit, Our teams collects those kinds of measurements using this patch: https://issues.apache.org/jira/browse/HADOOP-4559 Some example Java code in the comments shows how to access the data, which is serialized as JSON. Looks like the "red_hdfs_bytes_written" value would give you that. Be

Re: Block placement in HDFS

2008-11-25 Thread Dhruba Borthakur
Hi Dennis, There were some discussions on this topic earlier: http://issues.apache.org/jira/browse/HADOOP-3799 Do you have any specific use-case for this feature? thanks, dhruba On Mon, Nov 24, 2008 at 10:22 PM, Owen O'Malley <[EMAIL PROTECTED]> wrote: > > On Nov 24, 2008, at 8:44 PM, Mahadev

Hadoop complex calculations

2008-11-25 Thread Chris Quach
Hi, I'm testing Hadoop to see if we could use for complex calculations next to the 'standard' implementation. I've set up a grid with 10 nodes and if I run the RandomTextWriter example only 2 nodes are used as mappers, while I specified 10 mappers to be used. The other nodes are used for storage,

Re: Getting Reduce Output Bytes

2008-11-25 Thread Lohit
Thanks sharad and paco. Lohit On Nov 25, 2008, at 5:34 AM, "Paco NATHAN" <[EMAIL PROTECTED]> wrote: Hi Lohit, Our teams collects those kinds of measurements using this patch: https://issues.apache.org/jira/browse/HADOOP-4559 Some example Java code in the comments shows how to access the data

Re: Hadoop Installation

2008-11-25 Thread Mithila Nagendra
Hey steve The version is: Linux enpc3740.eas.asu.edu 2.6.9-67.0.20.EL #1 Wed Jun 18 12:23:46 EDT 2008 i686 i686 i386 GNU/Linux, this is what I got when I used the command uname -a On Tue, Nov 25, 2008 at 1:50 PM, Steve Loughran <[EMAIL PROTECTED]> wrote: > Mithila Nagendra wrote: > >> Hey Steve >

Re: Hadoop Installation

2008-11-25 Thread Steve Loughran
Mithila Nagendra wrote: Hey steve The version is: Linux enpc3740.eas.asu.edu 2.6.9-67.0.20.EL #1 Wed Jun 18 12:23:46 EDT 2008 i686 i686 i386 GNU/Linux, this is what I got when I used the command uname -a On Tue, Nov 25, 2008 at 1:50 PM, Steve Loughran <[EMAIL PROTECTED]> wrote: Mithila Nagendr

Question about ChainMapper and ChainReducer

2008-11-25 Thread Tarandeep Singh
Hi, I would like to know how does ChainMapper and ChainReducer save IO ? The doc says the output of first mapper becomes the input of second and so on. So does this mean, the output of first map is *not* written to HDFS and a second map process is started that operates on the data generated by fi

"Lookup" HashMap available within the Map

2008-11-25 Thread tim robertson
Hi all, If I want to have an in memory "lookup" Hashmap that is available in my Map class, where is the best place to initialise this please? I have a shapefile with polygons, and I wish to create the polygon objects in memory on each node's JVM and have the map able to pull back the objects by i

Re: "Lookup" HashMap available within the Map

2008-11-25 Thread Alex Loddengaard
You should use the DistributedCache: < http://www.cloudera.com/blog/2008/11/14/sending-files-to-remote-task-nodes-with-hadoop-mapreduce/ > and < http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#DistributedCache > Hope this helps! Alex On Tue, Nov 25, 2008 at 11:09 AM, tim robert

Re: "Lookup" HashMap available within the Map

2008-11-25 Thread tim robertson
Hi Thanks Alex - this will allow me to share the shapefile, but I need to "one time only per job per jvm" read it, parse it and store the objects in the index. Is the Mapper.configure() the best place to do this? E.g. will it only be called once per job? Thanks Tim On Tue, Nov 25, 2008 at 8:1

Re: "Lookup" HashMap available within the Map

2008-11-25 Thread Doug Cutting
tim robertson wrote: Thanks Alex - this will allow me to share the shapefile, but I need to "one time only per job per jvm" read it, parse it and store the objects in the index. Is the Mapper.configure() the best place to do this? E.g. will it only be called once per job? In 0.19, with HADOOP-

Re: "Lookup" HashMap available within the Map

2008-11-25 Thread tim robertson
Hi Doug, Thanks - it is not so much I want to run in a single JVM - I do want a bunch of machines doing the work, it is just I want them all to have this in-memory lookup index, that is configured once per job. Is there some hook somewhere that I can trigger a read from the distributed cache, or

Re: Block placement in HDFS

2008-11-25 Thread Pete Wyckoff
Fyi - Owen is referring to: https://issues.apache.org/jira/browse/HADOOP-2559 On 11/24/08 10:22 PM, "Owen O'Malley" <[EMAIL PROTECTED]> wrote: On Nov 24, 2008, at 8:44 PM, Mahadev Konar wrote: > Hi Dennis, > I don't think that is possible to do. No, it is not possible. > The block place

Re: "Lookup" HashMap available within the Map

2008-11-25 Thread Chris K Wensel
Hey Tim The .configure() method is what you are looking for i believe. It is called once per task, which in the default case, is once per jvm. Note Jobs are broken into parallel tasks, each task handles a portion of the input data. So you may create your map 100 times, because there are 100

Re: "Lookup" HashMap available within the Map

2008-11-25 Thread tim robertson
Thanks Chris, I have a different test running, then will implement that. Might give cascading a shot for what I am doing. Cheers Tim On Tue, Nov 25, 2008 at 9:24 PM, Chris K Wensel <[EMAIL PROTECTED]> wrote: > Hey Tim > > The .configure() method is what you are looking for i believe. > > It i

Re: "Lookup" HashMap available within the Map

2008-11-25 Thread Chris K Wensel
cool. If you need a hand with Cascading stuff, feel free to ping me on the mail list or #cascading irc. lots of other friendly folk there already. ckw On Nov 25, 2008, at 12:35 PM, tim robertson wrote: Thanks Chris, I have a different test running, then will implement that. Might give ca

Problems running TestDFSIO to a non-default directory

2008-11-25 Thread Joel Welling
Hi Konstantin (et al.); A while ago you gave me the following trick to run TestDFSIO to an output directory other than the default- just use -Dtest.build.data=/output/dir to pass the new directory to the executable. I recall this working, but it is failing now under 0.18.1, and looking at it I c

64 bit namenode and secondary namenode & 32 bit datanode

2008-11-25 Thread Sagar Naik
I am trying to migrate from 32 bit jvm and 64 bit for namenode only. *setup* NN - 64 bit Secondary namenode (instance 1) - 64 bit Secondary namenode (instance 2) - 32 bit datanode- 32 bit From the mailing list I deduced that NN-64 bit and Datanode -32 bit

Re: 64 bit namenode and secondary namenode & 32 bit datanode

2008-11-25 Thread Allen Wittenauer
On 11/25/08 3:58 PM, "Sagar Naik" <[EMAIL PROTECTED]> wrote: > I am trying to migrate from 32 bit jvm and 64 bit for namenode only. > *setup* > NN - 64 bit > Secondary namenode (instance 1) - 64 bit > Secondary namenode (instance 2) - 32 bit > datanode- 32

Re: 64 bit namenode and secondary namenode & 32 bit datanode

2008-11-25 Thread lohit
I might be wrong, but my assumption is running SN either in 64/32 shouldn't matter. But I am curious how two instances of Secondary namenode is setup, will both of them talk to same NN and running in parallel? what are the advantages here. Wondering if there are chances of image corruption. T

Re: 64 bit namenode and secondary namenode & 32 bit datanode

2008-11-25 Thread Sagar Naik
lohit wrote: I might be wrong, but my assumption is running SN either in 64/32 shouldn't matter. But I am curious how two instances of Secondary namenode is setup, will both of them talk to same NN and running in parallel? what are the advantages here. I just have multiple entries master fi

Re: 64 bit namenode and secondary namenode & 32 bit datanode

2008-11-25 Thread lohit
Well, if I think about, image corruption might not happen, since each checkpoint initiation would have unique number. I was just wondering what would happen in this case Consider this scenario. Time 1 <-- SN1 asks NN image and edits to merge Time 2 <-- SN2 asks NN image and edits to merge Time 2

"Filesystem closed" errors

2008-11-25 Thread Bryan Duxbury
I have an app that runs for a long time with no problems, but when I signal it to shut down, I get errors like this: java.io.IOException: Filesystem closed at org.apache.hadoop.dfs.DFSClient.checkOpen(DFSClient.java:196) at org.apache.hadoop.dfs.DFSClient.rename(DFSClient.java:5

Re: Block placement in HDFS

2008-11-25 Thread Hyunsik Choi
Hi All, I try to divide some data into partitions explicitly (like regions of Hbase). I wonder the following way to do is the best method. For example, when we assume a block size 64MB, a file potion corresponding to 0~63MB is allocated to first block? I have three questions as follows: Is the

HDFS directory listing from the Java API?

2008-11-25 Thread Shane Butler
Hi all, Can someone pls guide me on how to get a directory listing of files on HDFS using the java API (0.19.0)? Regards, Shane

Re: HDFS directory listing from the Java API?

2008-11-25 Thread lohit
You can see how 'hadoop dfs -ls' is implemented in FsShell::ls(Path src, boolean recursive, boolean printHeader) in FsShell.java Thanks, Lohit - Original Message From: Shane Butler <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Tuesday, November 25, 2008 8:04:48 PM Subject: HDFS direc

Re: "Filesystem closed" errors

2008-11-25 Thread David B. Ritch
Do you have speculative execution enabled? I've seen error messages like this caused by speculative execution. David Bryan Duxbury wrote: > I have an app that runs for a long time with no problems, but when I > signal it to shut down, I get errors like this: > > java.io.IOException: Filesystem c

Re: "Filesystem closed" errors

2008-11-25 Thread Hong Tang
Does your code ever call fs.close()? If so, https://issues.apache.org/ jira/browse/HADOOP-4655 might be relevant to your problem. On Nov 25, 2008, at 9:07 PM, David B. Ritch wrote: Do you have speculative execution enabled? I've seen error messages like this caused by speculative execution.

How to retrieve rack ID of a datanode

2008-11-25 Thread Ramya R
Hi all, I want to retrieve the Rack ID of every datanode. How can I do this? I tried using getNetworkLocation() in org.apache.hadoop.hdfs.protocol.DatanodeInfo. I am getting /default-rack as the output for all datanodes. Any advice? Thank in advance Ramya

Re: How to retrieve rack ID of a datanode

2008-11-25 Thread lohit
/default-rack is set when datanode has not set rackID. It is upto the datanode to tell namenode which rack it belongs to. Is your datanode doing that explicitly ? -Lohit - Original Message From: Ramya R <[EMAIL PROTECTED]> To: core-user@hadoop.apache.org Sent: Tuesday, November 25, 200

Re: How to retrieve rack ID of a datanode

2008-11-25 Thread Amar Kamat
Ramya R wrote: Hi all, I want to retrieve the Rack ID of every datanode. How can I do this? I tried using getNetworkLocation() in org.apache.hadoop.hdfs.protocol.DatanodeInfo. I am getting /default-rack as the output for all datanodes. Have you setup the cluster to be rack-aware? Atleas

Re: 64 bit namenode and secondary namenode & 32 bit datanod

2008-11-25 Thread Dhruba Borthakur
The design is such that running multiple secondary namenodes should not corrupt the image (modulo any bugs). Are you seeing image corruptions when this happens? You can run all or any daemons in 32-bit mode or 64 bit-mode. You can mix-and-match. If you have many millions of files, then you might w

RE: How to retrieve rack ID of a datanode

2008-11-25 Thread Ramya R
Hi Lohit, I have not set the datanode to tell namenode which rack it belongs to. Can you please tell me how do I do it? Is it using setNetworkLocation()? My intention is to kill the datanodes in a given rack. So it would be useful even if I obtain the subnet each datanode belongs to. Thanks Ra

Switching to HBase from HDFS

2008-11-25 Thread Shimi K
I have a system which uses HDFS to store files on multiple nodes. On each HDFS node machine I have another application which reads the local files. Until know my system worked only with files, HDFS seemed like the right solution and everything worked fine. Now I need to save additional information

Re: How to retrieve rack ID of a datanode

2008-11-25 Thread Yi-Kai Tsai
hi Ramya Setup topology.script.file.name in your hadoop-site.xml and the script. check http://hadoop.apache.org/core/docs/current/cluster_setup.html , Hadoop Rack Awareness section. Hi Lohit, I have not set the datanode to tell namenode which rack it belongs to. Can you please tell me ho

Re: Switching to HBase from HDFS

2008-11-25 Thread Yi-Kai Tsai
Hi Shimi HBase (or BigTable) is a sparse, distributed, persistent multidimensional sorted map , Jim R. Wilson have a excellent article for understanding it : http://jimbojw.com/wiki/index.php?title=Understanding_HBase_and_BigTable I have a system which uses HDFS to store files on multiple no

How We get old version of Haddop

2008-11-25 Thread Rashid Ahmad
Dear Freinds, How we get hadoop old version. -- Regards, Rashid Ahmad

how can I decommission nodes on-the-fly?

2008-11-25 Thread Jeremy Chow
Hi list, I added a property dfs.hosts.exclude to my conf/hadoop-site.xml. Then refreshed my cluster with command bin/hadoop dfsadmin -refreshNodes It showed that it can only shut down the DataNode process but not included the TaskTracker process on each slaver specified in th

how can I decommission nodes on-the-fly?

2008-11-25 Thread Jeremy Chow
Hi list, I added a property dfs.hosts.exclude to my conf/hadoop-site.xml. Then refreshed my cluster with command bin/hadoop dfsadmin -refreshNodes It showed that it can only shut down the DataNode process but not included the TaskTracker process on each slaver specified in th

Re: how can I decommission nodes on-the-fly?

2008-11-25 Thread Amareshwari Sriramadasu
Jeremy Chow wrote: Hi list, I added a property dfs.hosts.exclude to my conf/hadoop-site.xml. Then refreshed my cluster with command bin/hadoop dfsadmin -refreshNodes It showed that it can only shut down the DataNode process but not included the TaskTracker process on each s