Re: Hadoop Cluster Multi-datacenter
Yes! ssh connectivity is required. On Tue, Jun 7, 2011 at 10:37 AM, wrote: > Hello, > > I wanted to know if anyone has any tips or tutorials on howto install the > hadoop cluster on multiple datacenters > > Do you need ssh connectivity between the nodes across these data centers? > > Thanks in advance for any guidance you can provide. > > > > __ > The information transmitted, including any attachments, is intended only > for the person or entity to which it is addressed and may contain > confidential and/or privileged material. Any review, retransmission, > dissemination or other use of, or taking of any action in reliance upon, > this information by persons or entities other than the intended recipient is > prohibited, and all liability arising therefrom is disclaimed. If you > received this in error, please contact the sender and delete the material > from any computer. PricewaterhouseCoopers LLP is a Delaware limited > liability partnership. This communication may come from > PricewaterhouseCoopers LLP or one of its subsidiaries. > -- Thanks, Shah
Re: Why inter-rack communication in mapreduce slow?
On 06/06/2011 02:40 PM, John Armstrong wrote: On Mon, 06 Jun 2011 09:34:56 -0400, wrote: Yeah, that's a good point. In fact, it almost makes me wonder if an ideal setup is not only to have each of the main control daemons on their own nodes, but to put THOSE nodes on their own rack and keep all the data elsewhere. I'd give them 10Gbps connection to the main network fabric, as with any ingress/egress nodes whose aim in life is to get data into and out of the cluster. There's a lot to be said for fast nodes within the datacentre but not hosting datanodes, as that way their writes get scattered everywhere -which is what you need when loading data into HDFS. You don't need separate racks for this, just more complicated wiring. -steve (disclaimer, my network knowledge generally stops at Connection Refused and No Route to Host messages)
NameNode is starting with exceptions whenever its trying to start datanodes
Helloo.. My namenode is running with the following exceptions and going to safemode everytime its trying to start the datanodes.. why so ? I deleted all the files in the HDFS.. and ran it again..!! 2011-06-07 15:02:19,467 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host = ub13/162.192.100.53 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20-append-r1056497 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-append-r 1056491; compiled by 'stack' on Fri Jan 7 20:43:30 UTC 2011 / 2011-06-07 15:02:19,637 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=54310 2011-06-07 15:02:19,645 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: ub13/ 162.192.100.53:54310 2011-06-07 15:02:19,651 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2011-06-07 15:02:19,653 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext 2011-06-07 15:02:19,991 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop 2011-06-07 15:02:19,992 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2011-06-07 15:02:19,992 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true 2011-06-07 15:02:20,034 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext 2011-06-07 15:02:20,036 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean 2011-06-07 15:02:20,276 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 56 2011-06-07 15:02:20,310 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0 2011-06-07 15:02:20,310 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 5718 loaded in 0 seconds. 2011-06-07 15:02:20,320 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Invalid opcode, reached end of edit log Number of transactions found 7 2011-06-07 15:02:20,321 INFO org.apache.hadoop.hdfs.server.common.Storage: Edits file /usr/local/hadoop/hadoop-datastore/hadoop-hadoop/dfs/name/current/edits of size 1049092 edits # 7 loaded in 0 seconds. 2011-06-07 15:02:20,337 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 5718 saved in 0 seconds. 2011-06-07 15:02:20,784 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 5718 saved in 0 seconds. 2011-06-07 15:02:21,227 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 1482 msecs 2011-06-07 15:02:21,242 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode ON. The ratio of reported blocks 0. has not reached the threshold 0.9990. Safe mode will be turned off automatically. 2011-06-07 15:02:26,941 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2011-06-07 15:02:27,031 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50070 2011-06-07 15:02:27,033 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50070 webServer.getConnectors()[0].getLocalPort() returned 50070 2011-06-07 15:02:27,033 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50070 2011-06-07 15:02:27,033 INFO org.mortbay.log: jetty-6.1.14 2011-06-07 15:02:27,537 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50070 2011-06-07 15:02:27,538 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at: 0.0.0.0:50070 2011-06-07 15:02:27,549 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2011-06-07 15:02:27,559 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54310: starting 2011-06-07 15:02:27,565 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54310: starting 2011-06-07 15:02:27,573 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54310: starting 2011-06-07 15:02:27,585 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54310: starting 2011-06-07 15:02:27,597 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54310: starting 2011-06-07 15:02:27,613 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 54310: starting 2011-06-07 15:02:27,621 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 54310: starting 2011-06-07 15:02:27,632 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 54310: starting 2011-06-07 15:02:27,633 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 54310: starting 2011-06-07 15:02:27,63
Re: Hadoop Cluster Multi-datacenter
On 06/07/2011 06:07 AM, sanjeev.ta...@us.pwc.com wrote: Hello, I wanted to know if anyone has any tips or tutorials on howto install the hadoop cluster on multiple datacenters Nobody has come out and said they've built a single HDFS filesystem from multiple sites, primarly because the inter-site bandwidth/latency will be awful and there isn't any support for this in the topology model of Hadoop (there are some placeholders though). You could set up an HDFS filesystem in each datacentre, and use symbolic links (or the forthcoming federation) to pull data in. There's no reason why you can't start up a job on Datacentre-1 that starts reading some of its data from DC-2, after which all the work will be datacentre-local. Do you need ssh connectivity between the nodes across these data centers? Depends on how you deploy Hadoop. You only need SSH if you use the built-in tooling; if you use large scale cluster management tools then it's a non-issue.
Re: NameNode is starting with exceptions whenever its trying to start datanodes
On 06/07/2011 10:50 AM, praveenesh kumar wrote: The logs say The ratio of reported blocks 0.9091 has not reached the threshold 0.9990. Safe mode will be turned off automatically. not enough datanodes reported in, or they are missing data
Re: NameNode is starting with exceptions whenever its trying to start datanodes
But I dnt have any data on my HDFS.. I was having some date before.. but now I deleted all the files from HDFS.. I dnt know why datanodes are taking time to start.. I guess because of this exception its taking more time to start. On Tue, Jun 7, 2011 at 3:34 PM, Steve Loughran wrote: > On 06/07/2011 10:50 AM, praveenesh kumar wrote: > > The logs say > > > The ratio of reported blocks 0.9091 has not reached the threshold 0.9990. >> Safe mode will be turned off automatically. >> > > > not enough datanodes reported in, or they are missing data >
Re: Running Back to Back Map-reduce jobs
Harsh J wrote: Yes, I believe Oozie does have Pipes and Streaming action helpers as well. On Thu, Jun 2, 2011 at 5:05 PM, Adarsh Sharma wrote: Ok, Is it valid for running jobs through Hadoop Pipes too. Thanks Harsh J wrote: Oozie's workflow feature may exactly be what you're looking for. It can also do much more than just chain jobs. Check out additional features at: http://yahoo.github.com/oozie/ On Thu, Jun 2, 2011 at 4:48 PM, Adarsh Sharma wrote: After following the below points, I am confused about the examples used in the documentation : http://yahoo.github.com/oozie/releases/3.0.0/WorkflowFunctionalSpec.html#a3.2.2.3_Pipes What I want to achieve is to terminate a job after my permission i.e if I want to run again a map-reduce job after the completion of one , it runs & then terminates after my code execution. I struggled to find a simple example that proves this concept. In the Oozie documentation, they r just setting parameters and use them. fore.g a simple Hadoop Pipes job is executed by : int main(int argc, char *argv[]) { return HadoopPipes::runTask(HadoopPipes::TemplateFactory()); } Now if I want to run another job after this on the reduced data in HDFS, how this could be possible. Do i need to add some code. Thanks Dear all, I ran several map-reduce jobs in Hadoop Cluster of 4 nodes. Now this time I want a map-reduce job to be run again after one. Fore.g to clear my point, suppose a wordcount is run on gutenberg file in HDFS and after completion 11/06/02 15:14:35 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 11/06/02 15:14:35 INFO mapred.FileInputFormat: Total input paths to process : 3 11/06/02 15:14:36 INFO mapred.JobClient: Running job: job_201106021143_0030 11/06/02 15:14:37 INFO mapred.JobClient: map 0% reduce 0% 11/06/02 15:14:50 INFO mapred.JobClient: map 33% reduce 0% 11/06/02 15:14:59 INFO mapred.JobClient: map 66% reduce 11% 11/06/02 15:15:08 INFO mapred.JobClient: map 100% reduce 22% 11/06/02 15:15:17 INFO mapred.JobClient: map 100% reduce 100% 11/06/02 15:15:25 INFO mapred.JobClient: Job complete: job_201106021143_0030 11/06/02 15:15:25 INFO mapred.JobClient: Counters: 18 Again a map-reduce job is started on the output or original data say again 1/06/02 15:14:36 INFO mapred.JobClient: Running job: job_201106021143_0030 11/06/02 15:14:37 INFO mapred.JobClient: map 0% reduce 0% 11/06/02 15:14:50 INFO mapred.JobClient: map 33% reduce 0% Is it possible or any parameters to achieve it. Please guide . Thanks
RE: Hadoop Cluster Multi-datacenter
PWC now getting in to Hadoop? Interesting Sanjeev, the simple short answer is that you don't create a cloud that spans a data center. Bad design. You build two clusters one per data center. > To: common-user@hadoop.apache.org > Subject: Hadoop Cluster Multi-datacenter > From: sanjeev.ta...@us.pwc.com > Date: Mon, 6 Jun 2011 22:07:51 -0700 > > Hello, > > I wanted to know if anyone has any tips or tutorials on howto install the > hadoop cluster on multiple datacenters > > Do you need ssh connectivity between the nodes across these data centers? > > Thanks in advance for any guidance you can provide. > > > __ > The information transmitted, including any attachments, is intended only for > the person or entity to which it is addressed and may contain confidential > and/or privileged material. Any review, retransmission, dissemination or > other use of, or taking of any action in reliance upon, this information by > persons or entities other than the intended recipient is prohibited, and all > liability arising therefrom is disclaimed. If you received this in error, > please contact the sender and delete the material from any computer. > PricewaterhouseCoopers LLP is a Delaware limited liability partnership. This > communication may come from PricewaterhouseCoopers LLP or one of its > subsidiaries.
RE: Why inter-rack communication in mapreduce slow?
We've looked at the network problem for the past two years. 10GBe is now a reality. Even a year ago prices were still at a premium. Because you now have 10GBe and you have 2TB drives at a price sweet spot, you really need to sit down and think out your cluster design. You need to look at things in terms of power usage, density, cost and vendor relationships... There's really more to this problem. If you're looking at a simple answer, if you run 1 GBe TOR switches, buy a reliable switch that not only has better uplinks but allows you to bond ports to create a fatter pipe. Arista and BladeNetworks (Now part of IBM) are producing some interesting switches and the prices aren't too bad. (Plus they claim to play nice with Cisco switches...) If you're looking to build out a very large cluster... you really need to take a holistic approach. HTH -Mike > Date: Tue, 7 Jun 2011 10:47:25 +0100 > From: ste...@apache.org > To: common-user@hadoop.apache.org > Subject: Re: Why inter-rack communication in mapreduce slow? > > On 06/06/2011 02:40 PM, John Armstrong wrote: > > On Mon, 06 Jun 2011 09:34:56 -0400, wrote: > >> Yeah, that's a good point. > > > > > In fact, it almost makes me wonder if an ideal setup is not only to have > > each of the main control daemons on their own nodes, but to put THOSE nodes > > on their own rack and keep all the data elsewhere. > > I'd give them 10Gbps connection to the main network fabric, as with any > ingress/egress nodes whose aim in life is to get data into and out of > the cluster. There's a lot to be said for fast nodes within the > datacentre but not hosting datanodes, as that way their writes get > scattered everywhere -which is what you need when loading data into HDFS. > > You don't need separate racks for this, just more complicated wiring. > > -steve > > (disclaimer, my network knowledge generally stops at Connection Refused > and No Route to Host messages)
Re: DistributedCache
John, Not 100% clear on what you meant. You are saying I should put the file into my HDFS cluster or should I use DistributedCache? If you suggest the latter, could you address my original question? Thanks for your help! Pony On Mon, Jun 6, 2011 at 6:27 PM, John Armstrong wrote: > On Mon, 06 Jun 2011 16:14:14 -0500, Shi Yu wrote: > > I still don't understand, in a cluster you have a shared directory to > > all the nodes, right? Just put the configuration file in that directory > > and load it in all the mappers, isn't that simple? > > So I still don't understand why bother DistributedCache, the only reason > > > might be the shared directory is costly for network and usually has > > storage limit. > > That's exactly the problem the DistributedCache is designed for. It > guarantees that you only need to copy the file to any given local > filesystem once. Using the way you suggest, if there are a hundred mappers > on a given node they'd all need to make a local copy of the file instead of > just making one local copy and moving it around locally from then on. >
Re: Hadoop Cluster Multi-datacenter
On Jun 7, 2011, at 12:07 AM, sanjeev.ta...@us.pwc.com wrote: > Hello, > > I wanted to know if anyone has any tips or tutorials on howto install the > hadoop cluster on multiple datacenters > Generally, this is a bad idea. Why? 1) Inter-datacenter bandwidth is expensive compared to cluster bandwidth. 2) This extra topological constraint is not currently well-modeled in the Hadoop architecture. This means that you will likely find assumptions in the software that are not true in the inter-datacenter case. 3) None of the biggest users currently do this. Until you plan on putting serious money into the game, follow what is well-established to work. I would note that, in my other life, I work with a batch-oriented distributed computing system called Condor (http://www.cs.wisc.edu/condor/). Condor is designed to naturally span the globe (I've seen it spanning around 50 clusters). However, it is batch job oriented, not data oriented. If you have to wedge your problem to fit into the MapReduce paradigm, this might be a good alternate. > Do you need ssh connectivity between the nodes across these data centers? > Definitely not. SSH is only used in the wrapper scripts to start the HDFS daemons. It's a usability crutch for smaller clusters that don't have proper management. If your ops folks don't have a better way to manage what is running on your cluster, fire them. Brian
Re: DistributedCache
On Tue, 7 Jun 2011 09:41:21 -0300, "Juan P." wrote: > Not 100% clear on what you meant. You are saying I should put the file into > my HDFS cluster or should I use DistributedCache? If you suggest the > latter, > could you address my original question? I mean that you can certainly get away with putting information into a known place on HDFS and loading it in each mapper or reducer, but that may become very inefficient as your problem scales up. Mostly I was responding to Shi Yu's question about why the DC is even worth using at all. As to your question, here's how I do it, which I think I basically lifted from an example in The Definitive Guide. There may be better ways, though. In my setup, I put files into the DC by getting Path objects (which should be able to reference either HDFS or local filesystem files, though I always have my files on HDFS to start) and using DistributedCache.addCacheFile(path.toUri(), conf); Then within my mapper or reducer I retrieve all the cached files with Path[] cacheFiles = DistributedCache.getLocalCacheFiles(conf); IIRC, this is what you were doing. The problem is this gets all the cached files, although they are now in a working directory on the local filesystem. Luckily, I know the filename of the file I want, so I iterate for (Path cachePath : cacheFiles) { if (cachePath.getName().equals(cachedFilename)) { return cachePath; } } Then I've got the path to the local filesystem copy of the file I want in hand and I can do whatever I want with it. hth
error -2 (No such file or directory) when mounting fuse-dfs
Hello everyone: I have installed hadoop 0.20 (Single node) on a ubuntu 11.04 64bit. I have been successful on compiling fuse-dfs according to MountableHDFS wiki. The fuse version that I'm using is the one that is bundled with Natty. When I run ./fuse_dfs_wrapper.sh dfs://localhost:9000 ../mnt -d (from HADOOP_HOME/src/contrib/fuse-dfs/src/) this is what comes up: port=9000,server=localhost fuse-dfs didn't recognize //mnt/,-2 //Apparently it's just a warning. It shouldn't matter fuse-dfs ignoring option -d //Apparently it's just a warning. It shouldn't matter Then a bunch of stuff like this: unique: 2, opcode: GETATTR (3), nodeid: 1, insize: 56 getattr / unique: 3, opcode: GETATTR (3), nodeid: 1, insize: 56 getattr / which seems correct Also some of this: unique: 11, opcode: LOOKUP (1), nodeid: 1, insize: 52 LOOKUP /autorun.inf getattr /autorun.inf unique: 11, error: -2 (No such file or directory), outsize: 16 Which seems to overcome somehow because it doesn't get stuck or exit. Until this one: unique: 147, opcode: LOOKUP (1), nodeid: 1, insize: 52 LOOKUP /autorun.inf getattr /autorun.inf unique: 147, error: -2 (No such file or directory), outsize: 16 It just gets stuck and I have to ctrl+C. I haven't been able to go any further. The aspect of the half-mounted directory is: -rw-r--r-- 1 * * 13366 2010-02-19 08:55 .txt drwxr-xr-x 3 * 4096 2011-06-07 15:32 logs d? ? ? ? ?? mnt -rw-r--r-- 1 * * 101 2010-02-19 08:55 .txt -rw-r--r-- 1 * *1366 2010-02-19 08:55 ***.txt To try again, I have to unmount first (umount /.../mnt). If not, it says that the transport endpoint is not connected which seems obvious because the operation didn't finish successfully. I have done some researching on the web and haven't found anything on the same line except for: http://www.mail-archive.com/common-user@hadoop.apache.org/msg02351.html I did what's suggested: fuse_dfs -oserver=127.0.0.1 -oport=9000 /dfs -oallow_other - ordbuffer=131072 But I got the same result Any ideas? Did it happen to somebody else? Thank you in advance. Elena. -- View this message in context: http://old.nabble.com/error--2-%28No-such-file-or-directory%29-when-mounting-fuse-dfs-tp31792099p31792099.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: error -2 (No such file or directory) when mounting fuse-dfs
Hi Elena, FUSE-DFS is extremely picky about hostnames. All of the following should have the exact same string: - Output of "hostname" on the namenode. - fs.default.name - Primary reverse-DNS of the namenode's IP. "localhost" is almost certainly not what you want. Brian On Jun 7, 2011, at 9:47 AM, elena.otero wrote: > > Hello everyone: > > I have installed hadoop 0.20 (Single node) on a ubuntu 11.04 64bit. > I have been successful on compiling fuse-dfs according to MountableHDFS > wiki. > The fuse version that I'm using is the one that is bundled with Natty. > When I run ./fuse_dfs_wrapper.sh dfs://localhost:9000 ../mnt -d (from > HADOOP_HOME/src/contrib/fuse-dfs/src/) this is what comes up: > > port=9000,server=localhost > fuse-dfs didn't recognize //mnt/,-2 //Apparently it's just a > warning. It shouldn't matter > fuse-dfs ignoring option -d //Apparently it's just a > warning. It shouldn't matter > > Then a bunch of stuff like this: > > unique: 2, opcode: GETATTR (3), nodeid: 1, insize: 56 > getattr / > unique: 3, opcode: GETATTR (3), nodeid: 1, insize: 56 > getattr / > > which seems correct > > Also some of this: > > unique: 11, opcode: LOOKUP (1), nodeid: 1, insize: 52 > LOOKUP /autorun.inf > getattr /autorun.inf > unique: 11, error: -2 (No such file or directory), outsize: 16 > > Which seems to overcome somehow because it doesn't get stuck or exit. Until > this one: > > unique: 147, opcode: LOOKUP (1), nodeid: 1, insize: 52 > LOOKUP /autorun.inf > getattr /autorun.inf > unique: 147, error: -2 (No such file or directory), outsize: 16 > > It just gets stuck and I have to ctrl+C. I haven't been able to go any > further. > The aspect of the half-mounted directory is: > > -rw-r--r-- 1 * * 13366 2010-02-19 08:55 .txt > drwxr-xr-x 3 * 4096 2011-06-07 15:32 logs > d? ? ? ? ?? mnt > -rw-r--r-- 1 * * 101 2010-02-19 08:55 .txt > -rw-r--r-- 1 * *1366 2010-02-19 08:55 ***.txt > > To try again, I have to unmount first (umount /.../mnt). If not, it says > that the transport endpoint is not connected which seems obvious because the > operation didn't finish successfully. > > > I have done some researching on the web and haven't found anything on the > same line except for: > > http://www.mail-archive.com/common-user@hadoop.apache.org/msg02351.html > > I did what's suggested: > > fuse_dfs -oserver=127.0.0.1 -oport=9000 /dfs -oallow_other - > ordbuffer=131072 > > But I got the same result > > Any ideas? > Did it happen to somebody else? > > Thank you in advance. > > > Elena. > > > > > > -- > View this message in context: > http://old.nabble.com/error--2-%28No-such-file-or-directory%29-when-mounting-fuse-dfs-tp31792099p31792099.html > Sent from the Hadoop core-user mailing list archive at Nabble.com.
Automatic line number in reducer output
Hi, I am wondering is there any built-in function to automatically add a self-increment line number in reducer output (like the relation DB auto-key). I have this problem because in 0.19.2 API, I used a variable linecount increasing in the reducer like: public static class Reduce extends MapReduceBase implements Reducer{ private long linecount = 0; public void reduce(Text key, Iterator values, OutputCollector output, Reporter reporter) throws IOException { //.some code here linecount ++; output.collect(new Text(Long.toString(linecount)), var); } } However, I found that this is not working in 0.20.2 API, if I write the code like: public static class Reduce extends org.apache.hadoop.mapreduce.Reducer{ private long linecount = 0; public void reduce (Text key, Iterator values, org.apache.hadoop.mapreduce.Reducer.Context context) throws IOException, InterruptedException { //some code here linecount ++; context.write(new Text(Long.toString(linecount)),var); } } but it seems not working anymore. I would also like to know if there are combiner and reducer implemented, how to avoid that line number being written twice (cause I only want it in reducer, not in combiner). Thanks! Shi
Re: Backing up namenode
It should automatically start writing NN data to that dir. If its still empty, you should check the NN logs and permissions on the backup dir. On Mon, Jun 6, 2011 at 10:21 AM, Mark wrote: > Thank you. I added another directory the following configuration and > restarted my cluster: > > > dfs.name.dir > /var/hadoop/dfs/name,/var/hadoop/dfs/name.backup > > > However the name.backup directory is empty. Is there anything I need to do > to tell it to "backup"? > > Thanks > > > On 6/5/11 12:44 AM, sulabh choudhury wrote: > >> Hey Mark, >> >> If you add more than one directory (comma separated) in the variable >> "dfs.name.dir" it would automatically be copied in all those location. >> >> On Sat, Jun 4, 2011 at 10:14 AM, Mark wrote: >> >> How would I go backing up our namenode data? I set up the >>> secondarynamenode >>> on a separate physical machine but just as a backup I would like to save >>> the >>> namenode data. >>> >>> Thanks >>> >>> >>> >> > > -- -- Thanks and Regards, Sulabh Choudhury
Setting Backup Node
Hello. Geetings to everyone. For some time already we are testing HDFS filesystem (without Map/Reduce) for our cluster setup. It is mostly OK. I've encountered FreeBSD support issue (https://issues.apache.org/jira/browse/HADOOP-7294) that I did workaround by creating custom shell script "stat" command that mimics Linux behaviour. Now I've decided to look at reliability/HA options for name node. As far as I could see, secondary name node should be replaced either with checkpoint or backup node. Unfortunately, documentation does not shows any pros or cons of the options. So, anyway, I've decided to go backup node way. It was strange for me that default configuration is done with deprecated secondary name node setup. Also all the scripts are good for secondary name node and not "new way". I did next things: 1) I've chosen one node to be my backup node 2) I did create conf/backup file that has my backup node hostname 3) I did replace last line of start-dfs.sh script (the line that were starting secondary name node) with next line: "$HADOOP_COMMON_HOME"/bin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts backup --script "$bin"/hdfs start namenode -backup $nameStartOpt 4) I did add dfs.http.address property to my hdsf-site.xml file. Until I did so, my backup node could not find my name node 5) I did create name directory on backup host 6) I stopped secondary name node 7) I did try to start my backup node. Now I am getting java.io.FileNotFoundException: http://backup:50070/getimage?putimage=1&port=50105&machine=10.112.0.213&token=-24:1842738969:0:1307458998000:1307458455405 messages on backup node and 2011-06-07 15:40:51,534 WARN org.mortbay.log: /getimage: java.io.IOException: GetImage failed. java.io.IOException: Inconsistent checkpoint fields. LV = -24 namespaceID = 1842738969 cTime = 0; checkpointTime = 1307458455405. Expecting respectively: -24; 1842738969; 0; 1307458455406 on name node. Is it ok, will they sync after some time? BTW: Is it correct that http://hadoop.apache.org/common/docs/current/index.html point to 0.20 and not 0.21 documentation? Best regards, Vitalii Tymchyshyn
Re: remotely downloading file
Bill, thanks for the reply, is there a resource that you have available that i can look at the correct way to connect remotely? I seem to be seeing conflicting ways on doing that. I'm looking at: http://hadoop.apache.org/hdfs/docs/current/api/org/apache/hadoop/hdfs/DFSClient.html http://hadoop.apache.org/hdfs/docs/current/api/org/apache/hadoop/hdfs/DistributedFileSystem.html But the examples i'm seeing are using the Configuration but i don't see that being used in those classes. Thanks again, Joe On Fri, Jun 3, 2011 at 5:05 PM, Habermaas, William < william.haberm...@fatwire.com> wrote: > You can access HDFS for reading and writing from other machines. The API > works through the HDFS client which can be anywhere on the network and not > just on the namenode. You just have to have the Hadoop core jar with your > application wherever it is going to run. > > Bill > > -Original Message- > From: Joe Greenawalt [mailto:joe.greenaw...@gmail.com] > Sent: Friday, June 03, 2011 4:55 PM > To: common-user@hadoop.apache.org > Subject: remotely downloading file > > Hi, > We're interested in using HDFS to store several large file sets to be > available for download from our customers in the following paradigm: > > Customer <- | APPSERVER-CLUSTER {app1,app2,app3} | <- | HDFS | > > I had assumed that pulling the file from HDFS to the APPSERVER-CLUSTER > could > be done program-ably remotely after browsing the documentation. But after > reading the API, it seems that writing Java code to interface with HDFS > needs to happen locally? Is that correct? > > If it is correct, what is the best/recommended way to > deliver downloadables to the APPSERVERS (and vice versa) which are hosted > in > the same network but on different machines? > > Thanks, > Joe >
Re: NameNode is starting with exceptions whenever its trying to start datanodes
Check two things: 1. Some of your data node is getting connected, that means password less SSH is not working within nodes. 2. Then Clear the Dir where you data is persisted in data nodes and format the namenode. It should definitely work then Cheers, Jagaran From: praveenesh kumar To: common-user@hadoop.apache.org Sent: Tue, 7 June, 2011 3:14:01 AM Subject: Re: NameNode is starting with exceptions whenever its trying to start datanodes But I dnt have any data on my HDFS.. I was having some date before.. but now I deleted all the files from HDFS.. I dnt know why datanodes are taking time to start.. I guess because of this exception its taking more time to start. On Tue, Jun 7, 2011 at 3:34 PM, Steve Loughran wrote: > On 06/07/2011 10:50 AM, praveenesh kumar wrote: > > The logs say > > > The ratio of reported blocks 0.9091 has not reached the threshold 0.9990. >> Safe mode will be turned off automatically. >> > > > not enough datanodes reported in, or they are missing data >
Re: NameNode is starting with exceptions whenever its trying to start datanodes
>>1. Some of your data node is getting connected, that means password less SSH is >>not working within nodes. So you mean that passwordless SSH should be there among datanodes also. In hadoop we used to do password less SSH from namenode to data nodes Do we have to do passwordless ssh among datanodes also ??? On Tue, Jun 7, 2011 at 11:15 PM, jagaran das wrote: > Check two things: > > 1. Some of your data node is getting connected, that means password less > SSH is > not working within nodes. > 2. Then Clear the Dir where you data is persisted in data nodes and format > the > namenode. > > It should definitely work then > > Cheers, > Jagaran > > > > > From: praveenesh kumar > To: common-user@hadoop.apache.org > Sent: Tue, 7 June, 2011 3:14:01 AM > Subject: Re: NameNode is starting with exceptions whenever its trying to > start > datanodes > > But I dnt have any data on my HDFS.. I was having some date before.. but > now > I deleted all the files from HDFS.. > I dnt know why datanodes are taking time to start.. I guess because of this > exception its taking more time to start. > > On Tue, Jun 7, 2011 at 3:34 PM, Steve Loughran wrote: > > > On 06/07/2011 10:50 AM, praveenesh kumar wrote: > > > > The logs say > > > > > > The ratio of reported blocks 0.9091 has not reached the threshold > 0.9990. > >> Safe mode will be turned off automatically. > >> > > > > > > not enough datanodes reported in, or they are missing data > > >
Re: NameNode is starting with exceptions whenever its trying to start datanodes
Sorry I mean Some of your data nodes are not getting connected From: jagaran das To: common-user@hadoop.apache.org Sent: Tue, 7 June, 2011 10:45:59 AM Subject: Re: NameNode is starting with exceptions whenever its trying to start datanodes Check two things: 1. Some of your data node is getting connected, that means password less SSH is not working within nodes. 2. Then Clear the Dir where you data is persisted in data nodes and format the namenode. It should definitely work then Cheers, Jagaran From: praveenesh kumar To: common-user@hadoop.apache.org Sent: Tue, 7 June, 2011 3:14:01 AM Subject: Re: NameNode is starting with exceptions whenever its trying to start datanodes But I dnt have any data on my HDFS.. I was having some date before.. but now I deleted all the files from HDFS.. I dnt know why datanodes are taking time to start.. I guess because of this exception its taking more time to start. On Tue, Jun 7, 2011 at 3:34 PM, Steve Loughran wrote: > On 06/07/2011 10:50 AM, praveenesh kumar wrote: > > The logs say > > > The ratio of reported blocks 0.9091 has not reached the threshold 0.9990. >> Safe mode will be turned off automatically. >> > > > not enough datanodes reported in, or they are missing data >
Re: NameNode is starting with exceptions whenever its trying to start datanodes
>>Sorry I mean Some of your data nodes are not getting connected.. So are you sticking with your solution that you are saying to me.. to go for passwordless ssh for all datanodes.. because for my hadoop.. all datanodes are running fine On Tue, Jun 7, 2011 at 11:32 PM, jagaran das wrote: > Sorry I mean Some of your data nodes are not getting connected > > > > > From: jagaran das > To: common-user@hadoop.apache.org > Sent: Tue, 7 June, 2011 10:45:59 AM > Subject: Re: NameNode is starting with exceptions whenever its trying to > start > datanodes > > Check two things: > > 1. Some of your data node is getting connected, that means password less > SSH is > not working within nodes. > 2. Then Clear the Dir where you data is persisted in data nodes and format > the > namenode. > > It should definitely work then > > Cheers, > Jagaran > > > > > From: praveenesh kumar > To: common-user@hadoop.apache.org > Sent: Tue, 7 June, 2011 3:14:01 AM > Subject: Re: NameNode is starting with exceptions whenever its trying to > start > datanodes > > But I dnt have any data on my HDFS.. I was having some date before.. but > now > I deleted all the files from HDFS.. > I dnt know why datanodes are taking time to start.. I guess because of this > exception its taking more time to start. > > On Tue, Jun 7, 2011 at 3:34 PM, Steve Loughran wrote: > > > On 06/07/2011 10:50 AM, praveenesh kumar wrote: > > > > The logs say > > > > > > The ratio of reported blocks 0.9091 has not reached the threshold > 0.9990. > >> Safe mode will be turned off automatically. > >> > > > > > > not enough datanodes reported in, or they are missing data > > >
Re: NameNode is starting with exceptions whenever its trying to start datanodes
Yes Correct Password less SSH between your name node and some of your datanode is not working From: praveenesh kumar To: common-user@hadoop.apache.org Sent: Tue, 7 June, 2011 10:56:08 AM Subject: Re: NameNode is starting with exceptions whenever its trying to start datanodes >>1. Some of your data node is getting connected, that means password less SSH is >>not working within nodes. So you mean that passwordless SSH should be there among datanodes also. In hadoop we used to do password less SSH from namenode to data nodes Do we have to do passwordless ssh among datanodes also ??? On Tue, Jun 7, 2011 at 11:15 PM, jagaran das wrote: > Check two things: > > 1. Some of your data node is getting connected, that means password less > SSH is > not working within nodes. > 2. Then Clear the Dir where you data is persisted in data nodes and format > the > namenode. > > It should definitely work then > > Cheers, > Jagaran > > > > > From: praveenesh kumar > To: common-user@hadoop.apache.org > Sent: Tue, 7 June, 2011 3:14:01 AM > Subject: Re: NameNode is starting with exceptions whenever its trying to > start > datanodes > > But I dnt have any data on my HDFS.. I was having some date before.. but > now > I deleted all the files from HDFS.. > I dnt know why datanodes are taking time to start.. I guess because of this > exception its taking more time to start. > > On Tue, Jun 7, 2011 at 3:34 PM, Steve Loughran wrote: > > > On 06/07/2011 10:50 AM, praveenesh kumar wrote: > > > > The logs say > > > > > > The ratio of reported blocks 0.9091 has not reached the threshold > 0.9990. > >> Safe mode will be turned off automatically. > >> > > > > > > not enough datanodes reported in, or they are missing data > > >
RE: remotely downloading file
Joe, Take a look at http://wiki.apache.org/hadoop/HadoopDfsReadWriteExample It should give you an idea of how to read and write HDFS files. This page is somewhat old and the package names have changed a bit between versions but I hope it will get you on the right track. If you don't want to write code there are HDFS copy utilities that you can use instead from shell scripts. Bill -Original Message- From: Joe Greenawalt [mailto:joe.greenaw...@gmail.com] Sent: Tuesday, June 07, 2011 1:38 PM To: common-user@hadoop.apache.org Subject: Re: remotely downloading file Bill, thanks for the reply, is there a resource that you have available that i can look at the correct way to connect remotely? I seem to be seeing conflicting ways on doing that. I'm looking at: http://hadoop.apache.org/hdfs/docs/current/api/org/apache/hadoop/hdfs/DFSClient.html http://hadoop.apache.org/hdfs/docs/current/api/org/apache/hadoop/hdfs/DistributedFileSystem.html But the examples i'm seeing are using the Configuration but i don't see that being used in those classes. Thanks again, Joe On Fri, Jun 3, 2011 at 5:05 PM, Habermaas, William < william.haberm...@fatwire.com> wrote: > You can access HDFS for reading and writing from other machines. The API > works through the HDFS client which can be anywhere on the network and not > just on the namenode. You just have to have the Hadoop core jar with your > application wherever it is going to run. > > Bill > > -Original Message- > From: Joe Greenawalt [mailto:joe.greenaw...@gmail.com] > Sent: Friday, June 03, 2011 4:55 PM > To: common-user@hadoop.apache.org > Subject: remotely downloading file > > Hi, > We're interested in using HDFS to store several large file sets to be > available for download from our customers in the following paradigm: > > Customer <- | APPSERVER-CLUSTER {app1,app2,app3} | <- | HDFS | > > I had assumed that pulling the file from HDFS to the APPSERVER-CLUSTER > could > be done program-ably remotely after browsing the documentation. But after > reading the API, it seems that writing Java code to interface with HDFS > needs to happen locally? Is that correct? > > If it is correct, what is the best/recommended way to > deliver downloadables to the APPSERVERS (and vice versa) which are hosted > in > the same network but on different machines? > > Thanks, > Joe >
Re: NameNode is starting with exceptions whenever its trying to start datanodes
Cleaning data from data dir of datanode and formatting the name node may help you From: praveenesh kumar To: common-user@hadoop.apache.org Sent: Tue, 7 June, 2011 11:05:03 AM Subject: Re: NameNode is starting with exceptions whenever its trying to start datanodes >>Sorry I mean Some of your data nodes are not getting connected.. So are you sticking with your solution that you are saying to me.. to go for passwordless ssh for all datanodes.. because for my hadoop.. all datanodes are running fine On Tue, Jun 7, 2011 at 11:32 PM, jagaran das wrote: > Sorry I mean Some of your data nodes are not getting connected > > > > > From: jagaran das > To: common-user@hadoop.apache.org > Sent: Tue, 7 June, 2011 10:45:59 AM > Subject: Re: NameNode is starting with exceptions whenever its trying to > start > datanodes > > Check two things: > > 1. Some of your data node is getting connected, that means password less > SSH is > not working within nodes. > 2. Then Clear the Dir where you data is persisted in data nodes and format > the > namenode. > > It should definitely work then > > Cheers, > Jagaran > > > > > From: praveenesh kumar > To: common-user@hadoop.apache.org > Sent: Tue, 7 June, 2011 3:14:01 AM > Subject: Re: NameNode is starting with exceptions whenever its trying to > start > datanodes > > But I dnt have any data on my HDFS.. I was having some date before.. but > now > I deleted all the files from HDFS.. > I dnt know why datanodes are taking time to start.. I guess because of this > exception its taking more time to start. > > On Tue, Jun 7, 2011 at 3:34 PM, Steve Loughran wrote: > > > On 06/07/2011 10:50 AM, praveenesh kumar wrote: > > > > The logs say > > > > > > The ratio of reported blocks 0.9091 has not reached the threshold > 0.9990. > >> Safe mode will be turned off automatically. > >> > > > > > > not enough datanodes reported in, or they are missing data > > >
Re: NameNode is starting with exceptions whenever its trying to start datanodes
Dude.. passwordless ssh between my namenode and datanode is working all fine...!!! My question is --- *"Are you talking about passwordless ssh between datanodes" * ** or *"Are you talking about password less ssh between datanodes and namenode" * ** Because if you are talking about 2nd case.. than that thing is working fine...because I already mentioned it that all my datanodes in hadoop are working fine..!!! I can see all those datanodes using "hadoop fsck /" as well as in hdfs web UI also.. On Tue, Jun 7, 2011 at 11:35 PM, jagaran das wrote: > Yes Correct > Password less SSH between your name node and some of your datanode is not > working > > > > > > From: praveenesh kumar > To: common-user@hadoop.apache.org > Sent: Tue, 7 June, 2011 10:56:08 AM > Subject: Re: NameNode is starting with exceptions whenever its trying to > start > datanodes > > >>1. Some of your data node is getting connected, that means password less > SSH is > >>not working within nodes. > > So you mean that passwordless SSH should be there among datanodes also. > In hadoop we used to do password less SSH from namenode to data nodes > Do we have to do passwordless ssh among datanodes also ??? > > On Tue, Jun 7, 2011 at 11:15 PM, jagaran das >wrote: > > > Check two things: > > > > 1. Some of your data node is getting connected, that means password less > > SSH is > > not working within nodes. > > 2. Then Clear the Dir where you data is persisted in data nodes and > format > > the > > namenode. > > > > It should definitely work then > > > > Cheers, > > Jagaran > > > > > > > > > > From: praveenesh kumar > > To: common-user@hadoop.apache.org > > Sent: Tue, 7 June, 2011 3:14:01 AM > > Subject: Re: NameNode is starting with exceptions whenever its trying to > > start > > datanodes > > > > But I dnt have any data on my HDFS.. I was having some date before.. but > > now > > I deleted all the files from HDFS.. > > I dnt know why datanodes are taking time to start.. I guess because of > this > > exception its taking more time to start. > > > > On Tue, Jun 7, 2011 at 3:34 PM, Steve Loughran > wrote: > > > > > On 06/07/2011 10:50 AM, praveenesh kumar wrote: > > > > > > The logs say > > > > > > > > > The ratio of reported blocks 0.9091 has not reached the threshold > > 0.9990. > > >> Safe mode will be turned off automatically. > > >> > > > > > > > > > not enough datanodes reported in, or they are missing data > > > > > >
Re: NameNode is starting with exceptions whenever its trying to start datanodes
how shall I clean my data dir ??? Cleaning data dir .. u mean to say is deleting all files from hdfs ???.. is there any special command to clean all the datanodes in one step ??? On Tue, Jun 7, 2011 at 11:46 PM, jagaran das wrote: > Cleaning data from data dir of datanode and formatting the name node may > help > you > > > > > > From: praveenesh kumar > To: common-user@hadoop.apache.org > Sent: Tue, 7 June, 2011 11:05:03 AM > Subject: Re: NameNode is starting with exceptions whenever its trying to > start > datanodes > > >>Sorry I mean Some of your data nodes are not getting connected.. > > So are you sticking with your solution that you are saying to me.. to go > for > passwordless ssh for all datanodes.. > because for my hadoop.. all datanodes are running fine > > > > On Tue, Jun 7, 2011 at 11:32 PM, jagaran das >wrote: > > > Sorry I mean Some of your data nodes are not getting connected > > > > > > > > > > From: jagaran das > > To: common-user@hadoop.apache.org > > Sent: Tue, 7 June, 2011 10:45:59 AM > > Subject: Re: NameNode is starting with exceptions whenever its trying to > > start > > datanodes > > > > Check two things: > > > > 1. Some of your data node is getting connected, that means password less > > SSH is > > not working within nodes. > > 2. Then Clear the Dir where you data is persisted in data nodes and > format > > the > > namenode. > > > > It should definitely work then > > > > Cheers, > > Jagaran > > > > > > > > > > From: praveenesh kumar > > To: common-user@hadoop.apache.org > > Sent: Tue, 7 June, 2011 3:14:01 AM > > Subject: Re: NameNode is starting with exceptions whenever its trying to > > start > > datanodes > > > > But I dnt have any data on my HDFS.. I was having some date before.. but > > now > > I deleted all the files from HDFS.. > > I dnt know why datanodes are taking time to start.. I guess because of > this > > exception its taking more time to start. > > > > On Tue, Jun 7, 2011 at 3:34 PM, Steve Loughran > wrote: > > > > > On 06/07/2011 10:50 AM, praveenesh kumar wrote: > > > > > > The logs say > > > > > > > > > The ratio of reported blocks 0.9091 has not reached the threshold > > 0.9990. > > >> Safe mode will be turned off automatically. > > >> > > > > > > > > > not enough datanodes reported in, or they are missing data > > > > > >
Re: remotely downloading file
Thanks, and I've seen this example. I think after I can connect, i'm ok, but i'm not sure how to do it remotely. I'm writing a groovy script just to test connection and i'll paste it below so you can see what i'm trying to do. @GrabResolver(name='org.apache.mahout.hadoop', root=' > http://mymavenrepo/nexus/content/repositories/thirdparty') > @Grab(group='org.apache.mahout.hadoop', module='hadoop-core', > version='0.20.203.0') > import org.apache.hadoop.hdfs.DFSClient > import org.apache.hadoop.hdfs.DistributedFileSystem > import org.apache.hadoop.conf.Configuration > org.apache.hadoop.hdfs.protocol.HdfsFileStatus > def DIR_HADOOP = "1.1.1.1"; > def PORT_HADOOP = "9000"; > def config = new Configuration() > config.set("fs.default.name", DIR_HADOOP + ":" + PORT_HADOOP) //got this > from from some site > def dfs = new DistributedFileSystem() > def dfsClient = dfs.getClient() > def fileInfo = dfsClient.getFileInfo("/DEV") > println fileInfo.isDir() On Tue, Jun 7, 2011 at 2:11 PM, Habermaas, William < william.haberm...@fatwire.com> wrote: > Joe, > > Take a look at http://wiki.apache.org/hadoop/HadoopDfsReadWriteExample > > It should give you an idea of how to read and write HDFS files. This page > is somewhat old and the package names have changed a bit between versions > but I hope it will get you on the right track. If you don't want to write > code there are HDFS copy utilities that you can use instead from shell > scripts. > > Bill > > -Original Message- > From: Joe Greenawalt [mailto:joe.greenaw...@gmail.com] > Sent: Tuesday, June 07, 2011 1:38 PM > To: common-user@hadoop.apache.org > Subject: Re: remotely downloading file > > Bill, > thanks for the reply, is there a resource that you have available that i > can > look at the correct way to connect remotely? > I seem to be seeing conflicting ways on doing that. > > I'm looking at: > > http://hadoop.apache.org/hdfs/docs/current/api/org/apache/hadoop/hdfs/DFSClient.html > > http://hadoop.apache.org/hdfs/docs/current/api/org/apache/hadoop/hdfs/DistributedFileSystem.html > > But the examples i'm seeing are using the Configuration but i don't see > that > being used in those classes. > > Thanks again, > > Joe > > On Fri, Jun 3, 2011 at 5:05 PM, Habermaas, William < > william.haberm...@fatwire.com> wrote: > > > You can access HDFS for reading and writing from other machines. The API > > works through the HDFS client which can be anywhere on the network and > not > > just on the namenode. You just have to have the Hadoop core jar with your > > application wherever it is going to run. > > > > Bill > > > > -Original Message- > > From: Joe Greenawalt [mailto:joe.greenaw...@gmail.com] > > Sent: Friday, June 03, 2011 4:55 PM > > To: common-user@hadoop.apache.org > > Subject: remotely downloading file > > > > Hi, > > We're interested in using HDFS to store several large file sets to be > > available for download from our customers in the following paradigm: > > > > Customer <- | APPSERVER-CLUSTER {app1,app2,app3} | <- | HDFS | > > > > I had assumed that pulling the file from HDFS to the APPSERVER-CLUSTER > > could > > be done program-ably remotely after browsing the documentation. But > after > > reading the API, it seems that writing Java code to interface with HDFS > > needs to happen locally? Is that correct? > > > > If it is correct, what is the best/recommended way to > > deliver downloadables to the APPSERVERS (and vice versa) which are hosted > > in > > the same network but on different machines? > > > > Thanks, > > Joe > > >
Re: NameNode is starting with exceptions whenever its trying to start datanodes
I mean removing rm -rf * in the datanode dir See this are debugging step that i followed From: praveenesh kumar To: common-user@hadoop.apache.org Sent: Tue, 7 June, 2011 11:19:50 AM Subject: Re: NameNode is starting with exceptions whenever its trying to start datanodes how shall I clean my data dir ??? Cleaning data dir .. u mean to say is deleting all files from hdfs ???.. is there any special command to clean all the datanodes in one step ??? On Tue, Jun 7, 2011 at 11:46 PM, jagaran das wrote: > Cleaning data from data dir of datanode and formatting the name node may > help > you > > > > > > From: praveenesh kumar > To: common-user@hadoop.apache.org > Sent: Tue, 7 June, 2011 11:05:03 AM > Subject: Re: NameNode is starting with exceptions whenever its trying to > start > datanodes > > >>Sorry I mean Some of your data nodes are not getting connected.. > > So are you sticking with your solution that you are saying to me.. to go > for > passwordless ssh for all datanodes.. > because for my hadoop.. all datanodes are running fine > > > > On Tue, Jun 7, 2011 at 11:32 PM, jagaran das >wrote: > > > Sorry I mean Some of your data nodes are not getting connected > > > > > > > > > > From: jagaran das > > To: common-user@hadoop.apache.org > > Sent: Tue, 7 June, 2011 10:45:59 AM > > Subject: Re: NameNode is starting with exceptions whenever its trying to > > start > > datanodes > > > > Check two things: > > > > 1. Some of your data node is getting connected, that means password less > > SSH is > > not working within nodes. > > 2. Then Clear the Dir where you data is persisted in data nodes and > format > > the > > namenode. > > > > It should definitely work then > > > > Cheers, > > Jagaran > > > > > > > > > > From: praveenesh kumar > > To: common-user@hadoop.apache.org > > Sent: Tue, 7 June, 2011 3:14:01 AM > > Subject: Re: NameNode is starting with exceptions whenever its trying to > > start > > datanodes > > > > But I dnt have any data on my HDFS.. I was having some date before.. but > > now > > I deleted all the files from HDFS.. > > I dnt know why datanodes are taking time to start.. I guess because of > this > > exception its taking more time to start. > > > > On Tue, Jun 7, 2011 at 3:34 PM, Steve Loughran > wrote: > > > > > On 06/07/2011 10:50 AM, praveenesh kumar wrote: > > > > > > The logs say > > > > > > > > > The ratio of reported blocks 0.9091 has not reached the threshold > > 0.9990. > > >> Safe mode will be turned off automatically. > > >> > > > > > > > > > not enough datanodes reported in, or they are missing data > > > > > >
Linker errors with Hadoop pipes
Hadoop n00b here, just started playing around with Hadoop Pipes. I'm getting linker errors while compiling a simple WordCount example using hadoop-0.20.203 (current most recent version) that did not appear for the same code in hadoop-0.20.2 Linker errors of the form: undefined reference to `EVP_sha1' in HadoopPipes.cc. EVP_sha1 (and all of the undefined references I get) are part of the openssl library which HadoopPipes.cc from hadoop-0.20.203 uses, but hadoop-0.20.2 does not. I've tried adjusting my makefile to link to the ssl libraries, but I'm still out of luck. Any ideas would be greatly appreciated. Thanks! PS, here is my current makefile: CC = g++ HADOOP_INSTALL = /usr/local/hadoop-0.20.203.0 SSL_INSTALL = /usr/local/ssl PLATFORM = Linux-amd64-64 CPPFLAGS = -m64 -I$(HADOOP_INSTALL)/c++/$(PLATFORM)/include -I$(SSL_INSTALL)/include WordCount: WordCount.cc $(CC) $(CPPFLAGS) $< -Wall -Wextra -L$(SSL_INSTALL)/lib -lssl -lcrypto -L$(HADOOP_INSTALL)/c++/$(PLATFORM)/lib -lhadooppipes -lhadooputils -lpthread -g -O2 -o $@ The actual program I'm using can be found at http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_2.2_--_Running_C%2B%2B_Programs_on_Hadoop
Re: Hbase startup error: NoNode for /hbase/master after running out of space
On Mon, Jun 6, 2011 at 8:29 PM, Zhong, Sheng wrote: > I am appreciated by any help and suggestion. P.S: we're using apache > hadoop 0.20.2 and hbase 0.20.3, and zookeeper is running via > zookeeper-3.2.2 (not managed by Hbase). > Can you upgrade you hbase and hadoop? St.Ack
RE: Hbase startup error: NoNode for /hbase/master after running out of space
Thanks! The issue has been resolved by removing some bad blks... But St.Ack, We do want a upgrade, and am starting the research. Pl help to provide your suggestion or comments: If we stick with apache version, the latest stable version is hadoop-0.20.203.0rc1.tar.gz. However, it may have potentially compatible issue with latest Hbase 0.90.2 according to Michel Noll, specified from http://www.michael-noll.com/blog/2011/04/14/building-an-hadoop-0-20-x-ve rsion-for-hbase-0-90-2/comment-page-1/#comment-19757: "As of today, Hadoop 0.20.2 is the latest stable release of Apache Hadoop that is marked as ready for production (neither 0.21 nor 0.22 are). Unfortunately, Hadoop 0.20.2 release is not compatible with the latest stable version of HBase:" The paper suggests either uses branch-0.20-append or use hadoop jar found from HBASE. Andy -Original Message- From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of Stack Sent: Tuesday, June 07, 2011 5:13 PM To: common-user@hadoop.apache.org Subject: Re: Hbase startup error: NoNode for /hbase/master after running out of space On Mon, Jun 6, 2011 at 8:29 PM, Zhong, Sheng wrote: > I am appreciated by any help and suggestion. P.S: we're using apache > hadoop 0.20.2 and hbase 0.20.3, and zookeeper is running via > zookeeper-3.2.2 (not managed by Hbase). > Can you upgrade you hbase and hadoop? St.Ack
Linear scalability question
Hi, I have a question on the linear scalability of Hadoop. We have a situation where we have to do reduce side joins on two big tables (10+ TB). This causes lot of data to be transferred over network and network is becoming a bottleneck. In few years these table will have 100TB + data and the reduce side joins will demand lot of data transfer over network. Since network bandwidth is limited and can not be addressed by adding more nodes, hadoop will no longer be linearly scalable in this case. Is my understanding correct? Am I missing anything here? How do people address these kind of bottlenecks? Thanks and Regards, Shantian
re-reading
Hi, I'm trying to read the inputSplit over and over using following function in MapperRunner: @Override public void run(RecordReader input, OutputCollector output, Reporter reporter) throws IOException { RecordReader copyInput = input; //First read while(input.next(key,value)); //Second read while(copyInput.next(key,value)); } It can clearly be seen that this won't work because both RecordReaders are actually the same. I'm trying to find a way for the second reader to start reading the split again from beginning ... How can I do that? Thanks, Mark