core-u...@hadoop.apache.org
core-u...@hadoop.apache.org
Re: map(K1 key, V1 value, OutputCollector output, Reporter reporter) deprecated in 0.20.2?
thanks you guys :D On Sat, Jan 30, 2010 at 2:16 AM, Jim Twensky wrote: > Steven, > > I recently had the same issues, and I found this blog post very > helpful on migrating from 0.19 to 0.20.2 > > http://sonerbalkir.blogspot.com/2010/01/new-hadoop-api-020x.html > > You can download the sample code at the end, which contains a Hadoop > word count program written using the new API. > > Hope this helps. > > -Jim > > On Thu, Jan 28, 2010 at 9:43 AM, Edward Capriolo > wrote: > > On Thu, Jan 28, 2010 at 8:14 AM, steven zhuang > wrote: > >> hello, all, > >>As a newbie, I have been used to the (k1,v1,k2,v2) format > >> parameter list for map and reduce methods in mapper and reducer(as is > >> written in many books), but after several failures, I found in 0.20+, if > we > >> extends from base class org.apache.hadoop.mapreduce.Mapper, the map > should > >> be something like this: > >> > >> void map(KEYIN key, VALUEIN value, Context context) throws > >> IOException, InterruptedException > >> A little confusing to me. > >> My question is why the old fashion map interface > deprecated? > >> thanks! > >> > >> > >> -- > >> best wishes. > >>steven > >> > > > > Steven, > > > > The old map/reduce api is still available. org.apache.hadoop.mapred > > > >> My question is why the old fashion map interface > deprecated? > > > > Cause hadoop is like a freight train, either hop on or get out of the way > ! :) > > Just kidding, > > > > Great presentation about how to update code: > > http://www.slideshare.net/sh1mmer/upgrading-to-the-new-map-reduce-api > > > > Some information on the 'why' > > > http://www.cloudera.com/blog/2009/05/07/what%E2%80%99s-new-in-hadoop-core-020/ > > > -- best wishes. steven
Re: Could not obtain block
Ken, FIXED !!! SO MUCH THANKS Command prompt ulimit wasn't enough, one needs to hard set it and reboot explained here http://posidev.com/blog/2009/06/04/set-ulimit-parameters-on-ubuntu/ 2010/1/30 MilleBii > Increased the "ulimit" to 64000 ... same problem > stop/start-all ... same problem but on a different block which of course > present, so it looks like there is nothing wrong with actual data in the > hdfs. > > I use the Nutch default hadoop 0.19.x anything related ? > > 2010/1/30 Ken Goodhope > > "Could not obtain block" errors are often caused by running out of >> available >> file handles. You can confirm this by going to the shell and entering >> "ulimit -n". If it says 1024, the default, then you will want to increase >> it to about 64,000. >> >> On Fri, Jan 29, 2010 at 4:06 PM, MilleBii wrote: >> >> > X-POST with Nutch mailing list. >> > >> > HEEELP !!! >> > >> > Kind of get stuck on this one. >> > I backed-up my hdfs data, reformated the hdfs, put data back, try to >> merge >> > my segments together and it explodes again. >> > >> > Exception in thread "Lucene Merge Thread #0" >> > org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException: >> > Could not obtain block: blk_4670839132945043210_1585 >> > >> file=/user/nutch/crawl/indexed-segments/20100113003609/part-0/_ym.frq >> >at >> > >> > >> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309) >> > >> > If I go into the hfds/data directory I DO find the faulty block >> > Could it be a synchro problem on the segment merger code ? >> > >> > 2010/1/29 MilleBii >> > >> > > I'm looking for some help. I'm Nutch user, everything was working >> fine, >> > but >> > > now I get the following error when indexing. >> > > I have a single note pseudo distributed set up. >> > > Some people on the Nutch list indicated to me that I could full, so I >> > > remove many things and hdfs is far from full. >> > > This file & directory was perfectly OK the day before. >> > > I did a "hadoop fsck"... report says healthy. >> > > >> > > What can I do ? >> > > >> > > Is is safe to do a Linux FSCK just in case ? >> > > >> > > Caused by: java.io.IOException: Could not obtain block: >> > > blk_8851198258748412820_9031 >> > > >> > >> file=/user/nutch/crawl/indexed-segments/20100111233601/part-0/_103.frq >> > > >> > > >> > > -- >> > > -MilleBii- >> > > >> > >> > >> > >> > -- >> > -MilleBii- >> > >> >> >> >> -- >> Ken Goodhope >> Cell: 425-750-5616 >> >> 362 Bellevue Way NE Apt N415 >> Bellevue WA, 98004 >> > > > > -- > -MilleBii- > -- -MilleBii-
hadoop under cygwin issue
Hi, I am trying to run Hadoop 0.19.2 under cygwin as per directions on the hadoop "quickstart" web page. I know sshd is running and I can "ssh localhost" without a password. This is from my hadoop-site.xml hadoop.tmp.dir /cygwin/tmp/hadoop-${user.name} fs.default.name hdfs://localhost:9000 mapred.job.tracker localhost:9001 mapred.job.reuse.jvm.num.tasks -1 dfs.replication 1 dfs.permissions false webinterface.private.actions true These are errors from my log files: 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=9000 2010-01-30 00:03:33,121 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/127.0.0.1:9000 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2010-01-30 00:03:33,181 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext 2010-01-30 00:03:34,603 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=brian,None,Administrators,Users 2010-01-30 00:03:34,603 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2010-01-30 00:03:34,603 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=false 2010-01-30 00:03:34,653 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext 2010-01-30 00:03:34,653 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean 2010-01-30 00:03:34,803 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist. 2010-01-30 00:03:34,813 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed. org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent state: storage directory does not exist or is not accessible. at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:288) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:208) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:194) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868) 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000 = 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 9 time(s). problem cleaning system directory: null java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refused: no further information at org.apache.hadoop.ipc.Client.wrapException(Client.java:724) at org.apache.hadoop.ipc.Client.call(Client.java:700) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at $Proxy4.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104) Thanks Brian
Re: Could not obtain block
Increased the "ulimit" to 64000 ... same problem stop/start-all ... same problem but on a different block which of course present, so it looks like there is nothing wrong with actual data in the hdfs. I use the Nutch default hadoop 0.19.x anything related ? 2010/1/30 Ken Goodhope > "Could not obtain block" errors are often caused by running out of > available > file handles. You can confirm this by going to the shell and entering > "ulimit -n". If it says 1024, the default, then you will want to increase > it to about 64,000. > > On Fri, Jan 29, 2010 at 4:06 PM, MilleBii wrote: > > > X-POST with Nutch mailing list. > > > > HEEELP !!! > > > > Kind of get stuck on this one. > > I backed-up my hdfs data, reformated the hdfs, put data back, try to > merge > > my segments together and it explodes again. > > > > Exception in thread "Lucene Merge Thread #0" > > org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException: > > Could not obtain block: blk_4670839132945043210_1585 > > file=/user/nutch/crawl/indexed-segments/20100113003609/part-0/_ym.frq > >at > > > > > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309) > > > > If I go into the hfds/data directory I DO find the faulty block > > Could it be a synchro problem on the segment merger code ? > > > > 2010/1/29 MilleBii > > > > > I'm looking for some help. I'm Nutch user, everything was working fine, > > but > > > now I get the following error when indexing. > > > I have a single note pseudo distributed set up. > > > Some people on the Nutch list indicated to me that I could full, so I > > > remove many things and hdfs is far from full. > > > This file & directory was perfectly OK the day before. > > > I did a "hadoop fsck"... report says healthy. > > > > > > What can I do ? > > > > > > Is is safe to do a Linux FSCK just in case ? > > > > > > Caused by: java.io.IOException: Could not obtain block: > > > blk_8851198258748412820_9031 > > > > > > file=/user/nutch/crawl/indexed-segments/20100111233601/part-0/_103.frq > > > > > > > > > -- > > > -MilleBii- > > > > > > > > > > > -- > > -MilleBii- > > > > > > -- > Ken Goodhope > Cell: 425-750-5616 > > 362 Bellevue Way NE Apt N415 > Bellevue WA, 98004 > -- -MilleBii-