Hadoop Beijing Meeting has successfully concluded! www.hadooper.cn is ready now.
Hi,all Hadoop Beijing Meeting has successfully concluded on Nov 23. Thank you all for your attention. According to the agreements reached in this meeting, we have finished setting up the hadoop-in-china nonprofit website:www.hadooper.cn. We wish we can form a powerful hadoop community in china. Take a look at this website and if you are interested in contributing for the hadooper-in-china community, please drop me an email. We have uploaded the Hadoop Beijing Meeting pictures, slides and videos on this www.hadooper.cn website, which are also available on the hadoop-in-china google group (http://groups.google.com/group/hadooper_cn). please let me know if you have any suggestions. Thanks! heyongqiang 2008-12-01
Re: RE: Hadoop beijing meeting draft agenda is ready.
Hi,Ding Hui We will do our best to made the slides、pics and may videos public on the internet. I will confirm with the talkers about these things and coordinate with our volunteers. Thank you for your suggestion. heyongqiang 2008-11-20 发件人: Ding, Hui 发送时间: 2008-11-20 01:20:22 收件人: [EMAIL PROTECTED] 抄送: 主题: RE: Hadoop beijing meeting draft agenda is ready. Hi, Some of the talks sounds really interesting. Is it possible to video tape this and make it public? Or at least make the slides available? Cheers -Original Message- From: heyongqiang [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 19, 2008 8:40 AM To: core-user; core-dev; hbase-user Subject: Hadoop beijing meeting draft agenda is ready. hi,all The hadoop beijing meeting agenda is ready now. Currently I only count people who have replied my email or post on google group. I will send people i have counted a word document. We now still welcome participants from companies, institutes and universities. Please drop me a mail if you are interested. I have tried to send the agenda to the mail-lists, but it always failed after several retries, which i received a exception said 552 spam score (5.0) exceeded threshold. I post the agenda on the google group (http://groups.google.com/group/hadoop-beijing-meeting/). BTW, The agenda will be adjusted according to actual situation. It may last only half day. heyongqiang 2008-11-20
Re: Hadoop beijing meeting draft agenda is ready.
hi,all I also have uploaded the agenda word document to the google group. -- heyongqiang 2008-11-20 - 发件人:heyongqiang 发送日期:2008-11-20 00:04:51 收件人:core-user; core-dev; hbase-user 抄送: 主题:Hadoop beijing meeting draft agenda is ready. hi,all The hadoop beijing meeting agenda is ready now. Currently I only count people who have replied my email or post on google group. I will send people i have counted a word document. We now still welcome participants from companies, institutes and universities. Please drop me a mail if you are interested. I have tried to send the agenda to the mail-lists, but it always failed after several retries, which i received a exception said 552 spam score (5.0) exceeded threshold. I post the agenda on the google group (http://groups.google.com/group/hadoop-beijing-meeting/). BTW, The agenda will be adjusted according to actual situation. It may last only half day. heyongqiang 2008-11-20
Hadoop beijing meeting draft agenda is ready.
hi,all The hadoop beijing meeting agenda is ready now. Currently I only count people who have replied my email or post on google group. I will send people i have counted a word document. We now still welcome participants from companies, institutes and universities. Please drop me a mail if you are interested. I have tried to send the agenda to the mail-lists, but it always failed after several retries, which i received a exception said 552 spam score (5.0) exceeded threshold. I post the agenda on the google group (http://groups.google.com/group/hadoop-beijing-meeting/). BTW, The agenda will be adjusted according to actual situation. It may last only half day. heyongqiang 2008-11-20
Call for talker at the hadoop beijing meeting!
hi,all Currently we only received one talker application outside our team. We now welcome talkers for this meeting. You can choose any topic about cloud computing.Please send me a brief introduction about yourself and your talk. By the way, this meeting is not meant to be academic, it is just a experience exchange meeting. Best regards, Yongqiang He 2008-11-17 Email: [EMAIL PROTECTED] Tel: 86-10-62600966(O) Research Center for Grid and Service Computing, Institute of Computing Technology, Chinese Academy of Sciences P.O.Box 2704, 100080, Beijing, China
Re: Re: Hadoop Beijing Meeting
hi,Jeremy Chow Welcome! Please send me a brief introduction about yourself and your talk diretly to me. I will send you the detailed agenda and other import things next week. Best regards, Yongqiang He 2008-11-12 Email: [EMAIL PROTECTED] Tel: 86-10-62600966(O) Research Center for Grid and Service Computing, Institute of Computing Technology, Chinese Academy of Sciences P.O.Box 2704, 100080, Beijing, China 发件人: Jeremy Chow 发送时间: 2008-11-12 17:04:46 收件人: core-user@hadoop.apache.org 抄送: 主题: Re: Hadoop Beijing Meeting Hi Mr. He Yongqiang, I apply as a speaker, though is very hurried. I have always been a fan of hadoop. This is my technical blog, http://coderplay.javaeye.com/. Regards, Jeremy -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. http://coderplay.javaeye.com
Re: Hadoop Beijing Meeting
hello. we created a google group for this meeting,http://groups.google.com/group/hadoop-beijing-meeting/. It is both ok to discuss this meeting in the maillist or in the google group. We will make anounces in both place. Best regards, Yongqiang He 2008-11-12 Email: [EMAIL PROTECTED] Tel: 86-10-62600966(O) Research Center for Grid and Service Computing, Institute of Computing Technology, Chinese Academy of Sciences P.O.Box 2704, 100080, Beijing, China 发件人: 永强 何 发送时间: 2008-11-12 14:05:00 收件人: core-user@hadoop.apache.org; [EMAIL PROTECTED] 抄送: 主题: Hadoop Beijing Meeting Hello, all We are planning to host a Hadoop Beijing meeting on next Sunday(23th of Nov.). We now welcome speakers and participants! If you are interested in cloud computing topics and you can join us that day in Beijing, then you are invited and please let me know by dropping me an e-mail. This meeting will be held in: Room 948, 9th floor, Institute of Computing Technology (ICT), No.6 Kexueyuan South Road Zhongguancun, Haidian District Beijing, China. It is our great honor that we have invited Doctor Li Zha, who will give us a brief welcome speech. We are also trying to invite Doctor Zhiwei Xu, who is now the chief scientist of Institute of Computing Technology(ICT). We now welcome speakers for this meeting with our greatest sincerity. If you are interested in making a talk on this meeting, please let me know and I will add it into the schedule. Best regards! He Yongqiang Email: [EMAIL PROTECTED] Tel: 86-10-62600919(O) Fax:86-10-626000900 Key Laboratory of Network Science and Technology Research Center for Grid and Service Computing Institute of Computing Technology, Chinese Academy of Sciences P.O.Box 2704, 100080, Beijing, China
Re: File permissions issue
because in your permission set, the other role can not write the temp directory. and user3 is not in the same group with user2. heyongqiang 2008-07-09 发件人: Joman Chu 发送时间: 2008-07-09 13:06:51 收件人: core-user@hadoop.apache.org 抄送: 主题: File permissions issue Hello, On a cluster where I run Hadoop, it seems that the temp directory created by Hadoop (in our case, /tmp/hadoop/) gets its permissions set to "drwxrwxr-x" owned by the first person that runs a job after the Hadoop services are started. This causes file permissions problems as we try to run jobs. For example, user1:user1 starts Hadoop using ./start-all.sh. Then user2:user2 runs a Hadoop job. Temp directories (/tmp/hadoop/) are now created in all nodes in the cluster owned by user2 with permissions "drwxrwxr-x". Now user3:user3 tries to run a job and gets the following exception: java.io.IOException: Permission denied at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.checkAndCreate(File.java:1704) at java.io.File.createTempFile(File.java:1793) at org.apache.hadoop.util.RunJar.main(RunJar.java:115) at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220) Why does this happen and how can we fix this? Our current stop gap measure is to run a job as the user that started Hadoop. That is, in our example, after user1 starts Hadoop, user1 runs a job. Everything seems to work fine then. Thanks, Joman Chu
Re: Re: hadoop download performace when user app adopt multi-thread
Actually this test result is a good result,it is just my misunderstanding of the result.my mistake. the second column actually is the average download rate per thread.And this post test was run on one node,we also run test simultaneously on multiple nodes,and the performance results seem acceptable for us. But what u said is right,but this overhead(seek time and I/O consumption ) seems not easy to optimize. thank you for your attention. Best regards, Yongqiang He 2008-07-09 Email: [EMAIL PROTECTED] Tel: 86-10-62600966(O) Research Center for Grid and Service Computing, Institute of Computing Technology, Chinese Academy of Sciences P.O.Box 2704, 100080, Beijing, China 发件人: Samuel Guo 发送时间: 2008-07-09 09:47:32 收件人: core-user@hadoop.apache.org 抄送: 主题: Re: hadoop download performace when user app adopt multi-thread heyongqiang 写道: > ipc.Client object is designed be able to share across threads, and each > thread can only made synchronized rpc call,which means each thread call and > wait for a result or error.This is implemented by a novel technique:each > thread made distinct call(with different call object),the user thread then > wait at his call object which later will be notified by the connection > receiver thread.The user thread made a call by first add his call object into > the call list which later be used by the response receiver,and synchronized > at the connection's socket outputstream waiting for writing his call out. And > the connection's thread is running to collect response on behalf of all user > threads. > which i have not mentioned is that Client actually maintains a connection > table. > In every Client object ,a connection culler is running behind as a > daemon,which's sole purpose is to remove idel connection from the connection > table, > but it seems that this culler thread does not close the socket the connection > associated with,it only make a mark and do a notify. all the clean staff is > handled by the connection thread itself.This is really a wonderful design! > even the culler thread can culled the connection from the table, the > connection thread also includes remove code. That's because there is chance > that the connection thread would encounter some exception. > > The above is a brief summary of my understanding of hadoop's ipc code. > The below is a test result which is used to test the data throughput of > hadoop: > +--+--+ > | threadCounts | avg(averageRate) | > +--+--+ > |1 | 53030539.48913 | > |2 | 35325499.583756 | > |3 | 24998284.969072 | > |4 | 19824934.28125 | > |5 | 15956391.489583 | > |6 | 15948640.175532 | > |7 | 14623977.375691 | > |8 | 16098080.160131 | > |9 | 8967970.3877005 | > | 10 | 14569087.178947 | > | 11 | 8962683.6662088 | > | 12 | 20063735.297872 | > | 13 | 13174481.053977 | > | 14 | 10137907.034188 | > | 15 | 6464513.2013889 | > | 16 | 23064338.76087 | > | 17 | 18688537.44385 | > | 18 | 18270909.854317 | > | 19 | 13086261.536538 | > | 20 | 10784059.367347 | > +--+--+ > > the first column represents the thread counts of my test application, the > second column is the average download rate.It seems the rate download sharply > when the thread count increases. > This is very simple test application.Anyone can tell me why?where is the > bottleneck when user app adopt multiple thread. > > As you known, a block of the file in HDFS is presented as a file in the local filesystem resides in a datanode. Different threads read different files in HDFS or different blocks of a (same) file in HDFS, may result a burst of read requests in different local files(blocks of HDFS files) in a certain datanode. so the disk seek time and I/O consumption will become heavy and the response time will be longer. But it is just a local behavior of a (single) datanode. The whole throughput of the Hadoop cluster will be good. so, can you supply any information about your test? > heyongqiang > 2008-06-20 > >
Re: Re: modified word count example
where i can find the Reverse-Index application? heyongqiang 2008-07-09 发件人: Shengkai Zhu 发送时间: 2008-07-09 09:06:38 收件人: core-user@hadoop.apache.org 抄送: 主题: Re: modified word count example Another Map Reduce application, Reverse-Index, behaviors similarly as you description. You can refer to that. On 7/9/08, heyongqiang <[EMAIL PROTECTED] > wrote: > > InputFormat's method RecordReader getRecordReader(InputSplit split, > JobConf job, Reporter reporter) throws IOException; return a RecordReader. > You can implement your own InputFormat and RecordReader: > 1)the RecorderReader remember the FileSplit(subclass of InputSplit) field > in its class > 2) RecordReader's createValue() method always return the FileSplit's file > field. > > hope this helps. > > > > heyongqiang > 2008-07-09 > > > > 发件人: Sandy > 发送时间: 2008-07-09 01:45:15 > 收件人: core-user@hadoop.apache.org > 抄送: > 主题: modified word count example > > Hi, > > Let's say I want to run a map reduce job on a series of text files (let's > say x.txt y.txt and z.txt) > > Given the following mapper function in python (from WordCount.py): > > class WordCountMap(Mapper, MapReduceBase): >one = IntWritable(1) # removed >def map(self, key, value, output, reporter): >for w in value.toString().split(): >output.collect(Text(w), self.one) #how can I modify this line? > > Instead of creating pairs for each word found and the numeral one as the > example is doing, is there a function I can invoke to store the name of the > file it came from instead? > > thus, i'd have pairs like <"water", "x.txt" > <"hadoop", y.txt > > <"hadoop", > "z.txt" > etc. > > I took a look at javadoc, but i'm not sure if I've checked in the right > places. Could someone point me in the right direction? > > Thanks! > > -SM >
Re: modified word count example
InputFormat's method RecordReader getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException; return a RecordReader. You can implement your own InputFormat and RecordReader: 1)the RecorderReader remember the FileSplit(subclass of InputSplit) field in its class 2) RecordReader's createValue() method always return the FileSplit's file field. hope this helps. heyongqiang 2008-07-09 发件人: Sandy 发送时间: 2008-07-09 01:45:15 收件人: core-user@hadoop.apache.org 抄送: 主题: modified word count example Hi, Let's say I want to run a map reduce job on a series of text files (let's say x.txt y.txt and z.txt) Given the following mapper function in python (from WordCount.py): class WordCountMap(Mapper, MapReduceBase): one = IntWritable(1) # removed def map(self, key, value, output, reporter): for w in value.toString().split(): output.collect(Text(w), self.one) #how can I modify this line? Instead of creating pairs for each word found and the numeral one as the example is doing, is there a function I can invoke to store the name of the file it came from instead? thus, i'd have pairs like <"water", "x.txt" > <"hadoop", y.txt > <"hadoop", "z.txt" > etc. I took a look at javadoc, but i'm not sure if I've checked in the right places. Could someone point me in the right direction? Thanks! -SM
Re: Monthly Hadoop User Group Meeting
Will there be a meeting in Beijing,China in the future? haha heyongqiang 2008-07-09 发件人: Ajay Anand 发送时间: 2008-07-09 01:32:10 收件人: core-user@hadoop.apache.org; [EMAIL PROTECTED]; [EMAIL PROTECTED] 抄送: 主题: Monthly Hadoop User Group Meeting The next Hadoop User Group meeting is scheduled for July 22nd from 6 - 7:30 pm at Yahoo! Mission College, Building 1, Training Rooms 3 and 4. Agenda: Cascading - Chris Wenzel Performance Benchmarking on Hadoop (Terabyte Sort, Gridmix) - Sameer Paranjpye, Owen O'Malley, Runping Qi Registration and directions: http://upcoming.yahoo.com/event/869166 Look forward to seeing you there! Ajay
Re: Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets?
i doubt this error was because one datanode quit during the client write,and that datanode was chosen by namenode for the client to contact to write(this was what DFSClient.DFSOutputStream.nextBlockOutputStream did). Default,client side retry 3 times and sleep total 3*xxx seconds,but NameNode need more time to find the deadnode.So every time when client wake up, there is a chance the dead node was chosen again. maybe u should chang the NameNode's interval finding the deadnode and chang the Client's sleep more long? I have changed the DFSClient.DFSOutputStream.nextBlockOutputStream's sleep code like below: if (!success) { LOG.info("Abandoning block " + block + " and retry..."); namenode.abandonBlock(block, src, clientName); // Connection failed. Let's wait a little bit and retry retry = true; try { if (System.currentTimeMillis() - startTime > 5000) { LOG.info("Waiting to find target node: " + nodes[0].getName()); } long time=heartbeatRecheckInterval; Thread.sleep(time); } catch (InterruptedException iex) { } } heartbeatRecheckInterval is exactly the interval of the NameNode's deadnode monitor's recheck interval.And I also changed the NameNode's deadnode recheck interval to be double of heartbeat interval. Best regards, Yongqiang He 2008-07-08 Email: [EMAIL PROTECTED] Tel: 86-10-62600966(O) Research Center for Grid and Service Computing, Institute of Computing Technology, Chinese Academy of Sciences P.O.Box 2704, 100080, Beijing, China 发件人: Raghu Angadi 发送时间: 2008-07-08 01:45:19 收件人: core-user@hadoop.apache.org 抄送: 主题: Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets? ConcurrentModificationException looks like a bug we should file a jira. Regd why the writes are failing, we need to look at more logs.. Could you attach complete log from one of the failed tasks. Also try to see if there is anything in NameNode log around that time. Raghu. C G wrote: > Hi All: > > I've got 0.17.0 set up on a 7 node grid (6 slaves w/datanodes, 1 master > running namenode). I'm trying to process a small (180G) dataset. I've done > this succesfully and painlessly running 0.15.0. When I run 0.17.0 with the > same data and same code (w/API changes for 0.17.0 and recompiled, of course), > I get a ton of failures. I've increased the number of namenode threads > trying to resolve this, but that doesn't seem to help. The errors are of the > following flavor: > > java.io.IOException: Could not get block locations. Aborting... > java.io.IOException: All datanodes 10.2.11.2:50010 are bad. Aborting... > Exception in thread "Thread-2" java.util.ConcurrentModificationException > Exception closing file /blah/_temporary/_task_200807052311_0001_r_ > 04_0/baz/part-x > > As things stand right now, I can't deploy to 0.17.0 (or 0.16.4 or 0.17.1). I > am wondering if anybody can shed some light on this, or if others are having > similar problems. > > Any thoughts, insights, etc. would be greatly appreciated. > > Thanks, > C G > > Here's an ugly trace: > 08/07/06 01:43:29 INFO mapred.JobClient: map 100% reduce 93% > 08/07/06 01:43:29 INFO mapred.JobClient: Task Id : > task_200807052311_0001_r_03_0, Status : FAILED > java.io.IOException: Could not get block locations. Aborting... > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080) > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702) > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818) > task_200807052311_0001_r_03_0: Exception closing file > /output/_temporary/_task_200807052311_0001_r_ > 03_0/a/b/part-3 > task_200807052311_0001_r_03_0: java.io.IOException: All datanodes > 10.2.11.2:50010 are bad. Aborting... > task_200807052311_0001_r_03_0: at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.ja > va:2095) > task_200807052311_0001_r_03_0: at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702) > task_200807052311_0001_r_03_0: at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1 > 818) > task_200807052311_0001_r_03_0: Exception in thread "Thread-2" > java.util..ConcurrentModificationException > task_200807052311_0001_r_03_0: at > java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100) > task_200807052311_0001_r_03_0: at > java.util.TreeMap$KeyIterator.next(TreeMap.java:1154) > task_200807052311_0001_r_03_0: at > org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217) > task_200807052311_0001_r_03_0: at > org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214) > task_200807052311_0001_r_03_0: at > org.apache.hadoop.fs.FileSystem$Cache.closeAll(Fi
Re: Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets?
ConcurrentModificationException is a java bug or? Best regards, Yongqiang He 2008-07-08 Email: [EMAIL PROTECTED] Tel: 86-10-62600966(O) Research Center for Grid and Service Computing, Institute of Computing Technology, Chinese Academy of Sciences P.O.Box 2704, 100080, Beijing, China 发件人: Raghu Angadi 发送时间: 2008-07-08 01:45:19 收件人: core-user@hadoop.apache.org 抄送: 主题: Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets? ConcurrentModificationException looks like a bug we should file a jira. Regd why the writes are failing, we need to look at more logs.. Could you attach complete log from one of the failed tasks. Also try to see if there is anything in NameNode log around that time. Raghu. C G wrote: > Hi All: > > I've got 0.17.0 set up on a 7 node grid (6 slaves w/datanodes, 1 master > running namenode). I'm trying to process a small (180G) dataset. I've done > this succesfully and painlessly running 0.15.0. When I run 0.17.0 with the > same data and same code (w/API changes for 0.17.0 and recompiled, of course), > I get a ton of failures. I've increased the number of namenode threads > trying to resolve this, but that doesn't seem to help. The errors are of the > following flavor: > > java.io.IOException: Could not get block locations. Aborting... > java.io.IOException: All datanodes 10.2.11.2:50010 are bad. Aborting... > Exception in thread "Thread-2" java.util.ConcurrentModificationException > Exception closing file /blah/_temporary/_task_200807052311_0001_r_ > 04_0/baz/part-x > > As things stand right now, I can't deploy to 0.17.0 (or 0.16.4 or 0.17.1). I > am wondering if anybody can shed some light on this, or if others are having > similar problems. > > Any thoughts, insights, etc. would be greatly appreciated. > > Thanks, > C G > > Here's an ugly trace: > 08/07/06 01:43:29 INFO mapred.JobClient: map 100% reduce 93% > 08/07/06 01:43:29 INFO mapred.JobClient: Task Id : > task_200807052311_0001_r_03_0, Status : FAILED > java.io.IOException: Could not get block locations. Aborting... > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080) > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702) > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818) > task_200807052311_0001_r_03_0: Exception closing file > /output/_temporary/_task_200807052311_0001_r_ > 03_0/a/b/part-3 > task_200807052311_0001_r_03_0: java.io.IOException: All datanodes > 10.2.11.2:50010 are bad. Aborting... > task_200807052311_0001_r_03_0: at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.ja > va:2095) > task_200807052311_0001_r_03_0: at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702) > task_200807052311_0001_r_03_0: at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1 > 818) > task_200807052311_0001_r_03_0: Exception in thread "Thread-2" > java.util..ConcurrentModificationException > task_200807052311_0001_r_03_0: at > java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100) > task_200807052311_0001_r_03_0: at > java.util.TreeMap$KeyIterator.next(TreeMap.java:1154) > task_200807052311_0001_r_03_0: at > org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217) > task_200807052311_0001_r_03_0: at > org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214) > task_200807052311_0001_r_03_0: at > org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324) > task_200807052311_0001_r_03_0: at > org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224) > task_200807052311_0001_r_03_0: at > org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209) > 08/07/06 01:44:32 INFO mapred.JobClient: map 100% reduce 74% > 08/07/06 01:44:32 INFO mapred.JobClient: Task Id : > task_200807052311_0001_r_01_0, Status : FAILED > java.io.IOException: Could not get block locations. Aborting... > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080) > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702) > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818) > task_200807052311_0001_r_01_0: Exception in thread "Thread-2" > java.util..ConcurrentModificationException > task_200807052311_0001_r_01_0: at > java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100) > task_200807052311_0001_r_01_0: at > java.util.TreeMap$KeyIterator.next(TreeMap.java:1154) > task_200807052311_0001_r_01_0: at > org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217) > task_200807052311_0001_r_01_0:
Re: OK to remove NN's edits file?
I also have encountered this error,i added the try catch clause out the main code(FSEditLog.loadFSEdits),and passed the unknown opcode and EOF error, Best regards, Yongqiang He 2008-07-08 Email: [EMAIL PROTECTED] Tel: 86-10-62600966(O) Research Center for Grid and Service Computing, Institute of Computing Technology, Chinese Academy of Sciences P.O.Box 2704, 100080, Beijing, China 发件人: Otis Gospodnetic 发送时间: 2008-07-07 22:34:02 收件人: core-user@hadoop.apache.org 抄送: 主题: OK to remove NN's edits file? Hello, I have Hadoop 0.16.2 running in a cluster whose Namenode seems to have a corrupt "edits" file. This causes an EOFException during NN init, which causes NN to exit immediately (exception below). What is the recommended thing to do in such a case? I don't mind losing any of the data that is referenced in "edits" file. Should I just remove the edits file, start NN, and assume the NN will create a new, empty "edits" file and all will be well? This is what I see when NN tries to start: 2008-07-07 10:58:43,255 ERROR dfs.NameNode - java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at org.apache.hadoop.io.UTF8.readFields(UTF8.java:106) at org.apache.hadoop.io.ArrayWritable.readFields(ArrayWritable.java:90) at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:433) at org.apache.hadoop.dfs.FSImage.loadFSEdits(FSImage.java:756) at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:639) at org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:222) at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:79) at org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:254) at org.apache.hadoop.dfs.FSNamesystem. (FSNamesystem.java:235) at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:131) at org.apache.hadoop.dfs.NameNode. (NameNode.java:176) at org.apache.hadoop.dfs.NameNode. (NameNode.java:162) at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:846) at org.apache.hadoop.dfs.NameNode.main(NameNode.java:855) Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
Re: NoSuchMethodException - question to ask Tom White (and others) :-)
in most cases, this error is because u have not implemented the non-argument constructor explicitly. Best regards, Yongqiang He 2008-07-06 Email: [EMAIL PROTECTED] Tel: 86-10-62600966(O) Research Center for Grid and Service Computing, Institute of Computing Technology, Chinese Academy of Sciences P.O.Box 2704, 100080, Beijing, China 发件人: Xuan Dzung Doan 发送时间: 2008-07-06 09:12:43 收件人: core-user@hadoop.apache.org 抄送: 主题: NoSuchMethodException - question to ask Tom White (and others) :-) I'm writing a mapred app in Hadoop 0.16.4 in which I implement my own inputsplit, called BioFileSplit, that extends FileSplit (it adds one int data field to FileSplit). Testing my program in Eclipse yielded the exception trace that roughly looks as follows: Task Id : task_200807011030_0004_m_00_0, Status : FAILED java.lang.RuntimeException: java.lang.NoSuchMethodException: edu.bio.ec2alignment.BioFileSplit. () at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:80) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:180) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084) Caused by: java.lang.NoSuchMethodException: edu.bio.ec2alignment.BioFileSplit. () at java.lang.Class.getConstructor0(Class.java:2706) at java.lang.Class.getDeclaredConstructor(Class.java:1985) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:74) I found on the net the following: https://issues.apache.org/jira/browse/HADOOP-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12578250#action_12578250 In this, Tom White mentioned a bug that produced exception trace that looks similar to this. Are these the same problem? If so, this issue has been taken care of in version 0.17.0, right (issue HADOOP-3208?) ? I'd like Tom or others to verify this. I'm having a 0.16.4 environment that is stable and I'm happy about; I'm not sure how stable 0.17.0 is, and want to justify the decision to upgrade. If these problems are not the same, can anyone suggest any idea about what the issue could actually be? Thanks, David. PS: And looks like version 0.17.0 is no longer available on the download page, but the latest 0.17.1 :-)
Should there be a way not maintaining the whole namespace structure in memory?
In now's hdfs implementation,all INodeFile and INodeDirectory objects were loaded into memory,this is done when setting up the FSNameSpacs structure set up at namenode startup. the namenode will analyze the fsimage file and edit log file. And if there are milllions of files or directories how it can be handled? I have done an exprements by making dirs,before i exprements: [EMAIL PROTECTED] bin]$ ps -p 9122 -o rss,size,vsize,%mem RSSSZVSZ %MEM 153648 1193868 1275340 3.7 after i creating 1 directories, it turns: [EMAIL PROTECTED] bin]$ ps -p 9122 -o rss,size,vsize,%mem RSSSZVSZ %MEM 169084 1193868 1275340 4.0 I m trying to improve the fsimage file,so that namenode can locate and load the needed information at need,and just like linux vfs,we can only obtain an inode cache.So this can avoid loading the whole namespace structure at startup. Best regards, Yongqiang He 2008-07-01 Email: [EMAIL PROTECTED] Tel: 86-10-62600966(O) Research Center for Grid and Service Computing, Institute of Computing Technology, Chinese Academy of Sciences P.O.Box 2704, 100080, Beijing, China
Re: Data-local tasks
Hadoop does not implemented the clever task scheduler, when a data node heartbeat with the namenode, and if the data node wants a job, simply get one for it. The selection does not consider the task's input file at all. Best regards, Yongqiang He 2008-06-25 发件人: Saptarshi Guha 发送时间: 2008-06-30 21:12:24 收件人: core-user@hadoop.apache.org 抄送: 主题: Data-local tasks Hello, I recall asking this question but this is in addition to what I'ev askd. Firstly, to recap my question and Arun's specific response: -- On May 20, 2008, at 9:03 AM, Saptarshi Guha wrote: > Hello, > -- Does the "Data-local map tasks" counter mean the number of tasks that the had the input data already present on the machine on they are running on? -- i.e the wasn't a need to ship the data to them. Response from Arun -- Yes. Your understanding is correct. More specifically it means that the map-task got scheduled on a machine on which one of the -- replicas of it's input-split-block was present and was served by the datanode running on that machine. *smile* Arun Now, Is Hadoop designed to schedule a map task on a machine which has one of the replicas of it's input split block? Failing that, does then assign a map task on machine close to one that contains a replica of it's input split block? Are there any performance metrics for this? Many thanks Saptarshi Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha
Re: Is it possible to access the HDFS using webservices?
if u want to access hdfs metadata through webservices, it is ok. but it is not a wise way to deal with data. And further namenode daemon even can be implemented by webservice,it is just another alternative way of rpc. Best regards, Yongqiang He 2008-07-01 Email: [EMAIL PROTECTED] Tel: 86-10-62600966(O) Research Center for Grid and Service Computing, Institute of Computing Technology, Chinese Academy of Sciences P.O.Box 2704, 100080, Beijing, China 发件人: [EMAIL PROTECTED] 发送时间: 2008-07-01 06:19:30 收件人: [EMAIL PROTECTED]; core-user@hadoop.apache.org 抄送: 主题: Is it possible to access the HDFS using webservices? Hi everybody, I'm trying to access the hdfs using web services. The idea is that the web service client can access the HDFS using SOAP or REST and has to support all the hdfs shell commands. Is it some work around this?. I really appreciate any feedback, Xavier
Re: Data-local tasks
Hadoop does not implemented the clever task scheduler, when a data node heartbeat with the namenode, and if the data node wants a job, simply get one for it. The selection does not consider the task's input file at all. Best regards, Yongqiang He 2008-06-25 发件人: Saptarshi Guha 发送时间: 2008-06-30 21:12:24 收件人: core-user@hadoop.apache.org 抄送: 主题: Data-local tasks Hello, I recall asking this question but this is in addition to what I'ev askd. Firstly, to recap my question and Arun's specific response: -- On May 20, 2008, at 9:03 AM, Saptarshi Guha wrote: > Hello, > -- Does the "Data-local map tasks" counter mean the number of tasks that the had the input data already present on the machine on they are running on? -- i.e the wasn't a need to ship the data to them. Response from Arun -- Yes. Your understanding is correct. More specifically it means that the map-task got scheduled on a machine on which one of the -- replicas of it's input-split-block was present and was served by the datanode running on that machine. *smile* Arun Now, Is Hadoop designed to schedule a map task on a machine which has one of the replicas of it's input split block? Failing that, does then assign a map task on machine close to one that contains a replica of it's input split block? Are there any performance metrics for this? Many thanks Saptarshi Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha
Re: Re: understanding of client connection code
hehe I notices that in the DFSClient's DataStreamer thread, the run method is sending data out with synchronized on the dataqueue, is this really need? I mean remove,wait,and getFirst of variable dataQueue should be synchronized on the dataQueue,but is it need to hold a lock when send one packet out? I doubt. Can any developer give me one reason for doing that? heyongqiang 2008-06-23 发件人: hong 发送时间: 2008-06-21 10:10:59 收件人: core-user@hadoop.apache.org 抄送: 主题: Re: understanding of client connection code 兄弟是 余海燕 的部队吗? 在 2008-6-20,下午5:00,heyongqiang 写道: > ipc.Client object is designed be able to share across threads, and > each thread can only made synchronized rpc call,which means each > thread call and wait for a result or error.This is implemented by a > novel technique:each thread made distinct call(with different call > object),the user thread then wait at his call object which later > will be notified by the connection receiver thread.The user thread > made a call by first add his call object into the call list which > later be used by the response receiver,and synchronized at the > connection's socket outputstream waiting for writing his call out. > And the connection's thread is running to collect response on > behalf of all user threads. > which i have not mentioned is that Client actually maintains a > connection table. > In every Client object ,a connection culler is running behind as a > daemon,which's sole purpose is to remove idel connection from the > connection table, > but it seems that this culler thread does not close the socket the > connection associated with,it only make a mark and do a notify. all > the clean staff is handled by the connection thread itself.This is > really a wonderful design! even the culler thread can culled the > connection from the table, the connection thread also includes > remove code. That's because there is chance that the connection > thread would encounter some exception. > > The above is a brief summary of my understanding of hadoop's ipc > code. > The below is a test result which is used to test the data > throughput of hadoop: > +--+--+ > | threadCounts | avg(averageRate) | > +--+--+ > |1 | 53030539.48913 | > |2 | 35325499.583756 | > |3 | 24998284.969072 | > |4 | 19824934.28125 | > |5 | 15956391.489583 | > |6 | 15948640.175532 | > |7 | 14623977.375691 | > |8 | 16098080.160131 | > |9 | 8967970.3877005 | > | 10 | 14569087.178947 | > | 11 | 8962683.6662088 | > | 12 | 20063735.297872 | > | 13 | 13174481.053977 | > | 14 | 10137907.034188 | > | 15 | 6464513.2013889 | > | 16 | 23064338.76087 | > | 17 | 18688537.44385 | > | 18 | 18270909.854317 | > | 19 | 13086261.536538 | > | 20 | 10784059.367347 | > +--+--+ > > the first column represents the thread counts of my test > application, the second column is the average download rate.It > seems the rate download sharply when the thread count increases. > This is very simple test application.Anyone can tell me why?where > is the bottleneck when user app adopt multiple thread. > > > > > heyongqiang > 2008-06-20
datanode start failure
when i restart hdfs ,i encountered the below error,which cause the datanode exits. if i delete the foldes and files where hadoop use to store information and then restart,its ok.but i cannot do that everytime i restart... anyone know why and how to avoid? thanks! 2008-05-22 09:29:51,215 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: / STARTUP_MSG: Starting DataNode STARTUP_MSG: host = 114.vega/192.168.100.114 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.16.1 STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.16 -r 635123; compiled by 'hadoopqa' on Sun Mar 9 05:44:19 UTC 2008 / 2008-05-22 09:30:10,662 ERROR org.apache.hadoop.dfs.DataNode: java.io.IOException: Incompatible namespaceIDs in /opt/hadoop-0.16.1/filesystem/data: namenode namespaceID = 1486286536; datanode namespaceID = 1825907088 at org.apache.hadoop.dfs.DataStorage.doTransition(DataStorage.java:298) at org.apache.hadoop.dfs.DataStorage.recoverTransitionRead(DataStorage.java:142) at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:236) at org.apache.hadoop.dfs.DataNode.(DataNode.java:162) at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:2531) at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2475) at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2496) at org.apache.hadoop.dfs.DataNode.main(DataNode.java:2692) 2008-05-22 09:30:10,663 INFO org.apache.hadoop.dfs.DataNode: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down DataNode at 114.vega/192.168.100.114 /
Re: Confuse about the Client.Connection
well,i guess i got the answer.First hadoop use tcp,so will not occur situations like reaultBody_by_threadB,callId_by_threadB;Second,the server has been synchronied on the response queue when the responder send response messages ,and for one call each time.Since the clients use the same server,so the messages will not be reordered. heyongqiang 2008-05-22 发件人: heyongqiang 发送时间: 2008-05-22 13:30:10 收件人: core-user 抄送: 主题: Confuse about the Client.Connection hi,all I took a look at the source code of org.apache.hadoop.ipc.Client ,and i wonder if there are two client thread invoke the getConnection() specifing the same arguments,then they will get a same Connection object,how could they distinguish the results from each other? I noticed the results streamed back from the server is collected by the Connection's thread,not the callers' threads, and the Connection's thread expects reaults :callId_XX,reaultBody_XX. Is there a situation in which Connection's result thread collects callId_by_threadA,reaultBody_by_threadB,callId_by_threadB,resultBody_by_threadA?I think this situation is kind of reasonable,how does the current code handle this? heyongqiang [EMAIL PROTECTED] 2008-05-22
Confuse about the Client.Connection
hi,all I took a look at the source code of org.apache.hadoop.ipc.Client ,and i wonder if there are two client thread invoke the getConnection() specifing the same arguments,then they will get a same Connection object,how could they distinguish the results from each other? I noticed the results streamed back from the server is collected by the Connection's thread,not the callers' threads, and the Connection's thread expects reaults :callId_XX,reaultBody_XX. Is there a situation in which Connection's result thread collects callId_by_threadA,reaultBody_by_threadB,callId_by_threadB,resultBody_by_threadA?I think this situation is kind of reasonable,how does the current code handle this? heyongqiang [EMAIL PROTECTED] 2008-05-22
Re: Re: Hadoop summit video capture?
Have you tried http://research.yahoo.com/node/2104? but i cannot download the video,and even can not find the ie temp file. Seems yahoo has done some limit on it. heyongqiang 2008-05-15 发件人: Cole Flournoy 发送时间: 2008-05-15 03:48:20 收件人: core-user@hadoop.apache.org 抄送: 主题: Re: Hadoop summit video capture? They haven't been uploaded yet, we are begging and hoping that whoever has them will post them somewhere. I second Veoh, hadoop rocks. Cole On Wed, May 14, 2008 at 4:11 PM, Otis Gospodnetic < [EMAIL PROTECTED] > wrote: > I tried finding those Hadoop videos on Veoh, but got 0 hits: > >http://www.veoh.com/search.html?type=v&search=hadoop > > > Got URL, Ted? > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > - Original Message > > From: Ted Dunning <[EMAIL PROTECTED] > > > To: core-user@hadoop.apache.org > > Sent: Wednesday, May 14, 2008 1:50:02 PM > > Subject: Re: Hadoop summit video capture? > > > > > > Use Veoh instead. Higher resolution. Higher uptime. Nicer embeds. > > > > And the views get chewed up by hadoop instead of google's implementation! > > > > (conflict of interest on my part should be noted) > > > > > > On 5/14/08 10:43 AM, "Cole Flournoy" wrote: > > > > > Man, yahoo needs to get there act together with their video service > (the > > > videos are still down)! Is there anyway someone can upload these > videos to > > > youTube and provide a link? > > > > > > Thanks, > > > Cole > > > > > > On Wed, Apr 23, 2008 at 11:36 AM, Chris Mattmann < > > > [EMAIL PROTECTED] > wrote: > > > > > > > Thanks, Jeremy. Appreciate it. > > > > > > > > Cheers, > > > > Chris > > > > > > > > > > > > > > > > On 4/23/08 8:25 AM, "Jeremy Zawodny" wrote: > > > > > > > > > Certainly... > > > > > > > > > > Stay tuned. > > > > > > > > > > Jeremy > > > > > > > > > > On 4/22/08, Chris Mattmann wrote: > > > > > > > > > > > > Hi Jeremy, > > > > > > > > > > > > Any chance that these videos could be made in a downloadable > format > > > > rather > > > > > > than thru Y!'s player? > > > > > > > > > > > > For example I'm traveling right now and would love to watch the > rest > of > > > > > > the > > > > > > presentations but the next few hours I won't have an internet > > > > connection. > > > > > > > > > > > > So, my request won't help me, but may help folks in similar > situations. > > > > > > > > > > > > Just a thought, thanks! > > > > > > > > > > > > Cheers, > > > > > > Chris > > > > > > > > > > > > > > > > > > > > > > > > On 4/22/08 1:27 PM, "Jeremy Zawodny" wrote: > > > > > > > > > > > > > Okay, things appear to be fixed now. > > > > > > > > > > > > > > Jeremy > > > > > > > > > > > > > > On 4/20/08, Jeremy Zawodny wrote: > > > > > > > > > > > > > > > > Not yet... there seem to be a lot of cooks in the kitchen on > this > one, > > > > > > but > > > > > > > > we'll get it fixed. > > > > > > > > > > > > > > > > Jeremy > > > > > > > > > > > > > > > > On 4/19/08, Cole Flournoy wrote: > > > > > > > > > > > > > > > > > > Any news on when the videos are going to work? I am dieing > to > watch > > > > > > > > > them! > > > > > > > > > > > > > > > > > > Cole > > > > > > > > > > > > > > > > > > On Fri, Apr 18, 2008 at 8:10 PM, Jeremy Zawodny > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Almost...