from:"heyongqiang"

Hadoop Beijing Meeting has successfully concluded! www.hadooper.cn is ready now.

2008-12-01 Thread heyongqiang

Hi,all 
Hadoop Beijing Meeting has successfully concluded on Nov 23. Thank you all 
for your attention.
According to the agreements reached in this meeting, we have finished 
setting up the hadoop-in-china nonprofit website:www.hadooper.cn. 
We wish we can form a powerful hadoop community in china. Take a look at 
this website and if you are interested in contributing for the 
hadooper-in-china community, please drop me an email.
We have uploaded the Hadoop Beijing Meeting pictures, slides and videos on 
this www.hadooper.cn website, which are also available on the hadoop-in-china 
google group (http://groups.google.com/group/hadooper_cn).
please let me know if you have any suggestions.

Thanks!



heyongqiang
2008-12-01

Re: RE: Hadoop beijing meeting draft agenda is ready.

2008-11-19 Thread heyongqiang

Hi,Ding Hui
We will do our best to made the slides、pics and may videos public on the 
internet. I will confirm with the talkers about these things and coordinate 
with our volunteers.
Thank you for your suggestion.

heyongqiang
2008-11-20

发件人： Ding, Hui
发送时间： 2008-11-20 01:20:22
收件人： [EMAIL PROTECTED]
抄送： 
主题： RE: Hadoop beijing meeting draft agenda is ready.

Hi,

Some of the talks sounds really interesting.  Is it possible to video
tape this and make it public? Or at least make the slides available?

Cheers 

-Original Message-
From: heyongqiang [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 19, 2008 8:40 AM
To: core-user; core-dev; hbase-user
Subject: Hadoop beijing meeting draft agenda is ready.

hi,all
The hadoop beijing meeting agenda is ready now. Currently I only
count people who have replied my email or post on google group. I will
send people i have counted a word document. We now still welcome
participants from companies, institutes and universities. Please drop me
a mail if you are interested.
I have tried to send the agenda to the mail-lists, but it always
failed after several retries, which i received a exception said 552 spam
score (5.0) exceeded threshold.
I post the agenda on the google group
(http://groups.google.com/group/hadoop-beijing-meeting/). 
BTW, The agenda will be adjusted according to actual situation. It
may last only half day.


heyongqiang
2008-11-20

Re: Hadoop beijing meeting draft agenda is ready.

2008-11-19 Thread heyongqiang

hi,all
I also have uploaded the agenda word document to the google group.
--   
heyongqiang
2008-11-20
-
发件人：heyongqiang
发送日期：2008-11-20 00:04:51
收件人：core-user; core-dev; hbase-user
抄送：
主题：Hadoop beijing meeting draft agenda is ready.

hi,all
The hadoop beijing meeting agenda is ready now. Currently I only count 
people who have replied my email or post on google group. I will send people i 
have counted a word document. We now still welcome participants from companies, 
institutes and universities. Please drop me a mail if you are interested.
I have tried to send the agenda to the mail-lists, but it always failed 
after several retries, which i received a exception said 552 spam score (5.0) 
exceeded threshold.
I post the agenda on the google group 
(http://groups.google.com/group/hadoop-beijing-meeting/). 
BTW, The agenda will be adjusted according to actual situation. It may last 
only half day.

heyongqiang
2008-11-20

Hadoop beijing meeting draft agenda is ready.

2008-11-19 Thread heyongqiang

hi,all
The hadoop beijing meeting agenda is ready now. Currently I only count 
people who have replied my email or post on google group. I will send people i 
have counted a word document. We now still welcome participants from companies, 
institutes and universities. Please drop me a mail if you are interested.
I have tried to send the agenda to the mail-lists, but it always failed 
after several retries, which i received a exception said 552 spam score (5.0) 
exceeded threshold.
I post the agenda on the google group 
(http://groups.google.com/group/hadoop-beijing-meeting/). 
BTW, The agenda will be adjusted according to actual situation. It may last 
only half day.
 



heyongqiang
2008-11-20

Call for talker at the hadoop beijing meeting!

2008-11-17 Thread heyongqiang

   hi,all
Currently we only received one talker application outside our team. 
We now welcome talkers for this meeting. You can choose any topic about cloud 
computing.Please send me a brief introduction about yourself and your talk. 
By the way, this meeting is not meant to be academic, it is just a experience 
exchange meeting.



Best regards,
 
Yongqiang He
2008-11-17

Email: [EMAIL PROTECTED]
Tel:   86-10-62600966(O)
 
Research Center for Grid and Service Computing,
Institute of Computing Technology, 
Chinese Academy of Sciences
P.O.Box 2704, 100080, Beijing, China

Re: Re: Hadoop Beijing Meeting

2008-11-12 Thread heyongqiang

hi,Jeremy Chow
Welcome! 
Please send me a brief introduction about yourself and your talk diretly to me.
I will send you the detailed agenda and other import things next week.



Best regards,
 
Yongqiang He
2008-11-12

Email: [EMAIL PROTECTED]
Tel:   86-10-62600966(O)
 
Research Center for Grid and Service Computing,
Institute of Computing Technology, 
Chinese Academy of Sciences
P.O.Box 2704, 100080, Beijing, China 



发件人： Jeremy Chow
发送时间： 2008-11-12 17:04:46
收件人： core-user@hadoop.apache.org
抄送： 
主题： Re: Hadoop Beijing Meeting

Hi Mr. He Yongqiang,
  I  apply as a speaker, though is very hurried. I have always been a fan of
hadoop. This is my technical blog, http://coderplay.javaeye.com/.

Regards,
Jeremy
-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

http://coderplay.javaeye.com

Re: Hadoop Beijing Meeting

2008-11-11 Thread heyongqiang

hello. 
we created a google group for this 
meeting,http://groups.google.com/group/hadoop-beijing-meeting/.
It is both ok to discuss this meeting in the maillist or in the google group.
We will make anounces in both place.



Best regards,
 
Yongqiang He
2008-11-12

Email: [EMAIL PROTECTED]
Tel:   86-10-62600966(O)
 
Research Center for Grid and Service Computing,
Institute of Computing Technology, 
Chinese Academy of Sciences
P.O.Box 2704, 100080, Beijing, China 



发件人： 永强 何
发送时间： 2008-11-12 14:05:00
收件人： core-user@hadoop.apache.org; [EMAIL PROTECTED]
抄送： 
主题： Hadoop Beijing Meeting

Hello, all
 We are planning to host a Hadoop Beijing meeting on next
Sunday(23th of Nov.). We now welcome speakers and participants! If you are
interested in cloud computing topics and you can join us that day in
Beijing, then you are invited and please let me know by dropping me an
e-mail.
This meeting will be held in: Room 948, 9th floor,  Institute of
Computing Technology (ICT), No.6 Kexueyuan South Road Zhongguancun, Haidian
District Beijing, China.
It is our great honor that we have invited Doctor Li Zha, who will
give us a brief welcome speech.  We are also trying to invite Doctor Zhiwei
Xu, who is now the chief scientist of Institute of Computing
Technology(ICT).  
We now welcome speakers for this meeting with our greatest
sincerity. If you are interested in making a talk on this meeting, please
let me know and I will add it into the schedule.


Best regards!

He Yongqiang 

Email: [EMAIL PROTECTED]

Tel: 86-10-62600919(O)

Fax：86-10-626000900

Key Laboratory of Network Science and Technology

Research Center for Grid and Service Computing Institute of Computing
Technology,

Chinese Academy of Sciences

P.O.Box 2704, 100080, Beijing, China

Re: File permissions issue

2008-07-08 Thread heyongqiang

because in your permission set, the other role can not write the temp directory.
and user3 is not in the same group with user2. 





heyongqiang
2008-07-09



发件人： Joman Chu
发送时间： 2008-07-09 13:06:51
收件人： core-user@hadoop.apache.org
抄送： 
主题： File permissions issue

Hello,

On a cluster where I run Hadoop, it seems that the temp directory created by 
Hadoop (in our case, /tmp/hadoop/) gets its permissions set to "drwxrwxr-x" 
owned by the first person that runs a job after the Hadoop services are 
started. This causes file permissions problems as we try to run jobs.

For example, user1:user1 starts Hadoop using ./start-all.sh. Then user2:user2 
runs a Hadoop job. Temp directories (/tmp/hadoop/) are now created in all nodes 
in the cluster owned by user2 with permissions "drwxrwxr-x". Now user3:user3 
tries to run a job and gets the following exception:

java.io.IOException: Permission denied
 at java.io.UnixFileSystem.createFileExclusively(Native Method)
 at java.io.File.checkAndCreate(File.java:1704)
 at java.io.File.createTempFile(File.java:1793)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:115)
 at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
 at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)

Why does this happen and how can we fix this? Our current stop gap measure is 
to run a job as the user that started Hadoop. That is, in our example, after 
user1 starts Hadoop, user1 runs a job. Everything seems to work fine then.

Thanks,
Joman Chu

Re: Re: hadoop download performace when user app adopt multi-thread

2008-07-08 Thread heyongqiang

Actually this test result is a good result,it is just my misunderstanding of 
the result.my mistake.
the second column actually is the average download rate per thread.And this 
post test was run on one node,we also run test simultaneously on multiple 
nodes,and the performance results seem acceptable for us.
But what u said is right,but this overhead(seek time and I/O consumption ) 
seems not easy to optimize.
thank you for your attention.



Best regards,
 
Yongqiang He
2008-07-09

Email: [EMAIL PROTECTED]
Tel:   86-10-62600966(O)
 
Research Center for Grid and Service Computing,
Institute of Computing Technology, 
Chinese Academy of Sciences
P.O.Box 2704, 100080, Beijing, China 



发件人： Samuel Guo
发送时间： 2008-07-09 09:47:32
收件人： core-user@hadoop.apache.org
抄送： 
主题： Re: hadoop download performace when user app adopt multi-thread

heyongqiang 写道:
> ipc.Client object is designed be able to share across threads, and each 
> thread can only made synchronized rpc call,which means each thread call and 
> wait for a result or error.This is implemented by a novel technique:each 
> thread made distinct call(with different call object),the user thread then 
> wait at his call object which later will be notified by the connection 
> receiver thread.The user thread made a call by first add his call object into 
> the call list which later be used by the response receiver,and synchronized 
> at the connection's socket outputstream waiting for writing his call out. And 
> the connection's thread is running to collect response on behalf of all user 
> threads.
> which i have not mentioned is that Client actually maintains a connection 
> table.
> In every Client object ,a connection culler is running behind as a 
> daemon,which's sole purpose is to remove idel connection from the connection 
> table,
> but it seems that this culler thread does not close the socket the connection 
> associated with,it only make a mark and do a notify. all the clean staff is 
> handled by the connection thread itself.This is really a wonderful design! 
> even the culler thread can culled the connection from the table, the 
> connection thread also includes remove code. That's because there is chance 
> that the connection thread would encounter some exception.
>
> The above is a brief summary of  my understanding of hadoop's ipc code.
> The below is a test result which is used to test the data throughput of 
> hadoop:
> +--+--+
> | threadCounts | avg(averageRate) |
> +--+--+
> |1 |   53030539.48913 |
> |2 |  35325499.583756 |
> |3 |  24998284.969072 |
> |4 |   19824934.28125 |
> |5 |  15956391.489583 |
> |6 |  15948640.175532 |
> |7 |  14623977.375691 |
> |8 |  16098080.160131 |
> |9 |  8967970.3877005 |
> |   10 |  14569087.178947 |
> |   11 |  8962683.6662088 |
> |   12 |  20063735.297872 |
> |   13 |  13174481.053977 |
> |   14 |  10137907.034188 |
> |   15 |  6464513.2013889 |
> |   16 |   23064338.76087 |
> |   17 |   18688537.44385 |
> |   18 |  18270909.854317 |
> |   19 |  13086261.536538 |
> |   20 |  10784059.367347 |
> +--+--+
>
> the first column represents the thread counts of my test application, the 
> second column is the average download rate.It seems the rate download sharply 
> when the thread count increases.
> This is very simple test application.Anyone can tell me why?where is the 
> bottleneck when user app adopt multiple thread.
>
>   

As you known, a block of the file in HDFS is presented as a file in the
local filesystem resides in a datanode.
Different threads read different files in HDFS or different blocks of a
(same) file in HDFS, may result a burst of read requests in different
local files(blocks of HDFS files) in a certain datanode. so the disk
seek time and I/O consumption will become heavy and the response time
will be longer.
But it is just a local behavior of a (single) datanode. The whole
throughput of the Hadoop cluster will be good.

so, can you supply any information about your test?
> heyongqiang
> 2008-06-20
>
>

Re: Re: modified word count example

2008-07-08 Thread heyongqiang

where i can find the Reverse-Index application?




heyongqiang
2008-07-09



发件人： Shengkai Zhu
发送时间： 2008-07-09 09:06:38
收件人： core-user@hadoop.apache.org
抄送： 
主题： Re: modified word count example

Another Map Reduce application, Reverse-Index, behaviors similarly as you
description.
You can refer to that.


On 7/9/08, heyongqiang  <[EMAIL PROTECTED] > wrote:
>
> InputFormat's method RecordReader  getRecordReader(InputSplit split,
> JobConf job, Reporter reporter) throws IOException; return a RecordReader.
> You can implement your own InputFormat and RecordReader:
> 1)the RecorderReader remember the FileSplit(subclass of InputSplit) field
> in its class
> 2) RecordReader's createValue() method always return the FileSplit's file
> field.
>
> hope this helps.
>
>
>
> heyongqiang
> 2008-07-09
>
>
>
> 发件人： Sandy
> 发送时间： 2008-07-09 01:45:15
> 收件人： core-user@hadoop.apache.org
> 抄送：
> 主题： modified word count example
>
> Hi,
>
> Let's say I want to run a map reduce job on a series of text files (let's
> say x.txt y.txt and z.txt)
>
> Given the following mapper function in python (from WordCount.py):
>
> class WordCountMap(Mapper, MapReduceBase):
>one = IntWritable(1) # removed
>def map(self, key, value, output, reporter):
>for w in value.toString().split():
>output.collect(Text(w), self.one) #how can I modify this line?
>
> Instead of creating pairs for each word found and the numeral one as the
> example is doing, is there a function I can invoke to store the name of the
> file it came from instead?
>
> thus, i'd have pairs like   <"water", "x.txt"  >   <"hadoop", y.txt  >
>  <"hadoop",
> "z.txt"  > etc.
>
> I took a look at javadoc, but i'm not sure if I've checked in the right
> places. Could someone point me in the right direction?
>
> Thanks!
>
> -SM
>

Re: modified word count example

2008-07-08 Thread heyongqiang

 InputFormat's method RecordReader getRecordReader(InputSplit split, 
JobConf job, Reporter reporter) throws IOException; return a RecordReader.
You can implement your own InputFormat and RecordReader:
1)the RecorderReader remember the FileSplit(subclass of InputSplit) field in 
its class
2) RecordReader's createValue() method always return the FileSplit's file field.

hope this helps.



heyongqiang
2008-07-09



发件人： Sandy
发送时间： 2008-07-09 01:45:15
收件人： core-user@hadoop.apache.org
抄送： 
主题： modified word count example

Hi,

Let's say I want to run a map reduce job on a series of text files (let's
say x.txt y.txt and z.txt)

Given the following mapper function in python (from WordCount.py):

class WordCountMap(Mapper, MapReduceBase):
one = IntWritable(1) # removed
def map(self, key, value, output, reporter):
for w in value.toString().split():
output.collect(Text(w), self.one) #how can I modify this line?

Instead of creating pairs for each word found and the numeral one as the
example is doing, is there a function I can invoke to store the name of the
file it came from instead?

thus, i'd have pairs like  <"water", "x.txt" >  <"hadoop", y.txt >   <"hadoop",
"z.txt" > etc.

I took a look at javadoc, but i'm not sure if I've checked in the right
places. Could someone point me in the right direction?

Thanks!

-SM

Re: Monthly Hadoop User Group Meeting

2008-07-08 Thread heyongqiang

Will there be a meeting in Beijing,China in the future?
haha




heyongqiang
2008-07-09



发件人： Ajay Anand
发送时间： 2008-07-09 01:32:10
收件人： core-user@hadoop.apache.org; [EMAIL PROTECTED]; [EMAIL PROTECTED]
抄送： 
主题： Monthly Hadoop User Group Meeting

The next Hadoop User Group meeting is scheduled for July 22nd from 6 -
7:30 pm at Yahoo! Mission College, Building 1, Training Rooms 3 and 4. 



Agenda:

Cascading - Chris Wenzel

Performance Benchmarking on Hadoop (Terabyte Sort, Gridmix) - Sameer
Paranjpye, Owen O'Malley, Runping Qi



Registration and directions: http://upcoming.yahoo.com/event/869166



Look forward to seeing you there!

Ajay

Re: Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets?

2008-07-07 Thread heyongqiang

i doubt this error was because one datanode quit during the client write,and 
that datanode was chosen by namenode for the client to contact to write(this 
was what DFSClient.DFSOutputStream.nextBlockOutputStream did).
Default,client side retry 3 times and sleep total 3*xxx seconds,but NameNode 
need more time to find the deadnode.So every time when client wake up, there is 
a chance the dead node was chosen again.
maybe u should chang the NameNode's interval finding the deadnode and chang the 
Client's sleep more long?
I have changed the DFSClient.DFSOutputStream.nextBlockOutputStream's sleep code 
like below:
if (!success) {
 LOG.info("Abandoning block " + block + " and retry...");
 namenode.abandonBlock(block, src, clientName);

 // Connection failed. Let's wait a little bit and retry
 retry = true;
 try {
  if (System.currentTimeMillis() - startTime > 5000) {
   LOG.info("Waiting to find target node: "
 + nodes[0].getName());
  }
  long time=heartbeatRecheckInterval;
  Thread.sleep(time);
 } catch (InterruptedException iex) {
 }
}

heartbeatRecheckInterval is exactly the interval of the NameNode's deadnode 
monitor's recheck interval.And I also changed the NameNode's deadnode 
recheck interval to be double of heartbeat interval.




Best regards,
 
Yongqiang He
2008-07-08

Email: [EMAIL PROTECTED]
Tel:   86-10-62600966(O)
 
Research Center for Grid and Service Computing,
Institute of Computing Technology, 
Chinese Academy of Sciences
P.O.Box 2704, 100080, Beijing, China 



发件人： Raghu Angadi
发送时间： 2008-07-08 01:45:19
收件人： core-user@hadoop.apache.org
抄送： 
主题： Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets?

ConcurrentModificationException looks like a bug we should file a jira.

Regd why the writes are failing, we need to look at more logs.. Could 
you attach complete log from one of the failed tasks. Also try to see if 
there is anything in NameNode log around that time.

Raghu.

C G wrote:
> Hi All:
>  
> I've got 0.17.0 set up on a 7 node grid (6 slaves w/datanodes, 1 master 
> running namenode).  I'm trying to process a small (180G) dataset.  I've done 
> this succesfully and painlessly running 0.15.0.  When I run 0.17.0 with the 
> same data and same code (w/API changes for 0.17.0 and recompiled, of course), 
> I get a ton of failures.  I've increased the number of namenode threads 
> trying to resolve this, but that doesn't seem to help.  The errors are of the 
> following flavor:
>  
> java.io.IOException: Could not get block locations. Aborting...
> java.io.IOException: All datanodes 10.2.11.2:50010 are bad. Aborting...
> Exception in thread "Thread-2" java.util.ConcurrentModificationException
> Exception closing file /blah/_temporary/_task_200807052311_0001_r_
> 04_0/baz/part-x
>  
> As things stand right now, I can't deploy to 0.17.0 (or 0.16.4 or 0.17.1).  I 
> am wondering if anybody can shed some light on this, or if others are having 
> similar problems.  
>  
> Any thoughts, insights, etc. would be greatly appreciated.
>  
> Thanks,
> C G
>  
> Here's an ugly trace:
> 08/07/06 01:43:29 INFO mapred.JobClient:  map 100% reduce 93%
> 08/07/06 01:43:29 INFO mapred.JobClient: Task Id : 
> task_200807052311_0001_r_03_0, Status : FAILED
> java.io.IOException: Could not get block locations. Aborting...
> at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
> at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
> at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
> task_200807052311_0001_r_03_0: Exception closing file 
> /output/_temporary/_task_200807052311_0001_r_
> 03_0/a/b/part-3
> task_200807052311_0001_r_03_0: java.io.IOException: All datanodes 
> 10.2.11.2:50010 are bad. Aborting...
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.ja
> va:2095)
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1
> 818)
> task_200807052311_0001_r_03_0: Exception in thread "Thread-2" 
> java.util..ConcurrentModificationException
> task_200807052311_0001_r_03_0:  at 
> java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
> task_200807052311_0001_r_03_0:  at 
> java.util.TreeMap$KeyIterator.next(TreeMap.java:1154)
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217)
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214)
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.fs.FileSystem$Cache.closeAll(Fi

Re: Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets?

2008-07-07 Thread heyongqiang

ConcurrentModificationException is a java bug or?




Best regards,
 
Yongqiang He
2008-07-08

Email: [EMAIL PROTECTED]
Tel:   86-10-62600966(O)
 
Research Center for Grid and Service Computing,
Institute of Computing Technology, 
Chinese Academy of Sciences
P.O.Box 2704, 100080, Beijing, China 



发件人： Raghu Angadi
发送时间： 2008-07-08 01:45:19
收件人： core-user@hadoop.apache.org
抄送： 
主题： Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets?

ConcurrentModificationException looks like a bug we should file a jira.

Regd why the writes are failing, we need to look at more logs.. Could 
you attach complete log from one of the failed tasks. Also try to see if 
there is anything in NameNode log around that time.

Raghu.

C G wrote:
> Hi All:
>  
> I've got 0.17.0 set up on a 7 node grid (6 slaves w/datanodes, 1 master 
> running namenode).  I'm trying to process a small (180G) dataset.  I've done 
> this succesfully and painlessly running 0.15.0.  When I run 0.17.0 with the 
> same data and same code (w/API changes for 0.17.0 and recompiled, of course), 
> I get a ton of failures.  I've increased the number of namenode threads 
> trying to resolve this, but that doesn't seem to help.  The errors are of the 
> following flavor:
>  
> java.io.IOException: Could not get block locations. Aborting...
> java.io.IOException: All datanodes 10.2.11.2:50010 are bad. Aborting...
> Exception in thread "Thread-2" java.util.ConcurrentModificationException
> Exception closing file /blah/_temporary/_task_200807052311_0001_r_
> 04_0/baz/part-x
>  
> As things stand right now, I can't deploy to 0.17.0 (or 0.16.4 or 0.17.1).  I 
> am wondering if anybody can shed some light on this, or if others are having 
> similar problems.  
>  
> Any thoughts, insights, etc. would be greatly appreciated.
>  
> Thanks,
> C G
>  
> Here's an ugly trace:
> 08/07/06 01:43:29 INFO mapred.JobClient:  map 100% reduce 93%
> 08/07/06 01:43:29 INFO mapred.JobClient: Task Id : 
> task_200807052311_0001_r_03_0, Status : FAILED
> java.io.IOException: Could not get block locations. Aborting...
> at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
> at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
> at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
> task_200807052311_0001_r_03_0: Exception closing file 
> /output/_temporary/_task_200807052311_0001_r_
> 03_0/a/b/part-3
> task_200807052311_0001_r_03_0: java.io.IOException: All datanodes 
> 10.2.11.2:50010 are bad. Aborting...
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.ja
> va:2095)
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1
> 818)
> task_200807052311_0001_r_03_0: Exception in thread "Thread-2" 
> java.util..ConcurrentModificationException
> task_200807052311_0001_r_03_0:  at 
> java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
> task_200807052311_0001_r_03_0:  at 
> java.util.TreeMap$KeyIterator.next(TreeMap.java:1154)
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217)
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214)
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324)
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224)
> task_200807052311_0001_r_03_0:  at 
> org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209)
> 08/07/06 01:44:32 INFO mapred.JobClient:  map 100% reduce 74%
> 08/07/06 01:44:32 INFO mapred.JobClient: Task Id : 
> task_200807052311_0001_r_01_0, Status : FAILED
> java.io.IOException: Could not get block locations. Aborting...
> at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
> at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
> at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
> task_200807052311_0001_r_01_0: Exception in thread "Thread-2" 
> java.util..ConcurrentModificationException
> task_200807052311_0001_r_01_0:  at 
> java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
> task_200807052311_0001_r_01_0:  at 
> java.util.TreeMap$KeyIterator.next(TreeMap.java:1154)
> task_200807052311_0001_r_01_0:  at 
> org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217)
> task_200807052311_0001_r_01_0:

Re: OK to remove NN's edits file?

2008-07-07 Thread heyongqiang

I also have encountered this error,i added the try catch clause out the main 
code(FSEditLog.loadFSEdits),and passed the unknown opcode and EOF error,



Best regards,
 
Yongqiang He
2008-07-08

Email: [EMAIL PROTECTED]
Tel:   86-10-62600966(O)
 
Research Center for Grid and Service Computing,
Institute of Computing Technology, 
Chinese Academy of Sciences
P.O.Box 2704, 100080, Beijing, China 



发件人： Otis Gospodnetic
发送时间： 2008-07-07 22:34:02
收件人： core-user@hadoop.apache.org
抄送： 
主题： OK to remove NN's edits file?

Hello,

I have Hadoop 0.16.2 running in a cluster whose Namenode seems to have a 
corrupt "edits" file.  This causes an EOFException during NN init, which causes 
NN to exit immediately (exception below).

What is the recommended thing to do in such a case?

I don't mind losing any of the data that is referenced in "edits" file.
Should I just remove the edits file, start NN, and assume the NN will create a 
new, empty "edits" file and all will be well?

This is what I see when NN tries to start:

2008-07-07 10:58:43,255 ERROR dfs.NameNode - java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at org.apache.hadoop.io.UTF8.readFields(UTF8.java:106)
at org.apache.hadoop.io.ArrayWritable.readFields(ArrayWritable.java:90)
at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:433)
at org.apache.hadoop.dfs.FSImage.loadFSEdits(FSImage.java:756)
at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:639)
at org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:222)
at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:79)
at org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:254)
at org.apache.hadoop.dfs.FSNamesystem. (FSNamesystem.java:235)
at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:131)
at org.apache.hadoop.dfs.NameNode. (NameNode.java:176)
at org.apache.hadoop.dfs.NameNode. (NameNode.java:162)
at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:846)
at org.apache.hadoop.dfs.NameNode.main(NameNode.java:855)

Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

Re: NoSuchMethodException - question to ask Tom White (and others) :-)

2008-07-05 Thread heyongqiang

in most cases, this error is  because u have not implemented the non-argument 
constructor explicitly. 




Best regards,
 
Yongqiang He
2008-07-06

Email: [EMAIL PROTECTED]
Tel:   86-10-62600966(O)
 
Research Center for Grid and Service Computing,
Institute of Computing Technology, 
Chinese Academy of Sciences
P.O.Box 2704, 100080, Beijing, China 



发件人： Xuan Dzung Doan
发送时间： 2008-07-06 09:12:43
收件人： core-user@hadoop.apache.org
抄送： 
主题： NoSuchMethodException - question to ask Tom White (and others) :-)

I'm writing a mapred app in Hadoop 0.16.4 in which I implement my own 
inputsplit, called BioFileSplit, that extends FileSplit (it adds one int data 
field to FileSplit). Testing my program in Eclipse yielded the exception trace 
that roughly looks as follows:

Task Id : task_200807011030_0004_m_00_0, Status : FAILED
java.lang.RuntimeException: java.lang.NoSuchMethodException: 
edu.bio.ec2alignment.BioFileSplit. ()
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:80)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:180)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084)
Caused by: java.lang.NoSuchMethodException: edu.bio.ec2alignment.BioFileSplit. 
()
at java.lang.Class.getConstructor0(Class.java:2706)
at java.lang.Class.getDeclaredConstructor(Class.java:1985)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:74)

I found on the net the following: 
https://issues.apache.org/jira/browse/HADOOP-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12578250#action_12578250

In this, Tom White mentioned a bug that produced exception trace that looks 
similar to this. Are these the same problem? If so, this issue has been taken 
care of in version 0.17.0, right (issue HADOOP-3208?) ? I'd like Tom or others 
to verify this. I'm having a 0.16.4 environment that is stable and I'm happy 
about; I'm not sure how stable 0.17.0 is, and want to justify the decision to 
upgrade.

If these problems are not the same, can anyone suggest any idea about what the 
issue could actually be?

Thanks,
David.

PS: And looks like version 0.17.0 is no longer available on the download page, 
but the latest 0.17.1 :-)

Should there be a way not maintaining the whole namespace structure in memory?

2008-06-30 Thread heyongqiang

In now's hdfs implementation,all INodeFile and INodeDirectory objects were 
loaded into memory,this is done when setting up the  FSNameSpacs structure set 
up at namenode startup.
the namenode will analyze the fsimage file and edit log file. And if there are 
milllions of files or directories how it can be handled?

I have done an exprements by making dirs,before i exprements:
[EMAIL PROTECTED] bin]$ ps -p 9122 -o rss,size,vsize,%mem
  RSSSZVSZ %MEM
153648 1193868 1275340  3.7

after i creating 1 directories, it turns:
[EMAIL PROTECTED] bin]$ ps -p 9122 -o rss,size,vsize,%mem
  RSSSZVSZ %MEM
169084 1193868 1275340  4.0

I m trying to improve the fsimage file,so that namenode can locate and load the 
needed information at need,and just like linux vfs,we can only obtain an inode 
cache.So this can avoid loading the whole namespace structure at startup.




Best regards,
 
Yongqiang He
2008-07-01

Email: [EMAIL PROTECTED]
Tel:   86-10-62600966(O)
 
Research Center for Grid and Service Computing,
Institute of Computing Technology, 
Chinese Academy of Sciences
P.O.Box 2704, 100080, Beijing, China

Re: Data-local tasks

2008-06-30 Thread heyongqiang

Hadoop does not implemented the clever task scheduler, when a data node 
heartbeat with the namenode, and if the data node wants a job, simply get one 
for it.
The selection  does not consider the task's input file at all.




  
Best regards,
 
Yongqiang He
2008-06-25



发件人： Saptarshi Guha
发送时间： 2008-06-30 21:12:24
收件人： core-user@hadoop.apache.org
抄送： 
主题： Data-local tasks

Hello, 
I recall asking this question but this is in addition to what I'ev askd.
Firstly, to recap my question and Arun's specific response:



-- On May 20, 2008, at 9:03 AM, Saptarshi Guha wrote: > Hello, >  
-- Does the "Data-local map tasks" counter mean the number of tasks  that the 
had the input data already present on the machine on they  are running on? 
-- i.e the wasn't a need to ship the data to them.  


Response from Arun

-- Yes. Your understanding is correct. More specifically it means that the 
map-task got scheduled on a machine on which one of the 
-- replicas of it's input-split-block was present and was served by the 
datanode running on that machine. *smile* Arun




Now, Is Hadoop designed to schedule a map task on a machine which has one of 
the replicas of it's input split block?

Failing that, does then assign a map task on machine close to one that contains 
a replica of it's input split block?

Are there any performance metrics for this?



Many thanks

Saptarshi





Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha

Re: Is it possible to access the HDFS using webservices?

2008-06-30 Thread heyongqiang

if u want to access hdfs metadata through webservices, it is ok. but it is not 
a wise way to deal with data. 
And further namenode daemon even can be implemented by webservice,it is just 
another alternative way of rpc.




Best regards,
 
Yongqiang He
2008-07-01

Email: [EMAIL PROTECTED]
Tel:   86-10-62600966(O)
 
Research Center for Grid and Service Computing,
Institute of Computing Technology, 
Chinese Academy of Sciences
P.O.Box 2704, 100080, Beijing, China 



发件人： [EMAIL PROTECTED]
发送时间： 2008-07-01 06:19:30
收件人： [EMAIL PROTECTED]; core-user@hadoop.apache.org
抄送： 
主题： Is it possible to access the HDFS using webservices?

Hi everybody, 

I'm trying to access the hdfs using web services. The idea is that the
web service client can access the HDFS using SOAP or REST and has to
support all the hdfs shell commands. 

Is it some work around this?.

I really appreciate any feedback,

Xavier

Re: Data-local tasks

2008-06-30 Thread heyongqiang

Hadoop does not implemented the clever task scheduler, when a data node 
heartbeat with the namenode, and if the data node wants a job, simply get one 
for it.
The selection  does not consider the task's input file at all.




 
Best regards,
 
Yongqiang He
2008-06-25



发件人： Saptarshi Guha
发送时间： 2008-06-30 21:12:24
收件人： core-user@hadoop.apache.org
抄送： 
主题： Data-local tasks

Hello,
I recall asking this question but this is in addition to what I'ev askd.
Firstly, to recap my question and Arun's specific response:



-- On May 20, 2008, at 9:03 AM, Saptarshi Guha wrote: > Hello, >  
-- Does the "Data-local map tasks" counter mean the number of tasks  that the 
had the input data already present on the machine on they  are running on? 
-- i.e the wasn't a need to ship the data to them.  


Response from Arun

-- Yes. Your understanding is correct. More specifically it means that the 
map-task got scheduled on a machine on which one of the 
-- replicas of it's input-split-block was present and was served by the 
datanode running on that machine. *smile* Arun




Now, Is Hadoop designed to schedule a map task on a machine which has one of 
the replicas of it's input split block?

Failing that, does then assign a map task on machine close to one that contains 
a replica of it's input split block?

Are there any performance metrics for this?



Many thanks

Saptarshi





Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha

Re: Re: understanding of client connection code

2008-06-22 Thread heyongqiang

hehe

I notices that in the DFSClient's DataStreamer thread, the run method is 
sending data out with synchronized on the dataqueue, is this really need?
I mean remove,wait,and getFirst of variable dataQueue should be synchronized on 
the dataQueue,but is it need to hold a lock when send one packet out?
I doubt. Can any developer give me one reason for doing that?




heyongqiang
2008-06-23



发件人： hong
发送时间： 2008-06-21 10:10:59
收件人： core-user@hadoop.apache.org
抄送： 
主题： Re: understanding of client connection code

兄弟是 余海燕 的部队吗？

在 2008-6-20，下午5:00，heyongqiang 写道：

> ipc.Client object is designed be able to share across threads, and  
> each thread can only made synchronized rpc call,which means each  
> thread call and wait for a result or error.This is implemented by a  
> novel technique:each thread made distinct call(with different call  
> object),the user thread then wait at his call object which later  
> will be notified by the connection receiver thread.The user thread  
> made a call by first add his call object into the call list which  
> later be used by the response receiver,and synchronized at the  
> connection's socket outputstream waiting for writing his call out.  
> And the connection's thread is running to collect response on  
> behalf of all user threads.
> which i have not mentioned is that Client actually maintains a  
> connection table.
> In every Client object ,a connection culler is running behind as a  
> daemon,which's sole purpose is to remove idel connection from the  
> connection table,
> but it seems that this culler thread does not close the socket the  
> connection associated with,it only make a mark and do a notify. all  
> the clean staff is handled by the connection thread itself.This is  
> really a wonderful design! even the culler thread can culled the  
> connection from the table, the connection thread also includes  
> remove code. That's because there is chance that the connection  
> thread would encounter some exception.
>
> The above is a brief summary of  my understanding of hadoop's ipc  
> code.
> The below is a test result which is used to test the data  
> throughput of hadoop:
> +--+--+
> | threadCounts | avg(averageRate) |
> +--+--+
> |1 |   53030539.48913 |
> |2 |  35325499.583756 |
> |3 |  24998284.969072 |
> |4 |   19824934.28125 |
> |5 |  15956391.489583 |
> |6 |  15948640.175532 |
> |7 |  14623977.375691 |
> |8 |  16098080.160131 |
> |9 |  8967970.3877005 |
> |   10 |  14569087.178947 |
> |   11 |  8962683.6662088 |
> |   12 |  20063735.297872 |
> |   13 |  13174481.053977 |
> |   14 |  10137907.034188 |
> |   15 |  6464513.2013889 |
> |   16 |   23064338.76087 |
> |   17 |   18688537.44385 |
> |   18 |  18270909.854317 |
> |   19 |  13086261.536538 |
> |   20 |  10784059.367347 |
> +--+--+
>
> the first column represents the thread counts of my test  
> application, the second column is the average download rate.It  
> seems the rate download sharply when the thread count increases.
> This is very simple test application.Anyone can tell me why?where  
> is the bottleneck when user app adopt multiple thread.
>
>
>
>
> heyongqiang
> 2008-06-20

datanode start failure

2008-05-25 Thread heyongqiang

when i restart hdfs ,i encountered the below error,which cause the datanode 
exits.
if i delete the foldes and files where hadoop use to store information and then 
restart,its ok.but i cannot do that everytime i restart...
anyone know why and how to avoid? thanks!

2008-05-22 09:29:51,215 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG:
/
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = 114.vega/192.168.100.114
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.16.1
STARTUP_MSG:   build = 
http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.16 -r 635123; 
compiled by 'hadoopqa' on Sun Mar  9 05:44:19 UTC 2008
/
2008-05-22 09:30:10,662 ERROR org.apache.hadoop.dfs.DataNode: 
java.io.IOException: Incompatible namespaceIDs in 
/opt/hadoop-0.16.1/filesystem/data: namenode namespaceID = 1486286536; datanode 
namespaceID = 1825907088
at org.apache.hadoop.dfs.DataStorage.doTransition(DataStorage.java:298)
at 
org.apache.hadoop.dfs.DataStorage.recoverTransitionRead(DataStorage.java:142)
at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:236)
at org.apache.hadoop.dfs.DataNode.(DataNode.java:162)
at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:2531)
at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2475)
at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2496)
at org.apache.hadoop.dfs.DataNode.main(DataNode.java:2692)
 
2008-05-22 09:30:10,663 INFO org.apache.hadoop.dfs.DataNode: SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down DataNode at 114.vega/192.168.100.114
/

Re: Confuse about the Client.Connection

2008-05-22 Thread heyongqiang

   well,i guess i got the answer.First hadoop use tcp,so will not occur 
situations like reaultBody_by_threadB,callId_by_threadB;Second,the server has 
been synchronied on the response queue when the responder send response 
messages ,and for one call each time.Since the clients use the same server,so 
the messages will not be reordered.




heyongqiang
2008-05-22



发件人： heyongqiang
发送时间： 2008-05-22 13:30:10
收件人： core-user
抄送： 
主题： Confuse about the Client.Connection

hi,all
I took a look at the source code of org.apache.hadoop.ipc.Client ,and i wonder 
if there are two client thread  invoke the getConnection() specifing the same 
arguments,then they will get a same Connection object,how could they 
distinguish the results from each other?
I noticed the results streamed back from the server is collected by the 
Connection's thread,not the callers' threads,
and the Connection's thread expects reaults :callId_XX,reaultBody_XX.
Is there a situation in which Connection's result thread collects 
callId_by_threadA,reaultBody_by_threadB,callId_by_threadB,resultBody_by_threadA?I
 think this situation is kind of reasonable,how does the current code handle 
this?







heyongqiang
[EMAIL PROTECTED]
2008-05-22

Confuse about the Client.Connection

2008-05-21 Thread heyongqiang

hi,all
I took a look at the source code of org.apache.hadoop.ipc.Client ,and i wonder 
if there are two client thread  invoke the getConnection() specifing the same 
arguments,then they will get a same Connection object,how could they 
distinguish the results from each other?
I noticed the results streamed back from the server is collected by the 
Connection's thread,not the callers' threads,
and the Connection's thread expects reaults :callId_XX,reaultBody_XX.
Is there a situation in which Connection's result thread collects 
callId_by_threadA,reaultBody_by_threadB,callId_by_threadB,resultBody_by_threadA?I
 think this situation is kind of reasonable,how does the current code handle 
this?







heyongqiang
[EMAIL PROTECTED]
2008-05-22

Re: Re: Hadoop summit video capture?

2008-05-14 Thread heyongqiang

Have you tried http://research.yahoo.com/node/2104?
but i cannot download the video,and even can not find the ie temp file. Seems 
yahoo has done some limit on it.




heyongqiang
2008-05-15



发件人： Cole Flournoy
发送时间： 2008-05-15 03:48:20
收件人： core-user@hadoop.apache.org
抄送： 
主题： Re: Hadoop summit video capture?

They haven't been uploaded yet, we are begging and hoping that whoever has
them will post them somewhere. I second Veoh, hadoop rocks.

Cole

On Wed, May 14, 2008 at 4:11 PM, Otis Gospodnetic  <
[EMAIL PROTECTED] > wrote:

> I tried finding those Hadoop videos on Veoh, but got 0 hits:
>
>http://www.veoh.com/search.html?type=v&search=hadoop
>
>
> Got URL, Ted?
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
> - Original Message 
>  > From: Ted Dunning  <[EMAIL PROTECTED] >
>  > To: core-user@hadoop.apache.org
>  > Sent: Wednesday, May 14, 2008 1:50:02 PM
>  > Subject: Re: Hadoop summit video capture?
>  >
>  >
>  > Use Veoh instead.  Higher resolution.  Higher uptime.  Nicer embeds.
>  >
>  > And the views get chewed up by hadoop instead of google's implementation!
>  >
>  > (conflict of interest on my part should be noted)
>  >
>  >
>  > On 5/14/08 10:43 AM, "Cole Flournoy" wrote:
>  >
>  >  > Man, yahoo needs to get there act together with their video service
> (the
>  >  > videos are still down)!  Is there anyway someone can upload these
> videos to
>  >  > youTube and provide a link?
>  >  >
>  >  > Thanks,
>  >  > Cole
>  >  >
>  >  > On Wed, Apr 23, 2008 at 11:36 AM, Chris Mattmann  <
>  >  > [EMAIL PROTECTED] > wrote:
>  >  >
>  >  > > Thanks, Jeremy. Appreciate it.
>  >  > >
>  >  > > Cheers,
>  >  > >  Chris
>  >  > >
>  >  > >
>  >  > >
>  >  > > On 4/23/08 8:25 AM, "Jeremy Zawodny" wrote:
>  >  > >
>  >  > > > Certainly...
>  >  > > >
>  >  > > > Stay tuned.
>  >  > > >
>  >  > > > Jeremy
>  >  > > >
>  >  > > > On 4/22/08, Chris Mattmann wrote:
>  >  > > > >
>  >  > > > > Hi Jeremy,
>  >  > > > >
>  >  > > > > Any chance that these videos could be made in a downloadable 
> format
>  >  > > rather
>  >  > > > > than thru Y!'s player?
>  >  > > > >
>  >  > > > > For example I'm traveling right now and would love to watch the 
> rest
> of
>  >  > > > > the
>  >  > > > > presentations but the next few hours I won't have an internet
>  >  > > connection.
>  >  > > > >
>  >  > > > > So, my request won't help me, but may help folks in similar
> situations.
>  >  > > > >
>  >  > > > > Just a thought, thanks!
>  >  > > > >
>  >  > > > > Cheers,
>  >  > > > >   Chris
>  >  > > > >
>  >  > > > >
>  >  > > > >
>  >  > > > > On 4/22/08 1:27 PM, "Jeremy Zawodny" wrote:
>  >  > > > >
>  >  > > > > > Okay, things appear to be fixed now.
>  >  > > > > >
>  >  > > > > > Jeremy
>  >  > > > > >
>  >  > > > > > On 4/20/08, Jeremy Zawodny wrote:
>  >  > > > > > >
>  >  > > > > > > Not yet... there seem to be a lot of cooks in the kitchen on 
> this
> one,
>  >  > > > > but
>  >  > > > > > > we'll get it fixed.
>  >  > > > > > >
>  >  > > > > > > Jeremy
>  >  > > > > > >
>  >  > > > > > > On 4/19/08, Cole Flournoy wrote:
>  >  > > > > > > >
>  >  > > > > > > > Any news on when the videos are going to work?  I am dieing 
> to
> watch
>  >  > > > > > > > them!
>  >  > > > > > > >
>  >  > > > > > > > Cole
>  >  > > > > > > >
>  >  > > > > > > > On Fri, Apr 18, 2008 at 8:10 PM, Jeremy Zawodny
>  >  > > > > > > > wrote:
>  >  > > > > > > >
>  >  > > > > > > > > Almost...

Hadoop Beijing Meeting has successfully concluded! www.hadooper.cn is ready now.

Re: RE: Hadoop beijing meeting draft agenda is ready.

Re: Hadoop beijing meeting draft agenda is ready.

Hadoop beijing meeting draft agenda is ready.

Call for talker at the hadoop beijing meeting!

Re: Re: Hadoop Beijing Meeting

Re: Hadoop Beijing Meeting

Re: File permissions issue

Re: Re: hadoop download performace when user app adopt multi-thread

Re: Re: modified word count example

Re: modified word count example

Re: Monthly Hadoop User Group Meeting

Re: Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets?

Re: Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets?

Re: OK to remove NN's edits file?

Re: NoSuchMethodException - question to ask Tom White (and others) :-)

Should there be a way not maintaining the whole namespace structure in memory?

Re: Data-local tasks

Re: Is it possible to access the HDFS using webservices?

Re: Data-local tasks

Re: Re: understanding of client connection code

datanode start failure

Re: Confuse about the Client.Connection

Confuse about the Client.Connection

Re: Re: Hadoop summit video capture?

25 matches

Site Navigation

Mail list logo

Footer information