stop the running job?

2009-01-12 Thread Samuel Guo
Hi all, Is there any method that I can use to stop or suspend a runing job in Hadoop? Regards, Samuel

evaluate the size of the input & split them in parallel

2008-11-16 Thread Samuel Guo
Hi all, When I am using Hadoop to do some Map/Reduce jobs over a large dataset(many thousands of large input files), It seems that the client will take a little long time to initial the job before actually running it. I am doubting that it may be stucked during getting thousands of file's metadata

Re: Hadoop Beijing Meeting

2008-11-12 Thread Samuel Guo
It sounds interesting. 2008/11/12 永强 何 <[EMAIL PROTECTED]> > Hello, all > We are planning to host a Hadoop Beijing meeting on next > Sunday(23th of Nov.). We now welcome speakers and participants! If you are > interested in cloud computing topics and you can join us that day in > Beijing,

Re: DataNode Problem

2008-10-19 Thread Samuel Guo
nce(DataNode.java:2987) >at > org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:2942) >at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2950) >at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3072) > > 2008/10/15 Samuel

Re: DataNode Problem

2008-10-15 Thread Samuel Guo
please check the logs of the nodes that didn't come up. On Wed, Oct 15, 2008 at 6:46 PM, ZhiHong Fu <[EMAIL PROTECTED]> wrote: > Yes ,Thanks, I have tried as you suggested. and I leave the safemode , but > when i run bin/hadoop dfsadmin -report, There still is no datanode > available. > > 2008/1

Re: Use of 'dfs.replication'

2008-10-11 Thread Samuel Guo
the replication number of your file in HDFS. plz check hadoop-default.xml to see the details about these configurations. hope it is helpful. On Sun, Oct 12, 2008 at 4:51 AM, Amit k. Saha <[EMAIL PROTECTED]> wrote: > Hi! > What does the value of the property: "dfs.replication" determine? > > Say

Re: Typical Configuration of a task tracker

2008-10-11 Thread Samuel Guo
for task tracker, you neen to specify the JobTracker's address in its hadoop-site.xml. maybe you can check the log file in your task tracker to see what happend. hope it is helpful. On Sun, Oct 12, 2008 at 2:45 AM, Amit k. Saha <[EMAIL PROTECTED]> wrote: > Hi! > > I am setting up a Hadoop clust

Re: HDFS error

2008-10-09 Thread Samuel Guo
Does this happen when you want to write some files to HDFS? if it is so, plz check that you have enough space in the disks of your datanode. if this happened when you want to read some files in HDFS, maybe you can run fsck to check if the file is healthy. hope it will be helpful. On Fri, Oct 10,

Re: Map and Reduce numbers are not restricted by setNumMapTasks and setNumReduceTasks, JobConf related?

2008-10-06 Thread Samuel Guo
On Tue, Oct 7, 2008 at 1:12 PM, Andy Li <[EMAIL PROTECTED]> wrote: > I think there should be some documents where it indicates that the > "setNumMapTaks" and "setNumReduceTaks" > will be override by the splitting. It was misleading at the first time > when > I use them. I expect that the files w

Re: Map and Reduce numbers are not restricted by setNumMapTasks and setNumReduceTasks, JobConf related?

2008-10-06 Thread Samuel Guo
Mapper's Number depends on your inputformat. Default Inputformat try to treat every file block of a file as a InputSplit. And you will get the same number of mappers as the number of your inputsplits. try to configure "mapred.min.split.size" to reduce the number of your mapper if you want to. And

Re: architecture diagram

2008-10-06 Thread Samuel Guo
I think what Alex talked about 'split' is the mapreduce system's action. What you said about 'split' is your mapper's action. I guess that your map/reduce application uses *TextInputFormat* to treat your input file. your input file will first be splitted into a few splits. these splits may be lik

Re: Adding $CLASSPATH to Map/Reduce tasks

2008-09-26 Thread Samuel Guo
maybe you can use bin/hadoop jar -libjars ${your-depends-jars} your.mapred.jar args see details: http://hadoop.apache.org/core/docs/r0.18.1/api/org/apache/hadoop/mapred/JobShell.html On Thu, Sep 25, 2008 at 12:26 PM, David Hall <[EMAIL PROTECTED]>wrote: > On Sun, Sep 21, 2008 at 9:41 PM, David H

Re: small hadoop cluster setup question

2008-09-26 Thread Samuel Guo
could you please attach your configurations and logs? On Fri, Sep 26, 2008 at 6:12 AM, Ski Gh3 <[EMAIL PROTECTED]> wrote: > Hi all, > > I'm trying to set up a small cluster with 3 machines. I'd like to have one > machine serves as the namenode and the jobtracker, while the 3 all serve as > the d

Re: Format of the value of "fs.default.name" in hadoop-site.xml

2008-09-22 Thread Samuel Guo
you can check ${HADOOP_HOME}/conf/hadoo-default.xml to see infomation about "fs.default.name". fs.default.name file:/// The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property

Re: Need help in hdfs configuration fully distributed way in Mac OSX...

2008-09-16 Thread Samuel Guo
check the namenode's log in machine1 to see if your namenode started successfully :) On Tue, Sep 16, 2008 at 2:04 PM, souravm <[EMAIL PROTECTED]> wrote: > Hi All, > > I'm facing a problem in configuring hdfs in a fully distributed way in Mac > OSX. > > Here is the topology - > > 1. The namenode i

Re: how to configure a remote client? Has anyone tried this before?

2008-08-06 Thread Samuel Guo
Richard Zhang Wrote: Hi folks: How to configure a remote client to the HDFS? For a cluster with a few nodes, could we make one node as the remote client but not one of the data nodes/task nodes? What to change the hadoop-site.xml to configure this? Has anyone tried this before? Thanks. Richard

Re: DFS. How to read from a specific datanode

2008-08-06 Thread Samuel Guo
Kevin 写道: Hi, This is about dfs only, not to consider mapreduce. It may sound like a strange need, but sometimes I want to read a block from a specific data node which holds a replica. Figuring out which datanodes have the block is easy. But is there an easy way to specify which datanode I want

Re: hadoop download performace when user app adopt multi-thread

2008-07-08 Thread Samuel Guo
heyongqiang 写道: > ipc.Client object is designed be able to share across threads, and each > thread can only made synchronized rpc call,which means each thread call and > wait for a result or error.This is implemented by a novel technique:each > thread made distinct call(with different call objec

Re: question about hadoop 0.17 upgrade

2008-05-25 Thread Samuel Guo
־Զ 写道: upgrade 0.16.3 to 0.17, error appears when start dfs and jobtracker. How can I do with it? Thanks! I have use the “start-dfs.sh –upgrade” command to upgrade the filesystem below is the error log: 2008-05-26 09:14:33,463 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG: /*

Re: datanode start failure

2008-05-25 Thread Samuel Guo
heyongqiang 写道: > when i restart hdfs ,i encountered the below error,which cause the datanode > exits. > if i delete the foldes and files where hadoop use to store information and > then restart,its ok.but i cannot do that everytime i restart... > anyone know why and how to avoid? thanks! > > 200

everything becomes very slow when the number of writes is larger than the size of the cluster using *TestDFSIO* benchmark?

2008-05-13 Thread Samuel Guo
Hi all, I run the *TestDFSIO* benchmark on a simple cluster of 2 nodes. The file size is the same in all cases 2GB. The number of files tried is 1,2,4,8(only write). The bufferSize is 65536 bytes. The file replication is 1. the results as below: files 1 2 4 8 write -- Throughout(mb/s) 52.89 52.

java.lang.NoClassDefFoundError: org/apache/lucene/index/IndexDeletionPolicy in contrib/index

2008-04-27 Thread Samuel Guo
me known:) Thanks in advanced! Best Wishes Samuel Guo

Any API used to get the last modified time of the File in HDFS?

2008-04-20 Thread Samuel Guo
Hi all, Can anyone tell me : is there any api I can use to get the metadata info such as the last modified time and etc. of a File in hdfs? Thanks a lot :) Best Wishes:) Samuel Guo

Re: any tools to read sequencefile?

2008-04-16 Thread Samuel Guo
>>> > >>> > >>> 2008/4/16, Peeyush Bishnoi <[EMAIL PROTECTED]>: > >>> > >>> > >>> > >>>> Hello Samuel > >>>> > >>>> There is SequenceFileInputFormat for readi

Re: any tools to read sequencefile?

2008-04-16 Thread Samuel Guo
t; http://hadoop.apache.org/core/docs/r0.16.2/api/org/apache/hadoop/mapred/InputFormat.html > > > > > > --- > > > > > > Peeyush > > > > > > > > > On Wed, 2008-04-16 at 15:56 +0800, Samuel Guo wrote: > > > > > > > >

any tools to read sequencefile?

2008-04-16 Thread Samuel Guo
Hi all: Is there any tool that can be used to read the SequenceFile or other InputFormat/OutputFormat? Best Wishes! Samuel Guo