how to optimize mapreduce procedure??

2009-03-11 Thread ZhiHong Fu
Hello, I'm writing a program which will finish lucene searching in about 12 index directorys, all of them are stored in HDFS. It is done like this: 1. We get about 12 index Directorys through lucene index functionality, each of which about 100M size, 2. We store these 12 index directory

problem in inputSplit

2008-12-11 Thread ZhiHong Fu
Hello, Now I have encountered a very werid problem in custom split, in which I define a IndexDirSplit cotaining a list of Index Directory Path, I implemented like this: package zju.edu.tcmsearch.lucene.search.format; import java.io.IOException; import java.io.DataInput; import java.io.DataOutpu

output in memory

2008-11-26 Thread ZhiHong Fu
Hello , now , I have a mapreduce job , which i want the job result will not be stored in a file , I just need it to be showed to users. So , how can i write a outputFormat for that ? for example , The job will read a big number of data from the database, and then I will process the data suc

Re: Anyone have a Lucene index InputFormat for Hadoop?

2008-11-12 Thread ZhiHong Fu
I think you can refered to contrib/index, It maybe will do some help for you ! 2008/11/12 Anthony Urso <[EMAIL PROTECTED]> > Anyone have a Lucene index InputFormat already implemented? Failing > that, how about a Writable for the Lucene Document class? > > Cheers, > Anthony >

Re: Anyone have a Lucene index InputFormat for Hadoop?

2008-11-12 Thread ZhiHong Fu
I think maybe you can refered to the contrib/index , I may do some help for you! 2008/11/12 Anthony Urso <[EMAIL PROTECTED]> > Anyone have a Lucene index InputFormat already implemented? Failing > that, how about a Writable for the Lucene Document class? > > Cheers, > Anthony >

Hadoop+log4j

2008-11-11 Thread ZhiHong Fu
Hello, I'm very sorry to trouble you, I'm developing a MapReduce Application, And I can get Log.INFO in InputFormat ,but In Mapper or Reducer , I can't get anything . And Now an error occured in the reduce stage. Because the code is a little complicated, I can't find where is the mistake

Mapper value is null problem

2008-11-10 Thread ZhiHong Fu
Hello: I have customized a DbRecordAndOpInputFormt which will retrieve data from several web Services And the Data format is like the dataItem in Database ResultSets. And Now I have encountered a problem, I get right (key,value) in DbRecordReader next() method, But In Mapper map(key,value

Customized InputFormat Problem

2008-11-10 Thread ZhiHong Fu
Hello, I am doing a task, whick will read dbRecord data from web service, and then I will build index on them,But you see, inside the hadoop , The inputFormat is based on the FileInputFormat, So now I have to rewrite my dbRecordInputFormat , And I do it like this: import java.io.DataInpu

hadoop with tomcat

2008-11-09 Thread ZhiHong Fu
Hello: I have implemented a Map/Reduce job, which will receive data from several web services and deal with it based on hadoop. But I want to build a web application to manange this web service and Monitor the Map/Reduce job process, which will be deployed with Tomcat , I have read several m

hadoop start problem

2008-11-08 Thread ZhiHong Fu
Hi: I have encountered a strange problem,. I installed hadoop 0.15.2 on my computer servral month ago. but now I want to upgrade to 0.18.1, I delete the directory hadoop-0.15.2 and copy the hadoop 0.18.1 , and did some very simple configuration following the hadoop pre-sudo distributed

Re: help: InputFormat problem ?

2008-10-28 Thread ZhiHong Fu
ltset instead of the DbSplit. I don't understand why. Thanks for any help. 2008/10/27 ZhiHong Fu <[EMAIL PROTECTED]> > > > 2008/10/27 Owen O'Malley <[EMAIL PROTECTED]> > >> If your application that you are drawing from is doing some sort of web >> crawl th

Re: help: InputFormat problem ?

2008-10-27 Thread ZhiHong Fu
2008/10/27 Owen O'Malley <[EMAIL PROTECTED]> > If your application that you are drawing from is doing some sort of web > crawl that connects to lots of random servers, you may want to use > MultiThreadedMapRunner and do the remote connections in the map. If you are > just connecting to a small set

Re: help: InputFormat problem ?

2008-10-27 Thread ZhiHong Fu
db/DBInputFormat.java?view=markup > > Regards > Mice > > 2008/10/27 ZhiHong Fu <[EMAIL PROTECTED]>: > > Hello : > > > >In hadoop InputFormat are always based on the InputFileFormat , > > But Now I will get data from a web service application. The da

help: InputFormat problem ?

2008-10-26 Thread ZhiHong Fu
Hello : In hadoop InputFormat are always based on the InputFileFormat , But Now I will get data from a web service application. The data will be wrapped as ResultSet type. Now I am wandering " should I write the ResultSet to a file And then read out to do mapreduce job. Or How can I pro

Index Search problem

2008-10-19 Thread ZhiHong Fu
Hello, Now I have run the indexUpdater program under the idr "contrib/index", And I follow the way It builds the index, Now I have no idea how I should search the index using MapReduce model. Any suggestions are welcome. thanks

Re: DataNode Problem

2008-10-15 Thread ZhiHong Fu
well ,but node2 when I run jps command, I just can see TaskTracker, not including DataNode. 2008/10/15 ZhiHong Fu <[EMAIL PROTECTED]> > This is the content of node4 log files: > > 2008-10-15 16:18:59,406 INFO org.apache.hadoop.dfs.Da

Re: DataNode Problem

2008-10-15 Thread ZhiHong Fu
.java:2942) at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2950) at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3072) 2008/10/15 Samuel Guo <[EMAIL PROTECTED]> > please check the logs of the nodes that didn't come up. > > On Wed, Oct 15, 2008

Re: DataNode Problem

2008-10-15 Thread ZhiHong Fu
> minute or two to start dfs on a small cluster. Did you wait for sometime > for > dfs to start and leave safe mode? > > - Prasad. > > On Wednesday 15 October 2008 01:57:44 pm ZhiHong Fu wrote: > > Hello: > > > > I have installed hadoop on a cluster whic

DataNode Problem

2008-10-15 Thread ZhiHong Fu
Hello: I have installed hadoop on a cluster which hava 7 nodes, one is namenode and the other 6 nodes are datanode . and At that time It runs normally, and also I runned the wordcount example, It's good. but today I want to run a mapred application , It reports error. and I found some da

help! how can i control special data to specific datanode?

2008-09-05 Thread ZhiHong Fu
hello. I'm a new user to hadoop. and Now I hava a problem in understanding Hdfs. In such a scene. I have several databases and want to index them. So when I map indexing database, I have to control which database index was stored in which datanode. So when database updated, The index can be upda