The assumption is wrong. Distributed search is done from indexes on local file systems not HDFS.

It doesn't return because lucene is trying to search across the indexes in HDFS in real time which doesn't work because of network overhead. Depending on the size of the indexes it may actually return after some time but I have seen it timeout even for small indexes.

Short of it is, move the indexes and segments to a local file system, then point the distributed search server at their parent directory. Something like this:

bin/nutch server 8100 /full/path/to/parent/of/local/indexes

It technically doesn't have to be a full path. Then point the searcher.dir to a directory with search-servers.txt as you have done. The search-servers.txt points like you have it.

Dennis

MilleBii wrote:
I'm trying to search directly from the index in hdfs so in distributed mode

What do I have wrong ?

created  nutch/conf/search-servers.txt with
 localhost 8100

pointed  search.dir in nutch-site.xml to nutch/conf

tried to start search server with either :
 + nutch server 8100  crawl
 + nutch server 8100 hdfs://localhost:9000/user/nutch/crawl

The nutch server command doesn't return to prompt ???
Is this normal should I wait ?

And of course if I try a search it doesn't work

Reply via email to