The assumption is wrong. Distributed search is done from indexes on
local file systems not HDFS.
It doesn't return because lucene is trying to search across the indexes
in HDFS in real time which doesn't work because of network overhead.
Depending on the size of the indexes it may actually return after some
time but I have seen it timeout even for small indexes.
Short of it is, move the indexes and segments to a local file system,
then point the distributed search server at their parent directory.
Something like this:
bin/nutch server 8100 /full/path/to/parent/of/local/indexes
It technically doesn't have to be a full path. Then point the
searcher.dir to a directory with search-servers.txt as you have done.
The search-servers.txt points like you have it.
Dennis
MilleBii wrote:
I'm trying to search directly from the index in hdfs so in distributed mode
What do I have wrong ?
created nutch/conf/search-servers.txt with
localhost 8100
pointed search.dir in nutch-site.xml to nutch/conf
tried to start search server with either :
+ nutch server 8100 crawl
+ nutch server 8100 hdfs://localhost:9000/user/nutch/crawl
The nutch server command doesn't return to prompt ???
Is this normal should I wait ?
And of course if I try a search it doesn't work