Ok I don't per say need distributed search. I was trying to avoid a copy to local file system to optimize on ressources working off HDFS
What is the minimum to copy over index and segments ? Not crawldb ? All data in segments ? 2009/12/13, Dennis Kubes <[email protected]>: > The assumption is wrong. Distributed search is done from indexes on > local file systems not HDFS. > > It doesn't return because lucene is trying to search across the indexes > in HDFS in real time which doesn't work because of network overhead. > Depending on the size of the indexes it may actually return after some > time but I have seen it timeout even for small indexes. > > Short of it is, move the indexes and segments to a local file system, > then point the distributed search server at their parent directory. > Something like this: > > bin/nutch server 8100 /full/path/to/parent/of/local/indexes > > It technically doesn't have to be a full path. Then point the > searcher.dir to a directory with search-servers.txt as you have done. > The search-servers.txt points like you have it. > > Dennis > > MilleBii wrote: >> I'm trying to search directly from the index in hdfs so in distributed >> mode >> >> What do I have wrong ? >> >> created nutch/conf/search-servers.txt with >> localhost 8100 >> >> pointed search.dir in nutch-site.xml to nutch/conf >> >> tried to start search server with either : >> + nutch server 8100 crawl >> + nutch server 8100 hdfs://localhost:9000/user/nutch/crawl >> >> The nutch server command doesn't return to prompt ??? >> Is this normal should I wait ? >> >> And of course if I try a search it doesn't work >> > -- -MilleBii-
