I am using ParallelMultSearcher to search across 10 different servers.
Here is the problem I bumped into:
Sometimes when I search for a term, my program just hang there. No result,
no response.
I decided to take one server (No. 10 in my list of servers) off the list,
searching was as smooth as it was supposed to be. (Very Lucky)
At first, I thought I bumped into some scalability issue, so I put No. 10
server back to the list and took another server off the list.
My program hang there too.
Then I took some other servers off the list and left No 10 in the list. My
program still hang. But if I took No 10 off the list, other servers just
worked fine together.
So I guess, index files on No 10 server were corrupted, or there were some
hardware related issues going on.
However, when I used the ParallelMultiSearcher with ONLY No. 10 server left
in the list. It returned results.
Then I looked back. I found the condition which caused my program to hang.
--- 1). There is a hit on the index files on No. 10 server; in the meantime,
2). There is a hit on any other servers too.
But if there is hit on any servers other than No. 10, my program runs just
fine.
I don't believe it's a hardware related issue, because I can use PM search
with only No 10 in the list.
And I don't believe it's a bug in my code or ParallelMultiSearcher, because
other servers just work fine.
Does anybody have any idea? How should I debug into this problem?
Many thanks,
Wenjie
The following are part of my code and some logging info when it hangs:
-----------code---------------
// generate the query from the string
QueryBuilder qb = new QueryBuilder();
Query query = qb.build(this.queryStr, new
StandardAnalyzer());
logger.info(query.toString());
Hits hits = null;
try{
if(searcher != null){
Sort sort = new Sort(new SortField[]{new
SortField("date", true), SortField.FIELD_SCORE, new SortField("uid")});
logger.debug("BBBBB"); // before search
hits = searcher.search(query,sort); //
searcher is a PM Searcher
logger.debug("AAAAA"); // after
}
}catch(IOException e){
logger.fatal("Search results can not be returned " +
e.getMessage());
respond();
throw new RuntimeException(e);
}
--------logging------------
2006-08-02 16:20:44,261 [Thread-2] (SearchingThread.java:109) INFO - Query
from : user [0, 50]
2006-08-02 16:20:44,261 [Thread-2] (SearchingThread.java:111) INFO -
Received: +content:blah blah
2006-08-02 16:20:44,265 [Thread-145] (SearchingThread.java:144) INFO - Num
of server from DB 2
2006-08-02 16:20:44,277 [Thread-145] (SearchingThread.java:201) INFO -
rmi://vip.s-index5b:1098/1154550044272_810000
2006-08-02 16:20:44,363 [Thread-145] (SearchingThread.java:201) INFO -
rmi://vip.s-index1a:1099/1154550044290_57320000
2006-08-02 16:20:44,368 [Thread-145] (SearchingThread.java:273) INFO -
Number of servers used in Search: 2
2006-08-02 16:20:44,368 [Thread-145] (SearchingThread.java:285) DEBUG -
BBBBB
2006-08-02 16:20:44,404 [Thread-145] (SearchingThread.java:287) DEBUG -
AAAAA
2006-08-02 16:20:44,411 [Thread-145] (SearchingThread.java:329) INFO -
[user] hitCount: 2
It was fine this time, becuase 1a did not have a hit, both results were
returned from 5b, the No 10 server.
2006-08-02 16:21:37,552 [Thread-2] (SearchingThread.java:109) INFO - Query
from : user [0, 50]
2006-08-02 16:21:37,552 [Thread-2] (SearchingThread.java:111) INFO -
Received: +content:blah blah
2006-08-02 16:21:37,557 [Thread-149] (SearchingThread.java:144) INFO - Num
of server from DB 2
2006-08-02 16:21:37,564 [Thread-149] (SearchingThread.java:201) INFO -
rmi://vip.s-index5b:1098/1154550097561_807000
2006-08-02 16:21:37,572 [Thread-149] (SearchingThread.java:201) INFO -
rmi://vip.s-index1b:1098/1154550097569_659000
2006-08-02 16:21:37,574 [Thread-149] (SearchingThread.java:273) INFO -
Number of servers used in Search: 2
2006-08-02 16:21:37,575 [Thread-149] (SearchingThread.java:285) DEBUG -
BBBBB
I replaced 1a with 1b, It did not print out AAAAA, and not hitCount either.
It's because, 1b has a hit for this query.
2006-08-02 16:21:58,232 [Thread-2] (SearchingThread.java:109) INFO - Query
from : user [0, 50]
2006-08-02 16:21:58,232 [Thread-2] (SearchingThread.java:111) INFO -
Received: +content:blah blah
2006-08-02 16:21:58,237 [Thread-151] (SearchingThread.java:144) INFO - Num
of server from DB 1
2006-08-02 16:21:58,242 [Thread-151] (SearchingThread.java:201) INFO -
rmi://vip.s-index1b:1098/1154550118239_678000
2006-08-02 16:21:58,244 [Thread-151] (SearchingThread.java:273) INFO -
Number of servers used in Search: 1
2006-08-02 16:21:58,244 [Thread-151] (SearchingThread.java:285) DEBUG -
BBBBB
2006-08-02 16:21:58,253 [Thread-151] (SearchingThread.java:287) DEBUG -
AAAAA
2006-08-02 16:21:58,254 [Thread-151] (SearchingThread.java:329) INFO -
[user] hitCount: 2
I took 5b off, 1b has two hits