Sebi wrote:
I see. You got good results. I want to have them too. I think there might be 1 problem: my computer performance. For about 9000 docs, and with index optimized (using optimize() function) I get a search (which returned about 70 docs)
> with a time of 1.5 sec. Anyway this is slow.

Please describe your environment.
Computer, PHP version, PHP configuration.

Please make also one additional test:
Time of query execution, which searches for one term, but has empty result set. It may be done with query with qualified non-existing field
----------------
$hits = $index->find('non_existing_field_name:word');
----------------


The interesting point is that Luke execute the same query only in 56 ms.

I have the following questions:

1. Why do you think the Luke tool search the same query in 56 ms? It is PHP 
execution so slow?

I'll check, if searcher spend time somewhere when it's really unnecessary...

2. I have the 7 version of Zend installed. Should I get the last snapshot?

It has some improvements, but it doesn't touch search performance.

3. Do you have any advices for improving this search process?

Yes, but you already know them:
- use optimized index
- don't touch stored fields, which you don't need

It's possible, that I can add something, if you send your search script and index.

With best regards,
   Alexander Veremyev.

Hi Sebi,

1. I've just added necessary methods.

$index->numDocs() may be used to retrieve number of non-deleted documents.
$index->maxDoc() returns one greater than the largest possible document number (synonym for $index->count()).


2. I think, it's already a speed of PHP strings/objects processing itself + large result set.

I just made some tests:
PHP v5.2, WinXP
AMD Athlon 64 3000+, Seagate ST316082 7AS 160Gb SATA HD

a.
index size - 11.000 documents
optimized index - ~42Mb (document content is also stored)
source documents size - 33Mb

Results:
---------------------------
find() with 11000 docs result set - ~2.0 sec
find() with 4000 docs result set  - ~0.86 sec
find() with 1000 docs result set  - ~0.35 sec
---------------------------

b.
index size - 6.059 documents
optimized index - ~40Mb
source documents size - 31Mb (document content is also stored)

Results:
---------------------------
find() with 6059 docs result set - ~0.90 sec
find() with 2 docs result set  - ~0.17 sec
find() with 0 docs result set  - ~0.17 sec
---------------------------


I think it's also possible to make some optimizations.
Please add an issue into issue tracker for this (or I can do it).


3. I got one report for large index some time ago:
Source data: 8Gb
2xAMD 64 Opteron 250
iSCSI 4x36Gb in RAID 1+0
FreeBSD 7.0
Search time is 5-10 sec

I also have some ideas for search optimization, which will work especially for large indices.


With best regards,
    Alexander Veremyev.


Sebi wrote:
Any answer? Alexander?

Anyway I want to add some more questions.

1. The $index->count() does not reflect the real content of the database. I 
need to optimize the index for retrieving the correct number of documents. Is 
there any other way to find the exact count of documents?

2. I want to reopen the search problem. The time is to big.

I have 8737 documents which are indexed right now. When I search after  
keywords like: 'arte', 'galeria', etc, I get a time about 3.15 sec. When I had
only 4500 documents my time was about 1.6 sec. The generated query looks like: +(((titleSrch:galeria)) ((descriptionSrch:galeria)) ((tagsSrch:galeria))) +(countryID:1) . I mention that I measure only the time of the call of find() function. Without the retrieval of the documents fields.
I optimized the index using optimize function and the search was improved. The time was about 1.5 sec (2 times faster). But again is too big. I have only 8737 documents and a size of index about 2.7 MB. Another interesting thing is that, if I use Luke for searching, the time is only 56 ms. So, What is the problem? The PHP file system access?
I want help with search time because that was my first goal: to have a fast 
search by relevance. And this is not what I get right now.

3. How this engine will behave with 1 million of documents? For searching 
inside.





____________________________________________________________________________________
Never Miss an Email
Stay connected with Yahoo! Mail on your mobile.  Get started!
http://mobile.yahoo.com/services?promote=mail








____________________________________________________________________________________ Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. http://advision.webevents.yahoo.com/mailbeta/newmail_tools.html


Reply via email to