[Zope] Re: performance of textindexng2 vs. zctextindex

2005-07-20 Thread Francis Kelly
Thanks for your response, Andreas, and thanks for writing TXNG in the 
first place. I really appreciate the contribution.



I recently installed TextIndexNG2 2.1.1
   



which is *pretty old*. Take a look at the v 2.2.0 which has been optimized 
over the time in different ways. Consider using StupidStorage as documented 
in the release notes.


 



Question re v 2.2.0:  on this page 
http://www.zope.org/Members/ajung/TextIndexNG there are three 
possibilities 2.2.0, 2.2.0b1, 2.2.0b2. The last modified column 
indicates that 2.2.0 has been modified most recently, but I'm assuming 
that the b1  b2 versions are in fact more recent. Is that true?


Also, 2.2.0b1 is around 2MB, whereas the other two are 600K and I 
noticed after downloading 2.2.0 that stemming support seems absent.


So, from all this I'm assuming that if I want stemming support I should 
grab the 2MB 2.2.0b1. Is that right?


I'll look into StupidStorage.



I've been struck that if the number of search hits is high, TextIndexNG2
is much slower than ZCTextIndex. For example, if I do a search on
'podcast' (our site deals w/ podcasting) I get about 14,000 hits.
ZCTextIndex returns the results in about 0.1 seconds; TextIndexNG2 takes
31 seconds or 300 times longer. In general, the more hits there are, the
bigger the difference between the two search indexes.
   



Query speed depends on different things: the query, the implementation, the 
operations needed to be performed during the query. Because of some 
functionality TXNG needs to store much more information than ZCTextIndex.

It did this as said above sometimes in a not so efficient way (see above).
You might also look at TextIndexNG V3.


 



V3 is appealing but we're running 2.7.6 and I'm not sure that I want the 
hassle of installing Five at this point. When we go to 2.8, which we'll 
probably due in a couple of months, V3 will be an option.


Thanks for your help,
Francis



___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )


[Zope] performance of textindexng2 vs. zctextindex

2005-07-19 Thread Francis Kelly
I recently installed TextIndexNG2 2.1.1 on a system running Zope 2.7.6 
on Fedora Core 3. I've been running some comparison tests with 
ZCTextIndex, which is what our site currently uses. We're indexing 
around 50,000 objects at the moment. For TextIndexNG2, this is the 
configuration:


Indexed attributes   keywordSearchSource
Default encodingutf-8
Storage StandardStorage
Stemmer english
Splitter: casefolding   enabled
Splitter: index single characters   disabled
Splitter: max. length of splitted words 64
Splitter: separator characters  .+-_@
Default query parserPyQueryParser
Autoexpansion   disabled
Stopwords   english
Normalizer  European
Use converters  disabled
Near distance   
Left truncation disabled



I've been struck that if the number of search hits is high, TextIndexNG2 
is much slower than ZCTextIndex. For example, if I do a search on 
'podcast' (our site deals w/ podcasting) I get about 14,000 hits. 
ZCTextIndex returns the results in about 0.1 seconds; TextIndexNG2 takes 
31 seconds or 300 times longer. In general, the more hits there are, the 
bigger the difference between the two search indexes.


TextIndexNG2 is great: it has many features that we really want and 
perhaps the cost of those features is performance vis-a-vis ZCTextIndex. 
But I'm hoping that maybe I've overlooked an obvious or not-so-obvious 
configuration issue that will enable me to speed up TextIndexNG2.


Thanks for any advice.

Francis Kelly
www.loomia.com



___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )