Re: Memory Usage

2008-07-03 Thread Keith Watson
Thanks very much for this; I'll give it a shot. Keith. On 4 Jul 2008, at 00:02, Paul Smith wrote: (there are around 6,000,000 posts on the message board database) Date encoded as yyMMdd: appears to be using around 30M Date encoded as yyMMddHHmmss: appears to be using more than 400M! I g

Re: Store/Index Email Address in Lucene

2008-07-03 Thread miztaken
Hi there, Thanks for the comment. So basically it will be lame to add new field for each email address, wont it? How about getting unique tokens from string of email addresses using EmailFilter.java class and storing it in as a single field ? Jamie-52 wrote: > > Hi miztaken > > Check out: >

Multifield Search with OR and AND on different doc Fields

2008-07-03 Thread RanjithStar
My requirement is to search on SEVEN Fields say F1,F2,F3,F4,F5,F6,F7 having F1,F2,F3,F4 on one doc index and F5,F6,F7 on a different doc index I need to perform a search with ((F1=9 AND F2=4) AND (F3=keyword OR F4=keyword)) OR (F5=9 AND F6=4 AND F7=keyword) For normal search I was doing like th

Re: too many clauses exception

2008-07-03 Thread Chris Lu
This is easy, use: BooleanQuery.setMaxClauseCount(4096); -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?

too many clauses exception

2008-07-03 Thread Gaurav Sharma
Hi, I am stuck with an exception in lucene (too many clauses). When i am using a wild card such as a* i am getting too many clauses exception. It saying maximum clause count is set to 1024. Is there any way to increase this count. Can u please help me out in overcoming this. Thanks in advance. -

too many clauses exception

2008-07-03 Thread Gaurav Sharma
Hi, I am stuck with one more exception. When i am using a wild card such as a* i am getting too many clauses exception. It saying maximum clause count is set to 1024. Is there any way to increase this count. Can u please help me out in overcoming this. Thanks in advance. -Gaurav - -Gaura

Re: Store/Index Email Address in Lucene

2008-07-03 Thread Jamie
Hi miztaken Check out: http://openmailarchiva.svn.sourceforge.net/viewvc/openmailarchiva/Server/trunk/src/com/stimulus/archiva/search/EmailFilter.java?view=markup I think its what you want. I want to index email address in such a way that i can do WildCard, Phrase and Simple search on those it

Re: Lucene Error : java.io.FileNotFoundException

2008-07-03 Thread yugana
I have checked all the jars and tried replacing with the same versions. Still I get the same error. Please let me know what else to check. yug Michael McCandless-2 wrote: > > > It looks like under JBoss you are accidentally using Lucene 1.4, not > 2.3.2. > > Mike > > yugana wrote: > >> >

Re: Memory Usage

2008-07-03 Thread Paul Smith
(there are around 6,000,000 posts on the message board database) Date encoded as yyMMdd: appears to be using around 30M Date encoded as yyMMddHHmmss: appears to be using more than 400M! I guess I would have understood if I was seeing the usage double for sure, or even a little more; no idea

RE: Store/Index Email Address in Lucene

2008-07-03 Thread John Griffin
Miz, The StandardAnalyzer recognizes email addresses as is. That is, it pays attention to the '@' symbol. Just store an email address in a field and search them normally. This assumes you are going to store the different emails in separate fields. There is an alternative strategy if you need it.

RE: Term Frequency for more complex terms

2008-07-03 Thread John Griffin
Matthew, I not totally sure what you are asking but if it's 'where do I call the explain method from?' it looks like you want to call it from the IndexSearcher class. Look at the API docs for Searcher (the IndexSearcher's superclass). John G. P.S. If that's not it, look for explain in the API do

RE: Search question (newbie)

2008-07-03 Thread John Griffin
Chris, I've had similar requirements in the past. First strip the quotes then create a BooleanQuery consisting of two separate queries. 1. TermQuery for the first term - Fred 2. PrefixQuery for the second term - Flintstone When you add each individual query to the BooleanQuery make sure the Bool

Memory Usage

2008-07-03 Thread Keith Watson
Hello All, I have something that's not exactly causing me a major problem, but I would appreciate help in understanding the behaviour here: I have an internet message board, and I soon hope to revamp the code to be using Lucene for searching the threads and posts, as it's far better than

Term Frequency for more complex terms

2008-07-03 Thread Matthew Hall
I have a quick question, could someone point me towards where in the API I'll have to investigate in order to figure out the term frequencies of more complex terms? For example I want to know the tf of "kit ligand" treated as a phrase. I see that luke has access to this information in its exp

Enhancing phrase searching in Lucene

2008-07-03 Thread Asbjørn A . Fellinghaug
Hi. I've just finished my master thesis regarding how to enhance overall phrase searching in search engines nowadays. The focus in the thesis is to experiment with a new approach, whereas I've focused on pair of words (bigrams). The thesis can be freely downloaded here [1]. What I've specifically

Search question (newbie)

2008-07-03 Thread Chris Bamford
Hi, Can someone point me in the right direction please? How can I trap this situation correctly? I receive user queries like this (quotes included): /from:"fred flintston*"/ Which produces a query string of /+from:fred body:flintston/ (where /body/ is the default field) What I

Store/Index Email Address in Lucene

2008-07-03 Thread miztaken
Hi there, I want to index email address in such a way that i can do WildCard, Phrase and Simple search on those items. for each document i will have email addresses string just like in the case of CC and TO in mails. for eg "[EMAIL PROTECTED]; [EMAIL PROTECTED]; john hopkings; [EMAIL PROTECTED]"

Re: Lucene Error : java.io.FileNotFoundException

2008-07-03 Thread Michael McCandless
It looks like under JBoss you are accidentally using Lucene 1.4, not 2.3.2. Mike yugana wrote: Hi, I am indexing content and searching using lucene. It is working fine when I use the simple servlet and jsp mechanism. I am able to search on the indexed content. I tried to implement th