RE: Parsing MSWord

2008-11-11 Thread John Griffin
Dipesh, Start here. http://poi.apache.org/ John G. -Original Message- From: dipesh [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 11, 2008 8:38 PM To: java-user@lucene.apache.org Subject: Parsing MSWord Hello, I wanted to know if there are classes in Lucene that support parsing MSW

RE: Only last field indexed

2008-10-09 Thread John Griffin
o I suspect you're > doing something wrong. > > What evidence do you have that it's only the last field that's > indexed? > > Best > Erick > > On Tue, Oct 7, 2008 at 1:28 PM, John Griffin <[EMAIL PROTECTED] > >wrote: > > > Guys, >

Only last field indexed

2008-10-07 Thread John Griffin
Guys, I'm adding multiple fields with the same name to a document as Store.YES, Indexed.TOKENIZED and it seems that only the last field entered is indexed. I read about this somewhere her but now I can't find it, naturally. Is there a work around? does someone have a pointer to this discussion? Ca

Re-tokenized fields disappear

2008-10-06 Thread John Griffin
My previous question may be moot but as is it is still a problem. Here's a little more info on my problem. The same named fields contain two pieces of information, a code "B05" and a value "1" as follows. The value can be a range such as 1 to 5 or 1 to 100. "codesearch", "B05 1" This field

Re-tokenized fields disappear

2008-10-06 Thread John Griffin
Guys, I have documents with multiple stored, tokenized fields of the same name but different values in them such as: "codesearch", "B01" "codesearch", "B0105" "codesearch", "Q01" Etc; I receive a new code to add to the document so I create a copy of the document, call deleteFields

RE: Field Question

2008-08-23 Thread John Griffin
(true). Mike John Griffin wrote: > Dimitri, > > Field.TOKENIZED and Field.NO_NORMs send their field's contents > through a tokenizer and make their contents indexed and therefore > searchable. FIELD.UN_TOKENIZED does not send its field's contents > through a >

RE: Field Question

2008-08-22 Thread John Griffin
Dimitri, Field.TOKENIZED and Field.NO_NORMs send their field's contents through a tokenizer and make their contents indexed and therefore searchable. FIELD.UN_TOKENIZED does not send its field's contents through a tokenizer but it still indexes its contents. Only Field.NO does not index it

RE: Lucene Indexing DB records?

2008-08-22 Thread John Griffin
Try Hibernate Search - http://www.hibernate.org/410.html John G. -Original Message- From: ??? [mailto:[EMAIL PROTECTED] Sent: Friday, August 22, 2008 3:27 AM To: java-user@lucene.apache.org Subject: Lucene Indexing DB records? Guess I don't quite understand why there are so few posts ab

RE: Searching an Index on Another Machine

2008-08-07 Thread John Griffin
Dana, RMI. We use it exclusively and we are able to cluster for fail-over. The clustering complicates things quite a bit so you may not need or want it. John G. -Original Message- From: DanaWhite [mailto:[EMAIL PROTECTED] Sent: Thursday, August 07, 2008 9:54 AM To: java-user@lucene.apac

RE: Searching an Index on Another Machine

2008-08-07 Thread John Griffin
Alexander, RMI. We use it exclusively and we are able to cluster for fail-over. The clustering complicates things quite a bit so you may not need or want it. John G. -Original Message- From: Alexander Aristov [mailto:[EMAIL PROTECTED] Sent: Thursday, August 07, 2008 10:35 AM To: java-us

RE: failed to open an indexer after about 20 queries

2008-08-04 Thread John Griffin
Xh, Sorry about those questions. I received two copies of your email. The first was corrupt. We still need to see more code. No there isn't any special config necessary. John G. -Original Message- From: xh sun [mailto:[EMAIL PROTECTED] Sent: Monday, August 04, 2008 8:34 PM To: java-use

RE: failed to open an indexer after about 20 queries

2008-08-04 Thread John Griffin
Xh, We need to see a little more code her. Are you reopening the reader for each query? If so are you closing it each time? We need more information. John G. -Original Message- From: xh sun [mailto:[EMAIL PROTECTED] Sent: Monday, August 04, 2008 8:34 PM To: java-user@lucene.apache.org

RE: Index optimization ...

2008-07-28 Thread John Griffin
Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to sacrifice anything. It defaults to 16.0 MB so depending on the size of your index you may want to make it larger. Do some testing at various values to see where the sweet spot is. John G. -Original Message- From: Dragon

Query time analyzer call

2008-07-26 Thread John Griffin
Guys and Gals, Is there a call that can be made at query time to determine what analyzer was used at index time so the same analyzer can be used in the query? If there is I can't find it. John G.

Deletions

2008-07-10 Thread John Griffin
Guys (and Gals), A question on index deletions, what exactly happens to the Lucene document numbers in an index when a document is deleted? Let's say I have a 5 doc index. Document # Doc 0 doc1 1

RE: newbie question (for John Griffin) - fixed

2008-07-10 Thread John Griffin
Chris, -Original Message- From: Chris Bamford [mailto:[EMAIL PROTECTED] Sent: Thursday, July 10, 2008 9:15 AM To: java-user@lucene.apache.org Subject: Re: newbie question (for John Griffin) - fixed Hi John, Please ignore my earlier questions on this subject, as I have got to the

RE: newbie question (for John Griffin)

2008-07-10 Thread John Griffin
riginal Message- From: Chris Bamford [mailto:[EMAIL PROTECTED] Sent: Thursday, July 10, 2008 5:58 AM To: java-user@lucene.apache.org Subject: Re: newbie question (for John Griffin) Hi John, Further to my question below, I did some back-to-basics investigation of PhraseQueries and found that ev

RE: Index different files in different folders in lucene

2008-07-06 Thread John Griffin
they are not related , so when you like to look for english string you dont need to look for it in arabic and so on. My question is it possible for lucene to index multiple folderes in same time and put them in several indexes? thanks John Griffin-3 wrote: > > Starz, > > How about

RE: Index different files in different folders in lucene

2008-07-06 Thread John Griffin
e and put them in several indexes? thanks John Griffin-3 wrote: > > Starz, > > How about your code so we can see what you are doing? We're flying blind > here. > > John G. > > -Original Message- > From: starz10de [mailto:[EMAIL PROTECTE

RE: Index different files in different folders in lucene

2008-07-05 Thread John Griffin
Starz, How about your code so we can see what you are doing? We're flying blind here. John G. -Original Message- From: starz10de [mailto:[EMAIL PROTECTED] Sent: Saturday, July 05, 2008 12:41 PM To: java-user@lucene.apache.org Subject: Index different files in different folders in lucene

RE: Lucene Error : java.io.FileNotFoundException

2008-07-04 Thread John Griffin
Yug, In the root directory of your JBoss install there is a file that lists all the jar files your version of JBoss is guaranteed to work with (jar-versions.xml). It looks like the jar you're trying to add has a dependency on another jar that doesn't have what it's looking for. Get on the JBoss f

RE: Search question (newbie)

2008-07-04 Thread John Griffin
cene.apache.org Subject: Re: Search question (newbie) John, Thanks, I think I'm getting this now So you created your own BooleanQuery and parsed the string yourself, adding strings as TermQuerys etc., rather than using a QueryParser ? Cheers, - Chris John Griffin wrote: > Chris

RE: Store/Index Email Address in Lucene

2008-07-03 Thread John Griffin
Miz, The StandardAnalyzer recognizes email addresses as is. That is, it pays attention to the '@' symbol. Just store an email address in a field and search them normally. This assumes you are going to store the different emails in separate fields. There is an alternative strategy if you need it.

RE: Term Frequency for more complex terms

2008-07-03 Thread John Griffin
Matthew, I not totally sure what you are asking but if it's 'where do I call the explain method from?' it looks like you want to call it from the IndexSearcher class. Look at the API docs for Searcher (the IndexSearcher's superclass). John G. P.S. If that's not it, look for explain in the API do

RE: Search question (newbie)

2008-07-03 Thread John Griffin
Chris, I've had similar requirements in the past. First strip the quotes then create a BooleanQuery consisting of two separate queries. 1. TermQuery for the first term - Fred 2. PrefixQuery for the second term - Flintstone When you add each individual query to the BooleanQuery make sure the Bool

RE: Hibernate search (Problem adding new Record)

2008-05-02 Thread John Griffin
P.S. The Hibernate Search forum is at http://forum.hibernate.org/viewforum.php?f=9 John G. -Original Message- From: oyesiji [mailto:[EMAIL PROTECTED] Sent: Friday, May 02, 2008 5:24 PM To: java-user@lucene.apache.org Subject: Hibernate search (Problem adding new Record) I am using

RE: Hibernate search (Problem adding new Record)

2008-05-02 Thread John Griffin
Try this: FullTextSession fullTextSession = Search.createFullTextSession(session); for (JobDescription jobDescription : jobDescriptions) { fullTextSession.save(jobDescription); } ^ |

RE: Use of Lucene for DB Search

2008-04-10 Thread John Griffin
Prashant, In addition to the other suggestions, take a look at Hibernate Search at http://www.hibernate.org/410.html. It is specifically designed for full text search in DBs. John G _ From: Prashant Saraf [mailto:[EMAIL PROTECTED] Sent: Thursday, April 10, 2008 7:57 AM To: j

RE: How do i get a text summary

2008-02-27 Thread John Griffin
Ravinder, If you want something from an index it has to be IN the index. So, store a summary field in each document and make sure that field is part of the query. John G. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 27, 2008 7:58 PM To:

RE: Document.setBoost() doesn't work

2008-02-27 Thread John Griffin
Soren, Your documents are being boosted. Because of the way document boost values immediately go through some calculations and are stored in the index Luke will always show 1.o as the boost value. There has been some talk in the recent past that this should be removed from Luke since it is actuall

RE: remote stored index

2008-02-17 Thread John Griffin
Don't forget old, venerable RMI. We use it for multiple remote indexes and it works well. John G. -Original Message- From: Jan Pieper [mailto:[EMAIL PROTECTED] Sent: Sunday, February 17, 2008 5:20 AM To: Lucene Mailinglist Subject: remote stored index Hi guys, I want to use lucene for

RE: how to safely periodically reopen the IndexReader?

2008-02-16 Thread John Griffin
Your users won't appreciate your closing the searcher on them. That is, if you have a highly concurrent system. I don't know about 2.3.0 yet. Haven't had much chance to see the changes but with 2.2.0 I use an atomic counter. It's not that much to program. Regards, John G. -Original Message-

RE: Explanation

2007-11-23 Thread John Griffin
Oh, duh! Of course it is. I've done that before. Thanks Daniel. John G. -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Friday, November 23, 2007 5:52 PM To: java-user@lucene.apache.org Subject: Re: Explanation On Samstag, 24. November 2007, John Griffin

Explanation

2007-11-23 Thread John Griffin
Is there a problem with the term frequency count (tf) and the IndexSearcher.explain method? I'm searching the following string (fieldname is description) for the term 'salesman' and receive the accompanying explanation from IndexSearcher.explain(.). I've highlighted salesman. score => 0.3362463

Re: Document boost, is it working?

2007-10-30 Thread John Griffin
Bruno Dery wrote: Hi all the following is using Lucene 2.2.0. I've been trying to alter the scoring of my search results to boost by date. My idea was to boost documents while indexing using the date but it doesn't work. So I put together this little sample piece of code to investigate furthe

Re: Document boost, is it working?

2007-10-30 Thread John Griffin
Bruno Dery wrote: Hi all the following is using Lucene 2.2.0. I've been trying to alter the scoring of my search results to boost by date. My idea was to boost documents while indexing using the date but it doesn't work. So I put together this little sample piece of code to investigate furthe