Re: How to combine multiple fields to a single field for indexing

2006-08-24 Thread Chris Hostetter
: What is generally the reason for indexing both individual fields, and the : general-purpose "content" field ? so you can explicitly query for "name:paris" or "city:paris" instead of just "paris" : name : John Smith : food : subway sandwich : : So the general-purpose "content" would have the f

Re: How to combine multiple fields to a single field for indexing

2006-08-24 Thread Gopikrishnan Subramani
Erik's has used a space as the field separator. May be you can use a different field separator that your analyzer won't eat up, so that will change the token position by 1. Gopi On 8/24/06, KEGan <[EMAIL PROTECTED]> wrote: Erik, What is generally the reason for indexing both individual fields

Re: DateTools.set-

2006-08-24 Thread Chris Hostetter
I think the confusion here is that when DateTools looks at a Date object and a Resolution, it does it's calculations in GMT (so when you ask what "day" it is at a particular moment, it tells you the current day in GMT, when you ask which month, it tells you the month in GMT, etc...) This may seem

Re: Tomcat Simple Example

2006-08-24 Thread Michael Wechner
Mag Gam wrote: Thanks! So, when working with Tomcat, for a simple Index + Search, it is recommend to use JSP over servlets? any advice? well, the issue seems to me rather that the (X)HTML is hardcoded into the JSP resp. Servlet which creates a maintenance nightmare when one wants to cust

Re: How to combine multiple fields to a single field for indexing

2006-08-24 Thread KEGan
I think I start to understand this :) .. Thanks guys. ~KEGan On 8/24/06, Gopikrishnan Subramani <[EMAIL PROTECTED]> wrote: Erik's has used a space as the field separator. May be you can use a different field separator that your analyzer won't eat up, so that will change the token position by

RE: Change index structure

2006-08-24 Thread WATHELET Thomas
Thanks a lot. -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: 23 August 2006 14:26 To: java-user@lucene.apache.org Subject: Re: Change index structure On Aug 23, 2006, at 6:22 AM, WATHELET Thomas wrote: > If I want to add a new field for exemple into an existing i

Re: Tomcat Simple Example

2006-08-24 Thread Erik Hatcher
On Aug 24, 2006, at 3:29 AM, Michael Wechner wrote: As an alternative I would rather suggest that one generates a well- defined XML with JSP or a servlet and then applies an XSLT. If somebody is afraid of performance issues then one might want to consider generating the servlet or jsp code dy

Re: How to combine multiple fields to a single field for indexing

2006-08-24 Thread Erik Hatcher
Yeah, I used a cruder form by appending all the text together into a single string with a space separator in that LIA example. Given the position increment gap between instances of same-named fields that is now part of Lucene, I recommend using multiple field instances instead. Er

Re: Tomcat Simple Example

2006-08-24 Thread Michael Wechner
Erik Hatcher wrote: On Aug 24, 2006, at 3:29 AM, Michael Wechner wrote: As an alternative I would rather suggest that one generates a well- defined XML with JSP or a servlet and then applies an XSLT. If somebody is afraid of performance issues then one might want to consider generating the

sharing of Design documents of Lucene

2006-08-24 Thread sachin
Its nice if someone shares design documents of Lucene with Me. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Tomcat Simple Example

2006-08-24 Thread Robert Koberg
Erik Hatcher wrote: On Aug 24, 2006, at 3:29 AM, Michael Wechner wrote: As an alternative I would rather suggest that one generates a well-defined XML with JSP or a servlet and then applies an XSLT. If somebody is afraid of performance issues then one might want to consider generating the serv

NoClassDefFoundError

2006-08-24 Thread Mag Gam
Hi All, I keep getting this error in my tomcatlogs Aug 24, 2006 7:44:09 AM org.apache.catalina.core.ApplicationContext log INFO: Marking servlet search as unavailable Aug 24, 2006 7:44:09 AM org.apache.catalina.core.StandardWrapperValve invoke SEVERE: Allocate exception for servlet search java.

Re: NoClassDefFoundError

2006-08-24 Thread Erik Hatcher
My hunch is you don't have the Lucene JAR in the classpath at runtime. Erik On Aug 24, 2006, at 7:58 AM, Mag Gam wrote: Hi All, I keep getting this error in my tomcatlogs Aug 24, 2006 7:44:09 AM org.apache.catalina.core.ApplicationContext log INFO: Marking servlet search as unav

Re: Test new query parser?

2006-08-24 Thread Mark Miller
It is interesting to note that Lucene would also seem to suffer from bugs when using spans if you only have a single document in the index. At least with the NotSpanQuery, the spans could wrap around the document from end to beginning. This would be unexpected but would also go away if you add

RE: DateTools.set-

2006-08-24 Thread Paul Snyder
OK, that being the case, is there a prescribed method for dealing with this? Does anybody have a "best practice" for me? -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Thursday, August 24, 2006 2:21 AM To: java-user@lucene.apache.org Subject: Re: DateTools.set-

Re: NoClassDefFoundError

2006-08-24 Thread Mag Gam
Thankyou! You are right. Seems like tomcat overwrites my path. I had to manually move the .jar files into Tomcat's precence. On 8/24/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: My hunch is you don't have the Lucene JAR in the classpath at runtime. Erik On Aug 24, 2006, at 7:58 AM,

Document Get question

2006-08-24 Thread Mag Gam
Is it possible to get Document Name, instead of its entire path? Currently, i have something like this: out.println (doc.get ("path")); // Which gives me /documents/file.txt Is it possible to get "file.txt"

RE: Document Get question

2006-08-24 Thread Mordo, Aviran (EXP N-NANNATEK)
It is up to you. What ever you put in the document during indexing you'll get back. If you'll add a field of just the document name you can retrieve that, or just parse the file name from the path. Aviran http://www.aviransplace.com -Original Message- From: Mag Gam [mailto:[EMAIL PROTEC

Re: sharing of Design documents of Lucene

2006-08-24 Thread Michael McCandless
Its nice if someone shares design documents of Lucene with Me. You could start with the javadocs here: http://lucene.apache.org/java/docs/api/index.html Click on the "Document" class to see some decription for Documents in particular. Or for a broader "get your feet wet" introduction,

Re: How to combine multiple fields to a single field for indexing

2006-08-24 Thread Suba Suresh
Thanks for everyone's help. I understand how it works now. I can get rid of MultiFieldQueryParser in search. thanks suba suresh. Erik Hatcher wrote: Yeah, I used a cruder form by appending all the text together into a single string with a space separator in that LIA example. Given the posit

Re: Document Get question

2006-08-24 Thread Suba Suresh
Index the "filename" when you are indexing as you did the "path". You can get it back with doc.get("filename"); suba suresh. Mag Gam wrote: Is it possible to get Document Name, instead of its entire path? Currently, i have something like this: out.println (doc.get ("path")); // Which gives

Re: Document Get question

2006-08-24 Thread Ronnie Kolehmainen
If you want to get "file.txt" out of "/documents/file.txt" simply cut of everything before the last "/": String path = doc.get("path"); String name = path != null ? path.substring(path.lastIndexOf("/") + 1) : path; Otherwise, if you want to store only the name in the index, you will have to d

controlled library

2006-08-24 Thread Zhao, Xin
Hi, I have a design question. Here is what we try to do for indexing: We designed an indexing tool to generate standard MeSH terms from medical citations, and then use Lucene to save the terms and citations for future search. The information we need to save are: a) the exact mesh terms (top 10) b

Boosting Documents and score calculation

2006-08-24 Thread AlexeyG
Hello, I ran into some very strange behavior by Lucene 1.9. Boost factor under 1.3 does not effect the result score! I wrote a simple test to isolate the issue: Writing test index Creating 3 documents with same KEY and boosts of default, 1.1, 1.2, and 1.3 public static void writeTestI

Re: controlled library

2006-08-24 Thread Dedian Guo
in my solution, you can apply one doc for each mesh term, or apply different keyword such as "mesh_1""mesh_10" for your top 10 terms...or u can group your mesh terms as one string then add into a field, which requires a simple string parser for the group string when you wanna read the terms...

Re: Boosting Documents and score calculation

2006-08-24 Thread Chris Hostetter
First off, when trying to make sense of socres you should allways use either HitCollector or one of the TopDocs methods of the Searcher interface -- otherwise the "normalize if greater then 1" logic of the Hits class might confuse you. Second: Searcher.explain(Query,int) is your friend ... it wi

Indexmodifier optimize

2006-08-24 Thread vasu shah
Hi, I added one record to the index and did flush(), optimize() and close() in that order. I had one index file _twca.cfs. After the inserting the document and doing optimization, I have two index files _twca.cfs and _twcf.cfs (both approx. same size) and deletable file having entry for _twc

Re: Indexmodifier optimize

2006-08-24 Thread Chris Hostetter
: I added one record to the index and did flush(), optimize() and close() in that order. : I had one index file _twca.cfs. After the inserting the document and doing optimization, I have two index files _twca.cfs and _twcf.cfs (both approx. same size) and deletable file having entry for _twc

Lucene vs Database Search

2006-08-24 Thread kalpesh patel
Hi, I have an application. It has large number of records around (1.2 million) with a possibility of doubling every year. The average records being added per day is around 3000 distributed over the day. The inserted record has to be searchable immediately once it is entered into the databa

RE: DateTools.set-

2006-08-24 Thread Chris Hostetter
: OK, that being the case, is there a prescribed method for dealing with this? : Does anybody have a "best practice" for me? It depends on how/why you see it as a problem. If all you want to do is sort on the date - you have no problem, they will sort correctly. If you want to display the date

Re: Indexmodifier optimize

2006-08-24 Thread Michael McCandless
Chris Hostetter wrote: : I added one record to the index and did flush(), optimize() and close() in that order. : I had one index file _twca.cfs. After the inserting the document and doing optimization, I have two index files _twca.cfs and _twcf.cfs (both approx. same size) and deletable f

Upgrade from 1.4.3 to 1.9.1. Any problems with using existing index files?

2006-08-24 Thread Jed Wesley-Smith
Hello all, We are upgrading from Lucene 1.4.3 to 1.9.1, and have many customers with large existing index files. In our testing we have reused large indexes created in 1.4.3 in 1.9.1 without incident. We have looked through the changelog and the code and can't see any reason there should be a