RE: Highlighter

2006-01-24 Thread Ravi
Hi , I am also have some problem with highlighter when I want to highlight specific field in the lucene it is not working Thanks Ravi Kumar Jaladanki -Original Message- From: Koji Sekiguchi [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 25, 2006 3:53 AM To: java-user@lucene.ap

Re: Keyword fields, Porter stemming, and QueryParser

2006-01-24 Thread Dave Kor
If reindexing doesn't take too much time and effor, you can reindex using the PerFieldAnalyzerWrapper to have different analyzers for each field. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL

RE: java.io.IOException: read past EOF in BufferedIndexInput.refill

2006-01-24 Thread Dmitry Goldenberg
I see some mentions of NoOpDirectory, as in http://www.gossamer-threads.com/lists/lucene/java-user/14064?search_string=read%20past%20EOF;#14064 which point to the Lucene bug tracker. I just searched the Lucene JIRA and didn't find anything related to NoOpDirectory. Any clues? _

java.io.IOException: read past EOF in BufferedIndexInput.refill

2006-01-24 Thread Dmitry Goldenberg
Has anyone seen this exception and been able to resolve the cause? I have seen numerous mentions of it in the Lucene lists archives but no resolutions, looks like. Anyone? Thanks. java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java

Keyword fields, Porter stemming, and QueryParser

2006-01-24 Thread Dmitry Goldenberg
I'm having a problem with keyword fields and how they're treated by QueryParser. At indexing time, I index my documents, as follows: Content - tokenized, indexed field (the default field) DocType - not tokenized, indexed, stored field ... - other fields The analyzer I use utilizes Port

Re: Sorting by calculated custom score at search time

2006-01-24 Thread Chris Hostetter
: It's not in subversion yet though ;-) : : You have to look here: : http://issues.apache.org/jira/browse/LUCENE-446 Whoops ... sorry about that. I forget how far out on the bleeding edge the code I'm using is sometimes :) It definitely works right now, so you should give it a shot -- but you m

Re: Sorting by calculated custom score at search time

2006-01-24 Thread Yonik Seeley
It's not in subversion yet though ;-) You have to look here: http://issues.apache.org/jira/browse/LUCENE-446 I haven't committed it, because we may be able to do better (maybe removing the difference between Query and ValueSource so you could freely mix the two and not have to wrap ValueSource in

Re: Sorting by calculated custom score at search time

2006-01-24 Thread Chris Hostetter
Take a look at the org.apache.lucene.search.function package in SVN. It provides an API that allows you to define "function" classes that can compute a score for each document using whatever means you want. The overall FunctionQuery can then be wrapped in a BooleanQuery along with whateer other

RE: Highlighter

2006-01-24 Thread Gwyn Carwardine
Yes I think you're right. On reading the "lucene in action" chapted on highlighting I found it squirreled in the middle of the text. I get the feeling that whilst I have so far found query parser to be the primary method of building queries that this is not ht eprimary method used by other people.

Re: Sorting by calculated custom score at search time

2006-01-24 Thread gekkokid
how does TSS boost by date? give a small boost increase like 0.1 or 0.2 x (ArticlePublishDate - IndexCreationDate)? - Original Message - From: "Nick Vincent" <[EMAIL PROTECTED]> To: Sent: Tuesday, January 24, 2006 5:42 PM Subject: Sorting by calculated custom score at search time I

Re: Range queries

2006-01-24 Thread Chris Hostetter
: As Gwyn pointed out, that would make -3 > -2. Personally, I'd use : unsigned numbers and shift the range -- for 16 bit numbers I'd map : -32768..32767 to 0..65535 by adding 32768. I guess you could do that by : having overriding getRangeQuery() (LIA, p207 -- wonderful book). there are a lot

RE: Highlighter

2006-01-24 Thread Koji Sekiguchi
I've never used .net port of Lucene and highlighter, but I believe we have to call Query.rewrite() to expand the query expression when using phrasequery, wildcardquery, regexquery and fuzzyquery, then pass it to highlighter. hope this helps, Koji > -Original Message- > From: Gwyn Carwar

Highlighter

2006-01-24 Thread Gwyn Carwardine
I'm using the .net port of highlighter (1.5) and I notice it doesn't highlight range or prefix queries.. Is this consistent with the java version? Only I note my standard reference of www.lucenebook.com seems to support highlighting.. is this using that same highlighter version (couldn't find any v

RE: Sorting by calculated custom score at search time

2006-01-24 Thread Tim.Wright
Nick Vincent [mailto:[EMAIL PROTECTED] wrote: [snip] > From an earlier thread discussing a calculated score based on the hit > score and the age of document I gather that TSS regenerate their indexes > to alter the document boost based on date. I need to be able to sort by > either relevance or

Sorting by calculated custom score at search time

2006-01-24 Thread Nick Vincent
Hi, I am trying to find a way to create scores with a custom formula based on the initial score from Lucene and field values from each document, e.g. for each document: finalScore = searchScore * (popularity) * (userRating) The customer requires this functionality as I have to replace an existi

Re: performance implications for an index with large number of documents.

2006-01-24 Thread Ori Schnaps
hi, Thank you for all the quick and pertinent responses. The index is being optimized every hour due to the number of updates. The JVM has a heap of 2gig and the machine has a total of 4. Currently the JVM is not configured with -server parameter and the parallel garbage collection (we are test

Re: No sub-file with id _18.f0 found

2006-01-24 Thread Otis Gospodnetic
Look for File Formats page on the left side of Lucene's home page. Some of that is also in Appendix B of Lucene in Action, but the web version is more up to date. Otis - Original Message From: gekkokid <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tue 24 Jan 2006 07:31:45

Re: Two strange things in Lucene

2006-01-24 Thread Paul . Illingworth
The TooManyClauses exception is due to the prefix query being rewritten to a boolean query that exceeds the boolean queries maximum number of clauses. Its an unchecked exception from the search method that you should probably explicitly catch and then return a helpful message to the user maybe

Re: Range queries

2006-01-24 Thread John Haxby
Erik Hatcher wrote: 2. How do I search for negative numbers in a range. For example field:[-3 TO 2] ? I don't mind hacking code such that my numbers are indexed as +0001 and -0001 and then I can override the query parser to change my query to [-003 TO +002]. However.. "+"

Re: Two strange things in Lucene

2006-01-24 Thread Erik Hatcher
On Jan 24, 2006, at 8:52 AM, Daniel Pfeifer wrote: Today I've been alerted by one of my collegues that our Lucene-based indexing solution no longer refreshes the searchers and thus we never get any new indexed documents. Since I didn't find anything in the log from log4j I did a "kill -3" on

Two strange things in Lucene

2006-01-24 Thread Daniel Pfeifer
Today I've been alerted by one of my collegues that our Lucene-based indexing solution no longer refreshes the searchers and thus we never get any new indexed documents. Since I didn't find anything in the log from log4j I did a "kill -3" on the process and found two very interesting things: Almo

Re: performance implications for an index with large number of documents.

2006-01-24 Thread Michael D. Curtin
Hi Ori, Before taking drastic rehosting measures, and introducing the associated software complexity off splitting your application into pieces running on separate machines, I'd recommend looking at the way your document data is distributed and the way you're searching them. Here are some qu

Re: No sub-file with id _18.f0 found

2006-01-24 Thread gekkokid
is there a web page that lists all the files created in a index so i can track down the problem im having im using the latest source via svn and have rebuild using ant everytime i create an index no-matter how basic i get errors from luke - Original Message - From: "gekkokid" <[EMAIL

Re: Lucene Web Site

2006-01-24 Thread Erik Hatcher
On Jan 24, 2006, at 6:48 AM, Mike Streeton wrote: How do you go about getting our product listed on the Powered By Lucene web site (http://wiki.apache.org/jakarta-lucene/PoweredBy) and latest new in the Wiki. Create a wiki account and add it yourself. Self-serve. Erik

Lucene Web Site

2006-01-24 Thread Mike Streeton
How do you go about getting our product listed on the Powered By Lucene web site (http://wiki.apache.org/jakarta-lucene/PoweredBy) and latest new in the Wiki. Many Thanks Mike www.ardentia.com

RE: Selecting the maxium/highest numerical value from a lucene Index)

2006-01-24 Thread Allan Dewar
Thanks for the suggestions... I've only been using Lucene for a few months, so will go with the sort option for now and see how that works. -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: 23 January 2006 19:53 To: java-user@lucene.apache.org Subject: Re: Selecting t

RE: Range queries

2006-01-24 Thread Gwyn Carwardine
>> 2. How do I search for negative numbers in a range. For example >> field:[-3 TO >> 2] ? >> >> I don't mind hacking code such that my numbers are indexed as >> +0001 and >> -0001 and then I can override the query parser to change my >> query to >> [-003 TO +002]. However.. "

Re: Range queries

2006-01-24 Thread Erik Hatcher
On Jan 23, 2006, at 10:38 AM, Gwyn Carwardine wrote: Two queries about ranges: 1. field:[a TO z] does not return the same as field:[z TO a] I think it should. The standard QueryParser or even the range query should ascertain the lowest and highest and switch them around if necessary This

Re: performance implications for an index with large number of documents.

2006-01-24 Thread Chris Lamprecht
How much RAM do you have? If you're under linux, can you run something like "iostat -x -d -t 60" and watch your disk usage during searching? If your disk utilization is high, add more RAM (enough to hold your index in RAM) and see if the OS cache solves the problem. I would try this before the c