How to run Lucene in Action TestCase Examples

2009-02-26 Thread Seid Mohammed
I am using, Netbeans and want to run some of the Junit test examples from there I have added the Test programs to a test packages in the project window and try to run it but it fails to run. I have also included libs to the libraries thanks a lot Seid M. -- "RABI ZIDNI ILMA"

RE: Use of scanned documents for text extraction and indexing

2009-02-26 Thread Renaud Waldura
There is quite a bit of litterature available on this topic. This paper presents a summary. Nothing immediately applicable I'm afraid. Retrieving OCR Text: A survey of current approaches Steven M. Beitzel, Eric C. Jensen, David A Grossman Illinois Institute of Technology It lists a number of othe

Re: Faceted Search using Lucene

2009-02-26 Thread Amin Mohammed-Coleman
Forgot to mention that the previous code that i sent was related to facet search. This is a general search method I have implemented (they can probably be combined...). On Thu, Feb 26, 2009 at 8:21 PM, Amin Mohammed-Coleman wrote: > Hi > I have modified my search code. Here is the following: >

Re: Faceted Search using Lucene

2009-02-26 Thread Amin Mohammed-Coleman
Hi I have modified my search code. Here is the following: [code] public Summary[] search(SearchRequest searchRequest) throwsSearchExecutionException { String searchTerm = searchRequest.getSearchTerm(); if (StringUtils.isBlank(searchTerm)) { throw new SearchExecutionException("Search string ca

Re: TermsFilter Usage Question

2009-02-26 Thread Chetan Shah
Shooot... The category type was stored by the not indexed! :teeth: The moment I flipped the flag of categoryType field to analyzed. I was able to pull the results. -- View this message in context: http://www.nabble.com/TermsFilter-Usage-Question-tp22230841p22231549.html Sent from the Lu

Re: TermsFilter Usage Question

2009-02-26 Thread Chetan Shah
Yep. I searchcriteria and the category type value exists for a given document. -- View this message in context: http://www.nabble.com/TermsFilter-Usage-Question-tp22230841p22231524.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. -

Re: TermsFilter Usage Question

2009-02-26 Thread Erick Erickson
Probably a silly question, but are you *also* sure that you have documents with catetoryType:xyz AND the values you are asking for in searchCriteria? I'd guess that you could also check this out in Luke with a simple +contests:blah +categoryType:xyz as a quick sanity check. The other thing it'd be

Re: Simple Java Object Search

2009-02-26 Thread Erick Erickson
This looks pretty good, the search is always looking in your "basketName" field unless the searchBox.getText() returns something like "field:value". What behavior are you seeing (or not seeing that you expect)? You might try creating your index as an FSDirectory then getting a copy of Luke to exa

TermsFilter Usage Question

2009-02-26 Thread Chetan Shah
Why is this code not returning any results? //Create the query and search QueryParser queryParser = new QueryParser("contents", new StandardAnalyzer()); Query query = queryParser.parse(searchCriteria);

RE: Simple Java Object Search

2009-02-26 Thread Ambati, Ravi BGI SF
Hi, Getting started shows how to index files. It does not have an example of how to index how to index a list of objects. I was able to create an index for the list of objects. How do I retrieve a list of objects that match the query ? // Create Index Directory di

Re: Simple Java Object Search

2009-02-26 Thread Garth Patil
Sure: http://lucene.apache.org/java/2_4_0/gettingstarted.html On Thu, Feb 26, 2009 at 9:06 AM, Ambati, Ravi BGI SF wrote: > > All, > > I have a list of java objects and would like to index the contents of > those objects. And would like to update the index whenever list of > objects is changed. >

Simple Java Object Search

2009-02-26 Thread Ambati, Ravi BGI SF
All, I have a list of java objects and would like to index the contents of those objects. And would like to update the index whenever list of objects is changed. The big question is when users searches for something in index, I would like to get all the objects that matached that search string.

Re: Confidence scores at search time

2009-02-26 Thread Grant Ingersoll
I don't know of anyone doing work on it in the Lucene community. My understanding to date is that it is not really worth trying, but that may in fact be an outdated view. I haven't stayed up on the literature on this subject, so background info on what you are interested in would be help

Use of scanned documents for text extraction and indexing

2009-02-26 Thread Sudarsan, Sithu D.
Hi All: Is there any study / research done on using scanned paper documents as images (may be PDF), and then use some OCR or other technique for extracting text, and the resultant index quality? Thanks in advance, Sithu D Sudarsan sithu.sudar...@fda.hhs.gov sdsudar...@ualr.edu

Re: contains functionality in Lucene

2009-02-26 Thread Danil Ε’ORIN
You can generate n-grams: for example when you index "lucene" you create tokens "luce", "ucen", "cene". It will increase term count (and index size), however on search you will simply search for a single term, which will be extremely fast. It depends how may documents you have, size of each docum

Re: Field Normalisation in Query across two indexes

2009-02-26 Thread Dino Korah
Would any one please help me with this. Thanks On 23/02/2009, Dino Korah wrote: > > Guys, > > I have a question on normalisation. > I am using a prehistoric version of lucene; 2.0.0 > > Context: http://markmail.org/message/z5lcz2htjvqsscam > > I have these two scenario with indexes. > One: > 2 i

Re: contains functionality in Lucene

2009-02-26 Thread Erick Erickson
There is an option to turn leading wildcards on, see QueryParser. All the usual caveats about TooManyClauses apply Best Erick On Thu, Feb 26, 2009 at 7:59 AM, wrote: > > Hi all, > > We have a business requirement that needs Lucene to search similar to > contains (of SQL) such that we can h

Re: Faceted Search using Lucene

2009-02-26 Thread Amin Mohammed-Coleman
Hi Thanks for your help. I will modify my facet search and my other code to use the recommendations. Would it be ok to get a review of the completed code? I just want to make sure that I'm not doing anything that may cause any problems (threading, memory). Cheers On Thu, Feb 26, 2009 at 1:10

Re: Merging two tokenized fields

2009-02-26 Thread Erick Erickson
Reconstructing a field from an index is 1> slow 2> lossy (what about stemmed words? stopwords? ) UNLESS you have stored the data (Field.Store.YES/COMPRESS), in which case you can just get the field from each index and put it in the new one. Tokenization has little to do with this although you coul

Re: Faceted Search using Lucene

2009-02-26 Thread Michael McCandless
See below -- this is an excerpt from the upcoming Lucene in Action revision (chapter 10). It's a simple class. Use it like this for searching: IndexSearcher searcher = manager.get(); try { searcher.search(...). ...render results... } finally { manager.release(searcher); s

contains functionality in Lucene

2009-02-26 Thread Joseph.Syjuco
Hi all, We have a business requirement that needs Lucene to search similar to contains (of SQL) such that we can have something like *ucen* which should return lucene and lucent ... unfortunately wildcards are not allowed at the start of the search keyword - how should I go about this? Is thi

Merging two tokenized fields

2009-02-26 Thread liat oren
Hi, I have two indexes, each has a tokenized field and I would like to combine them both into one field in a new index. How can it be done? (Is it a good approach or is it better to hold them as untokenized text and only when I create the new index, then to tokenize it?) Many thanks, Liat

Merging two tokenized fields

2009-02-26 Thread liat oren
Hi, I have two indexes, each has a tokenized field and I would like to combine them both into one field in a new index. How can it be done? (Is it a good approach or is it better to hold them as untokenized text and only when I create the new index, then to tokenize it?) Many thanks, Liat

Re: background merge hit exception

2009-02-26 Thread Michael McCandless
Also: what sorts of stored fields are you storing? Binary? Compressed? Text with unicode characters? Roughly how many stored fields per document? Mike vivek sar wrote: Hi, We ran into the same issue (corrupted index) using Lucene 2.4.0. There was no outage or system reboot - not su

Re: background merge hit exception

2009-02-26 Thread Michael McCandless
The exception you hit during optimize indicates some corruption in the stored fields file (_X.fdt). Then, the exception hit during CheckIndex was different -- the postings entry (_r0y.frq) for term "desthost:wir" was supposed to have 6 entries but had 0. Did "CheckIndex -fix" allow you to optim

Re: Faceted Search using Lucene

2009-02-26 Thread Amin Mohammed-Coleman
Hi Thanks for your reply. Without sound completely ...silly...how do i go abouts using the methods you mentioned... Cheers Amin On Thu, Feb 26, 2009 at 10:24 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > > Actually, it's best to use IndexReader.incRef/decRef to track the > Index

Re: Faceted Search using Lucene

2009-02-26 Thread Michael McCandless
Actually, it's best to use IndexReader.incRef/decRef to track the IndexReader. You should not rely on GC to close your IndexReader since this can easily tie up resources (eg open file descriptors) for too long. Mike Michael Stoppelman wrote: If another thread is executing a query with t

Re: Faceted Search using Lucene

2009-02-26 Thread Amin Mohammed-Coleman
Hi Thanks for your reply. I have modified the code to the following: public Map getFacetHitCount(String searchTerm) { QueryParser queryParser = newMultiFieldQueryParser(FieldNameEnum.fieldNameDescriptions(), analyzer); Query baseQuery = null; try { if (!StringUtils.isBlank(searchTerm)) { ba