FieldSelector

2009-02-16 Thread spring
Hi, what kind of fields loads IndexSearcher.Document doc(int i)? Only those with Field.Store.YES? I'm asking because I do not need to load the tokens - should I use a FieldSelector or are these fields not loaded? Thank you - To

Re: FieldSelector

2009-02-16 Thread Grant Ingersoll
On Feb 16, 2009, at 7:00 AM, wrote: Hi, what kind of fields loads IndexSearcher.Document doc(int i)? Only those with Field.Store.YES? Yes, Lucene can only load those fields that are stored. I'm asking because I do not need to load the tokens - should I use a FieldSelector or are thes

Re: FieldSelector

2009-02-16 Thread Erick Erickson
Depending upon the structure of your index, FieldSelector can make a dramatic difference in your query speed. I wrote up some data peculiar to my setup, I think if you search FieldSelector on the Wiki you can find it. Best Erick On Mon, Feb 16, 2009 at 7:00 AM, wrote: > Hi, > > what kind of fie

Pattern for maintaining FSDirectory copy of RAMDirectory

2009-02-16 Thread Joel Halbert
Hi, I have a RAMDirectory based index. The document source for the index is a database table, where content to be indexed is stored alongside a status (pending_index, indexed, pending_delete, deleted). Each time the application is started, and periodically thereafter, all documents from the databa

Re: Pattern for maintaining FSDirectory copy of RAMDirectory

2009-02-16 Thread Erick Erickson
WARNING: I haven't had occasion to use the Directory.copy method, so I'm talking a bit theoretically here. I guess my main issue with your scheme is the usual abnormal termination issues and how you can be absolutely sure boust what's in your FSDir. So I guess what I'd do is create some kind

Re: Fragment Highlighter Phrase?

2009-02-16 Thread Mark Miller
Ian Vink wrote: Thanks Mark, I got the latest Contrib bits for Highlighter.net (Jan 28/2008 Version 2.3.2) but it looks similar to the older 2.0.0 There is a QueryScroer only. Any ideas? (Really important to me :) Ian I'll send out an email and see if I can get my hands on the C# port a

Querying for a catagory

2009-02-16 Thread AmigoProgrammer
Hi, I have a number of documents that each relate to a client. I would like to use an index and queries to answer two question: - Find relevant documents - Find relevant clients The first one is straight forward. For the second one, I am wondering. Should I iterate over the hits and compute the

Faceted Search using Lucene

2009-02-16 Thread Amin Mohammed-Coleman
Hi I am looking at building a faceted search using Lucene. I know that Solr comes with this built in, however I would like to try this by myself (something to add to my CV!). I have been looking around and I found that you can use the IndexReader and use TermVectors. This looks ok but I'm not su

[ANN] VTD-XML 2.5

2009-02-16 Thread crackeur
VTD-XML 2.5 is now released. Please go to https://sourceforge.net/project/showfiles.php?group_id=110612&package_id=120172&release_id=661376  to download the latest version. Changes from Version 2.4 (2/2009) * Added separate VTD indexing generating and loading (see http://vtd-xml.sf.net/persi

Re: Querying for a catagory

2009-02-16 Thread Erick Erickson
What constitutes a "relevant client"? If you want to restrict the returned documents to a particular client (or even a set of clients) a simple +client: would do the trick. Or you could create a Filter for "relevant clients". If neither of these helps, could you clarify your definition of a r

Re: What's the best way to store metadata?

2009-02-16 Thread Grant Ingersoll
On Feb 11, 2009, at 1:05 PM, Mandy Günther wrote: Hi all, I want to use Lucene in my project but I have the following problem: My goal is to store metadata in the index. Her is an example of the stuff I need to index 1; My; determiner; 2; project; noun; 1 3; uses; verb;

2.3.2 -> 2.4.0 StandardTokenizer issue

2009-02-16 Thread Philip Puffinburger
We have our own Analyzer which has the following Public final TokenStream tokenStream(String fieldname, Reader reader) { TokenStream result = new StandardTokenizer(reader); result = new StandardFilter(result); result = new MyAccentFilter(result); result = new LowerCaseFilter(result)

Re: How to compute the simlarity of a web page?

2009-02-16 Thread Grant Ingersoll
Hmmm, you might be able to do the following: Create a document in a memory index containing the web page Create a query from the keywords Do a search with the query against the memory index and see the score. Alternatively, you could use the corpus statistics plus to create a term vector from

Unique Filter on search results

2009-02-16 Thread selvaa
Hi, I am creating a tracker for web applications. I am indexing all the user credentials while they are logging . The problem is , user might be hit the same web page many times during the action period . So my tracker application is indexing the same value for several