problem with eclipse and lucene 1.9.1

2006-05-18 Thread Amir Hosein Jadidi Nejad
Hi All, I want to use lucene1.9.1 in Eclipse IDE in windows platform, but i can't create new project with Java Project from Existing Ant Buildfile in New Project window. when i use build.xml in top level of lucene source folder, following error is occur : specified buildfile does not contain a

Anyone used Zilverline?

2006-05-18 Thread SG Edwards
Dear all, I am a PhD student working on a text-mining project and I have a literature collection in a Postgres database. I now need to make this rapidly searchable for users through a web-interface so I have been looking at Lucene. I have been looking at the Zilverline search tool

Re: Anyone used Zilverline?

2006-05-18 Thread Chris Lu
You can try DBSight. http://www.dbsight.net You just need to use simple SQL to select content out. And you can customize ranking, analyzers, result templates, etc. You can create a google-like search in 15 minutes. Please let me know your feedback also. Chris Lu

SV: Sort problematics

2006-05-18 Thread Marcus Falck
I have slow subsequent searches. And if i get the cache up and running is it persisted to disc? /Marcus Från: Yonik Seeley [mailto:[EMAIL PROTECTED] Skickat: on 2006-05-17 16:31 Till: java-user@lucene.apache.org Ämne: Re: Sort problematics On 5/17/06,

Re: problem with eclipse and lucene 1.9.1

2006-05-18 Thread Erik Hatcher
On May 18, 2006, at 2:17 AM, Amir Hosein Jadidi Nejad wrote: I want to use lucene1.9.1 in Eclipse IDE in windows platform, but i can't create new project with Java Project from Existing Ant Buildfile in New Project window. when i use build.xml in top level of lucene source folder,

Re: caching lucene

2006-05-18 Thread Erik Hatcher
On May 18, 2006, at 3:06 AM, Alberto Marquÿe9s wrote: Hello I need to know if Lucene has breaks for search, and in case of having it if I can form it and like becoming it. huh? I'm sorry, but I do not understand the question. Erik

Re: SV: Sort problematics

2006-05-18 Thread Erik Hatcher
On May 18, 2006, at 4:52 AM, Marcus Falck wrote: I have slow subsequent searches. And if i get the cache up and running is it persisted to disc? No, Lucene's caches are not persisted, only in RAM. Are you using a new IndexReader/IndexSearcher for your subsequent searches? If not, you're

SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
Yes Erik I'm instantiating a new IndexSearcher for every search. -Ursprungligt meddelande- Från: Erik Hatcher [mailto:[EMAIL PROTECTED] Skickat: den 18 maj 2006 12:08 Till: java-user@lucene.apache.org Ämne: Re: SV: Sort problematics On May 18, 2006, at 4:52 AM, Marcus Falck wrote: I

My first question

2006-05-18 Thread Dan Wiggin
Hi luceners I'm looking Lucene in Action and proving the examples. I have some questions: If I have to index and I'm using MultiSearcher to search in my index, what I have to do for every search? Do I have a new Multisearcher for every search petition or Can I conserve my Multisearcher object

Re: SV: SV: Sort problematics

2006-05-18 Thread Erik Hatcher
On May 18, 2006, at 6:41 AM, Marcus Falck wrote: Yes Erik I'm instantiating a new IndexSearcher for every search. Then don't :) You only need a new IndexSearcher instance when the index itself has changed. -Ursprungligt meddelande- Från: Erik Hatcher [mailto:[EMAIL PROTECTED]

SV: SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
Yes I know. But the index is changed constantly. / Marcus -Ursprungligt meddelande- Från: Erik Hatcher [mailto:[EMAIL PROTECTED] Skickat: den 18 maj 2006 12:52 Till: java-user@lucene.apache.org Ämne: Re: SV: SV: Sort problematics On May 18, 2006, at 6:41 AM, Marcus Falck wrote: Yes

Re: My first question

2006-05-18 Thread Marc Dauncey
i think thats meant to be partition which would definitely make sense in the context of using a multisearcher to search logical domain-specific partitions within an app. - Original Message From: Erik Hatcher [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Thursday, 18 May,

Re: My first question

2006-05-18 Thread Dan Wiggin
Excuse me, jejeje I searched and petition doesn't exist in english is a silly traduction of spanish word that means request. EXCUSE ME.

cache in lucene

2006-05-18 Thread Alberto Marquÿffffe9s
My question is Lucene frisks in memory the results of its searches, to optimize them can be acceded to this breaks to form it - LLama Gratis a cualquier PC del Mundo. Llamadas a fijos y móviles desde 1 céntimo por minuto. http://es.voice.yahoo.com

How to do analysis when creating a query programmatically?

2006-05-18 Thread Satuluri, Venu_Madhav
Hi all, I have recently shifted to creating queries programmatically rather than using the QueryParser as this gave me more flexibility. I am facing a new problem, though: when indexing my fields are being analyzed (on a per-field basis: most are being stemmed etc, some are keywords returned as

Re: SV: SV: SV: Sort problematics

2006-05-18 Thread Erik Hatcher
On May 18, 2006, at 7:04 AM, Marcus Falck wrote: Yes I know. But the index is changed constantly. Then use Solr :)) Erik / Marcus -Ursprungligt meddelande- Från: Erik Hatcher [mailto:[EMAIL PROTECTED] Skickat: den 18 maj 2006 12:52 Till: java-user@lucene.apache.org Ämne:

Re: My first question

2006-05-18 Thread Erik Hatcher
So if I got this right, you're asking if you should use a new MultiSearcher instance for every request. Absolutely not. The techniques for managing when to construct new instances and bring them online and how to warm them up is an interesting topic that I believe Solr has a very nice

Re: How to do analysis when creating a query programmatically?

2006-05-18 Thread Erik Hatcher
On May 18, 2006, at 8:08 AM, Satuluri, Venu_Madhav wrote: Is there any way to run my Query object through my analyzer? Or is there another solution? But of course. Have a look at the source code to QueryParser.getFieldQuery() - it does this very thing. I'm glad to see more folks

SV: SV: SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
Doesn't solr use the same sort implementation as Lucene ? -Ursprungligt meddelande- Från: Erik Hatcher [mailto:[EMAIL PROTECTED] Skickat: den 18 maj 2006 14:57 Till: java-user@lucene.apache.org Ämne: Re: SV: SV: SV: Sort problematics On May 18, 2006, at 7:04 AM, Marcus Falck wrote:

Identifying Analyzers used in Microsoft's Lookout Lucene index for emails

2006-05-18 Thread mark harwood
For those who don't know Lookout is now a Microsoft product based on a .net lucene port which indexes all your Outlook mails. (http://www.lookoutsoft.com/Lookout/download.html) Lookout does a good job of integrating with Outlook and periodically indexing the content but I'd like to put a better

Re: Sort problematics

2006-05-18 Thread karl wettin
On Thu, 2006-05-18 at 16:22 +0200, Marcus Falck wrote: Doesn't solr use the same sort implementation as Lucene ? Solr comes with more cache. Is it a requirement that the new data is instantly available? - To unsubscribe,

SV: Sort problematics

2006-05-18 Thread Marcus Falck
Ok. I just set up a machine running solr and now I will index up a couple of gigabytes to see the difference in performance (using a sort). But since my real index will be around 2TB in size I don't think sorting is the right way to go? I pretty sure I will have to modify the ranking. And yes

Re: SV: SV: SV: Sort problematics

2006-05-18 Thread Yonik Seeley
On 5/18/06, Marcus Falck [EMAIL PROTECTED] wrote: Doesn't solr use the same sort implementation as Lucene ? Yes, but Solr handles the mechanics of warming up a new searcher in the background to avoid those lengthy first-time hits to the FieldCache and norms, and it warms any configured caches

RE: How to do analysis when creating a query programmatically?

2006-05-18 Thread Satuluri, Venu_Madhav
Thanks very much Erik. The QueryParser method was pretty useful in writing my own one. -Venu -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Thursday, May 18, 2006 7:09 PM To: java-user@lucene.apache.org Subject: Re: How to do analysis when creating a query

SV: SV: SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
But it will still require A LOT of RAM just to cache! -Ursprungligt meddelande- Från: Yonik Seeley [mailto:[EMAIL PROTECTED] Skickat: den 18 maj 2006 17:24 Till: java-user@lucene.apache.org Ämne: Re: SV: SV: SV: Sort problematics On 5/18/06, Marcus Falck [EMAIL PROTECTED] wrote:

Re: Sort problematics

2006-05-18 Thread Yonik Seeley
On 5/18/06, Marcus Falck [EMAIL PROTECTED] wrote: But since my real index will be around 2TB in size I don't think sorting is the right way to go? I pretty sure I will have to modify the ranking. They are both sorts, and they both use a priority queue. The differences shouldn't be that great

How are results merged from a multisearcher?

2006-05-18 Thread Tom Emerson
Greetings, Could someone describe how the results from multiple indices are merged when using a MultiSearcher? My naive intuition is that the scores for documents found in each index could be wildly different, so what criteria is used to merge the scored docs? Many thanks in advance, -tree

Re: problem with eclipse and lucene 1.9.1

2006-05-18 Thread Tom Emerson
When adding the Lucene jar file to an Eclipse project you can attach the source code to the jarfile, which allows you to step into Lucene without actually having to build it. This is really convenient, and is easily done. Assuming you have the Lucene jar in your project you attach the source by:

Re: How are results merged from a multisearcher?

2006-05-18 Thread Ken Krugler
Greetings, Could someone describe how the results from multiple indices are merged when using a MultiSearcher? My naive intuition is that the scores for documents found in each index could be wildly different, so what criteria is used to merge the scored docs? I believe they are blindly

Re: SV: SV: SV: SV: Sort problematics

2006-05-18 Thread Erik Hatcher
On May 18, 2006, at 11:33 AM, Marcus Falck wrote: But it will still require A LOT of RAM just to cache! Well, the more RAM you have the better when it comes to Solr responsiveness, I'm sure. But, Solr leverages some caching cleverness so the queries and filters used most frequently are

SV: Sort problematics

2006-05-18 Thread Marcus Falck
I'm well aware of the trade offs. But if you were aware of the large amounts of data that this system should be able to search you woldn't propose the usage of a database. Since I have an separate alert service for immediatly alerts up and running i may be able to do trade offs with the data

SV: SV: SV: SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
Sound very interesting. But the thing is that my system shouldn't benefit so much of this kind of caching. Since the searches are triggered by thousands of different companies that mainly searches for their own products. / Marcus Från: Erik Hatcher

Re: Sort problematics

2006-05-18 Thread Yonik Seeley
On 5/18/06, Marcus Falck [EMAIL PROTECTED] wrote: I'm well aware of the trade offs. But if you were aware of the large amounts of data that this system should be able to search you woldn't propose the usage of a database. If you have a hard requirement of instantly seeing any update, you

should I create a redundant field instead of using BooleanQueries?

2006-05-18 Thread Paulo Silveira
Hello An example: my document has 3 fields: field1, field2 and field3. I have to make queries for each field, and sometimes using all the fields. Should I use a BooleanQuery when searching for a string in the 3 fields, or should I create a redundant field4 (where field4 is the concat of

SV: Sort problematics

2006-05-18 Thread Marcus Falck
Hi Where can i read more about the lucene sort implementation? Does there exist any documentation on the sorting except for the Lucene API docs? / Marcus Från: Yonik Seeley [mailto:[EMAIL PROTECTED] Skickat: to 2006-05-18 20:39 Till:

Re: How are results merged from a multisearcher?

2006-05-18 Thread Daniel Naber
On Donnerstag 18 Mai 2006 18:36, Ken Krugler wrote: Could someone describe how the results from multiple indices are merged when using a MultiSearcher? My naive intuition is that the scores for documents found in each index could be wildly different, so what criteria is used to merge the

SV: Sort problematics

2006-05-18 Thread Marcus Falck
Från: Yonik Seeley [mailto:[EMAIL PROTECTED] Skickat: to 2006-05-18 20:39 Till: java-user@lucene.apache.org Ämne: Re: Sort problematics On 5/18/06, Marcus Falck [EMAIL PROTECTED] wrote: I'm well aware of the trade offs. But if you were aware of the large

Re: SV: Sort problematics

2006-05-18 Thread Erik Hatcher
On May 18, 2006, at 4:25 PM, Marcus Falck wrote: Where can i read more about the lucene sort implementation? Does there exist any documentation on the sorting except for the Lucene API docs? Well, there is Lucene in Action which covers sorting in a fair bit of detail. I hear that book is

SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
Well that book is cool =) Från: Erik Hatcher [mailto:[EMAIL PROTECTED] Skickat: to 2006-05-18 22:56 Till: java-user@lucene.apache.org Ämne: Re: SV: Sort problematics On May 18, 2006, at 4:25 PM, Marcus Falck wrote: Where can i read more about the lucene

Re: Sort problematics

2006-05-18 Thread Yonik Seeley
On 5/18/06, Marcus Falck [EMAIL PROTECTED] wrote: If i use lucene default implementation of the TermScorer and search for you OR her The term scorer will give higher score on documents containing both terms. This is a problem (in our application) since in this case want the same score on

Re: SV: Sort problematics

2006-05-18 Thread Günther Starnberger
On Thu, May 18, 2006 at 10:53:23PM +0200, Marcus Falck wrote: Hello, The term scorer will give higher score on documents containing both terms. This is a problem (in our application) since in this case want the same score on documents as long as they contain 1 of the terms (since we are

Re: How are results merged from a multisearcher?

2006-05-18 Thread Tom Emerson
OK, but what does merged correctly mean? That is really the crux of my question: what is the merging semantics across indices with possibly divergent IDFs. On 5/18/06, Daniel Naber [EMAIL PROTECTED] wrote: On Donnerstag 18 Mai 2006 18:36, Ken Krugler wrote: Could someone describe how the

Re: How are results merged from a multisearcher?

2006-05-18 Thread Daniel Naber
On Donnerstag 18 Mai 2006 23:26, Tom Emerson wrote: OK, but what does merged correctly mean? I assume it means: querying over several indices gives the same ranking as if the documents were in one index. Regards Daniel -- http://www.danielnaber.de

SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
Hi Gunther. We thought in the terms of an index containing the search profiles and search that index using the documents as a query. But we couldn't really figure it out. We have an alert service up and running today using Veritys implementation of alerts. So we looked at the Verity

Re: How are results merged from a multisearcher?

2006-05-18 Thread Ken Krugler
On Donnerstag 18 Mai 2006 18:36, Ken Krugler wrote: Could someone describe how the results from multiple indices are merged when using a MultiSearcher? My naive intuition is that the scores for documents found in each index could be wildly different, so what criteria is used to merge

using MultiFieldQueryParser and WildcardQuery?

2006-05-18 Thread Van Nguyen
I've read through the book and was unable to find a solution to this problem. Currently, my query looks like this: (+description_short:white +description_short:hard +description_short:hat) (+description_long:white +description_long:hard +description_long:hat) using a MultiFieldQueryParser and

Re: SV: Sort problematics

2006-05-18 Thread Erik Hatcher
On May 18, 2006, at 5:22 PM, Günther Starnberger wrote: On Thu, May 18, 2006 at 10:53:23PM +0200, Marcus Falck wrote: Hello, The term scorer will give higher score on documents containing both terms. This is a problem (in our application) since in this case want the same score on documents

Re: using MultiFieldQueryParser and WildcardQuery?

2006-05-18 Thread Erick Erickson
I'm pretty sure that just submitting the query will work. You might want to use the QueryParser(String, Analyzer) form. Don't be put off by the fact that the String is the default field, it doesn't make any difference given that you qualify each term with the field. In fact, you can even use a

Re: using MultiFieldQueryParser and WildcardQuery?

2006-05-18 Thread Chris Hostetter
I think you may have missunderstood his question, i believe what he was saying that his current use of MultiFieldParser will give him ...blah... based on the input string white hard hat; but what he'd like to get is hte same final query structure, but with the individual clauses being