date:20050128

Re: Re-Indexing a moving target???

2005-01-28 Thread Nader Henein

We'll need a little more detail to help you, what are the sizes of your updates and how often are they updated. 1) No just re-open the index writer every time to re-index, according to you it's moderately changing index, just keep a flag on the rows and batch indexing every so often. 2) It all

Re-Indexing a moving target???

2005-01-28 Thread Yousef Ourabi

Hey, We are using lucene to index a moderatly changing database, and I have a couple of questions on a performance strategy. 1) Should we just have one index writer open unil the system comes down...or create a new index writer each time we re-index our data-set. 2) Does anyone have anythoughts..

Simple question about concurrency

2005-01-28 Thread Peter Kim

Hi, I'm still mostly a beginner, both with Java and Lucene, so I apologize if this may be dumb questions. Is making index-modifying operations "safe" as simple just doing the following? synchronized (writer) { while (IndexReader.isLocked(directory)) wait(); writ

Re: Penalty for storing unrelated field?

2005-01-28 Thread Andy Goodell

You should be fine. On Fri, 28 Jan 2005 15:21:50 -0600, Bill Tschumy <[EMAIL PROTECTED]> wrote: > I just want to make sure > that adding the unrelated field to a single doc won't cause all the > other documents to increase their storage space. > -- I have lots of fields that only occur in one d

Penalty for storing unrelated field?

2005-01-28 Thread Bill Tschumy

I have an index containing a lot of documents with common fields. Is there any speed/space penalty for adding an unrelated document with a totally unrelated field? I want to store a version number and maybe a few other bits of meta-info in the index. I just want to make sure that adding the

Re: query term frequency

2005-01-28 Thread markharw00d

This from the highlighter package will give you the IDF : WeightedTerm[] QueryTermExtractor.getIdfWeightedTerms(Query query, IndexReader reader, String fieldName) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional comman

Re: query term frequency

2005-01-28 Thread Grant Ingersoll

I implemented a Query version of the TermVector org.apache.lucene.search.QueryTermVector Works off of an array of Strings or a String and an Analyzer. Is this what you are looking for? >>> [EMAIL PROTECTED] 1/28/2005 6:33:18 AM >>> On Jan 27, 2005, at 10:24 PM, Jonathan Lasko wrote: > No, the

Re: Lucene in Action hits desk in UK

2005-01-28 Thread Otis Gospodnetic

Hello, I've asked the publisher ( http://www.manning.com ) yesterday. I don't know about the exact stores, but apparently they do have a distributor in Singapore, so you should be able to find Lucene in Action there soon. Otis --- jac jac <[EMAIL PROTECTED]> wrote: > > Just wondering: > > Is

Re: Disk space used by optimize

2005-01-28 Thread Otis Gospodnetic

Morus, that description of 3 sets of index files is what I was imagining, too. I'll have to test and add to the book errata, it seems. Thanks for the info, Otis --- Morus Walter <[EMAIL PROTECTED]> wrote: > Otis Gospodnetic writes: > > Hello, > > > > Yes, that is how optimize works - copies a

Re: Loading a large index

2005-01-28 Thread Otis Gospodnetic

Edwin, --- Edwin Tang <[EMAIL PROTECTED]> wrote: > I have three indices really that I search via ParallelMultiSearcher. > All three > are being updated constantly. We would like to be able to perform a > search on > the indices and have the results reflect the latest documents > indexed. However,

Re: total number of (unique) terms in the index

2005-01-28 Thread Otis Gospodnetic

I don't think there is a direct way to get the number of (unique) terms in the index, so yes, I think you'll have to loop through TermEnum and count. Otis --- Jonathan Lasko <[EMAIL PROTECTED]> wrote: > I'm looking for the total number of unique terms in the index. I see > > that I can get a T

Re: carrot2 question too - Re: Fun with the Wikipedia

2005-01-28 Thread Owen Densmore

I looked at the Carrot2 docs which mentioned dimension reduction via singular value decomposition (SVD) .. and other forms too I think. Question: Does anyone have pointers to successful clustering techniques used with lucene? I'm particularly interested in 2D and 3D graphics as well, possibly

document numbers

2005-01-28 Thread Jonathan Lasko

Yet another burning question :-). Can someone explain how the document numbers in Lucene documents work? For example, the TermDocs.doc() method returns "the current doc number." How can I get this doc number if I just have a Document? Here's the context. I'm working on implementing Justin Z

Re: lucene query (sql kind)

2005-01-28 Thread Erik Hatcher

Ross - I'm really perplexed by your message. You create HTML from a database so that you can index it with Lucene, yet wish you could simply index the data in your database tied to a primary key directly, right? Well, you're in luck - you already can do this! What are you using for indexing?

RE: lucene query (sql kind)

2005-01-28 Thread Ross Rankin

I agree. My site is all dynamic pages created from the database. Right now, I have to have a process create dummy pages, index them with Lucene, then translate the Lucene results into meaningful links. It actually works better than it sounds, however it could be easier. If I could just give Luc

total number of (unique) terms in the index

2005-01-28 Thread Jonathan Lasko

I'm looking for the total number of unique terms in the index. I see that I can get a TermEnum of all the terms in the index, but what is the fastest way to get the total number of terms? Jonathan - To unsubscribe, e-mail: [EMA

Loading a large index

2005-01-28 Thread Edwin Tang

I have three indices really that I search via ParallelMultiSearcher. All three are being updated constantly. We would like to be able to perform a search on the indices and have the results reflect the latest documents indexed. However, that would mean I need to "refresh" my searcher. Because of th

Re: lucene query (sql kind)

2005-01-28 Thread jian chen

I like your idea and think you are quite right. I see quite some people are using lucene to the extreme such that relational database functionalities are replaced by lucene. However, storing everything in lucene and use it as a relational type of database will be kind of re-inventing the wheel. Fo

Re: carrot2 question too - Re: Fun with the Wikipedia

2005-01-28 Thread Akmal Sarhan

Hello, we have been experimenting with carrot2 and are very pleased so far, only one issue: there is no release not even an alpha one and the dependencies seemed to be patched (jama) is there any intentions to have any releases in the near future? thanks Akmal Am Montag, den 17.01.2005, 10:15 +

Re: Search results excerpt similar to Google

2005-01-28 Thread Maik Schreiber

Storing in the index has some performance benefits in the CVS version of Lucene, as you can store term position offset information and avoid having to re-analyze for highlighting. Speaking of which, is there a planned release date for a version that contains this feature? -- Maik Schreiber *

Re: Search results excerpt similar to Google

2005-01-28 Thread Erik Hatcher

On Jan 28, 2005, at 1:46 AM, Jason Polites wrote: I think they do a proximity result based on keyword matches. So... If you search for "lucene" and the document returned has this word at the very start and the very end of the document, then you will see the two sentences (sequences of words) su

Re: lucene query (sql kind)

2005-01-28 Thread mark harwood

I've added some user-defined lucene functions to HSQLDB and I've been able to run queries like the following one: select top 10 lucene_highlight(adText) from ads where pricePounds <200 and lucene_query('bass guitar drums',id)>0 order by lucene_score(id) DESC I've had similar success with Derby (

Re: lucene query (sql kind)

2005-01-28 Thread sunil goyal

Hello, Thanks, It works fine. > The field parameter simply defines the default field for all queries > without an explicit field specification (:). > Using 'field AND field' as default field does not make sense but does > not hurt as long as the default field is not used. > I'm not sure why you c

Re: lucene query (sql kind)

2005-01-28 Thread Morus Walter

sunil goyal writes: > > I was just trying that... > > QueryParser qp = new QueryParser("field AND field", new StandardAnalyzer()); > Query query = qp.parse("name:\"john\" AND age:[10 TO 16]"); > > It works fine with this. Do I need to specify that QueryParser should > expect things in order > "f

Re: lucene query (sql kind)

2005-01-28 Thread David Escuer

I've merged some different fields in one query, with the name of one of these fields as the second parameter in the static method, and it worked fine. Also, you can do a little query parser, and build the queries with BooleanQuery. David sunil goyal wrote: Hello, I was just trying that... Qu

RE: rackmount lucene/nutch - Re: google mini? who needs it when Lucene is there

2005-01-28 Thread Cocula Remi

In addition to this discution I would like to mention my efforts in creating a wrapper around Lucene with the LuceneServer project (http://sourceforge.net/projects/luceneserver/). It uses RMI to make indexes available over a network and includes automation tasks. I am courrently working on a se

Re: lucene query (sql kind)

2005-01-28 Thread sunil goyal

Hello, I was just trying that... QueryParser qp = new QueryParser("field AND field", new StandardAnalyzer()); Query query = qp.parse("name:\"john\" AND age:[10 TO 16]"); It works fine with this. Do I need to specify that QueryParser should expect things in order "field AND field". Or can I do wi

Re: lucene query (sql kind)

2005-01-28 Thread David Escuer

Hello, To build queries, you can generate a query like "(text:house OR text:car) AND (keywords:building)", and then parse it with the QueryParser.parse method to get the Lucene query. Is not 100% sql-like syntax, but it's more clear than the lucene syntax. Hope it helps David sunil goy

Re: lucene query (sql kind)

2005-01-28 Thread PA

On Jan 28, 2005, at 12:40, sunil goyal wrote: I want to run dynamic queries against the lucene index. Is there any native syntax available for Lucene so that I can query, by first generating the query in say an XML or SQL like format (cache this query) and then use this query over lucene index. Ta

lucene query (sql kind)

2005-01-28 Thread sunil goyal

Hello all, I want to run dynamic queries against the lucene index. Is there any native syntax available for Lucene so that I can query, by first generating the query in say an XML or SQL like format (cache this query) and then use this query over lucene index. e.g. So a lucene query syntax in w

Re: query term frequency

2005-01-28 Thread Erik Hatcher

On Jan 27, 2005, at 10:24 PM, Jonathan Lasko wrote: No, the number of occurrences of a term in a Query. Nothing built-in gives you this. You'd have to dissect the Query clause-by-clause and cast each clause to the proper type to pull the terms from them. The Highlighter code does this. If th

Re: rackmount lucene/nutch - Re: google mini? who needs it when Lucene is there

2005-01-28 Thread mark harwood

>>Also need http://jcifs.samba.org/ so you can spider >>windows file shares. That project also has a very nice servlet filter that is used to provide automatic authentication of Windows clients using the NTLM protocol. ___

Re: Disk space used by optimize

2005-01-28 Thread Morus Walter

Otis Gospodnetic writes: > Hello, > > Yes, that is how optimize works - copies all existing index segments > into one unified index segment, thus optimizing it. > > see hit #1: http://www.lucenebook.com/search?query=optimize+disk+space > > However, three times the space sounds a bit too much, or

Re: Re-Indexing a moving target???

Re-Indexing a moving target???

Simple question about concurrency

Re: Penalty for storing unrelated field?

Penalty for storing unrelated field?

Re: query term frequency

Re: query term frequency

Re: Lucene in Action hits desk in UK

Re: Disk space used by optimize

Re: Loading a large index

Re: total number of (unique) terms in the index

Re: carrot2 question too - Re: Fun with the Wikipedia

document numbers

Re: lucene query (sql kind)

RE: lucene query (sql kind)

total number of (unique) terms in the index

Loading a large index

Re: lucene query (sql kind)

Re: carrot2 question too - Re: Fun with the Wikipedia

Re: Search results excerpt similar to Google

Re: Search results excerpt similar to Google

Re: lucene query (sql kind)

Re: lucene query (sql kind)

Re: lucene query (sql kind)

Re: lucene query (sql kind)

RE: rackmount lucene/nutch - Re: google mini? who needs it when Lucene is there

Re: lucene query (sql kind)

Re: lucene query (sql kind)

Re: lucene query (sql kind)

lucene query (sql kind)

Re: query term frequency

Re: rackmount lucene/nutch - Re: google mini? who needs it when Lucene is there

Re: Disk space used by optimize

33 matches

Site Navigation

Mail list logo

Footer information