We'll need a little more detail to help you, what are the sizes of your
updates and how often are they updated.
1) No just re-open the index writer every time to re-index, according to
you it's moderately changing index, just keep a flag on the rows and
batch indexing every so often.
2) It all
Hey,
We are using lucene to index a moderatly changing
database, and I have a couple of questions on a
performance strategy.
1) Should we just have one index writer open unil the
system comes down...or create a new index writer each
time we re-index our data-set.
2) Does anyone have anythoughts..
Hi,
I'm still mostly a beginner, both with Java and Lucene, so I apologize
if this may be dumb questions.
Is making index-modifying operations "safe" as simple just doing the
following?
synchronized (writer) {
while (IndexReader.isLocked(directory))
wait();
writ
You should be fine.
On Fri, 28 Jan 2005 15:21:50 -0600, Bill Tschumy <[EMAIL PROTECTED]> wrote:
> I just want to make sure
> that adding the unrelated field to a single doc won't cause all the
> other documents to increase their storage space.
> --
I have lots of fields that only occur in one d
I have an index containing a lot of documents with common fields. Is
there any speed/space penalty for adding an unrelated document with a
totally unrelated field? I want to store a version number and maybe a
few other bits of meta-info in the index. I just want to make sure
that adding the
This from the highlighter package will give you the IDF :
WeightedTerm[] QueryTermExtractor.getIdfWeightedTerms(Query query,
IndexReader reader, String fieldName)
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional comman
I implemented a Query version of the TermVector
org.apache.lucene.search.QueryTermVector
Works off of an array of Strings or a String and an Analyzer. Is this
what you are looking for?
>>> [EMAIL PROTECTED] 1/28/2005 6:33:18 AM >>>
On Jan 27, 2005, at 10:24 PM, Jonathan Lasko wrote:
> No, the
Hello,
I've asked the publisher ( http://www.manning.com ) yesterday. I don't
know about the exact stores, but apparently they do have a distributor
in Singapore, so you should be able to find Lucene in Action there
soon.
Otis
--- jac jac <[EMAIL PROTECTED]> wrote:
>
> Just wondering:
>
> Is
Morus,
that description of 3 sets of index files is what I was imagining, too.
I'll have to test and add to the book errata, it seems.
Thanks for the info,
Otis
--- Morus Walter <[EMAIL PROTECTED]> wrote:
> Otis Gospodnetic writes:
> > Hello,
> >
> > Yes, that is how optimize works - copies a
Edwin,
--- Edwin Tang <[EMAIL PROTECTED]> wrote:
> I have three indices really that I search via ParallelMultiSearcher.
> All three
> are being updated constantly. We would like to be able to perform a
> search on
> the indices and have the results reflect the latest documents
> indexed. However,
I don't think there is a direct way to get the number of (unique) terms
in the index, so yes, I think you'll have to loop through TermEnum and
count.
Otis
--- Jonathan Lasko <[EMAIL PROTECTED]> wrote:
> I'm looking for the total number of unique terms in the index. I see
>
> that I can get a T
I looked at the Carrot2 docs which mentioned dimension reduction via
singular value decomposition (SVD) .. and other forms too I think.
Question: Does anyone have pointers to successful clustering techniques
used with lucene? I'm particularly interested in 2D and 3D graphics as
well, possibly
Yet another burning question :-). Can someone explain how the document
numbers in Lucene documents work? For example, the TermDocs.doc()
method returns "the current doc number." How can I get this doc number
if I just have a Document?
Here's the context. I'm working on implementing Justin Z
Ross - I'm really perplexed by your message. You create HTML from a
database so that you can index it with Lucene, yet wish you could
simply index the data in your database tied to a primary key directly,
right?
Well, you're in luck - you already can do this!
What are you using for indexing?
I agree. My site is all dynamic pages created from the database. Right
now, I have to have a process create dummy pages, index them with Lucene,
then translate the Lucene results into meaningful links. It actually works
better than it sounds, however it could be easier.
If I could just give Luc
I'm looking for the total number of unique terms in the index. I see
that I can get a TermEnum of all the terms in the index, but what is the
fastest way to get the total number of terms?
Jonathan
-
To unsubscribe, e-mail: [EMA
I have three indices really that I search via ParallelMultiSearcher. All three
are being updated constantly. We would like to be able to perform a search on
the indices and have the results reflect the latest documents indexed. However,
that would mean I need to "refresh" my searcher. Because of th
I like your idea and think you are quite right. I see quite some
people are using lucene to the extreme such that relational database
functionalities are replaced by lucene.
However, storing everything in lucene and use it as a relational type
of database will be kind of re-inventing the wheel. Fo
Hello,
we have been experimenting with carrot2 and are very pleased so far,
only one issue: there is no release not even an alpha one and the
dependencies seemed to be patched (jama)
is there any intentions to have any releases in the near future?
thanks
Akmal
Am Montag, den 17.01.2005, 10:15 +
Storing in the index has some performance benefits in the CVS
version of Lucene, as you can store term position offset information and
avoid having to re-analyze for highlighting.
Speaking of which, is there a planned release date for a version that
contains this feature?
--
Maik Schreiber *
On Jan 28, 2005, at 1:46 AM, Jason Polites wrote:
I think they do a proximity result based on keyword matches. So... If
you search for "lucene" and the document returned has this word at the
very start and the very end of the document, then you will see the two
sentences (sequences of words) su
I've added some user-defined lucene functions to
HSQLDB and I've been able to run queries like the
following one:
select top 10 lucene_highlight(adText) from ads where
pricePounds <200 and lucene_query('bass guitar
drums',id)>0 order by lucene_score(id) DESC
I've had similar success with Derby (
Hello,
Thanks, It works fine.
> The field parameter simply defines the default field for all queries
> without an explicit field specification (:).
> Using 'field AND field' as default field does not make sense but does
> not hurt as long as the default field is not used.
> I'm not sure why you c
sunil goyal writes:
>
> I was just trying that...
>
> QueryParser qp = new QueryParser("field AND field", new StandardAnalyzer());
> Query query = qp.parse("name:\"john\" AND age:[10 TO 16]");
>
> It works fine with this. Do I need to specify that QueryParser should
> expect things in order
> "f
I've merged some different fields in one query, with the name of one of
these fields as the second parameter in the
static method, and it worked fine.
Also, you can do a little query parser, and build the queries with
BooleanQuery.
David
sunil goyal wrote:
Hello,
I was just trying that...
Qu
In addition to this discution I would like to mention my efforts in creating
a wrapper around Lucene with the LuceneServer project
(http://sourceforge.net/projects/luceneserver/).
It uses RMI to make indexes available over a network and includes automation
tasks.
I am courrently working on a se
Hello,
I was just trying that...
QueryParser qp = new QueryParser("field AND field", new StandardAnalyzer());
Query query = qp.parse("name:\"john\" AND age:[10 TO 16]");
It works fine with this. Do I need to specify that QueryParser should
expect things in order
"field AND field". Or can I do wi
Hello,
To build queries, you can generate a query like "(text:house OR
text:car) AND (keywords:building)", and then
parse it with the QueryParser.parse method to get the Lucene query.
Is not 100% sql-like syntax, but it's more clear
than the lucene syntax.
Hope it helps
David
sunil goy
On Jan 28, 2005, at 12:40, sunil goyal wrote:
I want to run dynamic queries against the lucene index. Is there any
native syntax available for Lucene so that I can query, by first
generating the query in say an XML or SQL like format (cache this
query) and then use this query over lucene index.
Ta
Hello all,
I want to run dynamic queries against the lucene index. Is there any
native syntax available for Lucene so that I can query, by first
generating the query in say an XML or SQL like format (cache this
query) and then use this query over lucene index.
e.g. So a lucene query syntax in w
On Jan 27, 2005, at 10:24 PM, Jonathan Lasko wrote:
No, the number of occurrences of a term in a Query.
Nothing built-in gives you this. You'd have to dissect the Query
clause-by-clause and cast each clause to the proper type to pull the
terms from them. The Highlighter code does this.
If th
>>Also need http://jcifs.samba.org/ so you can spider
>>windows file shares.
That project also has a very nice servlet filter that
is used to provide automatic authentication of Windows
clients using the NTLM protocol.
___
Otis Gospodnetic writes:
> Hello,
>
> Yes, that is how optimize works - copies all existing index segments
> into one unified index segment, thus optimizing it.
>
> see hit #1: http://www.lucenebook.com/search?query=optimize+disk+space
>
> However, three times the space sounds a bit too much, or
33 matches
Mail list logo