> Well it doesn't since there is not justification of why it is the
> way it is. Its like saying, here is that car with 5 weels... enjoy
driving.
> > - I think the explanations there would also answer at least some of
your
> > questions.
I hoped it would answer *some* of the questions... (not al
Well it doesn't since there is not justification of why it is the way it is.
Its like saying, here is that car with 5 weels... enjoy driving.
Karl
Original-Nachricht
Datum: Sun, 10 Dec 2006 13:12:29 -0800
Von: Doron Cohen <[EMAIL PROTECTED]>
An: java-user@lucene.apache.org
Be
Yonik Seeley wrote:
It's read on demand, per indexed field.
So assuming your index is optimized (a single segment), then it
increases by one byte[] each time you search on a new field.
OK, makes sense then. Thanks!
-
To unsubs
On 12/11/06, Eric Jain <[EMAIL PROTECTED]> wrote:
Yonik Seeley wrote:
> There is no real document boost at the index level... it is simply
> multiplied into the boost for every field of that document. So it
> comes down to what fields you want that index-time boost to take
> effect on (as well a
Yonik Seeley wrote:
There is no real document boost at the index level... it is simply
multiplied into the boost for every field of that document. So it
comes down to what fields you want that index-time boost to take
effect on (as well as length normalization).
Come to think of it, I do have
On 12/11/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote:
Eric, you said you aren't using any Field.Index.NO_NORMS fields, but
SegmentReader.ones should only be used if you do use NO_NORMS, so things don't
add up here.
norms(fieldThatDoesntExist) will also return fakeNorms (ones)
-Yonik
http:
Nuno,
If you stop or block all operations that can change the index (e.g. deletes and
additions), you can safely copy the whole index directory. If you do it from
Java, you can use Lucene's own Lock class to lock index for modifications, copy
the index directory, and unlock the index.
Otis
-
Eric, you said you aren't using any Field.Index.NO_NORMS fields, but
SegmentReader.ones should only be used if you do use NO_NORMS, so things don't
add up here.
Otis
- Original Message
From: Yonik Seeley <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Monday, December 11, 200
> I do want to use document boosting... Is that independent from field
> boosting? The length normalization on the other hand may not be
necessary.
They "go together" - see "Score Boosting" in
http://lucene.apache.org/java/docs/scoring.html
---
On 12/11/06, Eric Jain <[EMAIL PROTECTED]> wrote:
I do want to use document boosting... Is that independent from field
boosting? The length normalization on the other hand may not be necessary.
There is no real document boost at the index level... it is simply
multiplied into the boost for ever
Yonik Seeley wrote:
On 12/11/06, Eric Jain <[EMAIL PROTECTED]> wrote:
I've noticed that after stress-testing my application (uses Lucene
2.0) for
I while, I have almost 200mb of byte[]s hanging around, the top two
culprits being:
24 x SegmentReader.Norm.bytes = 112mb
2 x SegmentReader.ones
On 12/11/06, Eric Jain <[EMAIL PROTECTED]> wrote:
I've noticed that after stress-testing my application (uses Lucene 2.0) for
I while, I have almost 200mb of byte[]s hanging around, the top two
culprits being:
24 x SegmentReader.Norm.bytes = 112mb
2 x SegmentReader.ones = 16mb
Each in
Hi guys,
I m wondering how I can cut certain index out of the index file and paste
it to other index file? For instance, I have index a particular file with
contents and other necessary info into particular index folder, then I would
like to move the index info that I have been indexed to othe
Hi,
I have one java service that uses lucene as it's text search engine. This is
working perfectly, but I don't know how to dump/backup it's filesystem index
catalog.
Can I simply do a hot copy, without stoping the service and with index open?
Thanks in advance.
--
Nuno Alexandre Carvalho
---
I've noticed that after stress-testing my application (uses Lucene 2.0) for
I while, I have almost 200mb of byte[]s hanging around, the top two
culprits being:
24 x SegmentReader.Norm.bytes = 112mb
2 x SegmentReader.ones = 16mb
The second one isn't a big deal, but I wonder what's the e
Version 1.9 of VTD-XML, available in C, C#, and Java, is now released.
This version contains XPath-related performance enhancements and bug
fixes. To download the latest release, please visit
http://sourceforge.net/project/showfiles.php?group_id=110612.
For latest performance report, please vis
On 12/11/06, Daniel Naber <[EMAIL PROTECTED]> wrote:
On Monday 11 December 2006 19:18, Andreas Kohn wrote:
> After some debugging, and some tests with the original snowball
> distribution from snowball.tartarus.org, it seems that the attached
> change is needed to avoid the exception.
The attac
Andreas, I could generate the error as you describe.
You can report this bug in http://issues.apache.org/jira/browse/LUCENE
There seem to be a few updates in http://snowball.tartarus.org not
reflected currently in Lucene -
- SnowballProgram.java has this bug fix as you describe
The algorithms
if you are trying to think of Lucene's docid as a meaningful number, you
are doing something wrong.
A lot of people want to view Lucene docids the same way they look at
auto-incrimented unique keys in a database -- don't do that. Instead
think of them as memory addresses in C or C++ ... they are
: Isn't it also true that using Field.Index.NO_NORMS when creating the field
will
: remove it from the scoring formula? I thought I read that somewhere, but now
: can't find where.
queries on fields with NO_NORMS will still contribute to the score, but
the field *length* and/or field bosts won'
Hi,
while playing with the various stemmers of Lucene(-1.9.1), I got an
index out of bounds exception:
lucene-1.9.1>java -cp
build/contrib/snowball/lucene-snowball-1.9.2-dev.jar
net.sf.snowball.TestApp Kp bla.txt
Exception in thread "main" java.lang.reflect.InvocationTargetException
at su
Span Queries also return positional information
On Dec 11, 2006, at 12:12 PM, Steven Rowe wrote:
abdul aleem wrote:
How to actually retrieve the content of search,
Most of the examples in Lucene in Action
Searcher gives the results found in number of
documents
but i coudln't find an API to r
abdul aleem wrote:
> How to actually retrieve the content of search,
>
> Most of the examples in Lucene in Action
> Searcher gives the results found in number of
> documents
>
> but i coudln't find an API to retrieve the line or
> paragraph where the search is matched
Hi Abdul,
I don't know w
I really lack this feature from lucene too.
Whatever the requirements from Mohammed, There surely I see some
improvements in search performance.
My argument here is, why not lucene provides a mechanism to be able to
provide custom document ids?
> -Original Message-
> From: Find Me [mailt
Thanks Erick,
I will take a look,
Apologies but a basic question,
How to actually retrieve the content of search,
Most of the examples in Lucene in Action
Searcher gives the results found in number of
documents
but i coudln't find an API to retrieve the line or
paragraph where the search is ma
On 12/11/06, Waheed Mohammed <[EMAIL PROTECTED]> wrote:
Hello,
Is there a way to influence lucene's generation of ids while indexing.
my requirement is. I want to have different indexes where no index should
have
ids that have been assigned to an index earlier.
for instance
IDX1 : {0.1
I don't believe that this is possible. Or desirable. Lucene IDs are mutable,
even within an index. That is, if you index docs that get, say, IDs 1, 2, 3,
4, 5 and delete doc 2 and optimize, Docs 4 and 5 get reassigned IDs 3 and 4
(or something similar).
You're far better off controlling this your
11 dec 2006 kl. 16.15 skrev Waheed Mohammed:
Is there a way to influence lucene's generation of ids while indexing.
If you speak of the Lucene "document number", then no. And are you
aware of the fact that document numbers are eligable for change at
any time (optimization) without giving
Then you really want to look at the classes that do the work with filters if
you require milliseconds. You should be just fine
On 12/11/06, abdul aleem <[EMAIL PROTECTED]> wrote:
Many thanks to All,
well kind of puzzled because ours is a fast moving log
down to Milliseconds :( as we deal w
Hello,
Is there a way to influence lucene's generation of ids while indexing.
my requirement is. I want to have different indexes where no index should have
ids that have been assigned to an index earlier.
for instance
IDX1 : {0.100}
IDX2: {101...200}
IDX3: {201...300}
but not
Aplogies, forget to mention there are great people
around in this group, they are of great help as well
:):)
--- abdul aleem <[EMAIL PROTECTED]> wrote:
> Many thanks to All,
>
> well kind of puzzled because ours is a fast moving
> log
> down to Milliseconds :( as we deal with forex on a
> finan
Many thanks to All,
well kind of puzzled because ours is a fast moving log
down to Milliseconds :( as we deal with forex on a
financial system.
Im sure there will be workarounds, actually most of
the time it is enough to search within 2 log files of
1-5MB size, coz we are more intersted in second
>>Extend QueryParser to sort this out.
The latest version in SVN has changed the default QueryParser behaviour to use
RangeFilters instead of RangeQuerys
- Original Message
From: Mike Streeton <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Monday, 11 December, 2006 1:35:47
I would use a RangeFilter instead of using the default Boolean query as
this will always break at some point with Too many Boolean clauses.
Extend QueryParser to sort this out. As far as extracting information
from log files I would look at creating yourself a LogAnalyzer that can
interpret the co
Many thanks Grant,
I will now dirty my hands with Lucene to get our
requirements
regards,
Abdul
--- Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> See below
>
> On Dec 11, 2006, at 7:04 AM, abdul aleem wrote:
>
> > Hi All,
> >
> > Im a Lucene newbie,
> >
> >
> > Requirement :
> > ==
As far as the appropriateness of Lucene, it's an open question, but I think
it'd be fine. If it isn't, you have an "interesting" problem .
About timestamps. This has been discussed a LOT on the thread, since they're
not as straight-forward as you might assume. See the thread *"Date ranges -
getti
See below
On Dec 11, 2006, at 7:04 AM, abdul aleem wrote:
Hi All,
Im a Lucene newbie,
Requirement :
==
a) Build a log viewer tool, search log files for
keywords and time stamp
b) files in production approx 200 logs per day and
each log file may range from 1MB - 5MB
Lucene
Hi All,
Im a Lucene newbie,
Requirement :
==
a) Build a log viewer tool, search log files for
keywords and time stamp
b) files in production approx 200 logs per day and
each log file may range from 1MB - 5MB
Lucene
We wanted to utilize Lucene's search capabilities
espec
Daniel Naber wrote:
On Saturday 09 December 2006 02:25, Scott Smith wrote:
What is the best way to do this? Is changing the boost the right
answer? Can a field's boost be zero?
Yes, just use: term1 term2 category1^0 category2^0. Erick's Filter idea is
also useful.
Isn't it also true that
39 matches
Mail list logo