Here are some remarks from what I learned by inspecting the code (quite
a while ago now, but the principle shouldn't have changed)...
When an IndexReader opens the segments of an index it
- grabs the commit lock,
- reads the "segments" file for the list of segment names.
- opens the files for ea
[
http://issues.apache.org/jira/browse/LUCENE-507?page=comments#action_12376874 ]
Andi Vajda commented on LUCENE-507:
---
My apologies, I didn't notice this until it was mentioned today.
The "//required by gcj" comment is not something I added or need.
The f
Any chance at a last plea for LUCENE-362? It saves me an enormous
amount of unnecessary allocation for the common case of a single large
compressed field. It is an expert-level api that needs to be used
carefully, but has no affect on any behavior if you don't use it.
http://issues.apache.org/ji
[
http://issues.apache.org/jira/browse/LUCENE-558?page=comments#action_12376849 ]
Chuck Williams commented on LUCENE-558:
---
There is one potentially important benefit of this approach over LUCENE-545.
By having the narrower more concrete API (list of
: I should have been more clear: I'm not asking for new feature requests.
: Rather for known, high-priority, bugs.
I don't know if it's high priority, but LUCENE-546 seems to be a trivial
bug with a trivial fix ("seems to be", i'm judging purely by the patch)
2.0 also seems like the best time
On 4/27/06, Robert Engels <[EMAIL PROTECTED]> wrote:
> What about making IndexReader & IndexWriter interfaces? Or creating
> interfaces for these (IReader & IWriter?), and making all of the classes use
> the interfaces?
There is a drawback to interfaces too... you can't easily add an extra
method
Robert Engels wrote:
What about making IndexReader & IndexWriter interfaces? Or creating
interfaces for these (IReader & IWriter?), and making all of the classes use
the interfaces?
I should have been more clear: I'm not asking for new feature requests.
Rather for known, high-priority, bugs.
Maybe a fix for
http://issues.apache.org/jira/browse/LUCENE-556
might be warranted?
-Yonik
On 4/27/06, Doug Cutting <[EMAIL PROTECTED]> wrote:
> Are there any changes folks think we need before we make the 2.0
> release? The major change from 1.9, removal of deprecated items, has
> been made. A
What about making IndexReader & IndexWriter interfaces? Or creating
interfaces for these (IReader & IWriter?), and making all of the classes use
the interfaces?
-Original Message-
From: Doug Cutting [mailto:[EMAIL PROTECTED]
Sent: Thursday, April 27, 2006 5:20 PM
To: java-dev@lucene.apache
karl wettin wrote:
Not critical in any way, but I would not mind if Term and Document were
interfaces instead of final classes.
That's not likely to happen before the 2.0 release. We're looking
high-priority, back-compatible bug fixes at this point.
Doug
--
28 apr 2006 kl. 00.19 skrev Doug Cutting:
Are there any changes folks think we need before we make the 2.0
release? The major change from 1.9, removal of deprecated items,
has been made. Anything else critical?
Not critical in any way, but I would not mind if Term and Document
were int
28 apr 2006 kl. 00.30 skrev Marvin Humphrey:
On Apr 27, 2006, at 2:35 PM, karl wettin wrote:
What will be required in the IndexReader? Is it enough to add
getBoost() in the TermEnum? How would the value be sent to the
scorer?
It wouldn't be the TermEnum, it would be a TermDocs subclass.
On Apr 27, 2006, at 2:35 PM, karl wettin wrote:
What will be required in the IndexReader? Is it enough to add
getBoost() in the TermEnum? How would the value be sent to the scorer?
It wouldn't be the TermEnum, it would be a TermDocs subclass. If
we're talking BOOST_PER_POSITION, it would
Are there any changes folks think we need before we make the 2.0
release? The major change from 1.9, removal of deprecated items, has
been made. Anything else critical?
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For ad
Chuck Williams (JIRA) wrote:
545 is a good improvement. [ ... ] Is there interest in committing 545?
I think we should probably get the 2.0 release out the door before we do
that.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTE
[ http://issues.apache.org/jira/browse/LUCENE-559?page=all ]
Emre Bayram updated LUCENE-559:
---
Attachment: TurkishAnalyzer.java
TurkishStemFilter.java
TurkishStemmer.java
> Turkish Analyzer for Lucene
> --
Turkish Analyzer for Lucene
---
Key: LUCENE-559
URL: http://issues.apache.org/jira/browse/LUCENE-559
Project: Lucene - Java
Type: Improvement
Components: Analysis
Reporter: Emre Bayram
I have developed an Analyzer for Turkish, thank
[
http://issues.apache.org/jira/browse/LUCENE-558?page=comments#action_12376828 ]
Chuck Williams commented on LUCENE-558:
---
You're right about IndexReader.document(int), although it appears you removed
(package api) FieldsReader.doc(int). I've been re
Marvin Humphrey wrote:
Incidentally, how about calling it BOOST_PER_POSITION instead?
+1, that is more consistent with other naming.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PRO
27 apr 2006 kl. 18.41 skrev Doug Cutting:
karl wettin wrote:
Boost per position, et.c. sounds very expensive.
Indeed. It will probably nearly double the size of indexes and
also increase search time. But it is also very powerful. Consider
the posting representation Google describes on
Now that I think about it, putting the score-multiplier into the
FreqFile does offer a benefit I hadn't considered before. It makes
it possible to tie the score multiplier to a term within a doc,
rather than a field within a doc.
Say you have a doc with a "body" field that's 1000 terms l
On Apr 27, 2006, at 12:17 PM, Doug Cutting wrote:
Marvin Humphrey wrote:
Moving away from cached norms was the second of three major
changes to the file format on my agenda, and the one I was all
but certain I wouldn't be able to sell to the Lucene community.
The first was using bytec
[ http://issues.apache.org/jira/browse/LUCENE-545?page=all ]
Grant Ingersoll updated LUCENE-545:
---
Attachment: newFiles.tar.gz
Forgot the new files.
> Field Selection and Lazy Field Loading
> --
>
> Key: L
[
http://issues.apache.org/jira/browse/LUCENE-558?page=comments#action_12376784 ]
Grant Ingersoll commented on LUCENE-558:
IndexReader.document(int n) is still in there. All prior APIs work and the
introduction of Fieldable is a drop in replacement
Marvin Humphrey wrote:
Moving away from cached norms was the second of three major changes to
the file format on my agenda, and the one I was all but certain I
wouldn't be able to sell to the Lucene community. The first was using
bytecounts at the head of Strings.
The third was storing st
[
http://issues.apache.org/jira/browse/LUCENE-558?page=comments#action_12376781 ]
Chuck Williams commented on LUCENE-558:
---
545 is certainly more general and could handle all the cases. I looked at it
briefly before doing this version and was concerne
On Apr 27, 2006, at 9:41 AM, Doug Cutting wrote:
karl wettin wrote:
My own immediate thought is to compromise by allowing boost per
term in document. Simply remove the norms-methods from the
IndexReader and add a new one to the TermEnum and fall back on
the field boost. How would the v
[
http://issues.apache.org/jira/browse/LUCENE-140?page=comments#action_12376780 ]
Jason Lambert commented on LUCENE-140:
--
I was having this problem intermittently while indexing over multiple threads
and I have found that the following steps can cause
On 4/27/06, Robert Engels <[EMAIL PROTECTED]> wrote:
> I thought each segment maintained its own list of deleted documents
Right.
> (since segments are WRITE ONCE
Yes, but deletions are the exception to that rule. Once written,
segment files never change, except for the file that tracks deleted
Doug can you please elaborate on this.
I thought each segment maintained its own list of deleted documents (since
segments are WRITE ONCE, and when that segment is merged or optimized it
would "go away" anyway, as the deleted documents are removed.
In my reopen() implementation, I check to see if
[
http://issues.apache.org/jira/browse/LUCENE-557?page=comments#action_12376775 ]
Hoss Man commented on LUCENE-557:
-
In my haste to upload the testing patch before i left work, I faied to mention
that it exposes 9 test failures, suggesting at least two bug
> I think the 'public static IndexReader.reopen(IndexReader old)' method I
> proposed can easily compare the current list of segments for the directory of
> old to those that old already has open, and determine which can be reused and
> which new segments must be opened.
This makes sense. Coul
karl wettin wrote:
My own immediate thought is to compromise by allowing boost per term in
document. Simply remove the norms-methods from the IndexReader and add
a new one to the TermEnum and fall back on the field boost. How would
the value be picked up by the scorer?
Boost per position,
[
http://issues.apache.org/jira/browse/LUCENE-556?page=comments#action_12376758 ]
jm commented on LUCENE-556:
---
I used a custom version and my queries work now, but I am not sure wether this
is ok...it's mostly an easy shot I took:
public class LuceneMatchAllDocs
Ask the question on the lucene users list, not the dev-list.
And, Read a book. Read the javadoc. Read the samples.
-Original Message-
From: Anton Feldmann [mailto:[EMAIL PROTECTED]
Sent: Thursday, April 27, 2006 10:05 AM
To: java-dev@lucene.apache.org; java-user@lucene.apache.org
Subject:
Hi
I wrote a Indexer which is indexing all the contents of a text and the
sentence are seperated in an other Document.
"Document document = new Document(new Field ("contents", reader ));
StringTokenizer token = new StringTokenizer(contents.replaceAll(". ",
"\\.x\\") , "\\.x\
[
http://issues.apache.org/jira/browse/LUCENE-558?page=comments#action_12376723 ]
Grant Ingersoll commented on LUCENE-558:
This is pretty much what I started out with when I first started working on
Lazy/Selective Field loading and I think it is a l
[ http://issues.apache.org/jira/browse/LUCENE-558?page=all ]
Chuck Williams updated LUCENE-558:
--
Attachment: LuceneTrunk.patch
> Selective field loading
> ---
>
> Key: LUCENE-558
> URL: http://issues.apache.org/jira
Selective field loading
---
Key: LUCENE-558
URL: http://issues.apache.org/jira/browse/LUCENE-558
Project: Lucene - Java
Type: New Feature
Components: Index
Versions: 2.0
Environment: All
Reporter: Chuck Williams
Provides a new
26 apr 2006 kl. 19.18 skrev Doug Cutting:
karl wettin wrote:
How about refactoring fields to something like:
[Document](fieldName)<#> {0..1} ->[Field +boost]<#>
{0..*} -> [FieldValue +store +index +termVector]
If you think you have a simple, back-compatible way to do this,
pleas
40 matches
Mail list logo