Chris,
That's exactly what I was looking for. Thanks for the info and the
clarification on where to post my questions.
Regards,
Kyle
On Wed, Jun 25, 2008 at 5:12 PM, Chris Hostetter <[EMAIL PROTECTED]>
wrote:
>
> : and how to use them? For a concrete example I'm looking to do a query
> : on
On 24-Jun-08, at 1:28 PM, Yonik Seeley wrote:
Something to consider for Lucene 3 is to have something to retrieve
Similarity per-field rather than passing the field name into some
functions...
+1
I've felt that this was the "proper" (and more useful) way to do
things for a long time
(http
On Wed, Jun 25, 2008 at 5:06 PM, Chris Hostetter
<[EMAIL PROTECTED]> wrote:
> Hmmm... that seems like it would be confusing: particularly since in the
> IndexWriter case the "Query" param would never make sense. changing
> IndexWriter.getSimilarity to take a "String fieldName" and changing
> Searc
: and how to use them? For a concrete example I'm looking to do a query
: on a date field to find documents earlier than a specified date or
: later than a specified date. Ex: date:( >20070101) or date:
: (<20070101). I looked at the range query feature but it didn't appear
: to cover this cas
: > i assume you mean "Searcher.getSimilarity(String fieldName, Query q)" to
: > replace the current Searcher.getSimilarity() right?
:
: No, I meant Similarity (it's more like a factory method on the
: Similarity class).
: The Searcher.getSimilarity() could remain unchanged.
: A Similarity is wha
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608189#action_12608189
]
Yonik Seeley commented on LUCENE-1316:
--
bq. is your point that without synchronizatio
I am not sure, BooleanQuery takes something that can score, e.g. being a
Clause or a Query, the contract requires some sort of scoring functionality.
We use DocIdSetQuery for some of the scoring capabilities such as constant
score (with boosting), age decay, and using the new scoring api in 2.3.
Ma
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608187#action_12608187
]
robert engels commented on LUCENE-1316:
---
Hoss, that is indeed the case, another thre
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608183#action_12608183
]
Hoss Man commented on LUCENE-1316:
--
bq. if thread A deleted a document, and then thread B
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608162#action_12608162
]
[EMAIL PROTECTED] edited comment on LUCENE-1316 at 6/25/08 12:40 PM:
---
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608162#action_12608162
]
Mark Miller commented on LUCENE-1316:
-
If I remember correctly, volatile does not work
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608160#action_12608160
]
Yonik Seeley commented on LUCENE-1316:
--
bq. declaring the deletedDocs volatile should
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608149#action_12608149
]
robert engels commented on LUCENE-1316:
---
The Pattern#5 referenced (cheap read-write
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608148#action_12608148
]
robert engels commented on LUCENE-1316:
---
According to
http://www.ibm.com/developerw
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608147#action_12608147
]
Yonik Seeley commented on LUCENE-1316:
--
bq. why would deletes be stop being instantly
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608146#action_12608146
]
robert engels commented on LUCENE-1316:
---
According to the java memory model, hasDele
On Wed, Jun 25, 2008 at 2:19 PM, Chris Hostetter
<[EMAIL PROTECTED]> wrote:
> : Might also consider passing in more optional context when retrieving
> : the similarity for a field (such as a Query, if searching).
> : Something like Similarity.getSimilarity(String field, Query q).
>
> i assume you m
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608137#action_12608137
]
Hoss Man commented on LUCENE-1316:
--
bq. Code that depended on deletes being instantly vis
: Might also consider passing in more optional context when retrieving
: the similarity for a field (such as a Query, if searching).
: Something like Similarity.getSimilarity(String field, Query q).
i assume you mean "Searcher.getSimilarity(String fieldName, Query q)" to
replace the current Sear
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608134#action_12608134
]
Yonik Seeley commented on LUCENE-1316:
--
> a more generalized improvements would proba
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Feak updated LUCENE-1316:
--
I like Hoss' suggestion better. I'll try that fix locally and if it provides
the same improvement, I will
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608129#action_12608129
]
Hoss Man commented on LUCENE-1316:
--
rather then attempting localized optimizations of ind
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608128#action_12608128
]
Yonik Seeley commented on LUCENE-1316:
--
Although this doesn't solve the general probl
Op Wednesday 25 June 2008 18:45:16 schreef John Wang:
> Hi Paul:
> Regarding to your comment on adding required/prohibited to
> BooleanQuery:
>
> Based on the new api on DocIdSet and DocIdSetIterator
> abstractions, we also developed decorators such as
> AndDocIdSet,OrDocIdSet and NotDocIdS
On Wed, Jun 25, 2008 at 11:30 AM, Jason Rutherglen
<[EMAIL PROTECTED]> wrote:
> I read other parts of the email but glanced over this part. Would terms be
> automatically sorted as they came in? If implemented it would be nice to be
> able to get an encoded representation (probably byte array) of
Hi Paul:
Regarding to your comment on adding required/prohibited to BooleanQuery:
Based on the new api on DocIdSet and DocIdSetIterator abstractions, we
also developed decorators such as AndDocIdSet,OrDocIdSet and NotDocIdSet,
furthermore a DocIdSetQuery class that honors the Query api con
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Feak updated LUCENE-1316:
--
Further investigation indicates that the ValueSourceQuery$ValueSourceScorer may
suffer from the same issu
Op Wednesday 25 June 2008 17:05:17 schreef John Wang:
> Thanks Paul and Mike for the feedback.
> Paul, for us, sparsity of the docIds determine which data structure
> to use. Where cardinality gives some of that, min/max docId would
> also help, example:
>
> say maxdoc=100, cardinality = 7, doc
[
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Feak updated LUCENE-1316:
--
Attachment: MatchAllDocsQuery.java
My version of MatchAlldocsQuery.java which has the modification in
Avoidable synchronization bottleneck in MatchAlldocsQuery$MatchAllScorer
Key: LUCENE-1316
URL: https://issues.apache.org/jira/browse/LUCENE-1316
Project: Lucene - Java
I read other parts of the email but glanced over this part. Would terms be
automatically sorted as they came in? If implemented it would be nice to be
able to get an encoded representation (probably byte array) of the document
and postings which could be written to a log, and then reentered in an
No reason done!
Erik
On Jun 25, 2008, at 11:05 AM, Jason Rutherglen wrote:
It seems like it could, it even has serialVersionUID defined.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mai
Thanks Paul and Mike for the feedback.
Paul, for us, sparsity of the docIds determine which data structure to use.
Where cardinality gives some of that, min/max docId would also help,
example:
say maxdoc=100, cardinality = 7, docids: {0,1,...6} or
{3,4...9}, using arrayDocIdSet wou
It seems like it could, it even has serialVersionUID defined.
+1
24 jun 2008 kl. 22.28 skrev Yonik Seeley:
Something to consider for Lucene 3 is to have something to retrieve
Similarity per-field rather than passing the field name into some
functions...
benefits:
- Would allow customizing most Similarity functions per-field
- Performance: Similarity for
[
https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608039#action_12608039
]
Jason Rutherglen commented on LUCENE-1314:
--
Here is the code of the SegmentReader
On Wed, Jun 25, 2008 at 6:29 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
> We've also discussed at one point creating an IndexReader impl that searches
> the RAM buffer that DocumentsWriter writes to when adding documents. I
> think it's easier than it sounds, on first glance, because Docume
I understand what you are saying. I am not sure it is worth "clearly quite
a bit more work" given how easy it is to simply be able to have more control
over the IndexReader deletedDocs BitVector which seems like a feature that
should be in there anyways, perhaps even allowing SortedVIntList to be
[
https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607954#action_12607954
]
Michael McCandless commented on LUCENE-1314:
bq. In my SegmentReader subclass
John Wang wrote:
The problem I am having is stated below, I don't know how to
add the minDoc and maxDoc values to the index while keeping backward
compatibility.
Unfortunately, TermInfo file format just isn't extensible at the
moment, so I think for now you'll have to break backward
Jason Rutherglen wrote:
One of the bottlenecks I have noticed testing Ocean realtime search
is the delete process which involves writing several files for each
possibly single delete of a document in SegmentReader. The best way
to handle the deletes is too simply keep them in memory witho
Jason Rutherglen wrote:
For Ocean I created a workaround where the IndexCommits from
IndexDeletionPolicy are saved in a map in order to achieve deleting
based on the IndexReader. It would be more straightforward to
delete from the IndexCommit in IndexReader.
It seems like we are mixing
Nadav Har'El wrote:
Recently an index I've been building passed the 2 GB mark, and after I
optimize()ed it into one segment over 2 GB, it stopped working.
Nadav, which platform did you hit this on? I think I've created > 2
GB index on 32 bit WinXP just fine. How many platforms are really
Op Wednesday 25 June 2008 07:03:59 schreef John Wang:
> Hi guys:
> Perhaps I should have posted this to this list in the first
> place.
>
> I am trying to work on a patch to for each term, expose minDoc
> and maxDoc. This value can be retrieve while constructing the
> TermInfo.
>
> Know
Hi,
Recently an index I've been building passed the 2 GB mark, and after I
optimize()ed it into one segment over 2 GB, it stopped working.
Apparently, this is a known problem (on 32 bit JVMs), and mentioned in the FAQ,
http://wiki.apache.org/lucene-java/LuceneFAQ question "Is there a way to limit
45 matches
Mail list logo