Mark Miller wrote:
Michael McCandless wrote:
Mark Miller wrote:
Mark Miller wrote:
Which new sort stuff are you referring to? Is it LUCENE-1471?
Yes. First thing I did was try and patch this in, but the sort
tests failed. It would be the right order, but like the two
center docs wo
Mark Miller wrote:
Mark Miller wrote:
Mark Miller wrote:
Which new sort stuff are you referring to? Is it LUCENE-1471?
Yes. First thing I did was try and patch this in, but the sort
tests failed. It would be the right order, but like the two center
docs would be reversed or something.
Mark Miller wrote:
Mark Miller wrote:
Which new sort stuff are you referring to? Is it LUCENE-1471?
Yes. First thing I did was try and patch this in, but the sort tests
failed. It would be the right order, but like the two center docs
would be reversed or something. No time to dig in, so I
Michael McCandless wrote:
Mark Miller wrote:
Mark Miller wrote:
Which new sort stuff are you referring to? Is it LUCENE-1471?
Yes. First thing I did was try and patch this in, but the sort tests
failed. It would be the right order, but like the two center docs
would be reversed or somet
Mark Miller wrote:
Mark Miller wrote:
Which new sort stuff are you referring to? Is it LUCENE-1471?
Yes. First thing I did was try and patch this in, but the sort
tests failed. It would be the right order, but like the two center
docs would be reversed or something. No time to dig in,
Hi Mark,
> Thanks for the ref to that bug Uwe, was indeed the problem.
This is now committed: updates in FieldSortedHitQueue, new super-interface
for FieldCache.Parsers and SortField changes (see Mikes commit as I have no
committer status yet).
Uwe
-
Mark Miller wrote:
Which new sort stuff are you referring to? Is it LUCENE-1471?
Yes. First thing I did was try and patch this in, but the sort tests
failed. It would be the right order, but like the two center docs
would be reversed or something. No time to dig in, so I just switch to
the
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654488#action_12654488
]
Jason Rutherglen commented on LUCENE-831:
-
M. McCandless:
"This is an interesting
Hi Mark,
> I'm going to dig in more tonight I hope. The main issue is that using
> SortType.AUTO blows up because the MultiSearcher code expects it already
> to have been resolved to a sort type, but my hack kept that from
> happening so it hits a switch statement for AUTO that throws an
> excepti
Michael McCandless wrote:
Mark Miller wrote:
I tried a quick poor mans version using a MultiSearcher and wrapping
the sub readers as searchers. Other than some AUTO sort field
detection problems, all tests do appear to pass.
Excellent, that sounds like a tentatively positive result, though
Mark Miller wrote:
I tried a quick poor mans version using a MultiSearcher and wrapping
the sub readers as searchers. Other than some AUTO sort field
detection problems, all tests do appear to pass.
Excellent, that sounds like a tentatively positive result, though we
do need to get to th
I tried a quick poor mans version using a MultiSearcher and wrapping the
sub readers as searchers. Other than some AUTO sort field detection
problems, all tests do appear to pass. The new sort stuff for
MultiSearcher may be a tiny bit off...sort tests fail, though are only
slightly off, with th
Mark Miller wrote:
Michael McCandless wrote:
Mark Miller wrote:
What do we get from this though? A MultiSearcher (with the
scoring issues) that can properly do rewrite? Won't we have to
take MultiSearchers scoring baggage into this as well?
If this can work, what we'd get is far bette
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654418#action_12654418
]
Robert Newson commented on LUCENE-831:
--
Yes, something like that. I made a Document c
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654417#action_12654417
]
Uwe Schindler commented on LUCENE-831:
--
{quote}This is an interesting idea. Say we cre
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654413#action_12654413
]
Michael McCandless commented on LUCENE-831:
---
bq. It seems with this field cache
Michael McCandless wrote:
Mark Miller wrote:
What do we get from this though? A MultiSearcher (with the scoring
issues) that can properly do rewrite? Won't we have to take
MultiSearchers scoring baggage into this as well?
If this can work, what we'd get is far better reopen() performance
w
Mark Miller wrote:
What do we get from this though? A MultiSearcher (with the scoring
issues) that can properly do rewrite? Won't we have to take
MultiSearchers scoring baggage into this as well?
If this can work, what we'd get is far better reopen() performance
when you sort-by-field, wi
What do we get from this though? A MultiSearcher (with the scoring
issues) that can properly do rewrite? Won't we have to take
MultiSearchers scoring baggage into this as well?
Michael McCandless wrote:
On thinking more about this... I think with a few small changes we
could achieve Sort by
On thinking more about this... I think with a few small changes we
could achieve Sort by field without materializing a full array. We
can decouple this change from LUCENE-831.
I think all that's needed is:
* Expose sub-readers (LUCENE-1475) by adding IndexReader[]
IndexReader.getSubReade
Hallo Robert,
> This is why I think for many users the field cache is not the best
> solution. If you have lots of documents but searchers that return
> relatively few, then using filters and sorting the results using
> stored fields is far more efficient.
>
> It seems to me that the field cache
One thing to keep in mind about using the field cache for filter
caching.
The filter bitset cache at worst holds 8 documents per byte (and with
bitset compression this can be even more efficient).
Using the field cache is going to rather be bytes per document, most
likely at least an orde
Michael McCandless wrote:
I'd like to decouple "upgraded to Object" vs "materialize full array",
ie, so we can access native values w/o materializing the full array.
I also think "upgrade to Object" is dangerous to even offer since it's
so costly.
I'm right with you. I didn't think the Ob
> >>> MultiSearcher has a few aspects I don't like.
> >>
> >> Do you mean the score differences vs IndexSearcher(MultiReader), or
> >> is there something else?
> > And rewrite does not work properly. And to get 30 docs over 3
> > indexes, you ask for 90. And sort twice.
>
> I'm thinking we stick w
Mark Miller wrote:
MultiSearcher has a few aspects I don't like.
Do you mean the score differences vs IndexSearcher(MultiReader), or
is there something else?
And rewrite does not work properly. And to get 30 docs over 3
indexes, you ask for 90. And sort twice.
I'm thinking we stick wi
Mark Miller wrote:
Michael McCandless wrote:
Today, with IndexSearcher(MultiReader), the FieldSortedHitQueue asks
FieldCache to materialize the full array for each field. Whereas
MultiSearcher only asks each child reader to materialize its array
for
the field, which is better because on r
Marvin Humphrey wrote:
On Sat, Dec 06, 2008 at 04:21:04PM -0500, Mark Miller wrote:
And to get 30 docs over 3 indexes, you ask for 90. And sort twice.
However, this scales with the number of segments, not the number of documents.
Marvin Humphre
Right. They are all minor gripes to be s
On Sat, Dec 06, 2008 at 04:21:04PM -0500, Mark Miller wrote:
> And to get 30 docs over 3 indexes, you ask for 90. And sort twice.
However, this scales with the number of segments, not the number of documents.
Marvin Humphrey
-
MultiSearcher has a few aspects I don't like.
Do you mean the score differences vs IndexSearcher(MultiReader), or is
there something else?
And rewrite does not work properly. And to get 30 docs over 3 indexes,
you ask for 90. And sort twice.
Minor gripes, but bugs me non the less.
- Mark
Michael McCandless wrote:
Today, with IndexSearcher(MultiReader), the FieldSortedHitQueue asks
FieldCache to materialize the full array for each field. Whereas
MultiSearcher only asks each child reader to materialize its array for
the field, which is better because on reopen we only need to ini
Mark Miller wrote:
EG when sorting by field, we could pull say an IntData iterator from
the reader, and then access the int values in docID order as we visit
the docs.
We need random access after collecting/visiting though...do we put
what we collect into a map? If a lot of docs match?
I g
Mark Miller wrote:
Michael McCandless (JIRA) wrote:
However, stepping back, this is poor approach. We should instead be
doing what MultiSearcher does, which is gather top results
per-sub-reader, and then merge-sort the results. At that point, to
do
the merge, we only need actual field va
Michael McCandless (JIRA) wrote:
However, stepping back, this is poor approach. We should instead be
doing what MultiSearcher does, which is gather top results
per-sub-reader, and then merge-sort the results. At that point, to do
the merge, we only need actual field values for those docs in
Michael McCandless (JIRA) wrote:
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654057#action_12654057 ]
Michael McCandless commented on LUCENE-831:
---
One
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654109#action_12654109
]
Robert Newson commented on LUCENE-831:
--
The conflict was easy to resolve, it was just
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654106#action_12654106
]
Uwe Schindler commented on LUCENE-831:
--
Maybe we need two trunks or branches or whatev
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654105#action_12654105
]
Uwe Schindler commented on LUCENE-831:
--
Maybe every asignee should tag his issues that
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654069#action_12654069
]
Robert Newson commented on LUCENE-831:
--
This enhancement is particularly interesting t
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654064#action_12654064
]
Mark Miller commented on LUCENE-831:
Ah, the dirty secret of 831 - there is plenty more
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654057#action_12654057
]
Michael McCandless commented on LUCENE-831:
---
One more thing here... while random
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654055#action_12654055
]
Michael McCandless commented on LUCENE-831:
---
[Note: my understanding of this are
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653757#action_12653757
]
Michael McCandless commented on LUCENE-831:
---
bq. change norm caching to use new c
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649393#action_12649393
]
Mark Miller commented on LUCENE-831:
I think this would actually be better if all cache
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649391#action_12649391
]
Mark Miller commented on LUCENE-831:
bq. i haven't had any time to do further work on t
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649385#action_12649385
]
Alex Vigdor commented on LUCENE-831:
To be honest, the cache never successfully refille
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649191#action_12649191
]
Mark Miller commented on LUCENE-831:
You've tried the patch? Awesome!
How long did it
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649188#action_12649188
]
Mark Miller commented on LUCENE-831:
Also - your reopen time will vary greatly dependin
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649180#action_12649180
]
Alex Vigdor commented on LUCENE-831:
Another useful feature that seems like it would be
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644786#action_12644786
]
Mark Miller commented on LUCENE-831:
This is missing a good way to manage the caching o
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12625470#action_12625470
]
Hoss Man commented on LUCENE-831:
-
bq. What benefit do you see to this? Does it offer anyth
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12625098#action_12625098
]
Mark Miller commented on LUCENE-831:
bq. change norm caching to use new caches (if not
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12625084#action_12625084
]
Mark Miller commented on LUCENE-831:
I've got the function package happily deprecated w
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624844#action_12624844
]
Mark Miller commented on LUCENE-831:
That patch may have a goof , I'll peel off another
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624835#action_12624835
]
Fuad Efendi commented on LUCENE-831:
Would be nice to have TermVectorCache (if term vec
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624529#action_12624529
]
Mark Miller commented on LUCENE-831:
Deprecating the function package is problematic. D
: > Right, if the updates come through IndexWriter or through a different
: > IndexReader. But if you do the updates with an IndexReader (which
: > eventually commits to disk), and also use that IndexReader for
: > searching, we may need to synchronize?
:
: IMO, if we want to support somethi
Yonik Seeley wrote:
On Fri, Mar 28, 2008 at 10:43 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
Yonik Seeley wrote:
On Fri, Mar 28, 2008 at 9:20 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
But eg LUCENE-1231 talks about maybe eventually allowing
updates to
fields, like how norm
On Fri, Mar 28, 2008 at 10:43 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
> Yonik Seeley wrote:
> > On Fri, Mar 28, 2008 at 9:20 AM, Michael McCandless
> > <[EMAIL PROTECTED]> wrote:
> >> But eg LUCENE-1231 talks about maybe eventually allowing updates to
> >> fields, like how norms ca
Yonik Seeley wrote:
On Fri, Mar 28, 2008 at 9:20 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
But eg LUCENE-1231 talks about maybe eventually allowing updates to
fields, like how norms can be updated in a reader today. If we do
that, eg as part of flexible indexing, then we might need
On Fri, Mar 28, 2008 at 9:20 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
> But eg LUCENE-1231 talks about maybe eventually allowing updates to
> fields, like how norms can be updated in a reader today. If we do
> that, eg as part of flexible indexing, then we might need to worry
> about
Mark Miller wrote:
Right Michael...of course. Sometimes I cannot see the forest or the
trees...
We don't have to worry that much about the IndexReader being read
only though do we? There is not much worry now - I believe that a
deleted doc field value remains in the field cache until the
Right Michael...of course. Sometimes I cannot see the forest or the trees...
We don't have to worry that much about the IndexReader being read only
though do we? There is not much worry now - I believe that a deleted doc
field value remains in the field cache until the Reader is reopened. If
s
I was picturing that you'd first call an API on IndexReader to
retrieve an object for accessing your stored field, like the
IndexReader.getCachedData() in the current patch on LUCENE-831. That
method must be synchronized so that the underlying cache is initially
loaded by at most one thr
The reason I am thinking you have to synch on every getCachedField call
is that the cache needs to be lazily loaded...I don't see a way to do
with this without sync unless you have an ugly "you must call this
method before repeatably calling getCachedField."
Maybe I am wrong? Or maybe the cost
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582576#action_12582576
]
Michael McCandless commented on LUCENE-831:
---
I think if we can finally move to ha
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582480#action_12582480
]
Mark Miller commented on LUCENE-831:
Hmm...how do we avoid having to pull the cached fi
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582471#action_12582471
]
Mark Miller commented on LUCENE-831:
>If you're going to incrementally update a FieldCa
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582443#action_12582443
]
Michael Busch commented on LUCENE-831:
--
{quote}
The benefit then is that reopen() of a
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582422#action_12582422
]
Michael McCandless commented on LUCENE-831:
---
One question here: should we switch
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581758#action_12581758
]
Yonik Seeley commented on LUCENE-831:
-
> I agree that it would be nice to skip the Stri
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581375#action_12581375
]
Mark Miller commented on LUCENE-831:
Right, I think its used in MultiSearcher and Paral
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581307#action_12581307
]
Hoss Man commented on LUCENE-831:
-
Mark:
I haven't looked at this issue or any of the code
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581189#action_12581189
]
Mark Miller commented on LUCENE-831:
I spent a little time getting this patch somewhat
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559341#action_12559341
]
Michael Busch commented on LUCENE-831:
--
{quote}
As with Lucene 2.3 the reopen is possi
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559008#action_12559008
]
Uwe Schindler commented on LUCENE-831:
--
I did some extensive tests with Lucene 2.3 tod
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515914
]
Otis Gospodnetic commented on LUCENE-831:
-
Catching up with java-dev (just 300-400 more emails to go! ;)), I
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513777
]
Hoss Man commented on LUCENE-831:
-
thanks for the feedback mark ... i honestly haven't looked at this patch since
th
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509958
]
Mark Miller commented on LUCENE-831:
I think this patch is great. Not only does it make all of the sort caching
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12484518
]
Otis Gospodnetic commented on LUCENE-831:
-
I haven't looked at the patch yet. However, I do know that a coll
101 - 179 of 179 matches
Mail list logo