[
https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739944#action_12739944
]
Michael McCandless commented on LUCENE-1771:
bq. however that project has apac
I always thought flexible indexing is not only for storing your
app-specific data next to terms/docs.
Something more along the lines of efficient geo search, or ability to
try out various index encoding schemes without patching lucene.
In other words, this is something that can be a basis for
easy
[
https://issues.apache.org/jira/browse/LUCENE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739949#action_12739949
]
Michael McCandless commented on LUCENE-1782:
{quote}
My reason for this, is th
[
https://issues.apache.org/jira/browse/LUCENE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-1782:
---
Attachment: LUCENE-1782.patch
OK new patch attached w/ the above renaming. I added
Agreed.
Grant's idea is something new and I think useful, ie offering some
sort of pluggability of what's stored in payloads, sitting entirely
outside (above) Lucene's core.
Maybe we should call it 'Flexible Payloads', or something, to
differentiate the two.
Mike
On Thu, Aug 6, 2009 at 5:10 AM,
improve performance of contrib/TestCompoundWordTokenFilter
--
Key: LUCENE-1786
URL: https://issues.apache.org/jira/browse/LUCENE-1786
Project: Lucene - Java
Issue Type: Test
C
[
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739974#action_12739974
]
Michael McCandless commented on LUCENE-1781:
So, here's one thing that worries
[
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-1781:
---
Fix Version/s: (was: 2.9)
3.1
> Large distances in Spatial go
[
https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739978#action_12739978
]
Michael McCandless commented on LUCENE-1771:
Patch looks good Mark! This is a
On Aug 6, 2009, at 5:48 AM, Michael McCandless wrote:
Agreed.
Yes, the ability to do things like implement Okapi, Language Modeling
or very sparse indexes (although we kind of have that already) would
not fit in with this stuff. Of course, those couldn't be solved
through the Attribute
[
https://issues.apache.org/jira/browse/LUCENE-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740033#action_12740033
]
Mark Miller commented on LUCENE-1785:
-
I think this might have to be 3.1 ...
> Simple
[
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Miller updated LUCENE-1486:
Fix Version/s: (was: 2.9)
3.1
3.0
> Wildcards, ORs etc i
[
https://issues.apache.org/jira/browse/LUCENE-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740034#action_12740034
]
Mark Miller commented on LUCENE-1767:
-
I'm about to push this to 3.1 unless someone sp
[
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Miller updated LUCENE-1749:
Attachment: LUCENE-1749.patch
I still havn't looked at this in the detail that I want to, but time
[
https://issues.apache.org/jira/browse/LUCENE-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Smith updated LUCENE-1784:
--
Attachment: LUCENE-1784.patch
Patch that makes BooleanWeight and DisjunctionMaxWeight protected
also
[
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-1768:
--
Fix Version/s: 2.9
I think, this should be in 2.9. Any Chance to do this. In my Opinion, it sh
I think there is an issue here, but I didn't follow the TokenStream
improvements very closely.
In Solr, CapitalizationFilterFactory has a CharArray set that it loads
up with keep words - it then checks (with the old TokenStream API) each
token (char array) to see if it should keep it. I think
Index: src/java/org/apache/solr/analysis/CapitalizationFilterFactory.java
===
--- src/java/org/apache/solr/analysis/CapitalizationFilterFactory.java
(revision
778975)
+++ src/java/org/apache/solr/analysis/CapitalizationFilterFactory.
[
https://issues.apache.org/jira/browse/LUCENE-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll resolved LUCENE-1341.
-
Resolution: Fixed
Lucene Fields: (was: [Patch Available])
Committed revision 80
I looked into the code of this Filter. It is very simple and should work out
of the box. There is no cloning done. When the indexer calls incrementToken,
the delegation to next(Token) does not clone at all. It just uses the
encapsulated Token instance (inside the AttributeImpl TokenWrapper) as
reus
uwe look at the patch i pasted in haste (i have a delivery guy here, sorry).
the filter had a bug all along (it was using termBuffer.length for
some length calculations).
On Thu, Aug 6, 2009 at 11:17 AM, Uwe Schindler wrote:
> I looked into the code of this Filter. It is very simple and should wo
I have seen ur mail, but this bug should not be related to the new Token
API, it should occur with old API, too.
I did not look very close into the implementations, I only checked who
changes what in which way. And I see that there is only one Token instance
with a termBuffer that is changed. No p
> I have seen ur mail, but this bug should not be related to the new Token
> API, it should occur with old API, too.
Maybe the problem is an unrelated change:
https://issues.apache.org/jira/browse/LUCENE-1762
This issue changed the default length of the termBuffer in
Token/TermAttributeImpl. Beca
the bug does occur with the old api (some of the evaluations have
incorrect length, but they are not keep words).
its just doesnt happen to make any tests fail (i guess
termBufferLength() happens to == termBuffer.length() for all the
tested keep words) with the old jar file...
On Thu, Aug 6, 2009
that makes perfect sense
On Thu, Aug 6, 2009 at 11:31 AM, Uwe Schindler wrote:
>> I have seen ur mail, but this bug should not be related to the new Token
>> API, it should occur with old API, too.
>
> Maybe the problem is an unrelated change:
> https://issues.apache.org/jira/browse/LUCENE-1762
>
Mark, I looked at this and think it might be unrelated to tokenstreams.
I think the length argument being provided to processWord(char[]
buffer, int offset, int length, int wordCount) in that filter might be
incorrectly calculated.
This is the method that checks the keep list.
(There is trailing
Thanks a lot guys.
Uwe: thats why I was asking ;) I had no proof it was the TokenStream
API, that just seemed a likely candidate - I'm not familiar with that
filter, but it worked with a version of Lucene right before the
TokenStream improvements patch, and then started failing after.
When I
Test passes with this patch - thanks a lot Robert ! I was going to ask
you to create a solr issue, but I see you already have, thanks!
No need to create a test I think - put in the new Lucene jars and it
fails, so likely thats good enough. Though it is spooky that the test
passed without the n
Mark, I agree it could use some more tests in the future, like many things :)
On Thu, Aug 6, 2009 at 11:52 AM, Mark Miller wrote:
> Test passes with this patch - thanks a lot Robert ! I was going to ask you
> to create a solr issue, but I see you already have, thanks!
>
> No need to create a test
Thanks, we are always here to help :-)
> Test passes with this patch - thanks a lot Robert ! I was going to ask
> you to create a solr issue, but I see you already have, thanks!
>
> No need to create a test I think - put in the new Lucene jars and it
> fails, so likely thats good enough. Though
[
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740155#action_12740155
]
Mark Miller commented on LUCENE-1749:
-
P.S. I'm not sure we want to go with the way I
[
https://issues.apache.org/jira/browse/LUCENE-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740165#action_12740165
]
Mark Miller commented on LUCENE-1760:
-
tokenstream still says token is deprecated
> T
I think it is fairly common use case (relative to the rather uncommon
use case of using SpanQuery that is) to want to do something like:
...
SpanQuery sq = ...
topDocs = searcher.search(tq, 10);
Spans spans = sq.getSpans(searcher.getIndexReader());
for (int i = 0; i < topDocs.scoreDocs.length;
seek() seems somewhat doable, although inefficient because the
underlying TermPositions supports seek, but that really would only
allow us to go back to the beginning, I think (besides the fact that
Spans is an interface and it would break back compat, ugh!).
Collector route seems more pro
>> besides the fact that Spans is an interface and it would break back
compat, ugh!
back compat is almost out the window for Spans and 2.9 - we already
broke it with the payloads, so PayloadSpans had been merged to Spans. I
don't know that we have time to squeeze anything in (2.9 is so close !
With a single search one might end up collecting lots of span info
that will be thrown away because the document score is too low.
So I think the best way is to first collect the best hits in the usual
way, and then get the spans of the query (effectively once more,
but now without SpanScorer in b
On Aug 6, 2009, at 2:31 PM, Paul Elschot wrote:
With a single search one might end up collecting lots of span info
that will be thrown away because the document score is too low.
Presumably, you would only collect it if the result was actually put
onto the PriorityQueue, in other words, aft
[
https://issues.apache.org/jira/browse/LUCENE-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reopened LUCENE-1760:
Reopening so we don't forget Mark's last comment...
> TokenStream API javadoc improve
[
https://issues.apache.org/jira/browse/LUCENE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740213#action_12740213
]
Luis Alves commented on LUCENE-1782:
I'm not able to apply your latest patch,
all fil
[
https://issues.apache.org/jira/browse/LUCENE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740219#action_12740219
]
Michael McCandless commented on LUCENE-1782:
I think if you run these commands
[
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reassigned LUCENE-1768:
--
Assignee: Uwe Schindler
> NumericRange support for new query parser
>
[
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740222#action_12740222
]
Yonik Seeley commented on LUCENE-1768:
--
bq. I think, this should be in 2.9.
The stan
Standard Tokenizer doesn't recognise I.B.M as Acronym, it requires it ends with
a dot i.e I.B.M.
Key: LUCENE-1787
URL: https://issues.apache.org/jira/browse/LUCENE-17
[
https://issues.apache.org/jira/browse/LUCENE-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740232#action_12740232
]
Yonik Seeley commented on LUCENE-1787:
--
You would want it to be greedy such that it w
[
https://issues.apache.org/jira/browse/LUCENE-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740233#action_12740233
]
Shai Erera commented on LUCENE-1787:
We should fix ACRONYM, not ACRONYM_DEP right? ACR
But still you might collect spans for docs unnecessarily during processing.
If a doc is added to the PQ and later removed, then the spans collection was
just a waste of time (unless the collection comes in free during query
processing).
Also, if you build a paging search UI, then as soon as the us
On Aug 6, 2009, at 4:25 PM, Shai Erera wrote:
But still you might collect spans for docs unnecessarily during
processing. If a doc is added to the PQ and later removed, then the
spans collection was just a waste of time (unless the collection
comes in free during query processing).
sure,
Cleanup highlighter test class
--
Key: LUCENE-1788
URL: https://issues.apache.org/jira/browse/LUCENE-1788
Project: Lucene - Java
Issue Type: Task
Components: contrib/highlighter
Reporter: Mar
Only w/ ScoreDocs we reuse the same instance. So I guess we'd like to do the
same here.
Seems like providing a TopSpansCollector is what you want, only unlike
TopFieldCollector which populates the fields post search, you'd like to do
it during search.
I've been typing and deleting suggestions for
On Aug 6, 2009, at 5:06 PM, Shai Erera wrote:
Only w/ ScoreDocs we reuse the same instance. So I guess we'd like
to do the same here.
Seems like providing a TopSpansCollector is what you want, only
unlike TopFieldCollector which populates the fields post search,
you'd like to do it durin
[
https://issues.apache.org/jira/browse/LUCENE-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Miller updated LUCENE-1788:
Attachment: LUCENE-1788.patch
> Cleanup highlighter test class
> --
>
[
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740256#action_12740256
]
Hoss Man commented on LUCENE-1749:
--
Mark: I'll start working on improving the docs (and o
[
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740265#action_12740265
]
Hoss Man commented on LUCENE-1749:
--
H...
actually mark, testing our your latest patc
[
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740272#action_12740272
]
Mark Miller commented on LUCENE-1749:
-
I think that TestCustomScoreQuery, TestFieldSco
[
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740275#action_12740275
]
Mark Miller commented on LUCENE-1749:
-
Here is the output - it appears to think String
[
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740278#action_12740278
]
Mark Miller commented on LUCENE-1749:
-
{quote}(Actually: that seems like a wroth while
[
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated LUCENE-1749:
-
Attachment: LUCENE-1749.patch
checkpoint: no functional change from mark's previous patch, just improved
[
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740308#action_12740308
]
Mark Miller commented on LUCENE-1749:
-
Okay, sorry - I messed up when merging with tru
[
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740311#action_12740311
]
Hoss Man commented on LUCENE-1749:
--
bq. I think that TestCustomScoreQuery, TestFieldScore
[
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated LUCENE-1749:
-
Attachment: LUCENE-1749.patch
bq. the interestingthing is that the CacheEntry.toString() doesn't show t
[
https://issues.apache.org/jira/browse/LUCENE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740319#action_12740319
]
Luis Alves commented on LUCENE-1782:
I finally was able to apply the patch in eclipse.
[
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740320#action_12740320
]
Bill Bell commented on LUCENE-1781:
---
I did some additional testing, and here is the new
[
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bill Bell updated LUCENE-1781:
--
Attachment: LLRect.java
Large distance fixer
> Large distances in Spatial go beyond Prime MEridian
>
Hey everybody, over in LUCENE-1749 i'm trying to make sanity checking of
the FieldCache possible, and i'm banging my head into a few walls, and
hoping people can help me fill in the gaps about how sorting w/FieldCache
is *suppose* to work.
For starters: i was getting confused why some debugg
[
https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740328#action_12740328
]
Hoss Man commented on LUCENE-1789:
--
This idea orriginated in LUCENE-1749, see these comme
[
https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated LUCENE-1771:
-
Attachment: LUCENE-1771.patch
FWIW: the last patch was giving me compile errors because BoostingNearQuer
getDocValues should provide a MultiReader DocValues abstraction
---
Key: LUCENE-1789
URL: https://issues.apache.org/jira/browse/LUCENE-1789
Project: Lucene - Java
Issue Type: Improv
[
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740340#action_12740340
]
Luis Alves commented on LUCENE-1768:
You could still do something similar by simply ov
[
https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740343#action_12740343
]
Mark Miller commented on LUCENE-1771:
-
Thanks - BoostingNearQuery was just added, so i
[
https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Miller updated LUCENE-1771:
Attachment: LUCENE-1771.bc-tests.patch
workarounds for the back compat test branch
> Using explai
[
https://issues.apache.org/jira/browse/LUCENE-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Miller resolved LUCENE-1788.
-
Resolution: Fixed
Lucene Fields: [New, Patch Available] (was: [New])
> Cleanup highlight
[
https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740362#action_12740362
]
Mark Miller commented on LUCENE-1789:
-
Its basically what I did as a first attempt at
Boosting Max Term Query
---
Key: LUCENE-1790
URL: https://issues.apache.org/jira/browse/LUCENE-1790
Project: Lucene - Java
Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersol
[
https://issues.apache.org/jira/browse/LUCENE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740378#action_12740378
]
Mark Miller commented on LUCENE-1790:
-
What about a common class with chooseable aggre
[
https://issues.apache.org/jira/browse/LUCENE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated LUCENE-1790:
Attachment: LUCENE-1790.patch
Will commit tomorrow or Saturday, as it is a pretty minor va
[
https://issues.apache.org/jira/browse/LUCENE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740380#action_12740380
]
Grant Ingersoll commented on LUCENE-1790:
-
Was actually just thinking we could hav
See http://hudson.zones.apache.org/hudson/job/Lucene-trunk/912/changes
-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
Enhance QueryUtils and CheckHIts to wrap everything they check in
MultiReader/MultiSearcher
---
Key: LUCENE-1791
URL: https://issues.apache.org/jira/browse/LUCENE-1791
[
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated LUCENE-1791:
-
Attachment: LUCENE-1791.patch
Patch showing what i have in mind.
Current patch causes 14 failures in Te
79 matches
Mail list logo