[jira] Commented: (LUCENE-1536) if a filter can support random access API, we should use it

2010-04-07 Thread Michael McCandless (JIRA)
flex landing), so... we just need a way to pass this Bits down to the low level scorers that actually pull a postings list. But, we should only do this if the filter is not sparse. Also: the filter must be inverted, and, ORd with the deleted docs. This can result in enormous perf gains for searches d

[jira] Commented: (LUCENE-2306) contrib/xml-query-parser: NumericRangeQuery and -Filter support

2010-03-27 Thread Mark Harwood (JIRA)
seful and I'm hoping to add smart dtd-driven query entry into Luke. > contrib/xml-query-parser: NumericRangeQuery and -Filter support > --- > > Key: LUCENE-2306 > URL: https://issues.

[jira] Updated: (LUCENE-2306) contrib/xml-query-parser: NumericRangeQuery and -Filter support

2010-03-27 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2306: -- Summary: contrib/xml-query-parser: NumericRangeQuery and -Filter support (was: contrib/xml

RE: Lucene Filter

2010-03-04 Thread Dyutiman
yaa... and now I am trying with multiple filters. Thanks -- View this message in context: http://old.nabble.com/Lucene-Filter-tp27756577p27778081.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com

RE: Lucene Filter

2010-03-03 Thread Uwe Schindler
Maybe now its also running correct with the filter? - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Dyutiman [mailto:dyutiman.chaudh...@gmail.com] > Sent: Wednesday, March 03, 2010 2:34 PM &

Re: Lucene Filter

2010-03-03 Thread Dyutiman
Oho... actually I didn't check that part of my code at all Thanks a lot for pointing out this to me. The search is running perfectly now thanks Dyutiman -- View this message in context: http://old.nabble.com/Lucene-Filter-tp27756577p27768251.html Sent from the Lucene - Java Deve

Re: Lucene Filter

2010-03-03 Thread mark harwood
-- From: Dyutiman To: java-dev@lucene.apache.org Sent: Wed, 3 March, 2010 11:40:29 Subject: Re: Lucene Filter Thanks Erick, I tried Luke and it seems that my index is fine (see the screenshot attached http://old.nabble.com/file/p27767115/luke.JPG luke.JPG ). That means I did something w

Re: Lucene Filter

2010-03-03 Thread Dyutiman
SearchUtil.java ). If you please can check it ones that will be very helpful. thanks again Dyutiman -- View this message in context: http://old.nabble.com/Lucene-Filter-tp27756577p27767115.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com

Re: Lucene Filter

2010-03-02 Thread Erick Erickson
Taking a quick glance at the code, I don't see anything obviously wrong as far as the problem you describe goes. What happens if you just add a required clause to your query string rather than use a Filter? Something like +sentiment:positive? If you do that, query.toString is your f

Re: Lucene Filter

2010-03-02 Thread Dyutiman
code > > HTH > Erick > > On Tue, Mar 2, 2010 at 9:35 AM, Dyutiman > wrote: > >> >> Hi, >> I am new in this forum and new to Lucene also. I m getting some issue >> while >> trying to filter my Lucene result. >> >> While creating the

Re: Lucene Filter

2010-03-02 Thread Erick Erickson
c, etc. Cure this by moving the new Document inside the while loop If this doesn't help, please show your indexing and searching code HTH Erick On Tue, Mar 2, 2010 at 9:35 AM, Dyutiman wrote: > > Hi, > I am new in this forum and new to Lucene also. I m getting some i

Lucene Filter

2010-03-02 Thread Dyutiman
Hi, I am new in this forum and new to Lucene also. I m getting some issue while trying to filter my Lucene result. While creating the index I am creating a field called sentiment and possible values are 'positive', 'negative' & 'neutral', I am indexing

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-09 Thread Robert Muir (JIRA)
the help here! > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java > Issue Type: New Feat

[jira] Resolved: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-09 Thread Robert Muir (JIRA)
revision 91. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java > Is

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-08 Thread Robert Muir (JIRA)
mmit this to the flex branch tomorrow. The only differences from Uwe's patch will be: * ensure the barred-O (ø) is corrrect in Anders name for the NOTICE.txt * remove the unused instance variable in the enum, as it is unused and irrelevant for FilteredTermsEnum > Automaton Query/Filter (sc

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-08 Thread Michael McCandless (JIRA)
e to start looking at committing this to flex so we do not have to work with huge patches? +1 > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-08 Thread Robert Muir (JIRA)
see our testcases exercise the important bits (not .toString or .toDot or other things, but those work too). If you have concerns or think it is confusing, i will do my best to try to figure out ways to simplify or improve it from here. > Automaton Query/Filter (scalabl

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-08 Thread Uwe Schindler (JIRA)
x27;s looks good, my change was only adding the method param and removing the access to the noew private tenum. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jir

[jira] Updated: (LUCENE-1271) ClassCastException when using ParallelMultiSearcher.search(Query query, Filter filter, int n, Sort sort)

2009-12-07 Thread Mark Miller (JIRA)
ery, > Filter filter, int n, Sort sort) > > > Key: LUCENE-1271 > URL: https://issues.apache.org/jira/browse/LUCENE-1271 > P

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-07 Thread Robert Muir (JIRA)
help a bit when seeking. instead a char[] is reused, and nextString() etc returns boolean if more solutions exist. I think its actually more readable in a way, need to reorganize a bit more but I need a break from this enum. > Automaton Query/Filter (scalable re

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-07 Thread Robert Muir (JIRA)
. and it simplifies code. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java >

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-07 Thread Robert Muir (JIRA)
d by state number for caching transitions, instead of a hashmap. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lu

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-06 Thread Robert Muir (JIRA)
compat. I added it. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java > Is

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-06 Thread Robert Muir (JIRA)
> Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java > Issue Type: New Feature >

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-06 Thread Robert Muir (JIRA)
you get a chance to review it, I can create a new version of the flex branch patch for this issue... this would resolve one of my "big 3 complaints" about complexity of the code. > Automaton Query/Filter (scalable regex) > --- > >

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-06 Thread Michael McCandless (JIRA)
put this nextValidUTF16String in UnicodeUtil and also use it in SegmentReader.LegacyTermEnum to replace the "hack", just in case someone else wrote an enum like mine. +1 > Automaton Query/Filter (scalable regex) > --- > >

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
= new TermRef(t.text()); {code} instead it could read something like tr = new TermRef(UnicodeUtil.nextValidUTF16String(t.text())); > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://i

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
. So I think I do not absolutely hate the unicode handling code in this enum anymore. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/bro

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
(validUTF16String), and pervert it slightly into nextValidUTF16String. all the tests pass using this on trunk and flex, and I think it reads much easier. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 >

[jira] Issue Comment Edited: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
dealt with it already) // in this case we have to bump to \uE000 (the lowest possible "upper BMP") unpaired L -> \uE000 edit: sorry for the many edits :) > Automaton Query/Filter (scalable regex) > --- > >

[jira] Issue Comment Edited: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
e 4/5: // an unpaired low surrogate. this is invalid when not preceded by lead surrogate // (and if there was one, the above rules would have dealt with it already) // in this case we have to bump to \uE (the lowest possible "upper BMP&q

[jira] Issue Comment Edited: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
is is invalid when not preceded by lead surrogate // (and if there was one, the above rules would have dealt with it already) // in this case we have to bump to \uE (the lowest possible "upper BMP") unpaired L -> \uE000 > Automaton Query/Filter (scalable regex) &

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
by lead surrogate // (and if there was one, the above rules would have dealt with it already) // in this case we have to bump to \uE (the lowest possible "upper BMP") unpaired L -> \uE000 > Automaton Query/Filter (scalable regex) > ---

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
il an accept state or a loop). > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java >

[jira] Issue Comment Edited: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Mark Miller (JIRA)
s (Author: markrmil...@gmail.com): Sorry - haven't been paying a lot of attention to all of the Unicode issues/talk lately. Could you briefly explain cleanupPosition? Whats the case where a seek position cannot be converted to UTF-8? > Automaton Query/Filte

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Mark Miller (JIRA)
attention to all of the Unicode issues/talk lately. Could you briefly explain cleanupPosition? Whats the case where a seek position cannot be converted to UTF-8? > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
or FilteredTermsEnum (see his branch patch, I think its easier there). if you have ideas how we can simplify any of this in trunk for easier readability (instead of just adding absurd amounts of comments as I did), I'd be very interested. > Automaton

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Mark Miller (JIRA)
n the position of testing anyway - else I'll look like a moron when I +1 this thing ;) bq. If you save this test setup, I'll save it for sure. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 &

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
here). > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java > Issue Type: New Feature >

Re: [jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir
onfused. Please >>> anybody help me out to know which part of codes you are working with. How >>> should I participate in work? Thank you! >>> >>> >>> >>> >>> >>> On Sat, Dec 5, 2009 at 1:02 PM, Uwe Schindler (JIRA) wrote: >>> >>>>

[jira] Issue Comment Edited: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Mark Miller (JIRA)
ndard corpus). I think the benches you have already done are probably plenty good for benefits testing. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Mark Miller (JIRA)
going to take more time. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java >

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
ss I think you are right about the partial dump. I am indexing the full dump now (at least I think). I will look at it too, at least for curiousity sake. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 >

[jira] Issue Comment Edited: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Mark Miller (JIRA)
you have already done are probably plenty good for benefits testing. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 >

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Mark Miller (JIRA)
of things. bq. More interesting to see the benefits... Right, but I'm not really testing for benefits - more for correctness and no loss of performance. I think the benches you have already done are probably plenty good for benefits testing. > Automaton Query/

Re: [jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Ghazal Gharooni
panel] >>> >>> Uwe Schindler updated LUCENE-1606: >>> -- >>> >>> Attachment: (was: LUCENE-1606-flex.patch) >>> >>> > Automaton Query/Filter (scalable regex) >>> > -

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
rsian corpus i mentioned with nearly 500k terms... > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java &g

Re: [jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir
-- >> >> Attachment: (was: LUCENE-1606-flex.patch) >> >> > Automaton Query/Filter (scalable regex) >> > --- >> > >> > Key: LUCENE-1606 >> > URL: https://issu

Re: [jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Mark Miller
606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Uwe Schindler updated LUCENE-1606: > -- > >Attachment: (was: LUCENE-1606-flex.patch) > > > Automaton Query/Filter (scalable regex)

Re: [jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Simon Willnauer
etabpanels:all-tabpanel >> ] >> >> Uwe Schindler updated LUCENE-1606: >> -- >> >>    Attachment:     (was: LUCENE-1606-flex.patch) >> >> > Automaton Query/Filter (scalable regex) >> > ---

Re: [jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Ghazal Gharooni
pache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] > > Uwe Schindler updated LUCENE-1606: > -- > > Attachment: (was: LUCENE-1606-flex.patch) > > >

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
o hear its doing so well on such a "small" index as wikipedia, as I would think automata overhead would make it slower (although this can probably be optimized away) > Automaton Query/Filter (scalable regex) > --- > >

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1606: -- Attachment: (was: LUCENE-1606-flex.patch) > Automaton Query/Filter (scalable re

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Uwe Schindler (JIRA)
*xxx*** > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java > Issue Type: New Feature

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Mark Miller (JIRA)
are testing I'm not sure at the moment - but its wikipedia dumps, so I'd guess its rather high actually. It is hitting the standard analyzer going in (mainly because I didn't think about changing it on building the indexes). And the queries are getting hit with the lowercase fil

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Uwe Schindler (JIRA)
instead of rewrite but with reverted LUCENE-2110, which was stupid. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Proj

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
orks on String. btw how many uniq terms is the field you are testing... this is where it starts to help with ?, when you have a ton of unique terms. But I am glad you are testing with hopefully a smaller # of uniq terms, this is probably more common. > Automaton Query/Filte

[jira] Issue Comment Edited: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Mark Miller (JIRA)
think Robert has mentioned). So far I haven't seen any anomalies in time taken or anything of that nature. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https:

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Mark Miller (JIRA)
is used (as I think Robert has mentioned). So far I haven't seen any anomalies in time taken or anything of that nature. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issu

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Uwe Schindler (JIRA)
very strange things like seeking forward and backwards and returning all strange stati. Will think about one tomorrow. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1606: -- Attachment: (was: LUCENE-1606-flex.patch) > Automaton Query/Filter (scalable re

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Uwe Schindler (JIRA)
gain wrong patch. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java >

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Uwe Schindler (JIRA)
work for today, I am exhausted like the enums. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Jav

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1606: -- Attachment: (was: LUCENE-1606-flex.patch) > Automaton Query/Filter (scalable re

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Uwe Schindler (JIRA)
LUCENE-2110. Robert: Can you test performance again and compare with old? > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 >

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
your patch the performance is the same. But the code is much simpler and easier to read... great work. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1606: -- Attachment: (was: LUCENE-1606-flex.patch) > Automaton Query/Filter (scalable re

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Uwe Schindler (JIRA)
the nextSeekTerm method to be more straigtForward. Robert: Sorry, it would be better to test this one *g* > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/bro

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Robert Muir (JIRA)
with the old and new flex patch, I do not want to commit 2110 before. Uwe I will run a benchmark on both versions! > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apa

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-05 Thread Uwe Schindler (JIRA)
, as soon as 2110 is committed I will upload a new patch. But its hard to differentiate between all modified files. Robert: Can you do performance tests with the old and new flex patch, I do not want to commit 2110 before. > Automaton Query/Filter (scalable re

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-04 Thread Robert Muir (JIRA)
used in both modes without any concern that it will ever hurt performance. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 >

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-04 Thread Robert Muir (JIRA)
, for experimenting or whatever. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java > Is

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-04 Thread Mark Miller (JIRA)
Fantastic commenting man - this whole patch is pretty darn thorough. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 >

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-04 Thread Robert Muir (JIRA)
what else needs to be done here, please review if you can. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 >

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-02 Thread Robert Muir (JIRA)
to invalid locations when walking thru the DFA, because these will be replaced by U+FFFD, and terms could be skipped, or we go backwards, creating a loop. Thats why i spent so much time on this. > Automaton Query/Filter (scalable regex) > --- > >

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-12-02 Thread Robert Muir (JIRA)
to valid UTF-8. {code} if you have ideas on how to make this nicer I am happy to hear them. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-30 Thread Robert Muir (JIRA)
lect this. Currently I cheat and take advantage of this property (in trunk) to make the code simpler. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-30 Thread Michael McCandless (JIRA)
now \u can be in the index, and I can seek to it (it won't get replaced with \uFFFD). Yes, \u should be untouched now (though I haven't verified -- actually I'll go add it to the test we already have for \u). > Automaton Query/

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-30 Thread Robert Muir (JIRA)
I need >to know, otherwise it will either skip \u terms, or go into a loop. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606

[jira] Issue Comment Edited: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-25 Thread Robert Muir (JIRA)
nt for the entire query. this can be determined from the state/transitions of the path being evaluated, but its not a one-liner! > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: http

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-25 Thread Robert Muir (JIRA)
but its not a one-liner! > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java > Issue Typ

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-25 Thread Michael McCandless (JIRA)
branch to put it back... > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java >

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-25 Thread Robert Muir (JIRA)
) to make this determination, thanks for the idea! > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-25 Thread Michael McCandless (JIRA)
ek itself vs next() Lucene" > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java &g

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-25 Thread Michael McCandless (JIRA)
eek Lucene, based on how costly nextString() is... > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Jav

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-25 Thread Robert Muir (JIRA)
f I were to do this, then that would kill the TermRef comparison speedup, because then no matter how much i optimize "my seek" nextString(), it needs to do the unicode conversion, which we have seen is expensive across many terms. > Automaton Q

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-25 Thread Michael McCandless (JIRA)
rch through the indexed terms... and not doing a scan when it determines the term you're seeking to is within the same index block. But I don't think this'll impact your tests with a large suffix since each seek will jump way ahead to a new index bloc

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Robert Muir (JIRA)
. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java > Issue Type: New Feature >

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Robert Muir (JIRA)
iteratively, in case someone builds some monster automaton from a 2 page regexp or something like that. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/bro

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1606: Attachment: (was: LUCENE-1606.patch) > Automaton Query/Filter (scalable re

[jira] Issue Comment Edited: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Robert Muir (JIRA)
a finite language (in the wildcard case, no *), we should not do the next() call. but more benchmarking is needed, with more patterns, especially on flex branch to determine if this heuristic is best. > Automaton Query/Filter (scalable regex) > --

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Robert Muir (JIRA)
se, no *), we should not do the next() call. but more benchmarking is needed, with more patterns, especially on flex branch to determine if this heuristic is best. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCE

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Robert Muir (JIRA)
ntation here, I'm hoping we can come up with better ideas that work well on average. One problem is, what is an "average" regular expression or wildcard query :) > Automaton Query/Filter (scalable regex) > --- > >

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Michael McCandless (JIRA)
Well, the seeks need to be done anyway... so you can't work around that. The only question is if a wasted next() was done before each, I guess... > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 &g

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Robert Muir (JIRA)
nd compute the next place to go... (and create a few objects along the way) > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 &g

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Robert Muir (JIRA)
ant rework of this maybe should take place in flex (although I still think this is an improvement for trunk already), to fully take advantage of it. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 >

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Robert Muir (JIRA)
nd it gives me abcdaa back, ill do the same thing again. the reason is, somewhere down the line there could be abcdaa1234 :) > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 &g

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Michael McCandless (JIRA)
's "close"). That said, seeking on trunk is alot more costly than seeking on flex, because trunk has to make a new [cloned] SegmentTermEnum for each seek. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCE

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Robert Muir (JIRA)
time ago, perhaps we should re-test to see if its appropriate? > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Luc

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-11-24 Thread Michael McCandless (JIRA)
e next XXX1234 term to try to seek to (and we should never use next() on the enum)? > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 &g

  1   2   3   4   5   6   7   8   9   10   >