[jira] [Commented] (LUCENE-4511) TermsFilter might return wrong results if a field is not indexed or not present in the index

2013-03-22 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13610617#comment-13610617
 ] 

Commit Tag Bot commented on LUCENE-4511:


[branch_4x commit] Simon Willnauer
http://svn.apache.org/viewvc?view=revisionrevision=1404132

LUCENE-4511: TermsFilter might return wrong results if a field is not indexed 
or not present in the index


 TermsFilter might return wrong results if a field is not indexed or not 
 present in the index
 

 Key: LUCENE-4511
 URL: https://issues.apache.org/jira/browse/LUCENE-4511
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 4.0, 4.1, 5.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4511.patch, LUCENE-4511.patch, LUCENE-4511.patch, 
 LUCENE-4511.patch, LUCENE-4511.patch


 TermsFilter returns if a term returns null from AIR#terms(term) while it 
 should just continue. I will upload a test  fix shortly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4511) TermsFilter might return wrong results if a field is not indexed or not present in the index

2012-10-31 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487733#comment-13487733
 ] 

Michael McCandless commented on LUCENE-4511:


bq. Regarding PrefixCodedTerms I don't think this buys us much here since 
usecases are not likely to share prefixes? 

Well I suspect TermsFilter is often used with many terms, at which
point prefix coding will usually reduce memory required.

Do you have a sense of how many terms typical ElasticSearch usage
uses?  Seems like it must be highish since we're compacting terms into
single byte[] in the first place.

It would also be nice to reusing existing same code instead of
inventing yet another way to pack terms into bytes (hrm:
FieldCache/DocValues is yet another place where we do this).

But I agree we don't need to improve that now ... we can refactor
later ... progress not perfection.

Hmm maybe add an explicit test for the no terms provided case?
(Maybe I missed it ...).  Also: maybe this should not be IAE but
rather just return a filter accepting nothing?  (I think this is
what current one does today). Ie, just don't add lastTermsAndField
if previousField is null in the ctor).

Otherwise +1 to the new patch.  Thanks Simon!


 TermsFilter might return wrong results if a field is not indexed or not 
 present in the index
 

 Key: LUCENE-4511
 URL: https://issues.apache.org/jira/browse/LUCENE-4511
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 4.0, 4.1, 5.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4511.patch, LUCENE-4511.patch, LUCENE-4511.patch, 
 LUCENE-4511.patch, LUCENE-4511.patch


 TermsFilter returns if a term returns null from AIR#terms(term) while it 
 should just continue. I will upload a test  fix shortly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4511) TermsFilter might return wrong results if a field is not indexed or not present in the index

2012-10-31 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487742#comment-13487742
 ] 

Simon Willnauer commented on LUCENE-4511:
-

bq. Well I suspect TermsFilter is often used with many terms, at which
point prefix coding will usually reduce memory required.

the main point here is reducing # of objects really. In lucene we often focus 
on reducing the memory footprint but even if we don't safe much here we are 
still friendly in terms of GC which is my main concern. so that is also why I 
don't care too much about the prefix coded stuff. Yet we should consolidate 
this. I will open another issue.

bq. Hmm maybe add an explicit test for the no terms provided case?
I will add one before I commit. I don't think we should be smart here. Its 
likely a bug if nothing is provided.

 TermsFilter might return wrong results if a field is not indexed or not 
 present in the index
 

 Key: LUCENE-4511
 URL: https://issues.apache.org/jira/browse/LUCENE-4511
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 4.0, 4.1, 5.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4511.patch, LUCENE-4511.patch, LUCENE-4511.patch, 
 LUCENE-4511.patch, LUCENE-4511.patch


 TermsFilter returns if a term returns null from AIR#terms(term) while it 
 should just continue. I will upload a test  fix shortly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4511) TermsFilter might return wrong results if a field is not indexed or not present in the index

2012-10-31 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487756#comment-13487756
 ] 

Uwe Schindler commented on LUCENE-4511:
---

Wasn't there another patch available that uses AutomatonTermsEnum with MTQ to 
provide the same functionality? The Automaton was this Dahizuk-Mihov-thingie. 
Maybe we can make a MultiTermQuery out of this one (the Filter is then incuded 
by the rewrite mode, too)?

 TermsFilter might return wrong results if a field is not indexed or not 
 present in the index
 

 Key: LUCENE-4511
 URL: https://issues.apache.org/jira/browse/LUCENE-4511
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 4.0, 4.1, 5.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4511.patch, LUCENE-4511.patch, LUCENE-4511.patch, 
 LUCENE-4511.patch, LUCENE-4511.patch


 TermsFilter returns if a term returns null from AIR#terms(term) while it 
 should just continue. I will upload a test  fix shortly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4511) TermsFilter might return wrong results if a field is not indexed or not present in the index

2012-10-31 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487775#comment-13487775
 ] 

Uwe Schindler commented on LUCENE-4511:
---

It is not a problem, I was on vacation, so I only followed the mailing list on 
my mobile phone... We should in all cases port this automaton from the tests 
into core/query-module. I think the main issue is: Some tests in corre use it, 
but maybe we can move those tests to the module. Or we move TermsFilter to 
core...

 TermsFilter might return wrong results if a field is not indexed or not 
 present in the index
 

 Key: LUCENE-4511
 URL: https://issues.apache.org/jira/browse/LUCENE-4511
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 4.0, 4.1, 5.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4511.patch, LUCENE-4511.patch, LUCENE-4511.patch, 
 LUCENE-4511.patch, LUCENE-4511.patch


 TermsFilter returns if a term returns null from AIR#terms(term) while it 
 should just continue. I will upload a test  fix shortly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4511) TermsFilter might return wrong results if a field is not indexed or not present in the index

2012-10-30 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487212#comment-13487212
 ] 

Simon Willnauer commented on LUCENE-4511:
-

if nobody objects I will commit the current patch tomorrow.

 TermsFilter might return wrong results if a field is not indexed or not 
 present in the index
 

 Key: LUCENE-4511
 URL: https://issues.apache.org/jira/browse/LUCENE-4511
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 4.0, 4.1, 5.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4511.patch, LUCENE-4511.patch, LUCENE-4511.patch, 
 LUCENE-4511.patch


 TermsFilter returns if a term returns null from AIR#terms(term) while it 
 should just continue. I will upload a test  fix shortly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4511) TermsFilter might return wrong results if a field is not indexed or not present in the index

2012-10-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487249#comment-13487249
 ] 

Michael McCandless commented on LUCENE-4511:


Do we need to check for the no terms provided case (throw IAE)?  Else we seem 
to make a TermsAndField w/ null field?  Or is that harmless (matches no 
docs)...?  Maybe we need a test for it ...

I think the ArrayUtil.grow can be a  not a =?  Should we shrink the byte[] 
down in the end?

Can we rename .terms - .termBytes?

Typo: don't use case we could pollute the cache here easily -- don't use 
cache since we could pollute the cache here easily

Typo?: no freq if we don't need them - no freq since we don't need them

Maybe equals should also compare the hashCode first (since we compute/cache it 
up front)?

Should currentTermsAndField be renamed to lastTermsAndField?  It's always the 
last completed field right?

Hmm I suddenly realized: I think this code is doing the same thing that 
FrozenBufferedDeletes does (see PrefixCodedTerms ... which takes even fewer 
bytes since it shares prefixes).  Maybe we should just use that?

 TermsFilter might return wrong results if a field is not indexed or not 
 present in the index
 

 Key: LUCENE-4511
 URL: https://issues.apache.org/jira/browse/LUCENE-4511
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 4.0, 4.1, 5.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4511.patch, LUCENE-4511.patch, LUCENE-4511.patch, 
 LUCENE-4511.patch


 TermsFilter returns if a term returns null from AIR#terms(term) while it 
 should just continue. I will upload a test  fix shortly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4511) TermsFilter might return wrong results if a field is not indexed or not present in the index

2012-10-30 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487417#comment-13487417
 ] 

Chris Male commented on LUCENE-4511:


+1 to these improvements.

Another typo: to optimize for this case and to be fitler-cache friendly we  
- filter-cache

 TermsFilter might return wrong results if a field is not indexed or not 
 present in the index
 

 Key: LUCENE-4511
 URL: https://issues.apache.org/jira/browse/LUCENE-4511
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 4.0, 4.1, 5.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4511.patch, LUCENE-4511.patch, LUCENE-4511.patch, 
 LUCENE-4511.patch


 TermsFilter returns if a term returns null from AIR#terms(term) while it 
 should just continue. I will upload a test  fix shortly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4511) TermsFilter might return wrong results if a field is not indexed or not present in the index

2012-10-29 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486021#comment-13486021
 ] 

Michael McCandless commented on LUCENE-4511:


Nice catch!

Hmm should we set lastField = field (and termsEnum = null) before continue (so 
we don't keep calling fields.terms() on the non-existent field), and then 
change that bogus if to check if termsEnum != null?



 TermsFilter might return wrong results if a field is not indexed or not 
 present in the index
 

 Key: LUCENE-4511
 URL: https://issues.apache.org/jira/browse/LUCENE-4511
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 4.0, 4.1, 5.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4511.patch


 TermsFilter returns if a term returns null from AIR#terms(term) while it 
 should just continue. I will upload a test  fix shortly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4511) TermsFilter might return wrong results if a field is not indexed or not present in the index

2012-10-29 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486165#comment-13486165
 ] 

Michael McCandless commented on LUCENE-4511:


Wow, this looks good!  We could also make an outer array w/ one entry (holding 
field name  array of terms I guess) per field, instead of the array of 
booleans marking the transition.

Hmm, but, you are calling terms.iterator once per term in each field?  Can we 
call that only once per field instead?

At some point/density it may be worth union-ing the terms into an A and using 
Terms.intersect ... we've talked about doing that before ... but we should do 
that separately.

 TermsFilter might return wrong results if a field is not indexed or not 
 present in the index
 

 Key: LUCENE-4511
 URL: https://issues.apache.org/jira/browse/LUCENE-4511
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 4.0, 4.1, 5.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4511.patch, LUCENE-4511.patch


 TermsFilter returns if a term returns null from AIR#terms(term) while it 
 should just continue. I will upload a test  fix shortly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org