Re: 2.9/3.0 plan Java 1.5
Just for the record, to pick up this point of Grant's: Grant Ingersoll wrote: IIRC, we also agreed that we didn't feel any compelling reason to make a sweeping change to generics, but would likely just add them as we see 'em, unless of course someone wants to do a wholesale patch. I'll go on record as saying that if doing a 'wholesale patch' is the easiest way, I'm more than happy to do so. As an experiment I tried using a combination of Eclipse's infer generic type arguments (which is brilliant, but not perfect) and manual changes (where Eclipse doesn't quite manage to nail it) and managed to get ~2000 'use of raw types' warnings throughout the Lucene trunk codebase down to ~1000 in the space of an hour or so. There's a little bit of manual tidy-up involved but it's something I've done plenty of before (both internally and on external APIs, which obviously require more care) -- but if you want someone to do the gruntwork, well, just let me know when the 3.0-dev branch exists and is ready for commits and I'll set aside a day and give it a crack. Cheers, Paul - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657398#action_12657398 ] Mark Miller commented on LUCENE-1483: - I'm on board with whatever you think is best. I'll keep playing with ords. I spent some time last night putting in most of the rest of the cleaup/finishup that was left outside of the comparators. Theres a handful of non SortTest classes tests that still fail though, so I still have to fix those. I'll do that, give ords a little play time, and then I think the patch will be fairly close. Then we can take it in and bench on a fairly close to done version. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657445#action_12657445 ] Mark Miller commented on LUCENE-1483: - Hey Mike how about this one? BooleanScorer can collect hits out of order if you force it (against the contract). I think its an issue with basedoc type stuff. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: solr NumberUtils to lucene?
It would be great to get it consistent I cherry picked when someone pointed it out to me Erik Hatcher wrote: My thoughts... bring over any simple functions like these are that are generally useful. At a quick glance, the functions in Solr's NumberUtils are generally useful and fit well in Lucene's NumberTools. What's the harm? Erik On Dec 16, 2008, at 9:14 PM, Ryan McKinley wrote: I posted this same question for the same reasons a while back... http://markmail.org/message/mji7jnpa5xjfflmw I'm looking at local lucene and trying to figure out how it could go into lucene. As is, locallucene depends on solr since it needs NumberUtils. Any change of heart for moving it into lucene? - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- Patrick O'Leary AOL Local Search Technologies Phone: + 1 703 265 8763 You see, wire telegraph is a kind of a very, very long cat. You pull his tail in New York and his head is meowing in Los Angeles. Do you understand this? And radio operates exactly the same way: you send signals here, they receive them there. The only difference is that there is no cat. - Albert Einstein View Patrick O Leary's profile
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657458#action_12657458 ] Mark Miller commented on LUCENE-1483: - I didnt think it should be a problem either, since we just push everything to one reader; But it seems to be - the only test not passing involves allowDocsOutOfOrder=true. Do the search with it true, do the same search with it false, gets 3 and 4 docs. 2 or 3 tests involving that fail. I don't have time to dig in till tonight though - thought you might shortcut me to the answer :) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657466#action_12657466 ] Doug Cutting commented on LUCENE-1483: -- bq. I would actually be fine with keeping HitCollector, adding a default setNextReader method, that either throws UOE or (if we are strongly against exceptions) returns false indicating it cannot handle sequential readers. Could we instead add a new HitCollector subclass, that adds the setNextReader, then use 'instanceof' to decide whether to wrap or not? bq. I really don't fully understand BooleanScorer! The original version of BooleanScorer uses a ~16k array to score windows of docs. So it scores docs 0-16k first, then docs 16-32k, etc. For each window it iterates through all query terms and accumulates a score in table[doc%16k]. It also stores in the table a bitmask representing which terms contributed to the score. Non-zero scores are chained in a linked list. At the end of scoring each window it then iterates through the linked list and, if the bitmask matches the boolean constraints, collects a hit. For boolean queries with lots of frequent terms this can be much faster, since it does not need to update a priority queue for each posting, instead performing constant-time operations per posting. The only downside is that it results in hits being delivered out-of-order within the window, which means it cannot be nested within other scorers. But it works well as a top-level scorer. The new BooleanScorer2 implementation instead works by merging priority queues of postings, albeit with some clever tricks. For example, a pure conjunction (all terms required) does not require a priority queue. Instead it sorts the posting streams at the start, then repeatedly skips the first to to the last. If the first ever equals the last, then there's a hit. When some terms are required and some terms are optional, the conjunction can be evaluated first, then the optional terms can all skip to the match and be added to the score. Thus the conjunction can reduce the number of priority queue updates for the optional terms. Does that help any? Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657468#action_12657468 ] Mark Miller commented on LUCENE-1483: - bq. Could we instead add a new HitCollector subclass, that adds the setNextReader, then use 'instanceof' to decide whether to wrap or not? Woah! Don't make me switch all that again! I've got wrist injuries here :) The reason I lost the instanceof is that we would have to deprecate the HitCollector implementations because they need to extend HitCollector. Mike seemed against deprecating those if we could get away with it, so I've since dropped that. I've already gone back and forth - whats it going to be ? Ill admit I don't like using the exception trap I am now, but I dont much like the return true/false method either... Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657406#action_12657406 ] Mark Miller commented on LUCENE-1483: - Hmmm...we had a reason for deprecating HitCollector though. At first it was to do the capability check (instance of HitCollector would be wrapped), but that didn't pan out. I think we also liked it because people got deprecation warnings though - so that they would know to implement that method for 3.0 when we would take out the wrapper. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657468#action_12657468 ] markrmil...@gmail.com edited comment on LUCENE-1483 at 12/17/08 10:34 AM: bq. Could we instead add a new HitCollector subclass, that adds the setNextReader, then use 'instanceof' to decide whether to wrap or not? Woah! Don't make me switch all that again! I've got wrist injuries here :) The reason I lost the instanceof is that we would have to deprecate the HitCollector implementations because they need to extend HitCollector. Mike seemed against deprecating those if we could get away with it, so I've since dropped that. I've already gone back and forth - whats it going to be ? Ill admit I don't like using the exception trap I am now, but I dont much like the return true/false method either... *Edit* Ah, I see, you have a new tweak on this time. Extend HitCollector rather then HitCollector extending the new type... Nice, I think this is the way to go. was (Author: markrmil...@gmail.com): bq. Could we instead add a new HitCollector subclass, that adds the setNextReader, then use 'instanceof' to decide whether to wrap or not? Woah! Don't make me switch all that again! I've got wrist injuries here :) The reason I lost the instanceof is that we would have to deprecate the HitCollector implementations because they need to extend HitCollector. Mike seemed against deprecating those if we could get away with it, so I've since dropped that. I've already gone back and forth - whats it going to be ? Ill admit I don't like using the exception trap I am now, but I dont much like the return true/false method either... Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657476#action_12657476 ] Doug Cutting commented on LUCENE-1483: -- Woah! Don't make me switch all that again! Sorry, I'm just tossing out ideas. Don't take me too seriously... The reason I lost the instanceof is that we would have to deprecate the HitCollector implementations because they need to extend HitCollector. Would we? I was suggesting that, if we're going to have two APIs, one expert and one non-expert, then we could make the expert API a subclass and not deprecate or otherwise alter HitCollector. I do not like using exceptions for normal control flow. Instanceof is better, but not ideal. A default implementation of an expert method that returns 'false', as Mike suggested, isn't bad and might be best. It requires neither deprecation, exceptions nor instanceof. Would we have a subclass that overrides this that's used as a base class for optimized implementations? Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657414#action_12657414 ] Mark Miller commented on LUCENE-1483: - Okay, I hate the idea of leaving in the wrapper, but it is true thats too difficult of a method for HitCollector (to be required anyway). setReader is a jump in understanding above setDocBase, which was bad enough. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657483#action_12657483 ] Michael McCandless commented on LUCENE-1483: {quote} Does that help any? {quote} Yes, thanks! So much so that I'm going to go add that blurb to the javadocs... Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657481#action_12657481 ] Michael McCandless commented on LUCENE-1483: {quote} Would we have a subclass that overrides this that's used as a base class for optimized implementations? {quote} If we do this, I don't think we need a new base class for expert collectors; they can simply subclass HitCollector override the setNextReader method? Though one downside of this approach is the simple HitCollector API is polluted with this advanced method, and HitCollector's collect method gets different args depending on what that method returns. It's a somewhat confusing API. I guess Id' actually prefer subclassing HitCollector (SequentialHitCollector? AdvancedHitCollector? SegmentedHitCollector?), adding setNextReader only to that subclass, and using instanceof to wrap HitCollector subclasses. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657482#action_12657482 ] Mark Miller commented on LUCENE-1483: - Woah! Don't make me switch all that again! Sorry, I'm just tossing out ideas. Don't take me too seriously... Same here. If you guys have a 100 ideas, id do it 100 times. No worries. Just wrist frustration :) I misunderstood you anyways. bq. It requires neither deprecation, exceptions nor instanceof. Okay, fair points. I guess my main dislike was having to call it, see what it returns, and then maybe call it again. That turned me off as much as instanceof. I'm still liking the suggestion you just made myself... Mike? Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657489#action_12657489 ] Mark Miller commented on LUCENE-1483: - bq. I guess Id' actually prefer subclassing HitCollector (SequentialHitCollector? AdvancedHitCollector? SegmentedHitCollector?), adding setNextReader only to that subclass, and using instanceof to wrap HitCollector subclasses. Thats actually what I prefer as well (and what I tried). I used MultiReaderHitCollector. Still thinking about the name... Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657493#action_12657493 ] Michael McCandless commented on LUCENE-1483: I like MultiReaderHitCollector! Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657402#action_12657402 ] Mark Miller commented on LUCENE-1483: - bq. I have still one question: Why do we need the new DocCollector? Is this really needed? Would it be not OK to just add the offset before calling collect()? If its not needed, lets get rid of it. We don't want to deprecate HitCollector if we don't have to. The main reason I can see that we are doing it at the moment is that the TopFieldValueDocCollector needs that hook so that it can set the next IndexReader for each Comparator. The Comparator needs it to create the fieldcaches and map ords from one reader to the next. Also, it lets us do the docBase stuff, which is nice because you add the docBase less often if done in the collector. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657331#action_12657331 ] Michael McCandless commented on LUCENE-1483: {quote} I just don't think that ords without fallback is going to get very good. I'm wondering if we should even try too hard if ord with val fallback does so well. {quote} Maybe we can try a bit more (I'll run perf tests on your next iteration here?) and then start wrapping things up? Progress not perfection! We can further improve this later. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657403#action_12657403 ] Michael McCandless commented on LUCENE-1483: {quote} Why do we need the new DocCollector? Is this really needed? Would it be not OK to just add the offset before calling collect()? {quote} I'd like to allow for 'expert' cases, where the collector is told when we advance to the next sequential reader and can do something at that point (like our sort-by-field collector does). But then still allow for 'normal' cases, where the collector is unchanged with what we have today (ie it receives the real docID). The core collectors would use the expert API to eke out all performance; external collectors can use either, but the 'normal' one would be simplest (and match back compat). So then how to implement this approach... I would actually be fine with keeping HitCollector, adding a default setNextReader method, that either throws UOE or (if we are strongly against exceptions) returns false indicating it cannot handle sequential readers. Then when we run searches we simply check if the collector is an expert one (does not throw UOE or return false from setNextReader) and if it isn't we wrap it with DocBaseCollector (which adds the doc base for every collect() call). Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657400#action_12657400 ] Uwe Schindler commented on LUCENE-1483: --- I have still one question: Why do we need the new DocCollector? Is this really needed? Would it be not OK to just add the offset before calling collect()? Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657409#action_12657409 ] Michael McCandless commented on LUCENE-831: --- {quote} this will turn more into an API overhaul than an IndexReader reopen time saver. {quote} ...and given the progress on LUCENE-1483 (copying values into the sort queues), I think this new FieldCache API should probably be primarily an iteration API. Complete overhaul of FieldCache API/Implementation -- Key: LUCENE-831 URL: https://issues.apache.org/jira/browse/LUCENE-831 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Hoss Man Fix For: 3.0 Attachments: ExtendedDocument.java, fieldcache-overhaul.032208.diff, fieldcache-overhaul.diff, fieldcache-overhaul.diff, LUCENE-831.03.28.2008.diff, LUCENE-831.03.30.2008.diff, LUCENE-831.03.31.2008.diff, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch Motivation: 1) Complete overhaul the API/implementation of FieldCache type things... a) eliminate global static map keyed on IndexReader (thus eliminating synch block between completley independent IndexReaders) b) allow more customization of cache management (ie: use expiration/replacement strategies, disk backed caches, etc) c) allow people to define custom cache data logic (ie: custom parsers, complex datatypes, etc... anything tied to a reader) d) allow people to inspect what's in a cache (list of CacheKeys) for an IndexReader so a new IndexReader can be likewise warmed. e) Lend support for smarter cache management if/when IndexReader.reopen is added (merging of cached data from subReaders). 2) Provide backwards compatibility to support existing FieldCache API with the new implementation, so there is no redundent caching as client code migrades to new API. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-1484) Remove SegmentReader.document synchronization
[ https://issues.apache.org/jira/browse/LUCENE-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1484. Resolution: Fixed Fix Version/s: 2.9 Committed revision 727338. Thanks Jason! Remove SegmentReader.document synchronization - Key: LUCENE-1484 URL: https://issues.apache.org/jira/browse/LUCENE-1484 Project: Lucene - Java Issue Type: Improvement Components: Index Affects Versions: 2.4 Reporter: Jason Rutherglen Assignee: Michael McCandless Fix For: 2.9 Attachments: LUCENE-1484.patch, LUCENE-1484.patch Original Estimate: 96h Remaining Estimate: 96h This is probably the last synchronization issue in Lucene. It is the document method in SegmentReader. It is avoidable by using a threadlocal for FieldsReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657455#action_12657455 ] Michael McCandless commented on LUCENE-1483: {quote} BooleanScorer can collect hits out of order if you force it (against the contract). {quote} Hmmm... right. You mean if you pass in allowDocsOutOfOrder=true (defaults to false). I think this should not be a problem? (Though, I really don't fully understand BooleanScorer!). Since we are running scoring per-segment, each segment might collect its docIDs out of order, but all such docs are still within the current segment. Then when we advance to the new segment, the collector can do something if it needs to, and then collection proceeds again on the next segment's docs, possibly out of order. Ie, the out-of-orderness never jumps across a segment and then back again? But this is a challenge for LUCENE-831, if we go with a primarily iterator-driven API. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657330#action_12657330 ] Michael McCandless commented on LUCENE-1483: {quote} the binary search gives back -insertionpoint - 1, the insertion point for banana is 1, so -1 -1 = -2. So I reverse that and subtract 2 to get 0 right? It lands on apple. {quote} Hmm -- I didn't realize binarySearch is returning the insertion point on a miss. So your logic (negate then subtract 2) makes perfect sense now. Just be sure... maybe you should temporarily add asserts when a negative index is returned that values[-index-2].compareTo(newValue) 0 and values[-index-1] 0 (making sure those array accesses are in bounds)? {quote} (I dont remember off hand why subord has to start at 1 not 0, but i remember it didnt work otherwise) {quote} This is very important -- that 1 is equivalent to the original 0.5 proposal, ie, think of subord as the 2nd digit in a 2-digit number. That 2nd digit being non zero is how we know that even though banana's ord landed on apple's, banana is in fact *not* equal to apple (because the subord for banana is 0) and is instead *between* apple and orange. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1494) Additional features for searching for value across multiple fields (many-to-one style)
[ https://issues.apache.org/jira/browse/LUCENE-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657353#action_12657353 ] Andrzej Bialecki commented on LUCENE-1494: --- Luke should work with trunk, possibly with only minor patches. Just grab the luke-0.9.jar and add jars from Lucene trunk on the classpath. Additional features for searching for value across multiple fields (many-to-one style) -- Key: LUCENE-1494 URL: https://issues.apache.org/jira/browse/LUCENE-1494 Project: Lucene - Java Issue Type: New Feature Components: Search Affects Versions: 2.4 Reporter: Paul Cowan Priority: Minor Attachments: LUCENE-1494-multifield.patch, LUCENE-1494-positionincrement.patch This issue is to cover the changes required to do a search across multiple fields with the same name in a fashion similar to a many-to-one database. Below is my post on java-dev on the topic, which details the changes we need: --- We have an interesting situation where we are effectively indexing two 'entities' in our system, which share a one-to-many relationship (imagine 'User' and 'Delivery Address' for demonstration purposes). At the moment, we index one Lucene Document per 'many' end, duplicating the 'one' end data, like so: userid: 1 userfirstname: fred addresscountry: au addressphone: 1234 userid: 1 userfirstname: fred addresscountry: nz addressphone: 5678 userid: 2 userfirstname: mary addresscountry: au addressphone: 5678 (note: 2 Documents indexed for user 1). This is somewhat annoying for us, because when we search in Lucene the results we want back (conceptually) are at the 'user' level, so we have to collapse the results by distinct user id, etc. etc (let alone that it blows out the size of our index enormously). So why do we do it? It would make more sense to use multiple fields: userid: 1 userfirstname: fred addresscountry: au addressphone: 1234 addresscountry: nz addressphone: 5678 userid: 2 userfirstname: mary addresscountry: au addressphone: 5678 But imagine the search +addresscountry:au +addressphone:5678. We'd like this to match ONLY Mary, but of course it matches Fred also because he matches both those terms (just for different addresses). There are two aspects to the approach we've (more or less) got working but I'd like to run them past the group and see if they're worth trying to get them into Lucene proper (if so, I'll create a JIRA issue for them) 1) Use a modified SpanNearQuery. If we assume that country + phone will always be one token, we can rely on the fact that the positions of 'au' and '5678' in Fred's document will be different. SpanQuery q1 = new SpanTermQuery(new Term(addresscountry, au)); SpanQuery q2 = new SpanTermQuery(new Term(addressphone, 5678)); SpanQuery snq = new SpanNearQuery(new SpanQuery[]{q1, q2}, 0, false); the slop of 0 means that we'll only return those where the two terms are in the same position in their respective fields. This works brilliantly, BUT requires a change to SpanNearQuery's constructor (which checks that all the clauses are against the same field). Are people amenable to perhaps adding another constructor to SNQ which doesn't do the check, or subclassing it to do the same (give it a protected non-checking constructor for the subclass to call)? 2) It gets slightly more complicated in the case of variable-length terms. For example, imagine if we had an 'address' field ('123 Smith St') which will result in (1 to n) tokens; slop 0 in a SpanNearQuery won't work here, of course. One thing we've toyed with is the idea of using getPositionIncrementGap -- if we knew that 'address' would be, at most, 20 tokens, we might use a position increment gap of 100, and make the slop factor 50; this works fine for the simple case (yay!), but with a great many addresses-per-user starts to get more complicated, as the gap counts from the last term (so the position sequence for a single value field might be 0, 100, 200, but for the address field it might be 0, 1, 2, 3, 103, 104, 105, 106, 206, 207... so it's going to get out of sync). The simplest option here seems to be changing (or supplementing) public int getPositionIncrementGap(String fieldname) to public int getPositionIncrementGap(String fieldname, int currentPos) so that we can override that to round up to the nearest 100 (or whatever) based on currentPos. The default implementation could just delegate to getPositionIncrementGap(). --- Patches (x2) to follow shortly -- This message is automatically generated by JIRA. - You can reply to this email
[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657401#action_12657401 ] Jeremy Volkman commented on LUCENE-831: --- A couple things: # Looking at the getCachedData method for MultiReader and MultiSegmentReader, it doesn't appear that the CacheData objects from merge operations are cached. Is there any reason for this? # I've written a merge method for StringIndexCacheKey. The process isn't all that complicated (apart from all of the off-by-ones), but it's expensive. {code:java} public boolean isMergable() { return true; } private static class OrderNode { int index; OrderNode next; } public CacheData mergeData(int[] starts, CacheData[] data) throws UnsupportedOperationException { int[] mergedOrder = new int[starts[starts.length - 1]]; // Lookup map is 1-based String[] mergedLookup = new String[starts[starts.length - 1] + 1]; // Unwrap cache payloads and flip order arrays StringIndex[] unwrapped = new StringIndex[data.length]; /* Flip the order arrays (reverse indices and values) * Since the ord map has a many-to-one relationship with the lookup table, * the flipped structure must be one-to-many which results in an array of * linked lists. */ OrderNode[][] flippedOrders = new OrderNode[data.length][]; for (int i = 0; i data.length; i++) { StringIndex si = (StringIndex) data[i].getCachePayload(); unwrapped[i] = si; flippedOrders[i] = new OrderNode[si.lookup.length]; for (int j = 0; j si.order.length; j++) { OrderNode a = new OrderNode(); a.index = j; a.next = flippedOrders[i][si.order[j]]; flippedOrders[i][si.order[j]] = a; } } // Lookup map is 1-based int[] lookupIndices = new int[unwrapped.length]; Arrays.fill(lookupIndices, 1); int lookupIndex = 0; String currentVal; int currentSeg; while (true) { currentVal = null; currentSeg = -1; int remaining = 0; // Find the next ordered value from all the segments for (int i = 0; i unwrapped.length; i++) { if (lookupIndices[i] unwrapped[i].lookup.length) { remaining++; String that = unwrapped[i].lookup[lookupIndices[i]]; if (currentVal == null || currentVal.compareTo(that) 0) { currentVal = that; currentSeg = i; } } } if (remaining == 1) { break; } else if (remaining == 0) { /* The only way this could happen is if there are 0 segments or if * all segments have 0 terms. In either case, we can return * early. */ return new CacheData(new StringIndex( new int[starts[starts.length - 1]], new String[1])); } if (!currentVal.equals(mergedLookup[lookupIndex])) { lookupIndex++; mergedLookup[lookupIndex] = currentVal; } OrderNode a = flippedOrders[currentSeg][lookupIndices[currentSeg]]; while (a != null) { mergedOrder[a.index + starts[currentSeg]] = lookupIndex; a = a.next; } lookupIndices[currentSeg]++; } {code} Complete overhaul of FieldCache API/Implementation -- Key: LUCENE-831 URL: https://issues.apache.org/jira/browse/LUCENE-831 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Hoss Man Fix For: 3.0 Attachments: ExtendedDocument.java, fieldcache-overhaul.032208.diff, fieldcache-overhaul.diff, fieldcache-overhaul.diff, LUCENE-831.03.28.2008.diff, LUCENE-831.03.30.2008.diff, LUCENE-831.03.31.2008.diff, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch Motivation: 1) Complete overhaul the API/implementation of FieldCache type things... a) eliminate global static map keyed on IndexReader (thus eliminating synch block between completley independent IndexReaders) b) allow more customization of cache management (ie: use expiration/replacement strategies, disk backed caches, etc) c) allow people to define custom cache data logic (ie: custom parsers, complex datatypes, etc... anything tied to a reader) d) allow people to inspect what's in a cache (list of CacheKeys) for an IndexReader so a new IndexReader can be likewise warmed. e) Lend support for smarter cache management if/when IndexReader.reopen is added (merging of cached data from subReaders). 2) Provide backwards compatibility to support existing FieldCache API
[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657445#action_12657445 ] markrmil...@gmail.com edited comment on LUCENE-1483 at 12/17/08 8:59 AM: --- Hey Mike how about this one? BooleanScorer can collect hits out of order if you force it (against the contract). I think its an issue with basedoc type stuff. Actually I'll clarify that - I think its an issue with the multple reader mojo - didnt mean to put it solely on adding bases in particular yet. was (Author: markrmil...@gmail.com): Hey Mike how about this one? BooleanScorer can collect hits out of order if you force it (against the contract). I think its an issue with basedoc type stuff. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: [Fwd: Re: 2.9, 3.0 and deprecation]
Hallo Patrick, You are almost right with what you think that the trie algorithm does. The idea behind the trie algorithm is to match as most as possible matching documents per term and so the number of TermDocs seeks is low. This is done by using the most precise terms (that only match few documents) for the borders of the range and use the most unprecise terms for the center of the range (which match more documents). Because of the algorithm the maximum number of termdoc seeks is limited hard to an upper boundary dependent on the trie parameters, not the index size or if the range is very large [see javadocs and LUCENE-1470 for numbers]. Because of this all ranges execute in about the same time. Uwe - UWE SCHINDLER Webserver/Middleware Development PANGAEA - Publishing Network for Geoscientific and Environmental Data MARUM - University of Bremen Room 2500, Leobener Str., D-28359 Bremen Tel.: +49 421 218 65595 Fax: +49 421 218 65505 http://www.pangaea.de/ http://www.pangaea.de/ E-mail: uschind...@pangaea.de _ From: patrick o'leary [mailto:polear...@aol.com] Sent: Tuesday, December 16, 2008 4:51 PM To: java-dev@lucene.apache.org Subject: Re: [Fwd: Re: 2.9, 3.0 and deprecation] Yes, typo.. long day yesterday Uwe Schindler wrote: I've only read through the jdoc of tier so far, but I'm guessing it's doing a dictionary search and splitting the the index readers position based on the result being less than or greater than upper / lower values. Which may be faster than a TermDocs seek, and certainly worth while investigating. Do you mean JDOC of Trie here? Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- Patrick O'Leary AOL Local Search Technologies Phone: + 1 703 265 8763 You see, wire telegraph is a kind of a very, very long cat. You pull his tail in New York and his head is meowing in Los Angeles. Do you understand this? And radio operates exactly the same way: you send signals here, they receive them there. The only difference is that there is no cat. - Albert Einstein http://www.linkedin.com/in/pjaol View Patrick O Leary's LinkedIn profileView Patrick O Leary's profile image001.gif
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657410#action_12657410 ] Michael McCandless commented on LUCENE-1483: {quote} so that they would know to implement that method for 3.0 when we would take out the wrapper. {quote} Right but the new insight (for me at least) is it's OK for external collectors to not code to the expert API. Ie previously we wanted to force migration to the expert API, but now I think it's OK to allow normal API and expert API to exist together. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657579#action_12657579 ] Mark Miller commented on LUCENE-1483: - {quote}Hmmm... right. You mean if you pass in allowDocsOutOfOrder=true (defaults to false). I think this should not be a problem? (Though, I really don't fully understand BooleanScorer!). Since we are running scoring per-segment, each segment might collect its docIDs out of order, but all such docs are still within the current segment. Then when we advance to the new segment, the collector can do something if it needs to, and then collection proceeds again on the next segment's docs, possibly out of order. Ie, the out-of-orderness never jumps across a segment and then back again?{quote} I was off base with my guess - its actually only using one reader for that test (3 or 4 docs). Gotto be the HitCollector that the out of order scorer uses needs to be tweaked. Last tests to fix. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657586#action_12657586 ] Mark Miller commented on LUCENE-1483: - Hmmm...working more with ints as ords rather than double...it gives us ints but it complicates things a bit. Before, the only ords that had to be sorted and suborded were ones that didn't map on the new Reader exactly. With an int ord, *everything* you add is going to collide, and you need the ords in the queue added to the double lists and you need to fall down to the subord much more often... interesting... I guess I'll go with it for now though... Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657594#action_12657594 ] Michael McCandless commented on LUCENE-1483: Hang on -- if the value carries over to the new segment (and you set subord to 0) then you don't need to add those ords to the double lists? Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-504) FuzzyQuery produces a java.lang.NegativeArraySizeException in PriorityQueue.initialize if I use Integer.MAX_VALUE as BooleanQuery.MaxClauseCount
[ https://issues.apache.org/jira/browse/LUCENE-504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657602#action_12657602 ] George Papas commented on LUCENE-504: - Hi, This is still an issue in 2.4.0. I know this is low priority, but has there been any more thinking about how to address this? Thanks George. FuzzyQuery produces a java.lang.NegativeArraySizeException in PriorityQueue.initialize if I use Integer.MAX_VALUE as BooleanQuery.MaxClauseCount -- Key: LUCENE-504 URL: https://issues.apache.org/jira/browse/LUCENE-504 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 1.9 Reporter: Joerg Henss Priority: Minor Attachments: BooleanQuery.java.diff, fuzzyquery.patch, PriorityQueue.java.diff, TestFuzzyQueryError.java PriorityQueue creates an java.lang.NegativeArraySizeException when initialized with Integer.MAX_VALUE, because Integer overflows. I think this could be a general problem with PriorityQueue. The Error occured when I set BooleanQuery.MaxClauseCount to Integer.MAX_VALUE and user a FuzzyQuery for searching. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1483: Attachment: LUCENE-1483.patch This patch is entering the finishing stages I think. This one is pretty much functionally complete and all tests should pass. There is still a bunch of polish to be done though. There are still the following sort types: SortField.STRING_VAL, STRING_ORD, STRING_ORD_VAL, and STRING is currently set to straight ord. I think the ord case is still pretty slow, I'm sure there are still a few optimizations left, but it would be nice to see where its at. There is still an issue with custom FieldComparators - they are currently passed the top level reader in the hook - this still needs to be addressed somehow. We also need a test for one. - Mark Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657608#action_12657608 ] markrmil...@gmail.com edited comment on LUCENE-1483 at 12/17/08 3:29 PM: --- This patch is entering the finishing stages I think. This one is pretty much functionally complete and all tests should pass. There is still a bunch of polish to be done though. There are still the following sort types: SortField.STRING_VAL, STRING_ORD, STRING_ORD_VAL, and STRING is currently set to straight ord. I think the ord case is still pretty slow, I'm sure there are still a few optimizations left, but it would be nice to see where its at. There is still an issue with custom FieldComparators - they are currently passed the top level reader in the hook - this still needs to be addressed somehow. We also need a test for one. - Mark (ignore the couple setDocBases you see in contrib - ive got em) was (Author: markrmil...@gmail.com): This patch is entering the finishing stages I think. This one is pretty much functionally complete and all tests should pass. There is still a bunch of polish to be done though. There are still the following sort types: SortField.STRING_VAL, STRING_ORD, STRING_ORD_VAL, and STRING is currently set to straight ord. I think the ord case is still pretty slow, I'm sure there are still a few optimizations left, but it would be nice to see where its at. There is still an issue with custom FieldComparators - they are currently passed the top level reader in the hook - this still needs to be addressed somehow. We also need a test for one. - Mark Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657620#action_12657620 ] Mark Miller commented on LUCENE-1483: - bq. Hang on - if the value carries over to the new segment (and you set subord to 0) then you don't need to add those ords to the double lists? What was actually happening: I noticed it wasn't quite working right after switching ords to ints from double, and I realized the problem was that there was always going to be a collision for the sort list, whereas before, there was only a sortable collision when more than one mapped-from ord collided. So I thought that out wrong and figured you needed to sort the current ord as well, but in fact, of course you don't: I just needed to assume there is always a collision that adds to the sort list, not wait for 2 mapped-from ords to collide. Change IndexSearcher to use MultiSearcher semantics for multiple subreaders --- Key: LUCENE-1483 URL: https://issues.apache.org/jira/browse/LUCENE-1483 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch FieldCache and Filters are forced down to a single segment reader, allowing for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1465) NearSpansOrdered.getPayload does not return the payload from the minimum match span
[ https://issues.apache.org/jira/browse/LUCENE-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657662#action_12657662 ] Mark Miller commented on LUCENE-1465: - This is an odd one Jonathan. Its actually for the unordered case (the others were for the ordered). I am not exactly clear on whats going on yet. When I look at the payloads coming back, it would seem we are get 0,7,7 when we should get 6,7,7. When I look at the offsets for the spans that I get the payloads from though - they appear correct. Its returning the payloads from the right offsets it seems, but somehow one of those payloads is from the term at position 0? Very odd. So when I debug in, it does indeed look like the first match happens at index 6...but the term offsets are start: 2147483647, end:-2147483648. What the heck? This is going to take some more time... NearSpansOrdered.getPayload does not return the payload from the minimum match span --- Key: LUCENE-1465 URL: https://issues.apache.org/jira/browse/LUCENE-1465 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 2.4 Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 2.4.1, 2.9 Attachments: LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, LUCENE-1465.patch, Test.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1327) TermSpans skipTo() doesn't always move forwards
[ https://issues.apache.org/jira/browse/LUCENE-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1327: Fix Version/s: (was: 2.3.3) 2.9 TermSpans skipTo() doesn't always move forwards --- Key: LUCENE-1327 URL: https://issues.apache.org/jira/browse/LUCENE-1327 Project: Lucene - Java Issue Type: Bug Components: Query/Scoring, Search Affects Versions: 1.9, 2.0.0, 2.1, 2.2, 2.3, 2.3.1, 2.3.2, 2.4, 2.9, 3.0 Reporter: Moti Nisenson Fix For: 2.9 In TermSpans (or the anonymous Spans class returned by SpansTermQuery, depending on the version), the skipTo() method is improperly implemented if the target doc is less than or equal to the current doc: public boolean skipTo(int target) throws IOException { // are we already at the correct position? if (doc = target) { return true; } ... This violates the correct behavior (as described in the Spans interface documentation), that skipTo() should always move forwards, in other words the correct implementation would be: if (doc = target) { return next(); } This bug causes particular problems if one wants to use the payloads feature - this is because if one loads a payload, then performs a skipTo() to the same document, then tries to load the next payload, the spans hasn't changed position and it attempts to load the same payload again (which is an error). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1405) Support for new Resources model in ant 1.7 in Lucene ant task.
[ https://issues.apache.org/jira/browse/LUCENE-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1405: Fix Version/s: (was: 2.3.3) 2.9 Support for new Resources model in ant 1.7 in Lucene ant task. -- Key: LUCENE-1405 URL: https://issues.apache.org/jira/browse/LUCENE-1405 Project: Lucene - Java Issue Type: Improvement Components: contrib/* Affects Versions: 2.3.2 Reporter: Przemyslaw Sztoch Fix For: 2.9 Attachments: lucene-ant1.7-newresources.patch Ant Task for Lucene should use modern Resource model (not only FileSet child element). There is a patch with required changes. Supported by old (ant 1.6) and new (ant 1.7) resources model: index !-- Lucene Ant Task -- fileset ... / /index Supported only by new (ant 1.7) resources model: index !-- Lucene Ant Task -- filelist ... / /index index !-- Lucene Ant Task -- userdefinied-filesource ... / /index -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1361) QueryParser should have a setDateFormat(DateFormat) method
[ https://issues.apache.org/jira/browse/LUCENE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1361: Fix Version/s: (was: 2.3.3) 2.9 QueryParser should have a setDateFormat(DateFormat) method -- Key: LUCENE-1361 URL: https://issues.apache.org/jira/browse/LUCENE-1361 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.3.2 Reporter: ocean Priority: Minor Fix For: 2.9 Currently the only way to change the date format used by QueryParser.java is to override the getRangeQuery method. This seems a bit excessive to me. Since QueryParser isn't threadsafe (like DateFormat) I would suggest that a DateFormat field be introduced (protected DateFormat dateFormat) and a setter be introduced (public void setDateFormat(DateFormat format)) so that it's easier to customize the date format in queries. If there are good reasons against this (can't imagine, but who knows) why not introduce a protected 'DateFormat:createDateFormat())' method so that, again, it's easier for clients to override this logic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org