[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-30 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737424#action_12737424 ] Hoss Man commented on LUCENE-1749: -- Quick responses to some other comments... bq. I chos

[jira] Updated: (LUCENE-1749) FieldCache introspection API

2009-07-30 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1749: - Attachment: LUCENE-1749-hossfork.patch This is a complete overhaul of the internals of FieldCacheSanityC

[jira] Commented: (LUCENE-1460) Change all contrib TokenStreams/Filters to use the new TokenStream API

2009-07-30 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737414#action_12737414 ] Michael Busch commented on LUCENE-1460: --- I wonder if we should just deprecate Prefix

Re: [jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Michael Busch
On 7/30/09 4:10 AM, Michael McCandless wrote: Plus, the original motivation for this (LUCENE-1195) was because queries in general look up the same term at least 2 times during their execution (weight (idf computation), get postings), and so I think we wanted to ensure that a single thread doing i

[jira] Closed: (LUCENE-1772) Upgrade Clover to 2.5.1

2009-07-30 Thread Nick Pellow (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pellow closed LUCENE-1772. --- Resolution: Duplicate Lucene Fields: (was: [Patch Available, New]) This was a duplicate of

[jira] Updated: (LUCENE-1769) Fix wrong clover analysis because of backwards-tests, upgrade clover to 2.4.3 or better

2009-07-30 Thread Nick Pellow (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pellow updated LUCENE-1769: Attachment: nicks-LUCENE-1769.patch a patch to upgrade to clover 2.5.1 . Clover 2.5.1 can be down

[jira] Issue Comment Edited: (LUCENE-1769) Fix wrong clover analysis because of backwards-tests, upgrade clover to 2.4.3 or better

2009-07-30 Thread Nick Pellow (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737366#action_12737366 ] Nick Pellow edited comment on LUCENE-1769 at 7/30/09 4:59 PM: --

[jira] Commented: (LUCENE-1769) Fix wrong clover analysis because of backwards-tests, upgrade clover to 2.4.3 or better

2009-07-30 Thread Nick Pellow (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737366#action_12737366 ] Nick Pellow commented on LUCENE-1769: - HI Uwe, Great work getting Clover upgraded. I

[jira] Created: (LUCENE-1772) Upgrade Clover to 2.5.1

2009-07-30 Thread Nick Pellow (JIRA)
Upgrade Clover to 2.5.1 --- Key: LUCENE-1772 URL: https://issues.apache.org/jira/browse/LUCENE-1772 Project: Lucene - Java Issue Type: Task Components: Build Reporter: Nick Pellow Lucene is current

Re: SegmentReader field cache merging?

2009-07-30 Thread Mark Miller
Perhaps in separate patch that's limited to field cache merging we can simply modify our existing field cache code (i.e. not rewrite field caching in general) (in conjunction with IW.getReader and segment merging) to automatically (or with a settings callback in IW for which fields should be auto

SegmentReader field cache merging?

2009-07-30 Thread Jason Rutherglen
I know this has been somewhat stuck in LUCENE-831 which seems to have blown up quite a bit over time and is untouched of late? Perhaps in separate patch that's limited to field cache merging we can simply modify our existing field cache code (i.e. not rewrite field caching in general) (in conjunct

[jira] Resolved: (LUCENE-1695) Update the Highlighter to use the new TokenStream API

2009-07-30 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved LUCENE-1695. - Resolution: Fixed I've committed this. We can reopen if someone brings up a new argument. Puttin

[jira] Commented: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-07-30 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737314#action_12737314 ] Mark Miller commented on LUCENE-1486: - Alright, then - do you have time to handle that

[jira] Commented: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-07-30 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737311#action_12737311 ] Michael Busch commented on LUCENE-1486: --- +1 for moving it to conrib. Then the users

[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-30 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737294#action_12737294 ] Mark Miller commented on LUCENE-1749: - This issue was a fantastic idea by the way! >

[jira] Commented: (LUCENE-1771) Using explain may double ram reqs for fieldcaches when using ValueSourceQuery/CustomScoreQuery

2009-07-30 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737288#action_12737288 ] Mark Miller commented on LUCENE-1771: - reminder to add warning for custom queries - yo

[jira] Updated: (LUCENE-1749) FieldCache introspection API

2009-07-30 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1749: Attachment: LUCENE-1749.patch Here is a rough draft for an explain fix. Explain for custom and va

Re: [jira] Commented: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-07-30 Thread Mark Miller
I'd almost rather maintain fixes in the JIRA issue. Because its just two separate classes, I think thats a good enough central place? I'd not really opposed to putting it in contrib, but the replacement is not likely to be exactly the same, and I don't know that we should distribute something we pl

[jira] Created: (LUCENE-1771) Using explain may double ram reqs for fieldcaches when using ValueSourceQuery/CustomScoreQuery

2009-07-30 Thread Mark Miller (JIRA)
Using explain may double ram reqs for fieldcaches when using ValueSourceQuery/CustomScoreQuery -- Key: LUCENE-1771 URL: https://issues.apache.org/jira/browse/LUCENE-1771

[jira] Commented: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-07-30 Thread Mark Harwood (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737270#action_12737270 ] Mark Harwood commented on LUCENE-1486: -- No objections to pulling from core given the

[jira] Issue Comment Edited: (LUCENE-1769) Fix wrong clover analysis because of backwards-tests, upgrade clover to 2.4.3 or better

2009-07-30 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737184#action_12737184 ] Uwe Schindler edited comment on LUCENE-1769 at 7/30/09 1:16 PM:

[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-30 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737254#action_12737254 ] Mark Miller commented on LUCENE-1749: - bq. If the CustomScoreQuery class(es) push the

[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-30 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737247#action_12737247 ] Mark Miller commented on LUCENE-1749: - bq. (BTW: random thought that occurred to me la

[jira] Commented: (LUCENE-1745) Add ability to specify compilation/matching flags to RegexCapabiltiies implementations

2009-07-30 Thread Marc Zampetti (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737228#action_12737228 ] Marc Zampetti commented on LUCENE-1745: --- What is the status of having this patch rev

[jira] Updated: (LUCENE-1745) Add ability to specify compilation/matching flags to RegexCapabiltiies implementations

2009-07-30 Thread Marc Zampetti (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marc Zampetti updated LUCENE-1745: -- Fix Version/s: 2.4.2 Lucene Fields: [New, Patch Available] (was: [New]) > Add ability to

[jira] Updated: (LUCENE-1745) Add ability to specify compilation/matching flags to RegexCapabiltiies implementations

2009-07-30 Thread Marc Zampetti (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marc Zampetti updated LUCENE-1745: -- Fix Version/s: (was: 2.4.2) Incorrectly added the "Fixed Version" in the last update. > A

[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-30 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737217#action_12737217 ] Hoss Man commented on LUCENE-1749: -- Mark: thanks for looking into the tests. If the Cust

[jira] Issue Comment Edited: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-07-30 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737212#action_12737212 ] Mark Miller edited comment on LUCENE-1486 at 7/30/09 11:17 AM: -

[jira] Commented: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-07-30 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737212#action_12737212 ] Mark Miller commented on LUCENE-1486: - Okay, so I guess the question is - who objects

[jira] Updated: (LUCENE-1567) New flexible query parser

2009-07-30 Thread Adriano Crestani (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adriano Crestani updated LUCENE-1567: - Attachment: (was: lucene_trunk_FlexQueryParser_2009july29_v11.patch) > New flexible

Re: ConcurrentMergeScheduler and MergePolicy question

2009-07-30 Thread Jason Rutherglen
If the app is creating many small segments LUCENE-1313 will help by keeping them in ram until they are too large. Smaller segments will be merged into a large segment ram -> disk. Then disk -> disk is faster as we're only merging larger segments. IW will not pause while writing the hopefully fairly

[jira] Created: (LUCENE-1770) WikipediaQueryMaker

2009-07-30 Thread Mark Miller (JIRA)
WikipediaQueryMaker --- Key: LUCENE-1770 URL: https://issues.apache.org/jira/browse/LUCENE-1770 Project: Lucene - Java Issue Type: New Feature Reporter: Mark Miller Priority: Trivial Attachmen

[jira] Updated: (LUCENE-1770) WikipediaQueryMaker

2009-07-30 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1770: Attachment: LUCENE-1770.patch > WikipediaQueryMaker > --- > > Key:

[jira] Updated: (LUCENE-1769) Fix wrong clover analysis because of backwards-tests, upgrade clover to 2.4.3 or better

2009-07-30 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1769: -- Attachment: LUCENE-1769.patch This patch uses the features ov clover 2.0. We can only commit

[jira] Created: (LUCENE-1769) Fix wrong clover analysis because of backwards-tests, upgrade clover to 2.4.3 or better

2009-07-30 Thread Uwe Schindler (JIRA)
Fix wrong clover analysis because of backwards-tests, upgrade clover to 2.4.3 or better --- Key: LUCENE-1769 URL: https://issues.apache.org/jira/browse/LUCENE-1769 Pr

Re: ConcurrentMergeScheduler and MergePolicy question

2009-07-30 Thread Mark Miller
Michael McCandless wrote: On the impact of search performance for large vs small mergeFactors, I think the jury is still out. People should keep testing that (and report back!). Right - I think things may have changed a bit with per segment as well - you have to a pay a bit more for more seg

Re: ConcurrentMergeScheduler and MergePolicy question

2009-07-30 Thread Michael McCandless
The merge selection (LogMergePolicy) tries to merge "roughly" equal sized (measured in bytes) segments together, so it creates a "roughly" log-staircase pattern. I agree, in an NRT app, larger mergeFactor is likely best since it minimizes reopen time overall. It's also important to setMergedSegme

[jira] Commented: (LUCENE-1695) Update the Highlighter to use the new TokenStream API

2009-07-30 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737129#action_12737129 ] Mark Miller commented on LUCENE-1695: - Alright - this is no idle threat. I'm gonna com

Re: ConcurrentMergeScheduler and MergePolicy question

2009-07-30 Thread Shai Erera
I think that when LUCENE-1750 is finished, you will be able to: 1) Create a MergePolicy that limits the segments size it's about to merge to a certain size. 2) Then have a daemon or something that runs on "idle" times and call optimize(maxNumSegments), or even open a new writer w/ the default merg

Re: ConcurrentMergeScheduler and MergePolicy question

2009-07-30 Thread Mark Miller
bq. we've always said to keep the merge factor small for search reasons, at least in the high-update case. I think we have been wrong. A bunch of segments vs optimized is about the same speed I think. I'd always read that to, but Mike said it didn't make sense once, and some simple testing seemed t

Re: ConcurrentMergeScheduler and MergePolicy question

2009-07-30 Thread Grant Ingersoll
Note also response from Mike that talks a little bit about something along these lines: http://www.lucidimagination.com/search/document/fa990adba4d2572b/is_there_a_way_to_control_when_merges_happen#f6f0bfeef4bf9a39 -Grant On Jul 30, 2009, at 10:35 AM, Grant Ingersoll wrote: Given a large segm

ConcurrentMergeScheduler and MergePolicy question

2009-07-30 Thread Grant Ingersoll
Given a large segment and a bunch of small segments, how does the ConcurrentMergeScheduler (CMS) work? Does it always merge the smaller segments into the bigger one, or does it merge the smaller segments together? Something I've been thinking about: Given a high update environment (and

[jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Carl Austin (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737107#action_12737107 ] Carl Austin commented on LUCENE-1690: - The cache in terminfosreader is for everything

Re: [jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Richard Marr
2009/7/30 Michael McCandless : > Good question... Good answer. Thanks. I guess the next step then is to understand why the TermInfo cache isn't getting the performance to where it could be. It'll take me a while to get to the point where I can answer that question. If anyone's in a hurry it'd pro

Re: [jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Michael McCandless
On Thu, Jul 30, 2009 at 6:28 AM, Richard Marr wrote: > Yeah, having this stuff stored centrally behind the IndexReader seems > like a better idea than having it in client classes. My shallow > knowledge of the code isn't helping me explain why it's not performing > though. > > Out of interest, how

Re: backwards compat tests

2009-07-30 Thread Michael McCandless
On Wed, Jul 29, 2009 at 5:11 PM, Uwe Schindler wrote: >> > My suggestion was to write the build script in a way that it checks out >> the >> > branch with the same revision number as the current base dir (trunk). >> >> I think this would work, as long as we always commit top-level and >> back-compa

Re: [jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Richard Marr
Yeah, having this stuff stored centrally behind the IndexReader seems like a better idea than having it in client classes. My shallow knowledge of the code isn't helping me explain why it's not performing though. Out of interest, how come it's a per-thread cache? I don't understand all the issues

Re: Build failed in Hudson: Lucene-trunk #902

2009-07-30 Thread Michael McCandless
On Thu, Jul 30, 2009 at 6:14 AM, Uwe Schindler wrote: > I for got to mention: only works for clover 2.x this is why I > updated. The update was simple, I only had to change one line in > common-build.xml and add the testsources tag with the current > junit.include/exclude and ASCIIFoldingFilter e

RE: Build failed in Hudson: Lucene-trunk #902

2009-07-30 Thread Uwe Schindler
> >> > I found out that clover-setup supports a special advanced tag > >> > "": > >> > is an Ant fileset which should only be used if Clover's > >> > default test detection is not adequate. Clover's default test > detection > >> > algorithm is used to distinguish test cases if this element is > om

Re: Build failed in Hudson: Lucene-trunk #902

2009-07-30 Thread Michael McCandless
On Thu, Jul 30, 2009 at 5:53 AM, Uwe Schindler wrote: >> > I found out that clover-setup supports a special advanced tag >> > "": >> > is an Ant fileset which should only be used if Clover's >> > default test detection is not adequate. Clover's default test detection >> > algorithm is used to dist

[jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737059#action_12737059 ] Michael McCandless commented on LUCENE-1690: OK now I feel silly -- this cache

RE: Build failed in Hudson: Lucene-trunk #902

2009-07-30 Thread Uwe Schindler
> > I found out that clover-setup supports a special advanced tag > > "": > > is an Ant fileset which should only be used if Clover's > > default test detection is not adequate. Clover's default test detection > > algorithm is used to distinguish test cases if this element is omitted. > > That so

Re: Build failed in Hudson: Lucene-trunk #902

2009-07-30 Thread Michael McCandless
On Thu, Jul 30, 2009 at 4:51 AM, Uwe Schindler wrote: >> I'm guessing it was the empty source file I accidentally left in for >> LUCENE-1754, that Hoss removed (thanks!). I think clover saw that as >> an attempt to instrument a source in the empty-string package. >> >> I'm unfamiliar w/ how to conf

[jira] Commented: (LUCENE-1763) MergePolicy should require an IndexWriter upon construction

2009-07-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737054#action_12737054 ] Michael McCandless commented on LUCENE-1763: Patch looks good Shai! Only chan

[jira] Updated: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Richard Marr (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Marr updated LUCENE-1690: - Attachment: LUCENE-1690.patch This is the latest version. I wasn't working on it at quite such a

RE: Build failed in Hudson: Lucene-trunk #902

2009-07-30 Thread Uwe Schindler
> I'm guessing it was the empty source file I accidentally left in for > LUCENE-1754, that Hoss removed (thanks!). I think clover saw that as > an attempt to instrument a source in the empty-string package. > > I'm unfamiliar w/ how to configure clover, but I agree we should make > sure it's testi

[jira] Created: (LUCENE-1768) NumericRange support for new query parser

2009-07-30 Thread Uwe Schindler (JIRA)
NumericRange support for new query parser - Key: LUCENE-1768 URL: https://issues.apache.org/jira/browse/LUCENE-1768 Project: Lucene - Java Issue Type: New Feature Components: QueryParser

[jira] Commented: (LUCENE-1567) New flexible query parser

2009-07-30 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737031#action_12737031 ] Uwe Schindler commented on LUCENE-1567: --- {quote} Can you create a new "jira issue" w

[jira] Updated: (LUCENE-1567) New flexible query parser

2009-07-30 Thread Adriano Crestani (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adriano Crestani updated LUCENE-1567: - Attachment: lucene_trunk_FlexQueryParser_2009july30_v12.patch {quote} There is something

[jira] Commented: (LUCENE-1567) New flexible query parser

2009-07-30 Thread Luis Alves (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737022#action_12737022 ] Luis Alves commented on LUCENE-1567: Hi Adriano There is something wrong with your pa