[jira] Created: (LUCENE-686) Resources not always reclaimed in scorers after each search

2006-10-17 Thread Ning Li (JIRA)
Resources not always reclaimed in scorers after each search --- Key: LUCENE-686 URL: http://issues.apache.org/jira/browse/LUCENE-686 Project: Lucene - Java Issue Type: Bug Com

[jira] Updated: (LUCENE-686) Resources not always reclaimed in scorers after each search

2006-10-17 Thread Ning Li (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-686?page=all ] Ning Li updated LUCENE-686: --- Attachment: ScorerResourceGC.patch A patch is attached: - The patch is based on the lastest version from trunk. - The patch includes a test called TestScorerResourceG

Include BM25 in Lucene?

2006-10-17 Thread J.Zhu
Hi, All, I am an enthusiastic user of Lucene and it is very helpful to my projects at hand. As probabilistic models such as BM25 are very popular among research communities now, do you have any plan to incorporate some of them in future Lucene release? I believe that will make Lucene even more pop

Re: Include BM25 in Lucene?

2006-10-17 Thread Grant Ingersoll
Hi Jianhan, I am not aware, however, of anyone working on a BM25 implementation. We are a volunteer project, though, so we are always open to contributions! -Grant On Oct 17, 2006, at 5:50 AM, J.Zhu wrote: Hi, All, I am an enthusiastic user of Lucene and it is very helpful to my proje

RE: Include BM25 in Lucene?

2006-10-17 Thread J.Zhu
Hi, Grant, If I would like to contribute, what should I do? I am not a good Java developer myself though. Can I work with someone also interested? Jianhan -Original Message- From: Grant Ingersoll [mailto:[EMAIL PROTECTED] Sent: 17 October 2006 11:56 To: java-dev@lucene.apache.org Subje

jira workflow

2006-10-17 Thread Steven Parkes
As a member of a number of Lucene subprojects dev lists, I've been comparing the way the Jira workflow is used on the different projects. In particular, I've been noting the difference between the workflows that Lucene Java and Hadoop use. Hadoop in particular has a state for "patch submitted" whi

Re: Include BM25 in Lucene?

2006-10-17 Thread Vic Bancroft
J.Zhu wrote: If I would like to contribute, what should I do? I am not a good Java developer myself though. Can I work with someone also interested? In some of my group's usage of lucene over large document collections, we have split the documents across several machines. This has lead to a

RE: email replies to JIRA issues

2006-10-17 Thread Steven Parkes
Hmmm ... do we want to change the default reply-to? I'd be +1 for that. The stuff would appear on dev anyway, right, because the Jira events get mailed there? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Actually, it looks like only repli

Re: jira workflow

2006-10-17 Thread Erik Hatcher
Steven, Thanks for this prod. I've been meaning to debrief the Lucene dev group on a recent visit I made to IBM's Silicon Valley Labs where I presented Lucene and met with their search gurus. A recurring theme from them was the management of Java Lucene project, and what they and we can

Re: jira workflow

2006-10-17 Thread Grant Ingersoll
Looking into the payload patch is next on my list after I finish up the benchmarking, but I know the payload patch is going to require several sets of eyes and some discussion. I would like to see the more flexible index format addressed and I am not sure about the removal of Fieldable as

Re: email replies to JIRA issues

2006-10-17 Thread Yonik Seeley
On 10/17/06, Steven Parkes <[EMAIL PROTECTED]> wrote: Hmmm ... do we want to change the default reply-to? I'd be +1 for that. The stuff would appear on dev anyway, right, because the Jira events get mailed there? It's being looked into already, but the dev list adds it's own reply-to, so it mig

[jira] Commented: (LUCENE-686) Resources not always reclaimed in scorers after each search

2006-10-17 Thread Paul Elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-686?page=comments#action_12442943 ] Paul Elschot commented on LUCENE-686: - Three things: Is there an actual memory leak problem related to this? In ReqExclScorer the two scorers can also be clos

RE: Include BM25 in Lucene?

2006-10-17 Thread J.Zhu
Hi, Vic, Unfortunately BM25 uses IDF as well so splitting documents across machines will also affect it. How about storing these as global statistical data for sharing the search on these machines? The equation of BM25 is clearly stated in Robertson's paper "Simple, proven approaches to text retr

Re: email replies to JIRA issues

2006-10-17 Thread Chris Hostetter
: Actually, it looks like only replies going to [EMAIL PROTECTED] (the default) : will be added as comments. Our setup has the reply-to set to : [EMAIL PROTECTED] by default, so those probably won't be added. Is this controlled on a Project basis? ... I tried making an issue in the Solr project

[jira] Commented: (LUCENE-686) Resources not always reclaimed in scorers after each search

2006-10-17 Thread Ning Li (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-686?page=comments#action_12442987 ] Ning Li commented on LUCENE-686: > Is there an actual memory leak problem related to this? Right now no. For example, in FS based directories, the index inputs te

Re: jira workflow

2006-10-17 Thread Chris Hostetter
: "patch submitted" which, as it's used on Hadoop, seems to facilitate : communication. State changes (open->patch submitted, : patch-submitted->open) seem to help communications between contributors : and reviewers. Looking at the Lucene Java Jira, sometimes "[patch]" is : put at the beginning of

Re: jira workflow

2006-10-17 Thread Doug Cutting
Steven Parkes wrote: Is there sufficient interest to consider this for Lucene Java? (I'd write "any interest", but since I'm interested, there's at least some.) +1 If there is agreement, I can gladly change Lucene Java to use Hadoop's Jira workflow. The other thing I was thinking of was th

Re: email replies to JIRA issues

2006-10-17 Thread Doug Cutting
Yonik Seeley wrote: It's being looked into already, but the dev list adds it's own reply-to, so it might not be as simple as changing the reply-to in outbound JIRA messages. Right, that's the problem. Doug - To unsubscribe, e

[jira] Commented: (LUCENE-686) Resources not always reclaimed in scorers after each search

2006-10-17 Thread Paul Elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-686?page=comments#action_12443014 ] Paul Elschot commented on LUCENE-686: - In SpanTermQuery in the spans package, there is also a TermPositions that might need to be closed. This means that closin

Re: email replies to JIRA issues

2006-10-17 Thread Doug Cutting
Chris Hostetter wrote: Is this controlled on a Project basis? ... I tried making an issue in the Solr project (where i figured i'd be spaming less people) to experiement with this and couldn't get any messages sent to [EMAIL PROTECTED] to show up in the test issue i created... Jira's outgoing

Re: jira workflow

2006-10-17 Thread Doug Cutting
Chris Hostetter wrote: Perhaps we should have two new statuses: "rough patch available" and "polished patch available" ? The other thing i would like to see tracked better is the distinction between issues that unit tests attached and issues that do not ... this is somewhat orthoginal to patch a

Re: jira workflow

2006-10-17 Thread Grant Ingersoll
Is it explicitly stated anywhere what the workflow is/should be? -Grant On Oct 17, 2006, at 1:51 PM, Doug Cutting wrote: Steven Parkes wrote: Is there sufficient interest to consider this for Lucene Java? (I'd write "any interest", but since I'm interested, there's at least some.) +1 I

Re: Java 1.5

2006-10-17 Thread Doug Cutting
Chuck Williams wrote: I think the last discussion ended with the main counter-argument being lack of support by gjc. Current top of GJC News: *June 6, 2006* RMS approved the plan to use the Eclipse compiler as the new gcj front end. Work is being done on the |gcj-eclipse| branch; it can alread

Re: [jira] Commented: (LUCENE-686) Resources not always reclaimed in scorers after each search

2006-10-17 Thread Chris Hostetter
: But as long as there is no real a memory leak, what is the point of : adding this close functionality? I think the concern is not so much that Lucene core as is has any leaks -- but that subclasses of core implimentations have no mechanism for safely managing resources. Custom Directory implim

Re: jira workflow

2006-10-17 Thread Doug Cutting
Grant Ingersoll wrote: Is it explicitly stated anywhere what the workflow is/should be? I can't see the workflow when I'm not logged in. I think you might have to be a Jira administrator to view workflows. Some information about them is up at: http://www.atlassian.com/software/jira/docs/v

RE: jira workflow

2006-10-17 Thread Steven Parkes
Think it's up to us. Hadoop has customized their workflow. It's not the easiest process in the world. I know it takes a certain amount of effort to transition issues from one workflow to another, so should we want to do this, we should try to not plan on lots of tweaks? -Original Message-

Re: [jira] Commented: (LUCENE-686) Resources not always reclaimed in scorers after each search

2006-10-17 Thread Paul Elschot
On Tuesday 17 October 2006 20:32, Chris Hostetter wrote: > > : But as long as there is no real a memory leak, what is the point of > : adding this close functionality? > > I think the concern is not so much that Lucene core as is has any leaks -- > but that subclasses of core implimentations have

Re: jira workflow

2006-10-17 Thread Doug Cutting
Doug Cutting wrote: I've attached an image of it to this message. Well, that didn't work. Instead I added it to Hadoop's wiki: http://wiki.apache.org/lucene-hadoop-data/attachments/JiraWorkflow/attachments/workflow.png Doug --

Re: [jira] Commented: (LUCENE-686) Resources not always reclaimed in scorers after each search

2006-10-17 Thread Chris Hostetter
: When custom Scorers and/or Directories need a close method, it can : also be provided by subclassing Scorer, IndexSearcher and Directory : in the custom code. : Not providing this close method in the Lucene core passes the message : that a working implementation is possible without it. it seems

Re: email replies to JIRA issues

2006-10-17 Thread Chris Hostetter
: > Is this controlled on a Project basis? ... I tried making an issue in the : > Solr project (where i figured i'd be spaming less people) to experiement : > with this and couldn't get any messages sent to [EMAIL PROTECTED] to show up in the : > test issue i created... : : Jira's outgoing messag

RE: jira workflow

2006-10-17 Thread Steven Parkes
Looking at some currently closed issues, it looks like it should be possible to add comments and links to closed issues. It provides the comment button anyway. I haven't tried to push it. We could test on the test SOLR issue. Looking at the Jira docs, it looks like you can configure closed issues

Re: [jira] Commented: (LUCENE-686) Resources not always reclaimed in scorers after each search

2006-10-17 Thread Paul Elschot
On Tuesday 17 October 2006 21:10, Chris Hostetter wrote: > > : When custom Scorers and/or Directories need a close method, it can > : also be provided by subclassing Scorer, IndexSearcher and Directory > : in the custom code. > : Not providing this close method in the Lucene core passes the messag

Re: Include BM25 in Lucene?

2006-10-17 Thread Chuck Williams
Vic Bancroft wrote on 10/17/2006 02:44 AM: > In some of my group's usage of lucene over large document collections, > we have split the documents across several machines. This has lead to > a concern of whether the inverse document frequency was appropriate, > since the score seems to be dependant

RE: email replies to JIRA issues

2006-10-17 Thread Steven Parkes
It looks like there is per-project configuration: http://www.atlassian.com/software/jira/docs/latest/issue_creation_email. html. It also looks like it requires a mailbox per project, which [EMAIL PROTECTED] doesn't sound like. -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTE

Re: jira workflow

2006-10-17 Thread Doug Cutting
Steven Parkes wrote: My main concern is that things get lost in lists that grow without bound. I've never been to concerned about the size of the open bug list. It can be searched, if one wishes to find whether there's already an issue before filing a new one. The lists I try to keep small

Re: email replies to JIRA issues

2006-10-17 Thread Doug Cutting
Steven Parkes wrote: It looks like there is per-project configuration: http://www.atlassian.com/software/jira/docs/latest/issue_creation_email. html. It also looks like it requires a mailbox per project, which [EMAIL PROTECTED] doesn't sound like. That half of things is working. If you reply

Re: jira workflow

2006-10-17 Thread Grant Ingersoll
A comment from a committer or contributor should be sufficient to explain why something has not been committed, fixed or whatever, and what action might next be needed. The scarcest resource is committers. So we want to be able to focus their activities. "Patch Available" is a call to

[jira] Commented: (LUCENE-365) [PATCH] Performance improvement to DisjunctionSumScorer

2006-10-17 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-365?page=comments#action_12443056 ] Yonik Seeley commented on LUCENE-365: - Paul, would it be possible to get a version with a license granted to the ASF? Also, a single patch file would be preferr

[jira] Commented: (LUCENE-365) [PATCH] Performance improvement to DisjunctionSumScorer

2006-10-17 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-365?page=comments#action_12443056 ] Yonik Seeley commented on LUCENE-365: - Paul, would it be possible to get a version with a license granted to the ASF? Also, a single patch file would be preferr

[jira] Commented: (LUCENE-544) MultiFieldQueryParser field boost multiplier

2006-10-17 Thread Matt Ericson (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-544?page=comments#action_12443058 ] Matt Ericson commented on LUCENE-544: - I have been working on this exact same problem Have you created an tests for it? I am attaching my version. My version

[jira] Updated: (LUCENE-544) MultiFieldQueryParser field boost multiplier

2006-10-17 Thread Matt Ericson (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-544?page=all ] Matt Ericson updated LUCENE-544: Attachment: QueryParserPatch This is my version of the Query Parser that will allow the users to boost some fields > MultiFieldQueryParser field boost multipl

Re: [jira] Commented: (LUCENE-686) Resources not always reclaimed in scorers after each search

2006-10-17 Thread Ning Li
A new scorer that requires reclaiming resources could be used by many other scorers such as boolean scorers and conjunction scorers. Then those scorers should have a closing method and so do the ones use those scorers... A general closing method would be better, wouldn't it? -

[jira] Updated: (LUCENE-544) MultiFieldQueryParser field boost multiplier

2006-10-17 Thread Otis Gospodnetic (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-544?page=all ] Otis Gospodnetic updated LUCENE-544: Attachment: (was: QueryParserPatch) > MultiFieldQueryParser field boost multiplier > > >

[jira] Commented: (LUCENE-365) [PATCH] Performance improvement to DisjunctionSumScorer

2006-10-17 Thread Paul Elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-365?page=comments#action_12443062 ] Paul Elschot commented on LUCENE-365: - The ASF licence is in the sources, this is from before jira. I'll add a patch with the ASF licence soon. > [PATCH] Per

[jira] Commented: (LUCENE-365) [PATCH] Performance improvement to DisjunctionSumScorer

2006-10-17 Thread Paul Elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-365?page=comments#action_12443062 ] Paul Elschot commented on LUCENE-365: - The ASF licence is in the sources, this is from before jira. I'll add a patch with the ASF licence soon. > [PATCH] Per

[jira] Updated: (LUCENE-544) MultiFieldQueryParser field boost multiplier

2006-10-17 Thread Matt Ericson (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-544?page=all ] Matt Ericson updated LUCENE-544: Attachment: QueryParserPatch My Version of the QueryParser that will allow you to boost your fields This version used a Map to keep track of what field to boos

[jira] Resolved: (LUCENE-443) ConjunctionScorer tune-up

2006-10-17 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-443?page=all ] Yonik Seeley resolved LUCENE-443. - Fix Version/s: 2.1 Resolution: Fixed Assignee: Yonik Seeley Thanks! I just committed this. > ConjunctionScorer tune-up > ---

[jira] Closed: (LUCENE-415) Merge error during add to index (IndexOutOfBoundsException)

2006-10-17 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-415?page=all ] Yonik Seeley closed LUCENE-415. --- Fix Version/s: 1.9 Resolution: Fixed Closing (a jira bug prevented me from closing this earlier). > Merge error during add to index (IndexOutOfBoundsExcep

RE: jira workflow

2006-10-17 Thread Steven Parkes
The cases I'm thinking of are those issues, be they bug fixes or critical functionality, that are submitted by non-contributors, people that aren't going to do the coding themselves. I might be very interested in some of those, probably the bugs in particular. I'm just trying to figure out how I'll

[jira] Commented: (LUCENE-365) [PATCH] Performance improvement to DisjunctionSumScorer

2006-10-17 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-365?page=comments#action_12443077 ] Yonik Seeley commented on LUCENE-365: - > The ASF licence is in the sources Yeah, I was just trying to be safe. It seems like there might be a small distinction

[jira] Commented: (LUCENE-365) [PATCH] Performance improvement to DisjunctionSumScorer

2006-10-17 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-365?page=comments#action_12443077 ] Yonik Seeley commented on LUCENE-365: - > The ASF licence is in the sources Yeah, I was just trying to be safe. It seems like there might be a small distinction

Re: jira workflow

2006-10-17 Thread Doug Cutting
Steven Parkes wrote: It wasn't really about having a list of needs clarification. It was more about bounding open. I suppose it's my product development side showing. Generally we tried not to leave issues open indefinitely, for fear of not getting back to a customer. Perhaps there's nothing comp

[jira] Updated: (LUCENE-365) [PATCH] Performance improvement to DisjunctionSumScorer

2006-10-17 Thread Paul Elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-365?page=all ] Paul Elschot updated LUCENE-365: Attachment: DisjunctionSumScorer20061017.patch This patch obsoletes DisjunctionSumScorer and ScorerDocQueue as attached earlier. All tests pass here. TestDisj

[jira] Updated: (LUCENE-365) [PATCH] Performance improvement to DisjunctionSumScorer

2006-10-17 Thread Paul Elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-365?page=all ] Paul Elschot updated LUCENE-365: Attachment: DisjunctionSumScorer20061017.patch This patch obsoletes DisjunctionSumScorer and ScorerDocQueue as attached earlier. All tests pass here. TestDisj

[jira] Resolved: (LUCENE-544) MultiFieldQueryParser field boost multiplier

2006-10-17 Thread Otis Gospodnetic (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-544?page=all ] Otis Gospodnetic resolved LUCENE-544. - Resolution: Fixed I decided to go with Matt's version - smaller change to the class + a unit test. Thanks Matt! Karl: if any functionality from you

RE: jira workflow

2006-10-17 Thread Steven Parkes
Just because you've gotten back doesn't mean the issue is gone. No, just clarifying whose court the ball is in. In the patch available case, it's clear. Looking at the some long standing open issues, sometimes discussion peters out and it's not clear whether submitter and commenters are on

Re: jira workflow

2006-10-17 Thread Grant Ingersoll
So is Hadoop's workflow acceptable to all? +1 I can put the diagram up on Wiki if this is approved. Also, is there a way to add links from JIRA? -Grant - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands

Re: jira workflow

2006-10-17 Thread Yonik Seeley
On 10/17/06, Doug Cutting <[EMAIL PROTECTED]> wrote: So is Hadoop's workflow acceptable to all? +1 for Hadoop's JIRA workflow. As far as how that workflow should be used, I think we need a wiki page. -Yonik - To unsubscribe,

[jira] Commented: (LUCENE-365) [PATCH] Performance improvement to DisjunctionSumScorer

2006-10-17 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-365?page=comments#action_12443135 ] Yonik Seeley commented on LUCENE-365: - Thanks Paul, This patch seemed to revert the following: http://issues.apache.org/jira/secure/attachment/12324730/Disjunct

[jira] Commented: (LUCENE-365) [PATCH] Performance improvement to DisjunctionSumScorer

2006-10-17 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-365?page=comments#action_12443135 ] Yonik Seeley commented on LUCENE-365: - Thanks Paul, This patch seemed to revert the following: http://issues.apache.org/jira/secure/attachment/12324730/Disjunct

[jira] Resolved: (LUCENE-365) [PATCH] Performance improvement to DisjunctionSumScorer

2006-10-17 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-365?page=all ] Yonik Seeley resolved LUCENE-365. - Fix Version/s: 2.1 Resolution: Fixed Assignee: Yonik Seeley (was: Lucene Developers) > [PATCH] Performance improvement to DisjunctionSumScore

Re: [jira] Commented: (LUCENE-686) Resources not always reclaimed in scorers after each search

2006-10-17 Thread Chris Hostetter
: > : When custom Scorers and/or Directories need a close method, it can : > : also be provided by subclassing Scorer, IndexSearcher and Directory : > it seems like that would handicap adoption of new Queries/Directories ... : > I don't know how many people would have been interested in : > Consta