MoreLikeThis: fieldNames and Query

2010-02-13 Thread Shay Banon
Hi, I have a few questions regarding more like this: 1. In MoreLikeThis, it seems like the check for fieldNames being null and fetching them from the reader is not done for all the like methods. For example, it does not look like it is done at all for like(Reader r), and on the other hand

[jira] Commented: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-12-15 Thread Otis Gospodnetic (JIRA)
you please use the Lucene code format? (Eclipse/IntelliJ templates are at the bottom of http://wiki.apache.org/lucene-java/HowToContribute ) Extension to MoreLikeThis to use tag information Key: LUCENE-1910 URL

[jira] Updated: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-11-26 Thread Thomas D'Silva (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated LUCENE-1910: --- Attachment: (was: LUCENE-1910.patch) Extension to MoreLikeThis to use tag information

[jira] Commented: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-11-26 Thread Thomas D'Silva (JIRA)
, the time taken to generate a MoreLikeThisUsingTags query is constant. Thanks, Thomas Extension to MoreLikeThis to use tag information Key: LUCENE-1910 URL: https://issues.apache.org/jira/browse/LUCENE-1910

[jira] Updated: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-11-26 Thread Thomas D'Silva (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated LUCENE-1910: --- Attachment: LUCENE-1910.patch Extension to MoreLikeThis to use tag information

[jira] Assigned: (LUCENE-1993) MoreLikeThis - allow to exclude terms that appear in too many documents (patch included)

2009-10-20 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1993: -- Assignee: Michael McCandless MoreLikeThis - allow to exclude terms

[jira] Commented: (LUCENE-1993) MoreLikeThis - allow to exclude terms that appear in too many documents (patch included)

2009-10-20 Thread Michael McCandless (JIRA)
shortly. MoreLikeThis - allow to exclude terms that appear in too many documents (patch included) Key: LUCENE-1993 URL: https://issues.apache.org/jira/browse/LUCENE-1993

[jira] Resolved: (LUCENE-1993) MoreLikeThis - allow to exclude terms that appear in too many documents (patch included)

2009-10-20 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1993. Resolution: Fixed Fix Version/s: 3.0 Thanks Christian! MoreLikeThis

[jira] Updated: (LUCENE-1993) MoreLikeThis - allow to exclude terms that appear in too many documents (patch included)

2009-10-19 Thread Christian Steinert (JIRA)
MoreLikeThis - allow to exclude terms that appear in too many documents (patch included) Key: LUCENE-1993 URL: https://issues.apache.org/jira/browse/LUCENE-1993

[jira] Created: (LUCENE-1993) MoreLikeThis - allow to exclude terms that appear in too many documents (patch included)

2009-10-19 Thread Christian Steinert (JIRA)
MoreLikeThis - allow to exclude terms that appear in too many documents (patch included) Key: LUCENE-1993 URL: https://issues.apache.org/jira/browse/LUCENE-1993

[jira] Commented: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-10-05 Thread Mark Harwood (JIRA)
documents? Unfortunately, I can't see this being generally useful until the performance is improved dramatically. Extension to MoreLikeThis to use tag information Key: LUCENE-1910 URL: https://issues.apache.org

[jira] Issue Comment Edited: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-10-04 Thread Thomas D'Silva (JIRA)
document terms for a given are cached in a hashmap once they have been generated in order to speed up subsequent lookups. Extension to MoreLikeThis to use tag information Key: LUCENE-1910 URL: https

[jira] Updated: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-10-04 Thread Thomas D'Silva (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated LUCENE-1910: --- Attachment: (was: LUCENE-1910.patch) Extension to MoreLikeThis to use tag information

[jira] Commented: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-09-21 Thread Mark Harwood (JIRA)
a lot of searches. I need to spend a little more time looking at it before I understand it in more detail. Before then - have you tested this on a big (millions of docs/terms) index? Some performance figures would be useful to accompany this. Cheers, Mark Extension to MoreLikeThis to use tag

[jira] Updated: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-09-14 Thread Thomas D'Silva (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated LUCENE-1910: --- Priority: Minor (was: Major) Extension to MoreLikeThis to use tag information

[jira] Created: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-09-13 Thread Thomas D'Silva (JIRA)
Extension to MoreLikeThis to use tag information Key: LUCENE-1910 URL: https://issues.apache.org/jira/browse/LUCENE-1910 Project: Lucene - Java Issue Type: New Feature Components

[jira] Updated: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-09-13 Thread Thomas D'Silva (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated LUCENE-1910: --- Attachment: LUCENE-1910.patch Extension to MoreLikeThis to use tag information

[jira] Updated: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-09-13 Thread Thomas D'Silva (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated LUCENE-1910: --- Attachment: (was: LUCENE-1910.patch) Extension to MoreLikeThis to use tag information

[jira] Updated: (LUCENE-1910) Extension to MoreLikeThis to use tag information

2009-09-13 Thread Thomas D'Silva (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated LUCENE-1910: --- Attachment: LUCENE-1910.patch Extension to MoreLikeThis to use tag information

MoreLikeThis Extension for documents that have tags

2009-09-03 Thread Thomas D'Silva
Hi, I would like to contribute a class based on the MoreLikeThis class in contrib/queries that generates a query based on the tags associated with a document. The class assumes that documents are tagged with a set of tags (which are stored in the index in a seperate Field). The class determines

[jira] Updated: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Richard Marr (JIRA)
is in the IndexReader, but suspect that would be a can of worms. Comments? Morelikethis queries are very slow compared to other search types - Key: LUCENE-1690 URL: https://issues.apache.org/jira/browse

[jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Michael McCandless (JIRA)
I'm confused: how come you are not already seeing the benefits of this cache? You ought to see MLT queries going faster. This core cache was first added in 2.4.x; it looks like you were testing against 2.4.1 (from the Affects Version on this issue). Morelikethis queries are very slow

Re: [jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Richard Marr
testing against 2.4.1 (from the Affects Version on this issue). Morelikethis queries are very slow compared to other search types -                 Key: LUCENE-1690                 URL: https://issues.apache.org/jira/browse/LUCENE

Re: [jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Michael McCandless
On Thu, Jul 30, 2009 at 6:28 AM, Richard Marrrichard.m...@gmail.com wrote: Yeah, having this stuff stored centrally behind the IndexReader seems like a better idea than having it in client classes. My shallow knowledge of the code isn't helping me explain why it's not performing though. Out

Re: [jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Richard Marr
2009/7/30 Michael McCandless luc...@mikemccandless.com: Good question... Good answer. Thanks. I guess the next step then is to understand why the TermInfo cache isn't getting the performance to where it could be. It'll take me a while to get to the point where I can answer that question. If

[jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Carl Austin (JIRA)
that is not MLT related. A lot of MLTs use the same terms, and I have a good size cache for it, meaning most terms I use in MLT can be retrieved from there. Seeing as MLT in my circumstance is one of the slower bits, this can give me a good advantage. Morelikethis queries are very slow compared to other

Re: [jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-30 Thread Michael Busch
On 7/30/09 4:10 AM, Michael McCandless wrote: Plus, the original motivation for this (LUCENE-1195) was because queries in general look up the same term at least 2 times during their execution (weight (idf computation), get postings), and so I think we wanted to ensure that a single thread doing

[jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-29 Thread Richard Marr (JIRA)
noticed. Please ignore the latest patch. Morelikethis queries are very slow compared to other search types - Key: LUCENE-1690 URL: https://issues.apache.org/jira/browse/LUCENE-1690 Project

[jira] Updated: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-28 Thread Richard Marr (JIRA)
some feedback in the meantime? Morelikethis queries are very slow compared to other search types - Key: LUCENE-1690 URL: https://issues.apache.org/jira/browse/LUCENE-1690 Project: Lucene

[jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-28 Thread Michael McCandless (JIRA)
like it'll incorrectly put 0 into the cache, when the field was in the top-level cache but the term text wasn't in the 2nd level cache? Morelikethis queries are very slow compared to other search types - Key

[jira] Closed: (LUCENE-1697) MoreLikeThis should use the new Token API

2009-07-24 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch closed LUCENE-1697. - Resolution: Duplicate This will be fixed as part of LUCENE-1460. MoreLikeThis should use

[jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-20 Thread Carl Austin (JIRA)
and this is unbounded. Perhaps this should be an LRU cache with a settable maximum number of entries to stop it growing forever if you do a lot of like this queries on large indexes with many unique terms. Otherwise nice addition, has sped up my more like this queries a bit. Morelikethis queries

[jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-20 Thread Richard Marr (JIRA)
binding to a specific IndexReader instance. I think I can handle that. Carl, do you have any data on how this has changed performance in your system? My use case is a limited vocabulary so the performance gain was large. Morelikethis queries are very slow compared to other search types

[jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-07-20 Thread Carl Austin (JIRA)
. Morelikethis queries are very slow compared to other search types - Key: LUCENE-1690 URL: https://issues.apache.org/jira/browse/LUCENE-1690 Project: Lucene - Java Issue Type

[jira] Resolved: (LUCENE-1272) Support for boost factor in MoreLikeThis

2009-07-14 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1272. Resolution: Fixed Thanks Jonathan! Support for boost factor in MoreLikeThis

[jira] Created: (LUCENE-1697) MoreLikeThis should use the new Token API

2009-06-16 Thread Grant Ingersoll (JIRA)
MoreLikeThis should use the new Token API - Key: LUCENE-1697 URL: https://issues.apache.org/jira/browse/LUCENE-1697 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll

[jira] Commented: (LUCENE-1697) MoreLikeThis should use the new Token API

2009-06-16 Thread Mark Miller (JIRA)
don't want this one Grant, we should assign to Michael as this is a part of LUCENE-1460. MoreLikeThis should use the new Token API - Key: LUCENE-1697 URL: https://issues.apache.org/jira/browse/LUCENE-1697

[jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-06-15 Thread Richard Marr (JIRA)
a little longer for me to do. I'll have a think about it. Morelikethis queries are very slow compared to other search types - Key: LUCENE-1690 URL: https://issues.apache.org/jira/browse/LUCENE-1690

[jira] Updated: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-06-13 Thread Richard Marr (JIRA)
. It shouldn't affect any applications that don't opt-in to using it, and applications that do should see an order of magnitude performance improvement for MLT queries. This cache implementation is tied to the MLT object but can be cleared on demand. Morelikethis queries are very slow compared

[jira] Commented: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-06-13 Thread Michael McCandless (JIRA)
include the IndexReader in the cache key? Then it'd be functionally equivalent we could enable it by default? Morelikethis queries are very slow compared to other search types - Key: LUCENE-1690 URL

[jira] Created: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

2009-06-12 Thread Richard Marr (JIRA)
Morelikethis queries are very slow compared to other search types - Key: LUCENE-1690 URL: https://issues.apache.org/jira/browse/LUCENE-1690 Project: Lucene - Java Issue Type

[jira] Updated: (LUCENE-1272) Support for boost factor in MoreLikeThis

2009-06-03 Thread Jonathan Leibiusky (JIRA)
in MoreLikeThis Key: LUCENE-1272 URL: https://issues.apache.org/jira/browse/LUCENE-1272 Project: Lucene - Java Issue Type: New Feature Components: contrib/* Reporter: Jonathan

[jira] Updated: (LUCENE-1272) Support for boost factor in MoreLikeThis

2009-06-03 Thread Jonathan Leibiusky (JIRA)
for boost factor in MoreLikeThis Key: LUCENE-1272 URL: https://issues.apache.org/jira/browse/LUCENE-1272 Project: Lucene - Java Issue Type: New Feature Components: contrib

[jira] Commented: (LUCENE-1272) Support for boost factor in MoreLikeThis

2009-06-02 Thread Otis Gospodnetic (JIRA)
for you to update this patch to work with the trunk, so I can apply it? Thanks! Support for boost factor in MoreLikeThis Key: LUCENE-1272 URL: https://issues.apache.org/jira/browse/LUCENE-1272 Project: Lucene

[jira] Resolved: (LUCENE-896) Let users set Similarity for MoreLikeThis

2008-11-12 Thread Otis Gospodnetic (JIRA)
, New]) Actually, my copy of MLT already takes Similarity in ctor and has set/getSimilarity, so no patch is needed. You want/need that isNoise method protected? Let users set Similarity for MoreLikeThis - Key: LUCENE-896

[jira] Updated: (LUCENE-1272) Support for boost factor in MoreLikeThis

2008-11-12 Thread Otis Gospodnetic (JIRA)
] (was: [Patch Available, New]) Fix Version/s: 2.9 Assignee: Otis Gospodnetic I don't see any harm in this, I'll make the change later this week. Support for boost factor in MoreLikeThis Key: LUCENE-1272 URL: https

[jira] Resolved: (LUCENE-1298) MoreLikeThis ignores custom similarity

2008-06-04 Thread Grant Ingersoll (JIRA)
. MoreLikeThis ignores custom similarity -- Key: LUCENE-1298 URL: https://issues.apache.org/jira/browse/LUCENE-1298 Project: Lucene - Java Issue Type: Bug Reporter: Grant Ingersoll

[jira] Created: (LUCENE-1298) MoreLikeThis ignores custom similarity

2008-06-03 Thread Grant Ingersoll (JIRA)
MoreLikeThis ignores custom similarity -- Key: LUCENE-1298 URL: https://issues.apache.org/jira/browse/LUCENE-1298 Project: Lucene - Java Issue Type: Bug Reporter: Grant Ingersoll

[jira] Updated: (LUCENE-1298) MoreLikeThis ignores custom similarity

2008-06-03 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1298: Attachment: LUCENE-1298.patch Patch MoreLikeThis ignores custom similarity

[jira] Resolved: (LUCENE-1295) Make retrieveTerms(int docNum) public in MoreLikeThis

2008-06-02 Thread Grant Ingersoll (JIRA)
revision 662413. Make retrieveTerms(int docNum) public in MoreLikeThis - Key: LUCENE-1295 URL: https://issues.apache.org/jira/browse/LUCENE-1295 Project: Lucene - Java Issue Type

[jira] Commented: (LUCENE-1295) Make retrieveTerms(int docNum) public in MoreLikeThis

2008-05-30 Thread Otis Gospodnetic (JIRA)
retrieveTerms(int docNum) public in MoreLikeThis - Key: LUCENE-1295 URL: https://issues.apache.org/jira/browse/LUCENE-1295 Project: Lucene - Java Issue Type: Improvement Components

[jira] Commented: (LUCENE-1295) Make retrieveTerms(int docNum) public in MoreLikeThis

2008-05-29 Thread Grant Ingersoll (JIRA)
? {quote} I see MLT is full of tabs, should you feel like fixing the formating. {quote} Yeah, I noticed that too, and it is quite egregious, but I thought we avoided formatting changes, but I am happy to make an exception here. Make retrieveTerms(int docNum) public in MoreLikeThis

[jira] Created: (LUCENE-1295) Make retrieveTerms(int docNum) public in MoreLikeThis

2008-05-28 Thread Grant Ingersoll (JIRA)
Make retrieveTerms(int docNum) public in MoreLikeThis - Key: LUCENE-1295 URL: https://issues.apache.org/jira/browse/LUCENE-1295 Project: Lucene - Java Issue Type: Improvement

[jira] Updated: (LUCENE-1295) Make retrieveTerms(int docNum) public in MoreLikeThis

2008-05-28 Thread Grant Ingersoll (JIRA)
docNum) public in MoreLikeThis - Key: LUCENE-1295 URL: https://issues.apache.org/jira/browse/LUCENE-1295 Project: Lucene - Java Issue Type: Improvement Components: contrib

[jira] Commented: (LUCENE-1295) Make retrieveTerms(int docNum) public in MoreLikeThis

2008-05-28 Thread Otis Gospodnetic (JIRA)
, should you feel like fixing the formating. Make retrieveTerms(int docNum) public in MoreLikeThis - Key: LUCENE-1295 URL: https://issues.apache.org/jira/browse/LUCENE-1295 Project: Lucene - Java

[jira] Updated: (LUCENE-896) Let users set Similarity for MoreLikeThis

2008-05-16 Thread Otis Gospodnetic (JIRA)
Seems very reasonable. I'll commit on Monday. Let users set Similarity for MoreLikeThis - Key: LUCENE-896 URL: https://issues.apache.org/jira/browse/LUCENE-896 Project: Lucene - Java Issue Type

[jira] Created: (LUCENE-1272) Support for boost factor in MoreLikeThis

2008-04-24 Thread Jonathan Leibiusky (JIRA)
Support for boost factor in MoreLikeThis Key: LUCENE-1272 URL: https://issues.apache.org/jira/browse/LUCENE-1272 Project: Lucene - Java Issue Type: New Feature Components: contrib

[jira] Updated: (LUCENE-1272) Support for boost factor in MoreLikeThis

2008-04-24 Thread Jonathan Leibiusky (JIRA)
in MoreLikeThis Key: LUCENE-1272 URL: https://issues.apache.org/jira/browse/LUCENE-1272 Project: Lucene - Java Issue Type: New Feature Components: contrib/* Reporter: Jonathan Leibiusky

[jira] Created: (LUCENE-896) Let users set Similarity for MoreLikeThis

2007-05-30 Thread Ryan McKinley (JIRA)
Let users set Similarity for MoreLikeThis - Key: LUCENE-896 URL: https://issues.apache.org/jira/browse/LUCENE-896 Project: Lucene - Java Issue Type: Improvement Components: Other

[jira] Updated: (LUCENE-896) Let users set Similarity for MoreLikeThis

2007-05-30 Thread Ryan McKinley (JIRA)
for Similarity. This also fixes a couple javadoc typos and makes isNoiseWord() protected Let users set Similarity for MoreLikeThis - Key: LUCENE-896 URL: https://issues.apache.org/jira/browse/LUCENE-896 Project

MoreLikeThis

2006-04-16 Thread Dean Hoover
Hi, Lucene is completely new to me. I just downloaded 1.9.1 and started experimenting with it. I am a bit confused though. I want to use the MoreLikeThis class, which appears in the javadoc, but does not exist in code. Where can I find it? Dean

Re: MoreLikeThis

2006-04-16 Thread Chris Hostetter
: Lucene is completely new to me. I just downloaded 1.9.1 and started : experimenting with it. I am a bit confused though. I want to use the : MoreLikeThis class, which appears in the javadoc, but does not exist in : code. Where can I find it? if you look at the way the main javadoc index