date:20090130

Re: Sorting lucene search results

2009-01-30 Thread Anshum

Hi Mitu, Could we have usage/implementation based questions at the user forum. Would help keep things segregated :). About your problem though, I wouldn't know about the .net port. You could (in Java Lucene) use: public TopFieldDocCollector(IndexReader reader, Sort sort, int numHits) i.e.:

Sorting lucene search results

2009-01-30 Thread mitu2009

Hi, I'm using following code to get execute search query in Lucene.Net var collector = new GroupingHitCollector(searcher.GetIndexReader());searcher.Search(myQuery, collector);resultsCount = collector.Hits.Count;How do i sort these search results based on a field? I need to use collector object(

Build failed in Hudson: Lucene-trunk #723

2009-01-30 Thread Apache Hudson Server

See http://hudson.zones.apache.org/hudson/job/Lucene-trunk/723/changes Changes: [uschindler] fix javadocs [uschindler] Add some extra check for validity of c'tor parameters in TrieRangeFilter [mikemccand] LUCENE-1314: add IndexReader.clone(boolean readOnly) and reopen(boolean readOnly) -

[jira] Commented: (LUCENE-1506) Adding FilteredDocIdSet and FilteredDocIdSetIterator

2009-01-30 Thread John Wang (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669105#action_12669105 ] John Wang commented on LUCENE-1506: --- Thanks Michael! > Adding FilteredDocIdSet and Filt

[jira] Created: (LUCENE-1533) Deleted documents as a Filter or top level Query

2009-01-30 Thread Jason Rutherglen (JIRA)

Deleted documents as a Filter or top level Query Key: LUCENE-1533 URL: https://issues.apache.org/jira/browse/LUCENE-1533 Project: Lucene - Java Issue Type: Improvement Components: In

Re: Realtime Search

2009-01-30 Thread Jason Rutherglen

> deletes made through reader (by docID) are immediately visible, but through writer are buffered until a flush or reopen? This is what I was thinking, IW buffers deletes, IR does not. Making IW.deletes visible immediately by applying them to the IR makes sense as well. What should be the behavio

Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

2009-01-30 Thread eks dev

indeed :) From: Paul Elschot To: java-dev@lucene.apache.org Sent: Friday, 30 January, 2009 23:37:08 Subject: Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs On Friday 30 January 2009 23:24:42 eks

Re: BloomFilter-s with Lucene

2009-01-30 Thread eks dev

unfortunately this code is not mine, but is rather simple to try it: int bloom_filter; for (char accent : accents ) { bloom_filter = bloom_filter | 1 << ( accent & 0x1F ); } the rest is easy, this works well for 10-20 chars per bloom_filter, depends on distribu

Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

2009-01-30 Thread Paul Elschot

On Friday 30 January 2009 23:24:42 eks dev wrote: ... > > This is conceptually almost equal (fully equal, when Paul gets "Fillters as > bolean clauses" done) to having separate, single valued field indexed > > isDeleted {true, false} > > where each Query gets implicitly transformed to "Origin

Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

2009-01-30 Thread eks dev

Maybe we should close this issue with a won't-fix and start a new one for filtered deletions? A few thoughts, without looking at the code, just thinking aloud :) It is inverted filter what we are talking about here, Lucene uses Filter as a pass filter (Set bit defines document that should pas

Re: BloomFilter-s with Lucene

2009-01-30 Thread Andi Vajda

On Fri, 30 Jan 2009, eks dev wrote: I have used them for speeding up huge switch clauses in charset normalization (eg lowercase and accent->plain form mapping). Big number of accented characters (this causes big switch statement) that appear seldom in corpus (big majority being not accented).

[jira] Commented: (LUCENE-1314) IndexReader.clone

2009-01-30 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669030#action_12669030 ] Michael McCandless commented on LUCENE-1314: {quote} I'm thinking of implement

[jira] Commented: (LUCENE-1532) File based spellcheck with doc frequencies supplied

2009-01-30 Thread Mark Miller (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669029#action_12669029 ] Mark Miller commented on LUCENE-1532: - Our spellchecking def needs improvement. I lik

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

2009-01-30 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669026#action_12669026 ] Michael McCandless commented on LUCENE-1476: bq. We need more performance data

Re: BloomFilter-s with Lucene

2009-01-30 Thread eks dev

I have used them for speeding up huge switch clauses in charset normalization (eg lowercase and accent->plain form mapping). Big number of accented characters (this causes big switch statement) that appear seldom in corpus (big majority being not accented). If negative test, you do just simple a

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

2009-01-30 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669025#action_12669025 ] Michael McCandless commented on LUCENE-1476: {quote} This seems like something

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

2009-01-30 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669024#action_12669024 ] Michael McCandless commented on LUCENE-1476: {quote} Thanks for running all th

Re: BloomFilter-s with Lucene

2009-01-30 Thread pdecrem

Well. I used 2 Broder similarity measures, and it works well. You obviously need to pick the right size bf's. Navendu Jain has a paper called using bloomfilters to refine web search results, which I think is relevant here. I talks about how remove near duplicate search results using bf's. --

[jira] Commented: (LUCENE-1532) File based spellcheck with doc frequencies supplied

2009-01-30 Thread Eks Dev (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669018#action_12669018 ] Eks Dev commented on LUCENE-1532: - bq. so it can suggest a very obscure word rather than a

Re: BloomFilter-s with Lucene

2009-01-30 Thread Andrzej Bialecki

markharw00d wrote: Andrzej Bialecki wrote: Funny, I was having vague thoughts about this today too having been concerned about some of the big arrays that can end up in a typical Lucene app. Aside from providing space-efiicient lookups, another application for BloomFilters is in similarity me

Re: BloomFilter-s with Lucene

2009-01-30 Thread markharw00d

Andrzej Bialecki wrote: Funny, I was having vague thoughts about this today too having been concerned about some of the big arrays that can end up in a typical Lucene app. Aside from providing space-efiicient lookups, another application for BloomFilters is in similarity measures e.g. ANDing 2

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

2009-01-30 Thread Jason Rutherglen (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668971#action_12668971 ] Jason Rutherglen commented on LUCENE-1476: -- {quote} Just run sortBench2.py in co

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

2009-01-30 Thread Jason Rutherglen (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668954#action_12668954 ] Jason Rutherglen commented on LUCENE-1476: -- > Maybe we should close this issue wi

[jira] Created: (LUCENE-1532) File based spellcheck with doc frequencies supplied

2009-01-30 Thread David Bowen (JIRA)

File based spellcheck with doc frequencies supplied --- Key: LUCENE-1532 URL: https://issues.apache.org/jira/browse/LUCENE-1532 Project: Lucene - Java Issue Type: New Feature Componen

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

2009-01-30 Thread Marvin Humphrey (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668946#action_12668946 ] Marvin Humphrey commented on LUCENE-1476: - > Actually I used your entire patch on

BloomFilter-s with Lucene

2009-01-30 Thread Andrzej Bialecki

Hi all, I've been using BloomFilters for various tasks, and I can't shake the feeling that they could be of some use in Lucene internals, to speed up various membership tests, especially if we look for 100% correct negatives, and we can accept a small rate of false positives. For example, le

[jira] Commented: (LUCENE-1314) IndexReader.clone

2009-01-30 Thread Jason Rutherglen (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668941#action_12668941 ] Jason Rutherglen commented on LUCENE-1314: -- I'm thinking of implementing a follow

[jira] Commented: (LUCENE-1314) IndexReader.clone

2009-01-30 Thread Jason Rutherglen (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668930#action_12668930 ] Jason Rutherglen commented on LUCENE-1314: -- Cool, cheers Mike! > IndexReader.clo

[jira] Commented: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-30 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668910#action_12668910 ] Michael McCandless commented on LUCENE-1483: One immediate workaround would be

[jira] Commented: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-30 Thread Yonik Seeley (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668891#action_12668891 ] Yonik Seeley commented on LUCENE-1483: -- My previous comment: {quote} I tracked down h

[jira] Commented: (LUCENE-1478) Missing possibility to supply custom FieldParser when sorting search results

2009-01-30 Thread Yonik Seeley (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668889#action_12668889 ] Yonik Seeley commented on LUCENE-1478: -- Apologies, I meant to post in LUCENE-1483 >

Re: Realtime Search

2009-01-30 Thread Michael McCandless

Jason Rutherglen wrote: > > We'd also need to ensure when a merge kicks off, the SegmentReaders > > used by the merging are not newly reopened but also "borrowed" from > > The IW merge code currently opens the SegmentReader with a 4096 > buffer size (different than the 1024 default), how will thi

[jira] Updated: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

2009-01-30 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1476: --- Attachment: hacked-deliterator.patch Alas I had a bug in my original test (my Se

[jira] Commented: (LUCENE-1478) Missing possibility to supply custom FieldParser when sorting search results

2009-01-30 Thread Uwe Schindler (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668822#action_12668822 ] Uwe Schindler commented on LUCENE-1478: --- Ah, now I understand (by the way, I do not

[jira] Commented: (LUCENE-1478) Missing possibility to supply custom FieldParser when sorting search results

2009-01-30 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668818#action_12668818 ] Michael McCandless commented on LUCENE-1478: bq. the parser should be a singl

[jira] Resolved: (LUCENE-1314) IndexReader.clone

2009-01-30 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1314. Resolution: Fixed Committed revision 739238. Thanks Jason! > IndexReader.clone >

[jira] Commented: (LUCENE-1478) Missing possibility to supply custom FieldParser when sorting search results

2009-01-30 Thread Uwe Schindler (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668815#action_12668815 ] Uwe Schindler commented on LUCENE-1478: --- By the way: The Cache of FieldCache instanc

[jira] Commented: (LUCENE-1478) Missing possibility to supply custom FieldParser when sorting search results

2009-01-30 Thread Uwe Schindler (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668812#action_12668812 ] Uwe Schindler commented on LUCENE-1478: --- bq. Uwe, would that result in a memory leak

[jira] Updated: (LUCENE-1506) Adding FilteredDocIdSet and FilteredDocIdSetIterator

2009-01-30 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1506: --- Attachment: LUCENE-1506.patch Thanks John! I made a few tweaks ("downgraded" to Jav

[jira] Commented: (LUCENE-1478) Missing possibility to supply custom FieldParser when sorting search results

2009-01-30 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668802#action_12668802 ] Michael McCandless commented on LUCENE-1478: bq. Write a FloatParser that maps

[jira] Commented: (LUCENE-1478) Missing possibility to supply custom FieldParser when sorting search results

2009-01-30 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668799#action_12668799 ] Michael McCandless commented on LUCENE-1478: Yonik, why was the failure so int

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs

2009-01-30 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668797#action_12668797 ] Michael McCandless commented on LUCENE-1476: bq. Presumably you spliced the im

[jira] Commented: (LUCENE-1478) Missing possibility to supply custom FieldParser when sorting search results

2009-01-30 Thread Uwe Schindler (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668777#action_12668777 ] Uwe Schindler commented on LUCENE-1478: --- After reading your comment several times an

43 matches

Mail list logo