[
https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668777#action_12668777
]
Uwe Schindler commented on LUCENE-1478:
---
After reading your comment several times
[
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668797#action_12668797
]
Michael McCandless commented on LUCENE-1476:
bq. Presumably you spliced the
[
https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668799#action_12668799
]
Michael McCandless commented on LUCENE-1478:
Yonik, why was the failure so
[
https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668802#action_12668802
]
Michael McCandless commented on LUCENE-1478:
bq. Write a FloatParser that maps
[
https://issues.apache.org/jira/browse/LUCENE-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-1506:
---
Attachment: LUCENE-1506.patch
Thanks John! I made a few tweaks (downgraded to Java
[
https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668812#action_12668812
]
Uwe Schindler commented on LUCENE-1478:
---
bq. Uwe, would that result in a memory
[
https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668815#action_12668815
]
Uwe Schindler commented on LUCENE-1478:
---
By the way: The Cache of FieldCache
[
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-1476:
---
Attachment: hacked-deliterator.patch
Alas I had a bug in my original test (my
Jason Rutherglen jason.rutherg...@gmail.com wrote:
We'd also need to ensure when a merge kicks off, the SegmentReaders
used by the merging are not newly reopened but also borrowed from
The IW merge code currently opens the SegmentReader with a 4096
buffer size (different than the 1024
[
https://issues.apache.org/jira/browse/LUCENE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668889#action_12668889
]
Yonik Seeley commented on LUCENE-1478:
--
Apologies, I meant to post in LUCENE-1483
[
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668910#action_12668910
]
Michael McCandless commented on LUCENE-1483:
One immediate workaround would be
[
https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668930#action_12668930
]
Jason Rutherglen commented on LUCENE-1314:
--
Cool, cheers Mike!
[
https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668941#action_12668941
]
Jason Rutherglen commented on LUCENE-1314:
--
I'm thinking of implementing a follow
Hi all,
I've been using BloomFilters for various tasks, and I can't shake the
feeling that they could be of some use in Lucene internals, to speed up
various membership tests, especially if we look for 100% correct
negatives, and we can accept a small rate of false positives.
For example,
[
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668946#action_12668946
]
Marvin Humphrey commented on LUCENE-1476:
-
Actually I used your entire patch on
File based spellcheck with doc frequencies supplied
---
Key: LUCENE-1532
URL: https://issues.apache.org/jira/browse/LUCENE-1532
Project: Lucene - Java
Issue Type: New Feature
[
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668954#action_12668954
]
Jason Rutherglen commented on LUCENE-1476:
--
Maybe we should close this issue
[
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668971#action_12668971
]
Jason Rutherglen commented on LUCENE-1476:
--
{quote}
Just run sortBench2.py in
Andrzej Bialecki wrote:
Funny, I was having vague thoughts about this today too having been
concerned about some of the big arrays that can end up in a typical
Lucene app. Aside from providing space-efiicient lookups, another
application for BloomFilters is in similarity measures e.g. ANDing
markharw00d wrote:
Andrzej Bialecki wrote:
Funny, I was having vague thoughts about this today too having been
concerned about some of the big arrays that can end up in a typical
Lucene app. Aside from providing space-efiicient lookups, another
application for BloomFilters is in similarity
[
https://issues.apache.org/jira/browse/LUCENE-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12669018#action_12669018
]
Eks Dev commented on LUCENE-1532:
-
bq. so it can suggest a very obscure word rather than a
Well. I used 2 Broder similarity measures, and it works well. You obviously
need to pick the right size bf's.
Navendu Jain has a paper called using bloomfilters to refine web search
results, which I think is relevant here. I talks about how remove near
duplicate search results using bf's.
[
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12669024#action_12669024
]
Michael McCandless commented on LUCENE-1476:
{quote}
Thanks for running all
[
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12669025#action_12669025
]
Michael McCandless commented on LUCENE-1476:
{quote}
This seems like something
I have used them for speeding up huge switch clauses in charset normalization
(eg lowercase and accent-plain form mapping). Big number of accented
characters (this causes big switch statement) that appear seldom in corpus (big
majority being not accented). If negative test, you do just simple
[
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12669026#action_12669026
]
Michael McCandless commented on LUCENE-1476:
bq. We need more performance data
[
https://issues.apache.org/jira/browse/LUCENE-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12669029#action_12669029
]
Mark Miller commented on LUCENE-1532:
-
Our spellchecking def needs improvement.
I
On Fri, 30 Jan 2009, eks dev wrote:
I have used them for speeding up huge switch clauses in charset
normalization (eg lowercase and accent-plain form mapping). Big number of
accented characters (this causes big switch statement) that appear seldom
in corpus (big majority being not accented).
Maybe we should close this issue with a won't-fix and start a new one for
filtered deletions?
A few thoughts, without looking at the code, just thinking aloud :)
It is inverted filter what we are talking about here, Lucene uses Filter as a
pass filter (Set bit defines document that should
On Friday 30 January 2009 23:24:42 eks dev wrote:
...
This is conceptually almost equal (fully equal, when Paul gets Fillters as
bolean clauses done) to having separate, single valued field indexed
isDeleted {true, false}
where each Query gets implicitly transformed to OriginalQuery
unfortunately this code is not mine, but is rather simple to try it:
int bloom_filter;
for (char accent : accents ) {
bloom_filter = bloom_filter | 1 ( accent 0x1F );
}
the rest is easy, this works well for 10-20 chars per bloom_filter, depends on
indeed :)
From: Paul Elschot paul.elsc...@xs4all.nl
To: java-dev@lucene.apache.org
Sent: Friday, 30 January, 2009 23:37:08
Subject: Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet,
IndexReader returns DocIdSet deleted docs
On Friday 30
deletes made through reader (by docID) are immediately visible, but
through writer are buffered until a flush or reopen?
This is what I was thinking, IW buffers deletes, IR does not. Making
IW.deletes visible immediately by applying them to the IR makes sense
as well.
What should be the
Deleted documents as a Filter or top level Query
Key: LUCENE-1533
URL: https://issues.apache.org/jira/browse/LUCENE-1533
Project: Lucene - Java
Issue Type: Improvement
Components:
[
https://issues.apache.org/jira/browse/LUCENE-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12669105#action_12669105
]
John Wang commented on LUCENE-1506:
---
Thanks Michael!
Adding FilteredDocIdSet and
See http://hudson.zones.apache.org/hudson/job/Lucene-trunk/723/changes
Changes:
[uschindler] fix javadocs
[uschindler] Add some extra check for validity of c'tor parameters in
TrieRangeFilter
[mikemccand] LUCENE-1314: add IndexReader.clone(boolean readOnly) and
reopen(boolean readOnly)
Hi,
I'm using following code to get execute search query in Lucene.Net
var collector = new
GroupingHitCollector(searcher.GetIndexReader());searcher.Search(myQuery,
collector);resultsCount = collector.Hits.Count;How do i sort these search
results based on a field?
I need to use collector
Hi Mitu,
Could we have usage/implementation based questions at the user forum.
Would help keep things segregated :).
About your problem though, I wouldn't know about the .net port. You
could (in Java Lucene) use:
public TopFieldDocCollector(IndexReader reader, Sort sort, int
numHits)
i.e.:
38 matches
Mail list logo