Source code for an accent-removal filter

2005-02-01 Thread Peter Pimley
Hi. In December I made some posts concerning a filter that could work by getting the unicode name of a character and trying to figure out the closest latin equivalent. For example, if it encountered character 00C1 LATIN CAPITAL LETTER A WITH ACUTE, it would be clever enough to replace that

Re: Optimising A Security Filter

2004-12-20 Thread Erik Hatcher
f the filter represent the entire index, or just the results that match the query? It represents the entire index at the time it was instantiated. This is important to know in case documents are later added to the index. Is not worrying about filters and simply checking the returned Hit List b

Re: Optimising A Security Filter

2004-12-20 Thread Paul Elschot
he text of my docs is in > Lucene, but the permissions are in my RDBMS. I can > write a filter (in fact have done so) that loops > through the documents in the passed IndexReader and > queries the DB to detect if the user is permissioned > for them, setting the relevant BitSet. My

Optimising A Security Filter

2004-12-19 Thread Steve Skillcorn
Hello All; I bought the Lucene in Action ebook, which is excellent and I can strongly recommend. One question that has arisen from the book though is custom filters. I have the situation where the text of my docs is in Lucene, but the permissions are in my RDBMS. I can write a filter (in fact

RE: Filter !!!

2004-12-07 Thread Natarajan.T
Thanks your kind help Erik.. -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 07, 2004 12:56 PM To: Lucene Users List Subject: Re: Filter !!! On Dec 7, 2004, at 12:55 AM, Chris Hostetter wrote: > > : Hits hits = indexSearcher.

Re: Filter !!!

2004-12-07 Thread Chris Hostetter
and then some. (the Filter[]chain is what i was planing, but the int[]logic idea is something i hadn't considered ... I figured when I needed multiple Filters combined with different operators I could just build a tree of Filters, but I'm guessing this approach will come in hand

Re: Filter !!!

2004-12-06 Thread Erik Hatcher
On Dec 7, 2004, at 12:55 AM, Chris Hostetter wrote: : Hits hits = indexSearcher.search(searchQuery, filter) // here I want : to pass multiple filter... (DateFilter,QueryFilter) You can write a Filter that takes in multiple filters and ANDs them together (or ORs them, it's not clear

RE: Filter !!!

2004-12-06 Thread Natarajan.T
Thanks for your response.. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Hostetter Sent: Tuesday, December 07, 2004 11:26 AM To: Lucene Users List Subject: Re: Filter !!! : Hits hits = indexSearcher.search(searchQuery, filter

Re: Filter !!!

2004-12-06 Thread Chris Hostetter
: Hits hits = indexSearcher.search(searchQuery, filter) // here I want : to pass multiple filter... (DateFilter,QueryFilter) You can write a Filter that takes in multiple filters and ANDs them together (or ORs them, it's not clear what you want) Hits h = s.search(q,new AndFilter(

Filter !!!

2004-12-06 Thread Natarajan.T
Hi All, I want pass multiple filter (QueryFilter,DateFilter) objects to search method.. See below: Hits hits = indexSearcher.search(searchQuery, filter) // here I want to pass multiple filter... (DateFilter,QueryFilter) How can I handle this?? Regards, Natarajan.

Re: Filter for a search refinement

2004-11-21 Thread Erik Hatcher
On Nov 21, 2004, at 8:34 AM, Nicolas Maisonneuve wrote: yes ...it's the same kind of feature... (i didn't see this Filter !, shame on me) but my method is maybe faster because with the queryFilter an internal search is launched and not with my method It'd be interesting for you

Re: Filter for a search refinement

2004-11-21 Thread Erik Hatcher
QueryFilter keys off the hits from a previous search to light up the bits for documents to pass the filter. The previous search hits all have a score > 0 already, so no need to be concerned with score there. Erik On Nov 21, 2004, at 8:49 AM, Nicolas Maisonneuve wrote: hmm jus

Re: Filter for a search refinement

2004-11-21 Thread Nicolas Maisonneuve
<[EMAIL PROTECTED]> wrote: > yes ...it's the same kind of feature... (i didn't see this Filter !, > shame on me) > but my method is maybe faster because with the queryFilter an internal > search is launched and not with my method > > nicolas > > > > > O

Re: Filter for a search refinement

2004-11-21 Thread Nicolas Maisonneuve
yes ...it's the same kind of feature... (i didn't see this Filter !, shame on me) but my method is maybe faster because with the queryFilter an internal search is launched and not with my method nicolas On Sun, 21 Nov 2004 05:06:12 -0500, Erik Hatcher <[EMAIL PROTECTED]> wrote:

Re: Filter for a search refinement

2004-11-21 Thread Erik Hatcher
Nicolas - how does your filter differ from the capabilities available from the built-in QueryFilter? It seems at first glance to be nearly the same thing. Erik On Nov 21, 2004, at 4:52 AM, Nicolas Maisonneuve wrote: I developped a filter to seach in filtering the search with anterior

Re: Accent filter

2004-09-28 Thread John Moylan
in my index. But unfortunately I couldnt find anything in neither lucene nor the lucene-sandbox to solve the problem. Så I wrote an accent filter and thought that I might as well share it with you guys :) package dk.atira.s

Accent filter

2004-09-28 Thread Bo Gundersen
Hi, I am certainly not the first, and probably not the last, that have had problems with accented characters in my index. But unfortunately I couldnt find anything in neither lucene nor the lucene-sandbox to solve the problem. Så I wrote an accent filter and thought that I might as well share

Re: Custom filter

2004-08-24 Thread roy-lucene-user
SSIVE indexes so updating needs to be planned out. In the meantime we continue with 1.2. So, just for curiousity's sake... any clue on the filter? Or perhaps someone could clue me in on what kind of terms the query parser creates ( and what the searcher class does with them ) when it has s

Re: Custom filter

2004-08-20 Thread Erik Hatcher
On Aug 20, 2004, at 6:48 PM, [EMAIL PROTECTED] wrote: We're currently in lucene 1.2... haven't moved to 1.3 yet. Skip 1.3 and go straight to 1.4.1 :) Upgrade - why not? Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For a

Re: Custom filter

2004-08-20 Thread roy-lucene-user
We're currently in lucene 1.2... haven't moved to 1.3 yet. Roy. On Fri, 20 Aug 2004 18:46:29 -0400, Erik Hatcher wrote > Have you considered using the built-in QueryFilter for this? Why > isn't it sufficient for your needs?

Re: Custom filter

2004-08-20 Thread Erik Hatcher
Have you considered using the built-in QueryFilter for this? Why isn't it sufficient for your needs? Erik On Aug 20, 2004, at 6:32 PM, [EMAIL PROTECTED] wrote: Hi guys! I was hoping someone here could help me out with a custom filter. We have an index of emails and do some search

Custom filter

2004-08-20 Thread roy-lucene-user
Hi guys! I was hoping someone here could help me out with a custom filter. We have an index of emails and do some searches on the text of an email message and also searches based on the email addresses in a To, From or CC. Since we also do searches on a bunch of emails, we created a custom

Re: Performance when computing computing a filter using hundreds of diff terms.

2004-08-06 Thread Paul Elschot
Kevin, On Thursday 05 August 2004 23:32, Kevin A. Burton wrote: > I'm trying to compute a filter to match documents in our index by a set > of terms. > > For example some documents have a given field 'category' so I need to > compute a filter with mulitple categorie

Performance when computing computing a filter using hundreds of diff terms.

2004-08-05 Thread Kevin A. Burton
I'm trying to compute a filter to match documents in our index by a set of terms. For example some documents have a given field 'category' so I need to compute a filter with mulitple categories. The problem is that our category list is > 200 items so it takes about 80 secon

How to feed Filter into Query/IndexReader?

2004-04-06 Thread Holger Klawitter
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi there, I am trying allow hit counters for partial queries in my search app. - From what I understand I have to 1.) create partial Queries, 2.) search them via FilterIndexReader, 3.) grab the number of hits for each search 4.) creat

Re: The Filter got called more than one time

2004-03-30 Thread Erik Hatcher
Use a caching mechanism for your filter, so the bitset is not regenerated. CachingWrappingFilter is your friend :) Erik On Mar 30, 2004, at 2:28 PM, Ching-Pei Hsing wrote: Hi, We implemented a Filter that performs filtering based on some internal pricing logic. While testing we discovered

The Filter got called more than one time

2004-03-30 Thread Ching-Pei Hsing
Hi, We implemented a Filter that performs filtering based on some internal pricing logic. While testing we discovered that this filter got called several times, not like the FAQ says, exactly one time. And the number of calls made was based on how big the result set was. I printed out the calling

Re: HTML tag filter...

2004-01-10 Thread Erik Hatcher
On Jan 10, 2004, at 1:43 PM, [EMAIL PROTECTED] wrote: would it be possible to implement a Analyser who filters HTML code out of a HTML page. As a result I would have only the text free of any tagging. The dilemma is that in a general sense there are multiple fields in HTML. At least "title" and

Re: HTML tag filter...

2004-01-10 Thread Stefan Groschupf
If you browse the cvs of nutch.org you will found an implementation. HTH Stefan Am 10.01.2004 um 19:43 schrieb [EMAIL PROTECTED]: Hi group, would it be possible to implement a Analyser who filters HTML code out of a HTML page. As a result I would have only the text free of any tagging. Is is m

HTML tag filter...

2004-01-10 Thread ambiesense
Hi group, would it be possible to implement a Analyser who filters HTML code out of a HTML page. As a result I would have only the text free of any tagging. Is is maybe better to use other existing open source software for that? Did somebody tried that here? Cheers, Ralf -- +++ GMX - die erste

Powerpoint filter.

2003-09-02 Thread Gregor Heinrich
Hi all, this is partly connected to Docco's new filters. Is there anyone interested in a Powerpoint filter? This format seems to have been kept incredibly obscure by out big friends in Redmond. I did only find a shareware tool called CZ-PPT2TXT, but open source would i.m.h.o. be a bit more u

Use filter instead of searching Re: Error when trying to match file path

2002-12-30 Thread Che Dong
first indexing file path field with a untokened indexing field Field("filePath", file.getAbsolutePath(), true, true, false) second , construct a prefix filter for searcher.I wrote a StringFilter.java for match and prefix match which can download from: http://www.chedon

newbie Filter question

2002-10-18 Thread Todd McGrath
e no results. In my test cases, if I hard code a filter with just "top" or create two filters: "top" and "firms", there are results. I have a feeling this is related to filter terms being tokenized. What are the best ways to use a filter where the text contains spaces?

using NOT in queries and searching using more than one filter

2002-08-08 Thread Minh Kama Yie
can point me in the right direction? Also, I was wondering if anyone has ever searched using more than one filter at a time to do searching via IndexSearcher? Thanks in advance. Regards, Minh Kama Yie This message is intended only for the named recipient. If you are not the intended r

Re: Few questions regarding the design of the Filter class

2002-05-24 Thread Christian Meunier
More or less : "A ChainableFilter allows multiple filters to be chained such that the result is the intersection of all the filters." I do a OR operator on filters which are based on the same field (hence the issue, i need to know on which field the filter is based) (All my f

RE: Few questions regarding the design of the Filter class

2002-05-24 Thread Armbrust, Daniel C.
Looks to me like your looking for Kelvin Tan's chainable filter http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg01168.html Dan -Original Message- From: Christian Meunier [mailto:[EMAIL PROTECTED]] Sent: Friday, May 24, 2002 5:38 AM To: Lucene Users List Subject: Re

Re: Few questions regarding the design of the Filter class

2002-05-24 Thread Christian Meunier
> > A workaround for what? It's not clear what you're trying to do. > Here is what i am trying to do: A simple class to filter a fiel

RE: Few questions regarding the design of the Filter class

2002-05-24 Thread cutting
> From: Christian Meunier > > > > From: Christian Meunier > > > > > > Why there is not method to get the field on which the filter > > > is used to restrict the search ? > > > > A filter may not always restrict the search to a single >

Re: Few questions regarding the design of the Filter class

2002-05-23 Thread Christian Meunier
- Original Message - From: <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, May 23, 2002 10:04 PM Subject: RE: Few questions regarding the design of the Filter class > > From: Christian Meunier > > > > Hi, i have few questions regarding the Fil

RE: Few questions regarding the design of the Filter class

2002-05-23 Thread cutting
> From: Christian Meunier > > Hi, i have few questions regarding the Filter class. > > Why this is not an interface ? No good reason. Since interfaces have some performance penalties with most JVMs, when I first wrote Lucene I only used interfaces where multiple inheritance wa

Few questions regarding the design of the Filter class

2002-05-22 Thread Christian Meunier
Hi, i have few questions regarding the Filter class. Why this is not an interface ? Why there is not method to get the field on which the filter is used to restrict the search ? Thanks in advance Best regards Christian Meunier

Re: HTML Analyzer & filter

2002-04-16 Thread David Black
>> To: [EMAIL PROTECTED] >> Subject: HTML Analyzer & filter >> >> >> Not to seem too lazy but I was just beginning to write an HTML Filter >> and Analyzer and thought..."gee, I bet someone has done this >> already". >> Are there any Apache/GPL HT

RE: HTML Analyzer & filter

2002-04-16 Thread Halácsy Péter
> -Original Message- > From: David Black [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, April 16, 2002 5:07 PM > To: [EMAIL PROTECTED] > Subject: HTML Analyzer & filter > > > Not to seem too lazy but I was just beginning to write an HTML Filter > and Ana

HTML Analyzer & filter

2002-04-16 Thread David Black
Not to seem too lazy but I was just beginning to write an HTML Filter and Analyzer and thought..."gee, I bet someone has done this already". Are there any Apache/GPL HTML filters out there as a part of another project or that anyone on this list would be willing to contribute. Than

RE: Chainable Filter contribution

2002-03-28 Thread Armbrust, Daniel C.
iel C. Cc: [EMAIL PROTECTED] Subject: Re: Chainable Filter contribution Dan, Totally my bad. I had since changed it but hadn't posted it to the list coz I didn't think anyone found it useful. Here's the correct version. I haven't really documented since it's pretty strai

RE: Chainable Filter contribution

2002-03-27 Thread Strittmatter Stephan (external)
Kevin, it does no matter, but would be nice. Stephan > -Original Message- > From: Kelvin Tan [mailto:[EMAIL PROTECTED]] > Sent: Thursday, March 28, 2002 8:15 AM > To: Lucene Users List > Subject: Re: Chainable Filter contribution > > > Stephan, > >

Re: Chainable Filter contribution

2002-03-27 Thread Kelvin Tan
ot;'Lucene Users List'" <[EMAIL PROTECTED]> Sent: Thursday, March 28, 2002 2:54 PM Subject: RE: Chainable Filter contribution > Hi Kelvin, > > I done som similar only doing XOR for my chains. > But now your improved filter is better than my own. > I think I

RE: Chainable Filter contribution

2002-03-27 Thread Strittmatter Stephan (external)
Hi Kelvin, I done som similar only doing XOR for my chains. But now your improved filter is better than my own. I think I will replace my own by yours. Will it be part of Lucene in future? Regards, Stephan > -Original Message- > From: Kelvin Tan [mailto:[EMAIL PROTECTED]]

Re: Chainable Filter contribution

2002-03-27 Thread Kelvin Tan
iginal Message - From: "Armbrust, Daniel C." <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, March 28, 2002 5:17 AM Subject: Chainable Filter contribution > I found this in the mailing list, and I do need something like this, as I > need to apply more

Re: character filter issue

2002-01-08 Thread Otis Gospodnetic
Hello, See http://jguru.com/faq/view.jsp?EID=538308 Have you tried that? Otis --- "Oshima, Scott" <[EMAIL PROTECTED]> wrote: > Suppose we have one field with one string abc-xxx.com > > When I query for abc-xxx.com it returns 0 hits. > > BUT when i query for something like xxx.com it returns r

RE: character filter issue/tokenizing host names

2002-01-08 Thread Halácsy Péter
> -Original Message- > From: Oshima, Scott [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, January 08, 2002 11:53 PM > To: 'Lucene Users List' > Subject: character filter issue > > > Suppose we have one field with one string abc-xxx.com > > When I

character filter issue

2002-01-08 Thread Oshima, Scott
Suppose we have one field with one string abc-xxx.com When I query for abc-xxx.com it returns 0 hits. BUT when i query for something like xxx.com it returns results fine. not sure what lucene is doing with the dashes. i am using the default standardfilter, lowercasefilter, stopfilter and porte

WG: Filter and stop-words

2001-12-03 Thread Ruffieux Stephane, yellowworld extern
liche Nachricht- Von:Brian Brown [SMTP:[EMAIL PROTECTED]] <mailto:[SMTP:[EMAIL PROTECTED]]> Gesendet am:Montag, 3. Dezember 2001 18:06 An: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> Betreff:Fw: Filter and stop-words I have just started work on a French stemmer

Fw: Filter and stop-words

2001-12-03 Thread Brian Brown
). Should we consider collaboration? Brian Brown - Original Message - From: "Ruffieux Stephane, yellowworld extern" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Monday, December 03, 2001 1:33 PM Subject: WG: Filter and stop-words Yes, it is a way.

WG: Filter and stop-words

2001-12-03 Thread Ruffieux Stephane, yellowworld extern
PROTECTED]> Betreff:Filter and stop-words I'm new to Lucene. First of all I would like to know if there is a search arquive like "sun servlets list". My first problem is that I want to index a Portuguese database and I need to remove the "s" (plural) and

RE: Filter and stop-words

2001-12-03 Thread Karl Øie
/portuguese/stemmer.html mvh karl øie -Original Message- From: Bizu de Anúncio [mailto:[EMAIL PROTECTED]] Sent: 3. desember 2001 13:22 To: [EMAIL PROTECTED] Subject: Filter and stop-words I'm new to Lucene. First of all I would like to know if there is a search arquive like "su

Filter and stop-words

2001-12-03 Thread Bizu de Anúncio
I'm new to Lucene. First of all I would like to know if there is a search arquive like "sun servlets list". My first problem is that I want to index a Portuguese database and I need to remove the "s" (plural) and acents (à é ...) from the words. Is there