OK, I think "query string" is a bit to specific, so more general
what I need is access from inside of a filter to the complete string
(not only token) being analyzed.
A very dirty workaround would be a "collector filter" which collects all
tokens after WhitespaceTokenizer and makes it somehow available for
the following filters, or not?
So at least at the last run of incrementToken() I have the original string.
Bernd
Am 26.10.2011 10:26, schrieb Uwe Schindler:
The input from StringReader does not help you:
- in the case of QueryParser it is *not* the query string!!!
- storing it in an attribute would blow up your heap for real documents
Uwe
-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [email protected]
-----Original Message-----
From: Bernd Fehling [mailto:[email protected]]
Sent: Wednesday, October 26, 2011 10:06 AM
To: [email protected]
Subject: Re: accessing the query string from inside TokenFilter
From what I can see in the debugger the analyzer chain is implemented as
a
stack with last filter at the bottom and the first filter at the top.
An analyzer query chain of:
charFilter: MappingCharFilterFactory
tokenizer : WhitespaceTokenizerFactory
filter : PatternReplaceFilterFactory
filter : LowerCaseFilterFactory
filter : ShingleFilterFactory
filter : SynonymFilterFactory
has a chain of:
this.input(SynonymFilter) --> input(ShingleFilter) -->
input(LowerCaseFilter) --> input(PatternReplaceFilter) -->
input(WhitespaceTokenizer) --> input(MappingCharFilter) -->
input(CharReader) --> input(StringReader).str
So I can always "see" the input of StringReader, but can I access it?
Bernd
Am 26.10.2011 09:37, schrieb Chris Male:
We've also lost the full query string by the time the QP creates its
TokenStream, right? Because the QP tokenizes on whitespace.
On Wed, Oct 26, 2011 at 8:32 PM, Uwe Schindler<[email protected]> wrote:
Hi Simon,
The problem is the xchanged consumer/producer role. Once the
TokenStream calls clearAttributes() the attributes are gone, but
query parser can only set the attribute *before* calling
incrementToken(), so you have no chance to get them, as Tokenizer
cleared it before any filter can read it (unless we use an attribute
with clear() a no-op, which would fail lots of tests, as it's a hack).
Uwe
-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [email protected]
-----Original Message-----
From: Simon Willnauer [mailto:[email protected]]
Sent: Wednesday, October 26, 2011 9:21 AM
To: [email protected]
Subject: Re: accessing the query string from inside TokenFilter
What Uwe says is correct though. What we possibly could do is adding
a queryattribute that is set in a query parser (you can do that
yourself
though).
not sure if it is worth it and if we should do it.
simon
On Wed, Oct 26, 2011 at 8:58 AM, Uwe Schindler<[email protected]>
wrote:
Hi,
QueryParser and TokenStreams are clearly separated, there is no way
to get the query string from inside a TokenStream (and there cannot
be, because QP is a consumer of the TS, which is used not only for
query parsing). The only chance you have is to use a ThreadLocal
that you set before the query is parsed and then use it in the
TokenFilter.
Uwe
-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
eMail: [email protected]
-----Original Message-----
From: Bernd Fehling [mailto:[email protected]]
Sent: Wednesday, October 26, 2011 8:33 AM
To: [email protected]
Subject: accessing the query string from inside TokenFilter
Dear list,
while writing some TokenFilter for my analyzer chain I need access
to
the
query
string from inside of my TokenFilter for some comparison, but the
Filters
are
working with a TokenStream and get seperate Tokens.
Currently I couldn't get any access to the query string.
It would be great to have such a funtionality in lucene/solr.
Should I write a jira issue for it or is there somewhere a wish
list?
Best regards
Bernd
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] For
additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] For
additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] For
additional
commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
--
*************************************************************
Bernd Fehling Universitätsbibliothek Bielefeld
Dipl.-Inform. (FH) Universitätsstr. 25
Tel. +49 521 106-4060 Fax. +49 521 106-4052
[email protected] 33615 Bielefeld
BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
--
*************************************************************
Bernd Fehling Universitätsbibliothek Bielefeld
Dipl.-Inform. (FH) Universitätsstr. 25
Tel. +49 521 106-4060 Fax. +49 521 106-4052
[email protected] 33615 Bielefeld
BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]