Στις 9/2/2012 8:54 μμ, ο/η Steven A Rowe έγραψε:
Hi Damerian,
One way to handle your scenario is to hold on to the previous token, and only
emit a token after you reach at least the second token (or at end-of-stream).
Your incrementToken() method could look something like:
1. Get current attributes: input.incrementToken()
2. If previous token does not exist:
2a. Store current attributes as previous token (see
AttributeSource#cloneAttributes)
2b. Get current attributes: input.incrementToken()
3. Check for& store conditions that will affect previous token's attributes
4. Store current attributes as next token (see AttributeSource#cloneAttributes)
5. Copy previous token into current attributes (see AttributeSource#copyTo);
the target will be "this", which is an AttributeSource.
6. Make changes based on conditions found in step #3 above
7. set previous token = next token
8. return true
(Everywhere I say "token" I mean "instance of AttributeSource".)
The final token in the input stream will need special handling, as will
single-token input streams.
Good luck,
Steve
-----Original Message-----
From: Damerian [mailto:dameria...@gmail.com]
Sent: Thursday, February 09, 2012 2:19 PM
To: java-user@lucene.apache.org
Subject: Access next token in a stream
Hello i want to implement my custom filter, my wuestion is quite simple
but i cannot find a solution to it no matter how i try:
How can i access the TermAttribute of the next token than the one i
currently have in my stream?
For example in the phrase "My name is James Bond" if let's say i am in
the token [My], i would like to be able to check the TermAttribute of
the following token [name] and fix my position increment accordingly.
Thank you in advance!
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
Hi Steve,
Thank you for your immediate reply. i will try your solution but i feel
that it does not solve my case.
What i am trying to make is a filter that joins together two
terms/tokens that start with a capital letter (it is trying to find all
the Names/Surnames and make them one token) so in my aforementioned
example when i examine [James] even if i store the TermAttribute to a
temporary token how can i check the next one [Bond] , to join them
without actually emmiting (and therefore creating a term in my inverted
index) that has [James] on its own.
Thank you again for your insight and i would relly appreciate any other
views on the matter.
Regards, Damerian
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org