On Thu, 19 Mar 2020, Marc Jeurissen wrote:
Pylucene version: 8.1.1
Hi all,
When you have a custom tokenizer (class CustomTokenizer(PythonTokenizer)),
you don?t seem to be able to override any method besides incrementToken
(so not end, reset, close).
Is this correct?
Correct, the only native method in PythonTokenizer.java meant to be
implemented in Python is incrementToken() since that is what Tokenizer.java
documents as being the method to extend.
This doesn't mean that you can't add your own extension points. Just edit
PythonTokenizer.java and add more native methods you wish to implement from
python and rebuild extensions.jar and PyLucene. If you override Reset() or
Close() you probably still want to ensure that the parent versions are
called from your own python overrides by casting your instance to the parent
class using its .cast_() method, using something like
mytok.cast_(Tokenizer).reset()
Andi..
Thank you very much
Met vriendelijke groeten,
Marc Jeurissen
Bibliotheek UAntwerpen
Stadscampus ? Ve35.303
Venusstraat 35 ? 2000 Antwerpen
marc.jeuris...@uantwerpen.be
T +32 3 265 49 71