On Jul 17, 2010, at 22:30, Andi Vajda va...@apache.org wrote:
On Jul 17, 2010, at 22:23, Martin mar...@webscio.net wrote:
Hi there,
I'm trying to extend the PythonTokenizer class to build my own
custom tokenizer, but seem to get stuck pretty much soon after
that. I know that I'm
Hey,
Thanks for the tips. I was pointed towards the KeywordTokenizer by the
java people which returns the full content as one content (not a very
intuitive name in my opinion, but anyway). I might still need to extend
this to do some customizations, so I'll look into the PythonAnalyzer