Re: Building a custom Tokenizer

2010-07-18 Thread Andi Vajda
On Jul 17, 2010, at 22:30, Andi Vajda va...@apache.org wrote: On Jul 17, 2010, at 22:23, Martin mar...@webscio.net wrote: Hi there, I'm trying to extend the PythonTokenizer class to build my own custom tokenizer, but seem to get stuck pretty much soon after that. I know that I'm

Re: Building a custom Tokenizer

2010-07-18 Thread Martin
Hey, Thanks for the tips. I was pointed towards the KeywordTokenizer by the java people which returns the full content as one content (not a very intuitive name in my opinion, but anyway). I might still need to extend this to do some customizations, so I'll look into the PythonAnalyzer