2011/1/4 Julien Nioche <[email protected]>: > Hi, > > Interesting! I'll definitely have a closer look at this and see if / how > pignlproc could be a good match with Behemoth ( > https://github.com/jnioche/behemoth). Speaking of which, I'll probably write > an openNLP wrapper for Behemoth at some point. Feel free to get in touch if > this is of interest.
OpenNLP already features UIMA wrappers that could probably be used to run it on a Behemoth setup. However I really like the simplicity of pig w.r.t. a UIMA runtime that adds an intermediate java object (i.e. the CAS and its type system) that will further add pressure on the JVM garbage collector. Pig already has support for optional type declarations to optimize the processing when needed, otherwise data is just treated as byte[]: no wrapping overhead nor useless memory allocation that can ruin the GC. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel
