Hi Pablo! Sounds like really good progress.
How about using spaCy with some rule-based processing to filter the output of a generative LLM? Best, Eddie On Tue, Sep 5, 2023 at 3:59 PM Pablo Duboue <pablo.dub...@gmail.com> wrote: > Hi, > > Just some update that might be useful for the upcoming board report. > > The work on UIMA CPP is ongoing and with good progress. > > A suitable delivery mechanism for releases for the project has been > identified (Docker image) and the hope is to have access to a UIMA CPP > Docker Hub slot whenever the chair files for an INFRA issue for it. > > Issues regarding Python multi-interpreter support have been identified and > two possible releases of the Pythonator are available, one fully embedded > and the other with multi-interpreter support. Sadly, NumPy, a popular > numeric processing library (and a requirement to most machine learning > packages in Python) does not support multi-interpreter. > > Using the fully embedded library, a proof-of-concept running an annotator > with the popular spaCy NLP library was successfully executed. > > Work is still ongoing packaging UIMA CPP to build as a system library > inside the Docker image rather than using custom paths. > > I'm currently looking for a good application of UIMA CPP to process large > amounts of text related to LLMs (the topic du jour), to attract more > interest in the new release. > > Once this base release is done, the objective is to move to support > aggregates natively in UIMA CPP and to bring back the JNI code that would > allow to run UIMAJ annotators in the pipeline. > > Questions? Comments? Would love to hear them. > > P >