Hi! On Mon, Sep 28, 2015 at 02:31:03PM +0100, Baker James D wrote: > I would like to draw your attention to a text analytics framework that has > just been released by Dstl (part of the UK Ministry of Defence). It uses UIMA > as part of its underlying architecture but provides additional functionality > on top of that, and simplifies much of the user configuration and experience, > as well as the development process. A number of collection readers, > annotators and consumers are included as part of the framework. > > The tool is called Baleen, and is released under Apache Software License 2. > > There is more information about the tool on the press release > (https://www.gov.uk/government/news/dstl-adds-to-open-source-software), and > on the GitHub page (https://github.com/dstl/baleen).
Thanks for the heads up. However, I haven't found any clear summary of what is the framework capable of right now - I think you might want to expand the generic description a bit with some examples and use-cases. I have been looking around a bit and seems like e.g. https://github.com/dstl/baleen/blob/master/baleen/baleen-annotators/src/main/java/uk/gov/dstl/baleen/annotators/cleaners/MergeAdjacentQuantities.java is something that could be pretty useful, but you might want to make it easier to discover the capabilities to get more users / contributors. Best, Petr Baudis