Hi Erik, all, not to forget the UIMA Regex Annotator [1].
Currently there is no Stanbol EnhancementEngine that supports pattern based feature extraction (e.g. by using Regex). A generic EnhancementEngine similar to the UIMA Regex Annotator would be definitely well received addition to Apache Stanbol. A configuration would need to include two things (1) the pattern (e.g. regex) used to detect and extract features from the text (2) the annotation: how features extracted by the pattern should be represented as RDF (e.g. Sparql construct part that gets the data from the pattern instead of the select part of the query) WDYT Rupert [1] http://uima.apache.org/downloads/sandbox/RegexAnnotatorUserGuide/RegexAnnotatorUserGuide.html On Tue, Jul 23, 2013 at 7:51 PM, Erik Antelman <eantel...@gmail.com> wrote: > I know several pipelines (GATE and several commercial NLP pipelines) > annotate dates, measurements, address etc > http://gate.ac.uk/sale/tao/splitap6.html#x36-736000F.7. > > I would like to have the same type of rule/re pattern annotation capability > in a Stanbol chain. I would rather not just throw in GATE, but am thinking > that perhaps the regex primitive in the Jena rule engine or using > constructive SPARQL statements could achieve the same capability with > existing core components. > > Obviously such an engine could utilize language/locale data to select > pattern variants and to some degree the POS annotations. > > What ideas or feedback do people have on this? > > for(;;); /* eantel...@gmail.com */ -- | Rupert Westenthaler rupert.westentha...@gmail.com | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen