It would be nice if uimaFIT provided a Maven plugin to automatically generate descriptors for aggregates. Maybe if we come up with a convention for factories, e.g. a "class with static methods that do not take any parameters and that return descriptors", or "methods that bear a specific Java annotation, e.g. @AutoGenerateDescriptor)" it should be possible to implement such a Maven plugin.
Cheers, -- Richard On 16.04.2014, at 05:21, Steven Bethard <steven.beth...@gmail.com> wrote: > +1. And note that once you have a descriptor, you can generate the > XML, so we should arrange to replace the current XML descriptors with > ones generated automatically from the uimaFIT code. That should reduce > some synchronization problems when the Java code was changed but the > XML descriptor was not. > > Steve > > On Tue, Apr 15, 2014 at 8:52 AM, Miller, Timothy > <timothy.mil...@childrens.harvard.edu> wrote: >> The discussion in the other thread with Abraham Tom gave me an idea I >> wanted to float to the list. We have been using some UIMAFit pipeline >> builders in the temporal project that maybe could be moved into >> clinical-pipeline. For example, look to this file: >> >> http://svn.apache.org/viewvc/ctakes/trunk/ctakes-temporal/src/main/java/org/apache/ctakes/temporal/pipelines/TemporalExtractionPipeline_ImplBase.java?view=markup >> >> with the static methods getPreprocessorAggregateBuilder() and >> getLightweightPreprocessorAggregateBuilder() [no umls]. >> >> So my idea would be to create a class in clinical-pipeline >> (CTakesPipelines) with static methods for some standard pipelines (to >> return AnalysisEngineDescriptions instead of AggregateBuilders?): >> >> getStandardUMLSPipeline() -- builds pipeline currently in >> AggregatePlaintextUMLSProcessor.xml >> getFullPipeline() -- same as above but with SRL, constituency parsing, >> etc., every component in ctakes >> >> We could then potentially merge our entry points -- I think Abraham's >> experience points out that this is currently confusing, as well as >> probably not implemented optimally. For example, either >> ClinicalPipelineWithUmls or BagOfCUIsGenerator would use that static >> method to run a uimafit-style pipeline. Maybe we can slowly deprecate >> our xml descriptors too unless people feel strongly about keeping those >> around. >> >> Another benefit is that the cTAKES API is then trivial -- if you import >> ctakes into your pom file getting a UIMA pipeline is one UimaFit call: >> >> builder.add(CTAKESPipelines.getStandardUMLSPipeline()); >> >> >> I think this would actually be pretty easy to implement, but hoping to >> get some feedback on whether this is a good direction. >> >> Tim