Re: General question about UimaFIT
Hi. Thanks for your answers. So it seems that programming-language independence is the primal reason to do things as they have done. Many thanks, Asher. 2016-09-09 18:49 GMT+03:00 Richard Eckart de Castilho : > On 09.09.2016, at 17:37, Asher Stern wrote: > > > > You explanation really makes things clear, and also answers my question. > > > > I still wonder whether some automatic mechanism can be developed to > > automatically generate TypeDescription and TypeSystemDescription directly > > from a Java class (under some conditions). > > This can shorten the learning curve of UIMA and remove the need for > > automatically-generated code, as well as tracking XML files in the > > classpath. (Such benefits are actually part of the primary goals of > > UimaFIT. Isn't it?) > > Though, such a development, even if possible, would not be trivial. > > The way that UIMA works, the JCas files are not meant to be a canonical > source of metadata information. The typesystem is an independent schema > and the JCas classes are a convenience. They can be used, but they do > not have to be used. In some cases, it is more reasonable to use > the CAS API instead of the JCas API and to operate entirely without > the JCas classes. E.g. an annotation editor where you can flexibibly > define types (such as WebAnno) would rely on the CAS API where > annotations are accessed by name instead of on the JCas API where > compiled Java classes are required. > > So JCas is optional, but type descriptors are not. > > With components, it is different. The descriptor is meaningless without > the implementing component. > > If somebody thought it would be worth the effort to generate type > descriptors > from annotated Java classes, it wouldn't be just annotating fields or > methods. > Some code generation would probably be involved, maybe comparable to how > Lombok > works. That approach has its own drawbacks starting from requiring a > compiler > plugin and going until e.g. Eclipse being unable to search for references > of > auto-generated methods. > > I'll not spend my time atm to follow that idea, but maybe you want to try > it? ;) > > Btw. there is ongoing work on a reimplementation of the CAS (and JCas) > towards > a UIMA v3 in the future. If you consider diving into this, you may want to > read a bit in the recent archives of the developer mailing list first. > > Cheers, > > -- Richard
Re: CPE processors and analysis engines
At least put the AEs and the consumers into separate CAS processors. Otherwise you run into either of the two situations: 1) AEs are not parallelized - no performance gain 2) consumers are parallelized - depending on their implementation, you might be getting mangled data, missing data, etc. Cheers, -- Richard > On 06.09.2016, at 10:56, armin.weg...@bka.bund.de wrote: > > Hi! > > What's the best practice to combine analysis engines into CAS processors? > Should every analysis engine become its own CAS processor? Should analysis > engines be combined to aggregates which become CAS processors? What are the > conditions for doing so: technical, semantical, logical? > > Best, > Armin >
Re: UIMA RUTA script with stringfunctions doesnt work when executing from UIMA-Fit
On 08.09.2016, at 19:46, Peter Klügl wrote: > > Sorry, that was the wrong extension... it should read: > > = AnalysisEngineFactory.createEngine(RutaEngine.class, > RutaEngine.PARAM_ADDITIONAL_EXTENSIONS, new > String[]{BooleanOperationsExtension.class.getName()}); > > btw its better to create a description first, and then the analysis engine. Unless you want to reuse the engine, it's actually a good idea not to instantiate the engine at all, but to leave that to e.g. SimplePipeline.runPipeline(), to the CPE or some other pre-build execution code which also ensures that all the lifecycle events are properly invoked on the engine. Cheers, -- Richard
Re: General question about UimaFIT
On 09.09.2016, at 17:37, Asher Stern wrote: > > You explanation really makes things clear, and also answers my question. > > I still wonder whether some automatic mechanism can be developed to > automatically generate TypeDescription and TypeSystemDescription directly > from a Java class (under some conditions). > This can shorten the learning curve of UIMA and remove the need for > automatically-generated code, as well as tracking XML files in the > classpath. (Such benefits are actually part of the primary goals of > UimaFIT. Isn't it?) > Though, such a development, even if possible, would not be trivial. The way that UIMA works, the JCas files are not meant to be a canonical source of metadata information. The typesystem is an independent schema and the JCas classes are a convenience. They can be used, but they do not have to be used. In some cases, it is more reasonable to use the CAS API instead of the JCas API and to operate entirely without the JCas classes. E.g. an annotation editor where you can flexibibly define types (such as WebAnno) would rely on the CAS API where annotations are accessed by name instead of on the JCas API where compiled Java classes are required. So JCas is optional, but type descriptors are not. With components, it is different. The descriptor is meaningless without the implementing component. If somebody thought it would be worth the effort to generate type descriptors from annotated Java classes, it wouldn't be just annotating fields or methods. Some code generation would probably be involved, maybe comparable to how Lombok works. That approach has its own drawbacks starting from requiring a compiler plugin and going until e.g. Eclipse being unable to search for references of auto-generated methods. I'll not spend my time atm to follow that idea, but maybe you want to try it? ;) Btw. there is ongoing work on a reimplementation of the CAS (and JCas) towards a UIMA v3 in the future. If you consider diving into this, you may want to read a bit in the recent archives of the developer mailing list first. Cheers, -- Richard
Re: General question about UimaFIT
And I guess you don't get JCAS classes for your type system without going through JCasGen, which is another disadvantage to generating the types on the fly. It also kind of goes against the fact that the type system should be something you can rely on for communication between components, so it would tend to be static. Just out of curiosity, what's the use case for this (except maybe unit testing as Armin mentioned)? Best, Jens On Fri, Sep 9, 2016 at 4:31 PM, Richard Eckart de Castilho wrote: > On 09.09.2016, at 13:39, Asher Stern wrote: > > > > Hi Armin. > > Thanks for your quick answer! > > > > While the workaround is indeed helpful, I am still curios why is there no > > regular mechanism to define new types and create new descriptors > > programmatically, much like all other UIMA components? > > Sure you can define types programmatically... it's just that for the > case of types, defining them through XML is actually more convenient. > Mind that the type-system is implementation independent! You can think > of it as of an DTD or XSD. > > If you want to programmatically create a type, you can do this: > > TypeSystemDescription tsd = new TypeSystemDescription_impl(); > TypeDescription tokenTypeDesc = tsd.addType("Token", "", > CAS.TYPE_NAME_ANNOTATION); > tokenTypeDesc.addFeature("length", "", CAS.TYPE_NAME_INTEGER); > > CAS cas = CasCreationUtils.createCas(tsd, null, null); > cas.setDocumentText("This is a test."); > > Check out [1] slides 20 following. > > Cheers, > > -- Richard > > [1] https://github.com/dkpro/dkpro-tutorials/blob/master/ > GSCL2013/tags/latest/slides/GSCL2013UIMATutorialUKP.pdf
Re: initiating CpeComponentDescriptor from String or InputStream
Thanks, Richard! That's also the solution I can think of: getting the descriptor contents from the database and save them as temp files. Then use those temp files to initiate the cas processors. Best, Xiaobin On Fri, Sep 9, 2016 at 5:28 PM, Richard Eckart de Castilho wrote: > Afaik there is no such thing. That is why the uimaFIT CpeBuilder > stores programmatically created engine descriptors in temporary > files. > > Cheers, > > -- Richard > > > On 09.09.2016, at 17:15, Chen Xiaobin wrote: > > > > Hi, > > I am wondering if there is a way to initiate a CpeComponentDescriptor > from > > an InputStream or a String, instead of from a physical descriptor file. > > I am using the following code originally: > > > >CpeCasProcessor casProcessor = CpeDescriptorFactory. > > produceCasProcessor(ae.getName()); > >CpeComponentDescriptor componentDescriptor = CpeDescriptorFactory. > > produceComponentDescriptor("path/to/aeDescriptor.xml"); > >casProcessor.setCpeComponentDescriptor(componentDescriptor); > >cpeDescription.addCasProcessor(casProcessor); > > > > But now in my application, all AE descriptors are stored in a database as > > Strings. I need to construct a CPE and add some AEs to the CPE. > > > > Is there a way to substitute the second line of the above code to > something > > like: > > CpeComponentDescriptor componentDescriptor = > > CpeDescriptorFactory.produceComponentDescriptor(** > descriptorFromAnInputStream > > or String**); > > The UIMA API does not provide such a method in the CpeDescriptorFactor. > > > > Thank you! > > > > Xiaobin > > -- > > -- > > Eberhard Karls Universität Tübingen > > LEAD Graduate School > > Doctoral Candidate > > Gartenstraße 29A · 72074 Tübingen · Germany > > Phone +49 1765 7634 683 > > -- Xiaobin Chen LEAD Graduate School & Research Network Gartenstr. 29A, 72076 Tübingen
Re: General question about UimaFIT
Hi Richard. Many thanks! You explanation really makes things clear, and also answers my question. I still wonder whether some automatic mechanism can be developed to automatically generate TypeDescription and TypeSystemDescription directly from a Java class (under some conditions). This can shorten the learning curve of UIMA and remove the need for automatically-generated code, as well as tracking XML files in the classpath. (Such benefits are actually part of the primary goals of UimaFIT. Isn't it?) Though, such a development, even if possible, would not be trivial. 2016-09-09 17:31 GMT+03:00 Richard Eckart de Castilho : > On 09.09.2016, at 13:39, Asher Stern wrote: > > > > Hi Armin. > > Thanks for your quick answer! > > > > While the workaround is indeed helpful, I am still curios why is there no > > regular mechanism to define new types and create new descriptors > > programmatically, much like all other UIMA components? > > Sure you can define types programmatically... it's just that for the > case of types, defining them through XML is actually more convenient. > Mind that the type-system is implementation independent! You can think > of it as of an DTD or XSD. > > If you want to programmatically create a type, you can do this: > > TypeSystemDescription tsd = new TypeSystemDescription_impl(); > TypeDescription tokenTypeDesc = tsd.addType("Token", "", > CAS.TYPE_NAME_ANNOTATION); > tokenTypeDesc.addFeature("length", "", CAS.TYPE_NAME_INTEGER); > > CAS cas = CasCreationUtils.createCas(tsd, null, null); > cas.setDocumentText("This is a test."); > > Check out [1] slides 20 following. > > Cheers, > > -- Richard > > [1] https://github.com/dkpro/dkpro-tutorials/blob/master/ > GSCL2013/tags/latest/slides/GSCL2013UIMATutorialUKP.pdf
Re: initiating CpeComponentDescriptor from String or InputStream
Can you just write out the component descriptor to a file and pass that to the factory? I think you need a path since the underlying code needs uima-style include which supports import by name or location. Perhaps CPE can do this for you with a new API you are suggesting but I the quickest path for you is to create a file from string. public static CpeComponentDescriptor produceComponentDescriptor(String aPath) { CpeComponentDescriptor componentDescriptor = new CpeComponentDescriptorImpl(); CpeInclude include = new CpeIncludeImpl(); include.set(aPath); componentDescriptor.setInclude(include); return componentDescriptor; } -jerry On Fri, Sep 9, 2016 at 11:15 AM, Chen Xiaobin wrote: > Hi, > I am wondering if there is a way to initiate a CpeComponentDescriptor from > an InputStream or a String, instead of from a physical descriptor file. > I am using the following code originally: > > CpeCasProcessor casProcessor = CpeDescriptorFactory. > produceCasProcessor(ae.getName()); > CpeComponentDescriptor componentDescriptor = CpeDescriptorFactory. > produceComponentDescriptor("path/to/aeDescriptor.xml"); > casProcessor.setCpeComponentDescriptor(componentDescriptor); > cpeDescription.addCasProcessor(casProcessor); > > But now in my application, all AE descriptors are stored in a database as > Strings. I need to construct a CPE and add some AEs to the CPE. > > Is there a way to substitute the second line of the above code to something > like: > CpeComponentDescriptor componentDescriptor = > CpeDescriptorFactory.produceComponentDescriptor(** > descriptorFromAnInputStream > or String**); > The UIMA API does not provide such a method in the CpeDescriptorFactor. > > Thank you! > > Xiaobin > -- > -- > Eberhard Karls Universität Tübingen > LEAD Graduate School > Doctoral Candidate > Gartenstraße 29A · 72074 Tübingen · Germany > Phone +49 1765 7634 683 >
Re: initiating CpeComponentDescriptor from String or InputStream
Afaik there is no such thing. That is why the uimaFIT CpeBuilder stores programmatically created engine descriptors in temporary files. Cheers, -- Richard > On 09.09.2016, at 17:15, Chen Xiaobin wrote: > > Hi, > I am wondering if there is a way to initiate a CpeComponentDescriptor from > an InputStream or a String, instead of from a physical descriptor file. > I am using the following code originally: > >CpeCasProcessor casProcessor = CpeDescriptorFactory. > produceCasProcessor(ae.getName()); >CpeComponentDescriptor componentDescriptor = CpeDescriptorFactory. > produceComponentDescriptor("path/to/aeDescriptor.xml"); >casProcessor.setCpeComponentDescriptor(componentDescriptor); >cpeDescription.addCasProcessor(casProcessor); > > But now in my application, all AE descriptors are stored in a database as > Strings. I need to construct a CPE and add some AEs to the CPE. > > Is there a way to substitute the second line of the above code to something > like: > CpeComponentDescriptor componentDescriptor = > CpeDescriptorFactory.produceComponentDescriptor(**descriptorFromAnInputStream > or String**); > The UIMA API does not provide such a method in the CpeDescriptorFactor. > > Thank you! > > Xiaobin > -- > -- > Eberhard Karls Universität Tübingen > LEAD Graduate School > Doctoral Candidate > Gartenstraße 29A · 72074 Tübingen · Germany > Phone +49 1765 7634 683
initiating CpeComponentDescriptor from String or InputStream
Hi, I am wondering if there is a way to initiate a CpeComponentDescriptor from an InputStream or a String, instead of from a physical descriptor file. I am using the following code originally: CpeCasProcessor casProcessor = CpeDescriptorFactory. produceCasProcessor(ae.getName()); CpeComponentDescriptor componentDescriptor = CpeDescriptorFactory. produceComponentDescriptor("path/to/aeDescriptor.xml"); casProcessor.setCpeComponentDescriptor(componentDescriptor); cpeDescription.addCasProcessor(casProcessor); But now in my application, all AE descriptors are stored in a database as Strings. I need to construct a CPE and add some AEs to the CPE. Is there a way to substitute the second line of the above code to something like: CpeComponentDescriptor componentDescriptor = CpeDescriptorFactory.produceComponentDescriptor(**descriptorFromAnInputStream or String**); The UIMA API does not provide such a method in the CpeDescriptorFactor. Thank you! Xiaobin -- -- Eberhard Karls Universität Tübingen LEAD Graduate School Doctoral Candidate Gartenstraße 29A · 72074 Tübingen · Germany Phone +49 1765 7634 683
Re: General question about UimaFIT
On 09.09.2016, at 13:39, Asher Stern wrote: > > Hi Armin. > Thanks for your quick answer! > > While the workaround is indeed helpful, I am still curios why is there no > regular mechanism to define new types and create new descriptors > programmatically, much like all other UIMA components? Sure you can define types programmatically... it's just that for the case of types, defining them through XML is actually more convenient. Mind that the type-system is implementation independent! You can think of it as of an DTD or XSD. If you want to programmatically create a type, you can do this: TypeSystemDescription tsd = new TypeSystemDescription_impl(); TypeDescription tokenTypeDesc = tsd.addType("Token", "", CAS.TYPE_NAME_ANNOTATION); tokenTypeDesc.addFeature("length", "", CAS.TYPE_NAME_INTEGER); CAS cas = CasCreationUtils.createCas(tsd, null, null); cas.setDocumentText("This is a test."); Check out [1] slides 20 following. Cheers, -- Richard [1] https://github.com/dkpro/dkpro-tutorials/blob/master/GSCL2013/tags/latest/slides/GSCL2013UIMATutorialUKP.pdf
Re: General question about UimaFIT
Hi Armin. Thanks for your quick answer! While the workaround is indeed helpful, I am still curios why is there no regular mechanism to define new types and create new descriptors programmatically, much like all other UIMA components? I mean, what is the difference between type-system and all other UIMA components, that forced the UimaFIT engineers to leave the XML-based definitions for types, while getting rid of XMLs for all the rest of UIMA? 2016-09-09 13:59 GMT+03:00 : > Hi Asher! > > As a work around, you can use an empty type system, > > TypeSystemDescription tsd = TypeSystemDescriptionFactory. > createTypeSystemDescription("EmptyTypeSystem"); > > add types programmatically, > > tsd.addType(typeName, null, CAS.TYPE_NAME_ANNOTATION); > > and get them later with > > Type type = cas.getTypeSystem().getType(typeName); > > The empty type system is an XML descriptor file without types residing > somewhere in the class path. I use this for unit testing when I need a > fresh type system. > > Cheers, > Armin > > > -Ursprüngliche Nachricht- > Von: Asher Stern [mailto:aste...@gmail.com] > Gesendet: Freitag, 9. September 2016 12:17 > An: user@uima.apache.org > Betreff: General question about UimaFIT > > Hi. > I have a general question regarding UimaFIT. > In UimaFIT there is no longer need to write and deal with XML files, thanks > to new classes and annotations. > > This is the case for almost all UIMA components, like AE, AAE, CPE, etc. > However, for type-system definition, XML files are still required. > My question is why? > Is there a technical issue that makes it impossible to get rid of > type-system XMLs? Or is it intentional due to some policy? > > > Thanks in advance, > Asher >
AW: General question about UimaFIT
Hi Asher! As a work around, you can use an empty type system, TypeSystemDescription tsd = TypeSystemDescriptionFactory.createTypeSystemDescription("EmptyTypeSystem"); add types programmatically, tsd.addType(typeName, null, CAS.TYPE_NAME_ANNOTATION); and get them later with Type type = cas.getTypeSystem().getType(typeName); The empty type system is an XML descriptor file without types residing somewhere in the class path. I use this for unit testing when I need a fresh type system. Cheers, Armin -Ursprüngliche Nachricht- Von: Asher Stern [mailto:aste...@gmail.com] Gesendet: Freitag, 9. September 2016 12:17 An: user@uima.apache.org Betreff: General question about UimaFIT Hi. I have a general question regarding UimaFIT. In UimaFIT there is no longer need to write and deal with XML files, thanks to new classes and annotations. This is the case for almost all UIMA components, like AE, AAE, CPE, etc. However, for type-system definition, XML files are still required. My question is why? Is there a technical issue that makes it impossible to get rid of type-system XMLs? Or is it intentional due to some policy? Thanks in advance, Asher pgpBfjvrekVRh.pgp Description: PGP signature
General question about UimaFIT
Hi. I have a general question regarding UimaFIT. In UimaFIT there is no longer need to write and deal with XML files, thanks to new classes and annotations. This is the case for almost all UIMA components, like AE, AAE, CPE, etc. However, for type-system definition, XML files are still required. My question is why? Is there a technical issue that makes it impossible to get rid of type-system XMLs? Or is it intentional due to some policy? Thanks in advance, Asher