AW: AW: Textmarker - Qualification of Types
Hi Peter, That was really helpful, Thanks again, Armin -Ursprüngliche Nachricht- Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Gesendet: Montag, 6. Mai 2013 15:39 An: user@uima.apache.org Betreff: Re: AW: Textmarker - Qualification of Types Hi, I should have mentioned that you can update your old textmarker projects to new UIMA Ruta project with a context action: right-click on an old project and select UIMA Ruta - Update Project Best, Peter On 06.05.2013 10:56, Peter Klügl wrote: Hi, a snapshot update site is here: http://people.apache.org/~pkluegl/temp/2.0.1-SNAPSHOT/eclipse-update-s ite/ http://people.apache.org/%7Epkluegl/temp/2.0.1-SNAPSHOT/eclipse-updat e-site/ The parent folder contains also ruta-core-2.0.1-SNAPSHOT.jar and the documentation since the jenkins build just failed. Best, Peter On 06.05.2013 07:53, armin.weg...@bka.bund.de wrote: Hi Peter, That is fine. I'm using 2.0.0 core jar from maven central. Can you give me a snapshot update site, please? Thank you, Armin -Ursprüngliche Nachricht- Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Gesendet: Freitag, 3. Mai 2013 15:04 An: user@uima.apache.org Betreff: Re: Textmarker - Qualification of Types Hi, it's fixed now in the trunk. Which version do you use? Just let me know if you need a snapshot update site or help with the new projects. Best, Peter On 03.05.2013 14:28, Peter Klügl wrote: Hi, On 03.05.2013 13:52, armin.weg...@bka.bund.de wrote: Hi, I'm running Textmarker on a CAS XMI file with a lot of annotations from different annotators and different type systems. There are some type names used more than once, but with different name spaces. All types are defined in the type system included with TYPESYSTEM. Prepending the namespace to a type name in a Textmarker script does not work. How to tell Textmarker which namespace to use? I just checked it and found a bug. I will fix it ASAP and commit it to the trunk. Please mind that the name textmarker was replaced by ruta in the trunk. Normally, you should be able to use the complete namespace for referencing a type and a shortcut with type variables since adding the complete namespace is tedious and confusing. // some imports... TYPE NUM_DKPro = de.tudarmstadt.ukp.dkpro.core.api.syntax.type.dependency.NUM; TYPE NUM_Ruta = org.apache.uima.ruta.type.NUM; NUM_Ruta PERIOD NUM_Ruta{- MARK(...)}; Thanks for reporting this. Best, Peter Thanks, Armin
Re: Ruta 2.0.1-SNAPSHOT - CVS directory exception
Hi, thanks for reporting this. I will fix it ASAP. Peter On 07.05.2013 09:32, armin.weg...@bka.bund.de wrote: Hello Peter, IN 2.0.1-SNAPSHOT the RutaLauncher complains about the CVS directory in the input folder: Exception in thread main java.io.FileNotFoundException: projectLocation/input/CVS (Is a directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.init(FileInputStream.java:138) at org.apache.uima.util.FileUtils.file2String(FileUtils.java:155) at org.apache.uima.ruta.ide.launching.RutaLauncher.processFile(RutaLauncher.java:149) at org.apache.uima.ruta.ide.launching.RutaLauncher.main(RutaLauncher.java:119) Cheers, Armin -Ursprüngliche Nachricht- Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Gesendet: Montag, 6. Mai 2013 10:56 An: user@uima.apache.org Betreff: Re: AW: Textmarker - Qualification of Types Hi, a snapshot update site is here: http://people.apache.org/~pkluegl/temp/2.0.1-SNAPSHOT/eclipse-update-site/ http://people.apache.org/%7Epkluegl/temp/2.0.1-SNAPSHOT/eclipse-update-site/ The parent folder contains also ruta-core-2.0.1-SNAPSHOT.jar and the documentation since the jenkins build just failed. Best, Peter On 06.05.2013 07:53, armin.weg...@bka.bund.de wrote: Hi Peter, That is fine. I'm using 2.0.0 core jar from maven central. Can you give me a snapshot update site, please? Thank you, Armin -Ursprüngliche Nachricht- Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Gesendet: Freitag, 3. Mai 2013 15:04 An: user@uima.apache.org Betreff: Re: Textmarker - Qualification of Types Hi, it's fixed now in the trunk. Which version do you use? Just let me know if you need a snapshot update site or help with the new projects. Best, Peter On 03.05.2013 14:28, Peter Klügl wrote: Hi, On 03.05.2013 13:52, armin.weg...@bka.bund.de wrote: Hi, I'm running Textmarker on a CAS XMI file with a lot of annotations from different annotators and different type systems. There are some type names used more than once, but with different name spaces. All types are defined in the type system included with TYPESYSTEM. Prepending the namespace to a type name in a Textmarker script does not work. How to tell Textmarker which namespace to use? I just checked it and found a bug. I will fix it ASAP and commit it to the trunk. Please mind that the name textmarker was replaced by ruta in the trunk. Normally, you should be able to use the complete namespace for referencing a type and a shortcut with type variables since adding the complete namespace is tedious and confusing. // some imports... TYPE NUM_DKPro = de.tudarmstadt.ukp.dkpro.core.api.syntax.type.dependency.NUM; TYPE NUM_Ruta = org.apache.uima.ruta.type.NUM; NUM_Ruta PERIOD NUM_Ruta{- MARK(...)}; Thanks for reporting this. Best, Peter Thanks, Armin
PhD Research Studentship, Natural Language Processing, University of Sheffield, UK
Applications are invited for a fully funded PhD studentship on computing the veracity of social media content. Application closing date is* 31 May 2013. * The aim of this studentship is to design natural language processing methods to compute the veracity of social media content and deal with the specifics of medical language. The goal is to model, identify, and verify healthcare-related misinformation and disinformation, as they spread across online media (e.g. patient forums) and social networks. The studentship is hosted at the Department of Computer Science at the University of Sheffield, UK. For more information, see: http://www.jobs.ac.uk/job/AGL052/phd-research-studentship/
RE: Extending TextMarker with new actions
Hi Peter, What you proposed would work fine for what I was trying to do! Cheers, Will -Original Message- From: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Sent: Tuesday, May 07, 2013 3:42 AM To: user@uima.apache.org Subject: Re: Extending TextMarker with new actions Hi, On 06.05.2013 18:26, William Karl Thompson wrote: Hi Peter, I like the simplified regular expression rule syntax -- very handy. It's almost exactly what I wanted. However, one thing I'm wondering is how to create an annotation with features using such rules. I have in mind something like the following: (regex string) - 1 = CREATE(FooType, feat = bar); Here's a possible variant of the above that I can imagine would be useful too: (regex) (string) - CREATE(FooType, feat1 = GROUP(1), feat2=GROUP(2)); What are your thoughts on this? I think I won't be able to use the existing code of the CREATE action for this and it will also be problematic in the grammar without creating a new context. What about something like: (regexp) (string) - Type1, 1 = Type2 (feat = 2); This will of course not work with numeric feature values, but there isn't an auto-cast anyway... Best, Peter Cheers, Will -Original Message- From: William Karl Thompson Sent: Thursday, May 02, 2013 1:49 PM To: user@uima.apache.org Subject: RE: Extending TextMarker with new actions Vielen Dank, Ich werde es probieren. -Original Message- From: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Sent: Thursday, May 02, 2013 12:42 PM To: user@uima.apache.org Subject: Re: Extending TextMarker with new actions Am 02.05.2013 19:16, schrieb William Karl Thompson: I see you're way ahead of me! I'll take a look at this -- is it in the latest on trunk? Yes, and there is also a unit test (if you are interested in some ready-to-work examples): org.apache.uima.ruta.RegExpRuleTest.java(.ruta, .txt) Peter -Original Message- From: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Sent: Thursday, May 02, 2013 12:14 PM To: user@uima.apache.org Subject: Re: Extending TextMarker with new actions Hi, oh, I am afraid I recently added something like that for the 2.0.1 release, not yet included in the 2.0.0 release. This does not mean that I would not include the action in UIMA Ruta ;-) Here the excerpt of the documentation: section id=ugr.tools.ruta.language.regexprule titleSimple Rules based on Regular Expressions/title para The Ruta language includes, additionally to the normal rules, a simplified rule syntax for processing regular expressions. These simple rules consist of two parts separated by quote-/quote: The left part is the regular expression (flags: DOTALL and MULTILINE), which may contain capturing groups. The right part defines, which kind of annotations should be created for each match of the regular expression. If a type is given without a group index, then an annotation of that type is created for the complete regular expression match, which corresponds to group 0. These simple rules can be restricted to match only within certain annotations using the BLOCK construct, and ignore all filtering settings. /para programlisting![CDATA[ RegExpRule - StringExpression - GroupAssignment (, GroupAssignment)* ; GroupAssignment - TypeExpression | NumberEpxression = TypeExpression ]]/programlisting para The following example contains a simple rule, which is able to create annotations of two different types. It creates an annotation of the type quoteT1/quote for each match of the complete regular expression and an annotation of the type quoteT2/quote for each match of the first capturing group. /para programlisting![CDATA[A(.*?)C - T1, 1 = T2;]]/programlisting /section Am 02.05.2013 19:06, schrieb William Karl Thompson: I forgot to mention, the numeric argument in the proposed MARKREGEXP action indicates which capturing group is to be used from regular expression to generate the region for the annotation of the specified type. -Original Message- From: William Karl Thompson Sent: Thursday, May 02, 2013 12:02 PM To: user@uima.apache.org Subject: RE: Extending TextMarker with new actions Peter, Thanks for helping me to get going on this, it now works like a charm! Have been able to generate extensions and have them be recognized by the Eclipse IDE as per your instructions. Very nice! In the process of doing this, I do have an idea for a possibly useful action to be added to the current set. The basic idea is implement functionality similar to that found in the RegularExpressionAnnotator that is one of the UIMA addons: http://uima.apache.org/sandbox.html#regex.annotator This allows you to define a set of regular expression matches, and to mark an annotation on the