AW: AW: Textmarker - Qualification of Types

2013-05-07 Thread Armin.Wegner
Hi Peter,

That was really helpful,

Thanks again,
Armin 

-Ursprüngliche Nachricht-
Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] 
Gesendet: Montag, 6. Mai 2013 15:39
An: user@uima.apache.org
Betreff: Re: AW: Textmarker - Qualification of Types

Hi,

I should have mentioned that you can update your old textmarker projects to new 
UIMA Ruta project with a context action: right-click on an old project and 
select UIMA Ruta - Update Project

Best,

Peter

On 06.05.2013 10:56, Peter Klügl wrote:
 Hi,

 a snapshot update site is here:

 http://people.apache.org/~pkluegl/temp/2.0.1-SNAPSHOT/eclipse-update-s
 ite/ 
 http://people.apache.org/%7Epkluegl/temp/2.0.1-SNAPSHOT/eclipse-updat
 e-site/

 The parent folder contains also ruta-core-2.0.1-SNAPSHOT.jar and the 
 documentation since the jenkins build just failed.

 Best,

 Peter

 On 06.05.2013 07:53, armin.weg...@bka.bund.de wrote:
 Hi Peter,

 That is fine. I'm using 2.0.0 core jar from maven central. Can you give me a 
 snapshot update site, please?

 Thank you,
 Armin

 -Ursprüngliche Nachricht-
 Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de]
 Gesendet: Freitag, 3. Mai 2013 15:04
 An: user@uima.apache.org
 Betreff: Re: Textmarker - Qualification of Types

 Hi,

 it's fixed now in the trunk.

 Which version do you use? Just let me know if you need a snapshot update 
 site or help with the new projects.

 Best,

 Peter


 On 03.05.2013 14:28, Peter Klügl wrote:
 Hi,

 On 03.05.2013 13:52, armin.weg...@bka.bund.de wrote:
 Hi,

 I'm running Textmarker on a CAS XMI file with a lot of annotations from 
 different annotators and different type systems. There are some type names 
 used more than once, but with different name spaces. All types are defined 
 in the type system included with TYPESYSTEM. Prepending the namespace to a 
 type name in a Textmarker script does not work. How to tell Textmarker 
 which namespace to use?
 I just checked it and found a bug. I will fix it ASAP and commit it 
 to the trunk. Please mind that the name textmarker was replaced by 
 ruta in the trunk.

 Normally, you should be able to use the complete namespace for 
 referencing a type and a shortcut with type variables since adding 
 the complete namespace is tedious and confusing.

 // some imports...

 TYPE NUM_DKPro =
 de.tudarmstadt.ukp.dkpro.core.api.syntax.type.dependency.NUM;
 TYPE NUM_Ruta = org.apache.uima.ruta.type.NUM;

 NUM_Ruta PERIOD NUM_Ruta{- MARK(...)};

 Thanks for reporting this.

 Best,

 Peter


 Thanks,
 Armin




Re: Ruta 2.0.1-SNAPSHOT - CVS directory exception

2013-05-07 Thread Peter Klügl
Hi,

thanks for reporting this. I will fix it ASAP.

Peter

On 07.05.2013 09:32, armin.weg...@bka.bund.de wrote:
 Hello Peter,

 IN 2.0.1-SNAPSHOT the RutaLauncher complains about the CVS directory in the 
 input folder:

 Exception in thread main java.io.FileNotFoundException: 
 projectLocation/input/CVS (Is a directory)
   at java.io.FileInputStream.open(Native Method)
   at java.io.FileInputStream.init(FileInputStream.java:138)
   at org.apache.uima.util.FileUtils.file2String(FileUtils.java:155)
   at 
 org.apache.uima.ruta.ide.launching.RutaLauncher.processFile(RutaLauncher.java:149)
   at 
 org.apache.uima.ruta.ide.launching.RutaLauncher.main(RutaLauncher.java:119)

 Cheers,
 Armin
  

 -Ursprüngliche Nachricht-
 Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] 
 Gesendet: Montag, 6. Mai 2013 10:56
 An: user@uima.apache.org
 Betreff: Re: AW: Textmarker - Qualification of Types

 Hi,

 a snapshot update site is here:

 http://people.apache.org/~pkluegl/temp/2.0.1-SNAPSHOT/eclipse-update-site/
 http://people.apache.org/%7Epkluegl/temp/2.0.1-SNAPSHOT/eclipse-update-site/

 The parent folder contains also ruta-core-2.0.1-SNAPSHOT.jar and the 
 documentation since the jenkins build just failed.

 Best,

 Peter

 On 06.05.2013 07:53, armin.weg...@bka.bund.de wrote:
 Hi Peter,

 That is fine. I'm using 2.0.0 core jar from maven central. Can you give me a 
 snapshot update site, please?

 Thank you,
 Armin

 -Ursprüngliche Nachricht-
 Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de]
 Gesendet: Freitag, 3. Mai 2013 15:04
 An: user@uima.apache.org
 Betreff: Re: Textmarker - Qualification of Types

 Hi,

 it's fixed now in the trunk.

 Which version do you use? Just let me know if you need a snapshot update 
 site or help with the new projects.

 Best,

 Peter


 On 03.05.2013 14:28, Peter Klügl wrote:
 Hi,

 On 03.05.2013 13:52, armin.weg...@bka.bund.de wrote:
 Hi,

 I'm running Textmarker on a CAS XMI file with a lot of annotations from 
 different annotators and different type systems. There are some type names 
 used more than once, but with different name spaces. All types are defined 
 in the type system included with TYPESYSTEM. Prepending the namespace to a 
 type name in a Textmarker script does not work. How to tell Textmarker 
 which namespace to use?
 I just checked it and found a bug. I will fix it ASAP and commit it 
 to the trunk. Please mind that the name textmarker was replaced by 
 ruta in the trunk.

 Normally, you should be able to use the complete namespace for 
 referencing a type and a shortcut with type variables since adding 
 the complete namespace is tedious and confusing.

 // some imports...

 TYPE NUM_DKPro =
 de.tudarmstadt.ukp.dkpro.core.api.syntax.type.dependency.NUM;
 TYPE NUM_Ruta = org.apache.uima.ruta.type.NUM;

 NUM_Ruta PERIOD NUM_Ruta{- MARK(...)};

 Thanks for reporting this.

 Best,

 Peter


 Thanks,
 Armin



PhD Research Studentship, Natural Language Processing, University of Sheffield, UK

2013-05-07 Thread Genevieve M Gorrell
Applications are invited for a fully funded PhD studentship on computing
the veracity of social media content.

Application closing date is* 31 May 2013. *

The aim of this studentship is to design natural language processing
methods to compute the veracity of social media content and deal with the
specifics of medical language. The goal is to model, identify, and verify
healthcare-related misinformation and disinformation, as they spread across
online media (e.g. patient forums) and social networks. The studentship is
hosted at the Department of Computer Science at the University of
Sheffield, UK.

For more information, see:

http://www.jobs.ac.uk/job/AGL052/phd-research-studentship/


RE: Extending TextMarker with new actions

2013-05-07 Thread William Karl Thompson
Hi Peter, 

What you proposed would work fine for what I was trying to do!

Cheers,

Will

-Original Message-
From: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] 
Sent: Tuesday, May 07, 2013 3:42 AM
To: user@uima.apache.org
Subject: Re: Extending TextMarker with new actions

Hi,

On 06.05.2013 18:26, William Karl Thompson wrote:
 Hi Peter,

 I like the simplified regular expression rule syntax -- very handy. It's 
 almost exactly what I wanted.  However, one thing I'm wondering is how to 
 create an annotation with features using such rules. I have in mind something 
 like the following:

 (regex string) - 1 = CREATE(FooType, feat = bar);

 Here's a possible variant of the above that  I can imagine would be useful 
 too:

 (regex) (string) - CREATE(FooType, feat1 = GROUP(1), 
 feat2=GROUP(2));

 What are your thoughts on this?

I think I won't be able to use the existing code of the CREATE action for this 
and it will also be problematic in the grammar without creating a new context.

What about something like:

(regexp) (string) - Type1, 1 = Type2 (feat = 2);

This will of course not work with numeric feature values, but there isn't an 
auto-cast anyway...

Best,

Peter
 


 Cheers,

 Will

 -Original Message-
 From: William Karl Thompson
 Sent: Thursday, May 02, 2013 1:49 PM
 To: user@uima.apache.org
 Subject: RE: Extending TextMarker with new actions

 Vielen Dank, Ich werde es probieren.

 -Original Message-
 From: Peter Klügl [mailto:pklu...@uni-wuerzburg.de]
 Sent: Thursday, May 02, 2013 12:42 PM
 To: user@uima.apache.org
 Subject: Re: Extending TextMarker with new actions

 Am 02.05.2013 19:16, schrieb William Karl Thompson:
 I see you're way ahead of me! I'll take a look at this -- is it in the 
 latest on trunk?
 Yes, and there is also a unit test (if you are interested in some 
 ready-to-work examples): 
 org.apache.uima.ruta.RegExpRuleTest.java(.ruta,
 .txt)

 Peter

 -Original Message-
 From: Peter Klügl [mailto:pklu...@uni-wuerzburg.de]
 Sent: Thursday, May 02, 2013 12:14 PM
 To: user@uima.apache.org
 Subject: Re: Extending TextMarker with new actions

 Hi,

 oh, I am afraid I recently added something like that for the 2.0.1 
 release, not yet included in the 2.0.0 release. This does not mean 
 that I would not include the action in UIMA Ruta ;-)

 Here the excerpt of the documentation:

 section id=ugr.tools.ruta.language.regexprule
   titleSimple Rules based on Regular Expressions/title
   para
 The Ruta language includes, additionally to the normal rules, a 
 simplified rule syntax for processing regular expressions.
 These simple rules consist of two parts separated by
 quote-/quote: The left part is the regular expression
 (flags: DOTALL and MULTILINE), which may contain capturing groups. 
 The right part defines, which kind of annotations
 should be created for each match of the regular expression. If a 
 type is given without a group index, then an annotation of that type is
 created for the complete regular expression match, which corresponds 
 to group 0. These simple rules can be restricted to match only within
 certain annotations using the BLOCK construct, and ignore all 
 filtering settings.
   /para

   programlisting![CDATA[
 RegExpRule  - StringExpression - GroupAssignment
 (, GroupAssignment)* ;
 GroupAssignment - TypeExpression | NumberEpxression = 
 TypeExpression ]]/programlisting

   para
 The following example contains a simple rule, which is able to 
 create annotations of two different types. It creates an annotation
 of the type quoteT1/quote for each match of the complete regular 
 expression and an annotation
 of the type quoteT2/quote for each match of the first capturing 
 group.
   /para

   programlisting![CDATA[A(.*?)C - T1, 1 = 
 T2;]]/programlisting


 /section




 Am 02.05.2013 19:06, schrieb William Karl Thompson:
 I forgot to mention, the numeric argument in the proposed MARKREGEXP action 
 indicates which capturing group is to be used from regular expression to 
 generate the region for the annotation of the specified type.

 -Original Message-
 From: William Karl Thompson
 Sent: Thursday, May 02, 2013 12:02 PM
 To: user@uima.apache.org
 Subject: RE: Extending TextMarker with new actions

 Peter,

 Thanks for helping me to get going on this, it now works like a charm! Have 
 been able to generate extensions and have them be recognized by the Eclipse 
 IDE as per your instructions. Very nice!

 In the process of doing this, I do have an idea for a possibly useful 
 action to be added to the current set. The basic idea is implement 
 functionality similar to that found in the RegularExpressionAnnotator that 
 is one of the UIMA addons:

 http://uima.apache.org/sandbox.html#regex.annotator

 This allows you to define a set of regular expression matches, and to mark 
 an annotation on the