Revision: 17690
http://sourceforge.net/p/gate/code/17690
Author: dgmaynard
Date: 2014-03-18 13:04:42 +0000 (Tue, 18 Mar 2014)
Log Message:
-----------
more rules to deal with numberletter combos which shouldn't be part of names
Modified Paths:
--------------
gate/trunk/plugins/ANNIE/resources/NE/first.jape
gate/trunk/plugins/ANNIE/resources/NE/firstname.jape
gate/trunk/plugins/ANNIE/resources/NE/main.jape
Added Paths:
-----------
gate/trunk/plugins/ANNIE/resources/NE/numberletter.jape
Modified: gate/trunk/plugins/ANNIE/resources/NE/first.jape
===================================================================
--- gate/trunk/plugins/ANNIE/resources/NE/first.jape 2014-03-18 12:24:37 UTC
(rev 17689)
+++ gate/trunk/plugins/ANNIE/resources/NE/first.jape 2014-03-18 13:04:42 UTC
(rev 17690)
@@ -14,7 +14,7 @@
*/
Phase: First
-Input: Token
+Input: Token NumberLetter
Options: control = appelt
// this has to be run first of all
@@ -56,7 +56,15 @@
-->
:tag.ClosedClass = {rule = "ClosedClass"}
+Rule: NumberLetter
+Priority: 100
+(
+ {NumberLetter}
+):tag
+-->
+{}
+
Rule: Upper
// define what can be a possible proper noun - cater for the fact that POS tag
might not be correct
(
Modified: gate/trunk/plugins/ANNIE/resources/NE/firstname.jape
===================================================================
--- gate/trunk/plugins/ANNIE/resources/NE/firstname.jape 2014-03-18
12:24:37 UTC (rev 17689)
+++ gate/trunk/plugins/ANNIE/resources/NE/firstname.jape 2014-03-18
13:04:42 UTC (rev 17690)
@@ -14,7 +14,7 @@
*/
Phase: FirstName
-Input: Token Lookup ClosedClass
+Input: Token Lookup ClosedClass NumberLetter
Options: control = appelt
@@ -126,7 +126,7 @@
Rule: Initials1
(
- ({Token.orth == upperInitial, Token.length =="1", !ClosedClass, !Lookup}
+ ({Token.orth == upperInitial, Token.length =="1", !ClosedClass, !Lookup,
!NumberLetter}
({Token.string == "."})?
)+
):tag
@@ -137,8 +137,8 @@
Rule: Initials2
(
- {Token.orth == allCaps, Token.length == "2", !Lookup, !ClosedClass} |
- {Token.orth == allCaps, Token.length == "3", !Lookup, !ClosedClass}
+ {Token.orth == allCaps, Token.length == "2", !Lookup, !ClosedClass,
!NumberLetter} |
+ {Token.orth == allCaps, Token.length == "3", !Lookup, !ClosedClass,
!NumberLetter}
):tag
-->
:tag.Initials = {rule = "Initials2"}
Modified: gate/trunk/plugins/ANNIE/resources/NE/main.jape
===================================================================
--- gate/trunk/plugins/ANNIE/resources/NE/main.jape 2014-03-18 12:24:37 UTC
(rev 17689)
+++ gate/trunk/plugins/ANNIE/resources/NE/main.jape 2014-03-18 13:04:42 UTC
(rev 17690)
@@ -14,7 +14,8 @@
*/
MultiPhase: TestTheGrammars
-Phases:
+Phases:
+numberletter
first
firstname
name
Added: gate/trunk/plugins/ANNIE/resources/NE/numberletter.jape
===================================================================
--- gate/trunk/plugins/ANNIE/resources/NE/numberletter.jape
(rev 0)
+++ gate/trunk/plugins/ANNIE/resources/NE/numberletter.jape 2014-03-18
13:04:42 UTC (rev 17690)
@@ -0,0 +1,21 @@
+
+Phase: NumberLetter
+Input: Token SpaceToken
+Options: control = appelt
+
+
+
+
+Rule: NumberLetter
+// A word that's adjoining a number with no spaces between should not be
considerd as part of any entity
+
+(
+ {Token.kind == number}
+)
+( {Token.kind == word}
+):tag
+-->
+:tag.NumberLetter = {rule = "NumberLetter"}
+
+
+
This was sent by the SourceForge.net collaborative development platform, the
world's largest Open Source development site.
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
GATE-cvs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/gate-cvs