Yes, this would help address that multiple permutations example. The new getOriginalText method would return something like "Acute|Disease". Right now I'm thinking of just using vertical bar as delimiter, to start with at least, but think it should be configurable.
-----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Chen, Pei Sent: Tuesday, October 01, 2013 9:38 AM To: [email protected] Subject: CTAKES-248- include original covered text of NEs which can't be recovered post if NE is from a disjoint span This sounds pretty cool. James, will this address the multiple permutations lookup example: "Acute alcoholic liver disease." There is a cui: C0001314: Acute Disease, but if you getCoveredText(), on the UMLSConcept, you would actually get the same "Acute alcoholic liver disease" instead of "Acute Disease". So, there is a new field called getOriginalText() that matched the hit? > -----Original Message----- > From: [email protected] [mailto:[email protected]] > Sent: Monday, September 30, 2013 5:49 PM > To: [email protected] > Subject: svn commit: r1527792 - /ctakes/trunk/ctakes-type- > system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst > em.xml > > Author: james-masanz > Date: Mon Sep 30 21:48:01 2013 > New Revision: 1527792 > > URL: http://svn.apache.org/r1527792 > Log: > CTAKES-248 - for named entities, since the annotation just has the begin and > end offset, it is requested to have a way to get the original covered text > (especially for disjoint spans) so it is possible to know which words in the > covered text were actually used in the matching to the dictionary entry > > Modified: > ctakes/trunk/ctakes-type- > system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst > em.xml > > Modified: ctakes/trunk/ctakes-type- > system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst > em.xml > URL: http://svn.apache.org/viewvc/ctakes/trunk/ctakes-type- > system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst > em.xml?rev=1527792&r1=1527791&r2=1527792&view=diff > ========================================================== > ==================== > Binary files - no diff available. >
