Hi all, I just have a couple of notes to expand upon what Guergana wrote.
Anafora requires a schema for annotation and it requires text files to be in a certain structure. I just checked in text files for annotation and the schema that we plan to use in ctakes-examples-res src/main/resources/org/apache/ctakes/examples/annotation/ . Everybody is obviously welcome to use the schema and notes, or to create annotations using another tool for all to share. As a disclaimer ... Anafora is not associated with ctakes. My opinion is that the ctakes devlist should not be over-used for anafora q/a. Thanks, Sean -----Original Message----- From: Savova, Guergana [mailto:guergana.sav...@childrens.harvard.edu] Sent: Tuesday, January 31, 2017 3:42 PM To: dev@ctakes.apache.org Subject: gold standard annotations for cTAKES [SUSPICIOUS] A while ago our physician colleague John Green created 16 realistically looking (but fake) clinical notes. Many thanks again, John! These notes are in ctakes-examples/data/notes. We now volunteer to annotate them with gold annotations. The main elements with their attributes are: Medications, Attributes ::= span associatedCode change_status_model conditional dosage_model duration_model end_date form_model frequency_model generic negation_indicator route_model start_date strength_model subject uncertainty_indicator Signs/Symptoms, Attributes ::= associated_code body_location conditional course duration end_time generic historyOf negation_indicator relative_temporal_context severity start_time subject uncertainty_indicator Anatomical Sites, Attributes ::= associatedCode conditional generic negation_indicator subject uncertainty_indicator Disease/DisordersAttributes ::= associated_code body_location conditional course duration end_time generic historyOf negation_indicator relative_temporal_context severity start_time subject uncertainty_indicator Procedures, Attributes ::= associated_code body_location conditional duration end_time generic historyOf method negation_indicator relative_temporal_context start_time subject uncertainty_indicator We expect to have the gold annotations by end of March. We are using the Anafora annotation tool (https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_weitechen_anafora&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=klIlU3or2Lr4NKPbcLwbF6pes2n2Ype-qri4zGIW_Xk&s=DE-u9g6s9UaCO6fLztks2ClRi7lrSCi5IkV5jtu3BPc&e= ) and will release the annotations in the xml format. Regards, --Guergana Guergana Savova, PhD, FACMI Associate Professor PI Natural Language Processing Lab Boston Children's Hospital and Harvard Medical School 300 Longwood Avenue Mailstop: BCH3092 Enders 144.1 Boston, MA 02115 Tel: (617) 919-2972 Fax: (617) 730-0817 guergana.sav...@childrens.harvard.edu<mailto:guergana.sav...@childrens.harvard.edu> Harvard Scholar: https://urldefense.proofpoint.com/v2/url?u=http-3A__scholar.harvard.edu_guergana-5Fk-5Fsavova_biocv&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=klIlU3or2Lr4NKPbcLwbF6pes2n2Ype-qri4zGIW_Xk&s=8AV7t2x3gPeu3zXjyzKyiyi6KUNsNO2Qv2Jmsx2Ys1M&e= ctakes.apache.org thyme.healthnlp.org cancer.healthnlp.org share.healthnlp.org