for best F1 measure [was RE: cTakes
Annotation Comparison
Sean (or others),
Of the various configuration options described below, which values/choices
would you recommend for best F1 measure for something like the shared clef 2013
task?
https://sites.google.com/site/shareclefehealth/
I'm
, 2014 10:43 AM
To: dev@ctakes.apache.org; kim.eb...@imatsolutions.com
Subject: RE: cTakes Annotation Comparison
Also check out stats that Sean ran before releasing the new component on:
http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-fast/doc/DictionaryLookupStats.docx
From
Thanks for this, Bruce! Very interesting work. It confirms what I've seen
in my small tests that I've done in a non-systematic way. Did you happen to
capture the number of false positives yet (annotations made by cTAKES that
are not in the human adjudicated standard)? I've seen a lot of dictionary
were similar.
Thank you everyone!
--Guergana
-Original Message-
From: David Kincaid [mailto:kincaid.d...@gmail.com]
Sent: Friday, December 19, 2014 9:02 AM
To: dev@ctakes.apache.org
Subject: Re: cTakes Annotation Comparison
Thanks for this, Bruce! Very interesting work. It confirms what
:02 AM
To: dev@ctakes.apache.org
Subject: Re: cTakes Annotation Comparison
Thanks for this, Bruce! Very interesting work. It confirms what I've seen in
my small tests that I've done in a non-systematic way. Did you happen to
capture the number of false positives yet (annotations made by cTAKES
To: dev@ctakes.apache.org
Subject: Re: cTakes Annotation Comparison
Guergana,
I'm curious to the number of records that are in your gold standard sets, or if
your gold standard set was run through a long running cTAKES process. I know at
some point we fixed a bug in the old dictionary lookup
and the fast one were similar.
Thank you everyone!
--Guergana
-Original Message-
From: David Kincaid [mailto:kincaid.d...@gmail.com]
Sent: Friday, December 19, 2014 9:02 AM
To: dev@ctakes.apache.orgmailto:dev@ctakes.apache.org
Subject: Re: cTakes Annotation Comparison
Thanks for this, Bruce
one were
similar.
Thank you everyone!
--Guergana
-Original Message-
From: David Kincaid [mailto:kincaid.d...@gmail.com]
Sent: Friday, December 19, 2014 9:02 AM
To: dev@ctakes.apache.orgmailto:dev@ctakes.apache.org
Subject: Re: cTakes Annotation Comparison
Thanks for this, Bruce
@ctakes.apache.org
Subject: Re: cTakes Annotation Comparison
Our analysis against the human adjudicated gold standard from this SHARE corpus
is using a simple check to see if the cTakes output included the annotation
specified by the gold standard. The initial results I reported were for exact
matches
cuis are added, removed, deprecated, and
moved from one TUI to another.
Sean
-Original Message-
From: Savova, Guergana [mailto:guergana.sav...@childrens.harvard.edu]
Sent: Friday, December 19, 2014 1:28 PM
To: dev@ctakes.apache.org
Subject: RE: cTakes Annotation Comparison
Several
-
From: Savova, Guergana [mailto:guergana.sav...@childrens.harvard.edu]
Sent: Friday, December 19, 2014 1:28 PM
To: dev@ctakes.apache.org
Subject: RE: cTakes Annotation Comparison
Several thoughts:
1. The ShARE corpus annotates only mentions of type Diseases/Disorders and
only Anatomical
I’m bringing it up in case the Human Annotations were done using a different
version.
From: Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com]
Sent: Friday, December 19, 2014 1:40 PM
To: dev@ctakes.apache.org
Subject: Re: cTakes Annotation Comparison
Sean,
I don't think that would be an issue
*Subject:* Re: cTakes Annotation Comparison
Guergana,
I'm curious to the number of records that are in your gold standard
sets, or if your gold standard set was run through a long running
cTAKES process. I know at some point we fixed a bug in the old
dictionary lookup that caused
Ebert [mailto:kim.eb...@perfectsearchcorp.com
kim.eb...@perfectsearchcorp.com]
*Sent:* Friday, December 19, 2014 10:25 AM
*To:* dev@ctakes.apache.org
*Subject:* Re: cTakes Annotation Comparison
Guergana,
I'm curious to the number of records that are in your gold standard sets,
or if your
:* Friday, December 19, 2014 1:47 PM
*To:* Chen, Pei; dev@ctakes.apache.org
*Subject:* Re: cTakes Annotation Comparison
Pei,
I don't think bugs/issues should be part of determining if one algorithm
vs the other is superior. Obviously, it is worth mentioning the bugs, but
if the fast lookup
that you'd only have two
matches per document (100 docs?).
Thanks,
Sean
-Original Message-
From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com]
Sent: Friday, December 19, 2014 3:23 PM
To: dev@ctakes.apache.org
Subject: Re: cTakes Annotation Comparison
Sean,
I tried
on this? It is really bizarre that you'd only
have two matches per document (100 docs?).
Thanks,
Sean
-Original Message-
From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com]
Sent: Friday, December 19, 2014 3:23 PM
To: dev@ctakes.apache.org
Subject: Re: cTakes Annotation Comparison
Sean
horribly
inaccurate.
Thanks
-Original Message-
From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com]
Sent: Friday, December 19, 2014 3:29 PM
To: dev@ctakes.apache.org
Subject: Re: cTakes Annotation Comparison
Correction -- So far, I did steps 1 and 2 of Sean's email.
[image
:* Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com]
*Sent:* Friday, December 19, 2014 3:37 PM
*To:* dev@ctakes.apache.org
*Subject:* Re: cTakes Annotation Comparison
My original results were using a newly downloaded cTakes 3.2.1 with the
separately downloaded resources copied
, December 19, 2014 3:37 PM
*To:* dev@ctakes.apache.org
*Subject:* Re: cTakes Annotation Comparison
My original results were using a newly downloaded cTakes 3.2.1 with the
separately downloaded resources copied in. There were no changes to any of
the configuration files.
As far as this last
-
From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com]
Sent: Friday, December 19, 2014 5:05 PM
To: dev@ctakes.apache.org
Subject: Re: cTakes Annotation Comparison
My apologies to Sean and everyone,
I am happy to report that I found a bug in our analysis tools that was missing
the last
:* Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com]
*Sent:* Friday, December 19, 2014 3:37 PM
*To:* dev@ctakes.apache.org
*Subject:* Re: cTakes Annotation Comparison
My original results were using a newly downloaded cTakes 3.2.1 with the
separately downloaded resources copied
:* Friday, December 19, 2014 3:37 PM
*To:* dev@ctakes.apache.org
*Subject:* Re: cTakes Annotation Comparison
My original results were using a newly downloaded cTakes 3.2.1 with the
separately downloaded resources copied in. There were no changes to any of
the configuration files.
As far
Bruce,
Thanks for this-- very useful.
Perhaps Sean Finan comment more-
but it's also probably worth it to compare to an adjudicated human annotated
gold standard.
--Pei
-Original Message-
From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com]
Sent: Thursday, December 18,
Actually, we are working on a similar tool to compare it to the human
adjudicated standard for the set we tested against. I didn't mention it
before because the tool isn't complete yet, but initial results for the set
(excluding those marked as CUI-less) was as follows:
Human adjudicated
25 matches
Mail list logo