Hi Gandhi,
Again, sorry for the late reply!
In terms of presenting and comparing outputs with the same phrase and
multiple pipelines, I have put together a comprehensive (throw-away)
repository that shows and explains the results between the proper cTAKES
system and the MySQL REST one. The phrase was: severe bipolar i disorder .
I tried to keep the README limited, but informative along with useful
commentary. Each result is neatly packaged under its system and pipeline:
https://github.com/MatthewVita/cTAKES-Special-Case-QA
Please let me know if there are other debugging approaches to try with this
issue. I'm not quite sure of how to move forward :).
Thanks,
Matthew Vita
On Wed, Jun 5, 2019 at 11:23 PM Matthew Vita
wrote:
> Sorry for the delay (work is busy) - will report back soon with the
> pipelines.
>
> Thanks,
> Matthew Vita
>
>
>
> On Mon, Jun 3, 2019 at 8:19 AM gandhi rajan
> wrote:
>
>> Hi Matt, we gotta see what are the types of pipelines used in both cases.
>> Did you tried using pipeline=full instead pipeline=default?
>>
>> Full pipeline can give more information I guess.
>>
>> On Monday, June 3, 2019, Matthew Vita wrote:
>>
>> > Hi Gandhi and All,
>> >
>> > (Correction: my previous statement about the MySQL web rest version “not
>> > working in its current state” is only partially true. I was able to HTTP
>> > POST “Hypertension” and get correct results. However, I’ll be showing
>> that
>> > it’s not working for all cases below.)
>> >
>> > My testing/debugging as of today was to set up the following
>> environments
>> > and compare the XMLs:
>> >
>> >
>> >1.
>> >
>> >Environment #1 - cTAKES Web Rest MySQL version @ 1850060 (with output
>> >xml on, per Gandhi) with the resource data loaded in via plain SQL
>> >2.
>> >
>> >Environment #2 - cTAKES proper @ 1850060 with the resources data
>> loaded
>> >on disk
>> >
>> >
>> > This setup allows for the data to be the same in either MySQL or HSQLDB.
>> >
>> >
>> >
>> >
>> > Furthermore, I made sure that the MySQL database had these following
>> > entries because I chose to use ‘severe bipolar i disorder’ as my test
>> > string:
>> >
>> >-
>> >
>> >cui_terms(236784,12,13,'severe bipolar i disorder , most recent
>> episode
>> >mixed , with psychotic features','features')
>> >-
>> >
>> >TUI(236784,48)
>> >-
>> >
>> >PREFTERM(236784,'Severe mixed bipolar I disorder with psychotic
>> >features')
>> >-
>> >
>> >SNOMEDCT_US(236784,10981006)
>> >
>> >
>> >
>> >
>> > Here’s the result of using the regular cTAKES setup with CVD and
>> > AggregatePlaintextFastUMLSProcessor:
>> >
>> > severe_bipolar_i_disorder_cvd.xml -
>> > https://gist.github.com/MatthewVita/93000a05a5d0f4ef6a4267359c63b510
>> >
>> > Here’s the result of using cTAKES Web Rest MySQL with cURL:
>> >
>> > curl -X POST \
>> >
>> > '
>> http://localhost:8080/ctakes-web-rest/service/analyze?pipeline=Default'
>> > \
>> >
>> > -H 'cache-control: no-cache' \
>> >
>> > -d 'severe bipolar i disorder'
>> >
>> > severe_bipolar_i_disorder_rest.xml
>> > https://gist.github.com/MatthewVita/341f8c9a3552f3db9352917b810a20b0
>> >
>> >
>> > The results show that the CVD results are much better. Rest doesn’t even
>> > pick up on the main disorder. *Any thoughts or more debugging ideas are
>> > welcomed!*
>> >
>> >
>> >
>> > *Sort of unrelated:* I have a good amount of work getting the MySQL
>> > version’s README instructions cleaned up and removing some other bugs in
>> > the issue tracker. I wonder if it would be Apache license compliant for
>> the
>> > main SVN web rest to link to this one? Perhaps this repo can be changed
>> to
>> > “GoTeamEpsilon/ctakes-mysql-rest-service”?
>> >
>> >
>> > Thanks,
>> > Matthew Vita
>> >
>> >
>> >
>> > On Fri, May 31, 2019 at 10:11 PM gandhi rajan
>> > wrote:
>> >
>> > > Hi Matt, I would check whether the XML output from cTAKES contains the
>> > > terms to isolate the issue.
>> > >
>> > > On Saturday, June 1, 2019, Matthew Vita
>> wrote:
>> > >
>> > > > Hi Jeff,
>> > > >
>> > > > Not sure I ran into that same issue. Sorry.
>> > > >
>> > > > In terms of MySQL, I suppose it is faster because it's not in-memory
>> > > based
>> > > > (to be fair, HSQLDB can utilize disks). Another factor is that you
>> can
>> > > load
>> > > > balance multiple servers in a "stateless" way if you had a heavy
>> load
>> > > > environment because the MySQL stands alone.
>> > > >
>> > > >
>> > > >
>> > > > Hi Gandhi,
>> > > >
>> > > > I'm using trunk@1850060 with the MySQL-based codebase on Github.
>> > > > Everything
>> > > > builds and it even connects to all of the tables and models,
>> however,
>> > it
>> > > > doesn't pick up terms.
>> > > >
>> > > > Where do you think is a good place to start, with respect to
>> debugging?
>> > > The
>> > > > frustrating part is there's no errors in the catalina logs :).
>> > > >
>> > > > Thanks,
>> > > > Matthew Vita
>> > > >
>> > > >
>> > > >
>> > > > On Thu, May 30, 2019 at 2:52