Gandhi,

I really appreciate this information. I have started working out the schema
and plan on writing a program that will automatically prepare a script to
work with MySQL. Work in progress. Can you do a quick review of my MySQL
schema so far?

CREATE SCHEMA CTAKES_DATA;

use CTAKES_DATA;

CREATE TABLE CUI_TERMS (
  CUI BIGINT NOT NULL,
  RINDEX INT(128) NOT NULL,
  TCOUNT INT(128) NOT NULL,
  TEXT VARCHAR(255) NOT NULL,
  RWORD VARCHAR(48) NOT NULL
);
CREATE INDEX IDX_CUI_TERMS ON CUI_TERMS (RWORD);

CREATE TABLE TUI (
  CUI BIGINT NOT NULL,
  TUI INT(128) NOT NULL
);
CREATE INDEX IDX_TUI ON TUI (CUI);

CREATE TABLE PREFTERM (
  CUI BIGINT NOT NULL,
  PREFTERM VARCHAR(511) NOT NULL
);
CREATE INDEX IDX_PREFTERM ON PREFTERM (CUI);

CREATE TABLE RXNORM (
  CUI BIGINT NOT NULL,
  RXNORM BIGINT NOT NULL
);
CREATE INDEX IDX_RXNORM ON RXNORM (CUI);

CREATE TABLE SNOMEDCT_US (
  CUI BIGINT NOT NULL,
  SNOMEDCT_US BIGINT NOT NULL
);
CREATE INDEX IDX_SNOMEDCT_US ON SNOMEDCT_US (CUI);

Quick question: do you use the AIR table?

Thanks,

Matthew Vita
www.matthewvita.com

On Mon, Oct 9, 2017 at 1:14 AM, Gandhi Rajan Natarajan <
[email protected]> wrote:

> Hi Mathew,
>
> First I would like to tell you that even I m a newbie in cTAKES.
> Unfortunately I don’t find any documentation on this. I have followed a
> crude way to accomplish as this is an one time activity. This is what I did:
>
> 1) Used dictionary generator GUI to generate Snomed, RxNorm and MEDDRA
> dictionary data that resulted in '.script' file under my
> <ctakes_home>\resources\org\apache\ctakes\dictionary\lookup\fast\<project_name>
> folder
> 2) The '.script' file has HSQLDB specific queries. I have removed the
> unwanted statements for me pertaining to HSQLDB from the file and converted
> them to mysql specific queries manually.
> 3) I have added semicolons at the end of each line in the script using
> text editor and splitted the file in to five parts. Then I ran those five
> sctipr files  in five different mysql command lines. It took me
> approximately 4 hours to pump all the data in to MySQL DB.
>
> I'm not sure whether it is the right way to proceed as I mentioned
> earlier. But with no documentation available for MySQL DB with  cTAKES,
> this is the approached that worked for me. Hope it will be helpful.
>
> Regards,
> Gandhi
>
>
> -----Original Message-----
> From: Matthew Vita [mailto:[email protected]]
> Sent: Monday, October 09, 2017 10:41 AM
> To: [email protected]
> Subject: Re: HSQLDB out of memory with custom dictionary
>
> Gandhi,
>
> Thank you for the reply. Do you have any documentation on how to
> accomplish this?
>
> Thanks,
>
> Matthew Vita
> www.matthewvita.com
>
> On Sun, Oct 8, 2017 at 3:14 AM, Gandhi Rajan Natarajan <
> [email protected]> wrote:
>
> > Hi Mathew,
> >
> > I feel using MySQL Db would be better idea than using in-memory
> > HSQLDB. In fact, this also comes handy when you are planning to deploy
> > ctakes as a web application as in our case.
> >
> > Regards,
> > Gandhi
> >
> > -----Original Message-----
> > From: Matthew Vita [mailto:[email protected]]
> > Sent: Sunday, October 08, 2017 6:02 AM
> > To: [email protected]
> > Subject: HSQLDB out of memory with custom dictionary
> >
> > Hi Sean, Tim, cTAKES Community,
> >
> > I have put together what I am considering a pretty standard dictionary
> > with sources from the following:
> >
> >
> >    -
> >
> >    MEDLINEPLUS
> >    -
> >
> >    MSH
> >    -
> >
> >    NCI
> >    -
> >
> >    NDFRT
> >    -
> >
> >    CHV
> >    -
> >
> >    CSP
> >    -
> >
> >    ICPC2P
> >    -
> >
> >    MEDCIN
> >    -
> >
> >    SNOMED
> >    -
> >
> >    RXNORM
> >    -
> >
> >    ICD10
> >
> >
> > However, when copied over to cTAKES (handled by the handy Dictionary
> > Creator GUI) HSQLDB runs out of memory.
> >
> > This is my first experience with HSQLDB so you’ll have to excuse my
> > limited knowledge here. I do understand that it can run either
> > in-memory and on disk, but I’m not sure how to configure this.
> >
> > Here is how I am connecting to it:
> >
> >
> >   <dictionary>
> >
> >
> >     <name>sno_rx_16abTerms</name>
> >
> >     <implementationName
> > >org.apache.ctakes.dictionary.lookup2.dictionary.UmlsJdbcRareWordDicti
> > >on
> > >ary</
> > implementationName>
> >
> >     <properties>
> >
> >       <property key="jdbcDriver" value="org.hsqldb.jdbcDriver" />
> >
> >       <property key="jdbcUrl" value=
> > "jdbc:hsqldb:file:resources/org/apache/ctakes/dictionary/
> > lookup/fast/sno_rx_16ab/sno_rx_16ab"
> > />
> >
> >       <property key="jdbcUser" value="sa" />
> >
> >       <property key="jdbcPass" value="" />
> >
> >       <property key="rareWordTable" value="cui_terms" />
> >
> >       <property key="umlsUrl" value="
> > https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser"; />
> >
> >       <property key="umlsVendor" value="NLM-6515182895" />
> >
> >       <property key="umlsUser" value="CHANGE_ME" />
> >
> >       <property key="umlsPass" value="CHANGE_ME" />
> >
> >     </properties>
> >
> >   </dictionary>
> >
> >   <dictionary>
> >
> >
> >
> > Can I configure HSQLDB to be used on disk? If this is not a good
> > approach, can I spin up MySQL in its place?
> >
> >
> > Sorry if this has asked before.
> >
> >
> > Thanks,
> >
> > Matthew Vita
> > www.matthewvita.com
> > This email and any files transmitted with it are confidential and
> > intended solely for the use of the individual or entity to whom they are
> addressed.
> > If you are not the named addressee you should not disseminate,
> > distribute or copy this e-mail. Please notify the sender or system
> > manager by email immediately if you have received this e-mail by
> > mistake and delete this e-mail from your system. If you are not the
> > intended recipient you are notified that disclosing, copying,
> > distributing or taking any action in reliance on the contents of this
> > information is strictly prohibited and against the law.
> >
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you are not the named addressee you should not disseminate, distribute
> or copy this e-mail. Please notify the sender or system manager by email
> immediately if you have received this e-mail by mistake and delete this
> e-mail from your system. If you are not the intended recipient you are
> notified that disclosing, copying, distributing or taking any action in
> reliance on the contents of this information is strictly prohibited and
> against the law.
>

Reply via email to