Hi Abad,
> · How can we point cTAKES application to multiple dictionaries. > Currently only sno_rx_16ab is pointed to the application, how can I tweak > it to point that to multiple dictionary simultaneously. Or you meant to say > create a fresh dictionary with all the vocabularies and point just that in > cTAKES. If you go back in the archive a bit, you should find a thread where I went into detail on how to add multiple dictionaries. Combining all dictionaries into a fresh dictionary is not recommended for obvious reasons. If you can't find the thread, I will dig it up. > · So for these edits I will have to add INSERT queries to > respective tables in the sno_rx_16ab.script file right? Do I need to make > any more changes for these tokens to get reflected in cTAKES. Nope! That is all that is needed and next time you launch cTakes, it should recognize your new entries. · If it is a non-existing CUI , I can get the respective CUI,TUI from > here https://uts.nlm.nih.gov//metathesaurus.html > <https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Futs.nlm.nih.gov%2Fmetathesaurus.html&data=02%7C01%7CAbad.Ayyub%40cognizant.com%7Cbd4a861ed0404262802e08d803e8a4b0%7Cde08c40719b9427d9fe8edf254300ca7%7C0%7C0%7C637263645022133073&sdata=KFn7yO59jEsACpgY2%2BRv2XKnzipPHgC00oSvN3R0ADI%3D&reserved=0> > right? Correct! Remember that the ontology has multiple-inheritance so you need to grab all the TUIs for a given CUI. > · Based on the source I will have to add entry to respective table > right? Like SNOMED,RxNORM,ICD 10 and a CUI will belong to either one of it > and not in all. Correct me if am wrong on this understanding That is also correct. And most of the time, the dictionaries only contain one CODE table so it is not even a question. However, sno_rx_16ab is an exception with both a CODE table for SNOMEDCT_US and RXNORM. They mostly do not overlap. I do remember that there were a couple of exceptions but, in the case where that happens, the metathesaurus will show it. For example: 'Acebutolol' (CUI: C0000946) has two SNOMEDCT_US codes (372815001 and 68088000) **and** an RXNORM of 149. · PREFTERM table will be having only one entry for each CUI right? > Basically it’s a one-to-one mapping between CUI and PREFTERM . Correct me > if am wrong on this understanding. You are correct here also. It is a one-to-one mapping although the system appears to tolerate when the PREFTERM is missing. *Rémy Sanouillet* NLP Engineer [email protected] <[email protected]> [image: image.png] ForeSee Medical, Inc. 12555 High Bluff Drive, Suite 100 San Diego, CA 92130 NOTICE: This e-mail message and all attachments transmitted with it are intended solely for the use of the addressee and may contain legally privileged and confidential information. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution, copying, or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately by replying to this message and please delete it from your computer. On Mon, Jun 1, 2020 at 7:56 AM <[email protected]> wrote: > Thank you Remy and Peter for your responses. I hope you guys are doing > good and safe in this lock down period. Could you pls. help me on my below > queries in creating an additional dictionary. > > > > · How to create additional dictionary. You meant to say using the > UMLS tool , so that using that tool we create .script files from .RRF files? > > · How can we point cTAKES application to multiple dictionaries. > Currently only sno_rx_16ab is pointed to the application, how can I tweak > it to point that to multiple dictionary simultaneously. Or you meant to say > create a fresh dictionary with all the vocabularies and point just that in > cTAKES. > > > > I hope Remy was explaining editing the existing dictionary where I would > deal with two scenarios where one was with existing CUI and other was with > Non-existing CUI. Could you pls. resolve the below queries regarding the > same. > > > > · So for these edits I will have to add INSERT queries to > respective tables in the sno_rx_16ab.script file right? Do I need to make > any more changes for these tokens to get reflected in cTAKES. > > · If it is a non-existing CUI , I can get the respective CUI,TUI > from here https://uts.nlm.nih.gov//metathesaurus.html > <https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Futs.nlm.nih.gov%2Fmetathesaurus.html&data=02%7C01%7CAbad.Ayyub%40cognizant.com%7Cbd4a861ed0404262802e08d803e8a4b0%7Cde08c40719b9427d9fe8edf254300ca7%7C0%7C0%7C637263645022133073&sdata=KFn7yO59jEsACpgY2%2BRv2XKnzipPHgC00oSvN3R0ADI%3D&reserved=0> > right? > > · Based on the source I will have to add entry to respective table > right? Like SNOMED,RxNORM,ICD 10 and a CUI will belong to either one of it > and not in all. Correct me if am wrong on this understanding > > · PREFTERM table will be having only one entry for each CUI right? > Basically it’s a one-to-one mapping between CUI and PREFTERM . Correct me > if am wrong on this understanding. > > > > > > Thanks & Regards > > [image: cid:D3145E69-CD94-48C1-877F-5134EEAFB598] > > *Abad Ayyub* > > Vnet: 406170 | Cell : +91-9447379028 > > > > > > *From:* Remy Sanouillet <[email protected]> > *Sent:* Friday, May 29, 2020 9:25 PM > *To:* [email protected] > *Cc:* [email protected] > *Subject:* Re: Building a new custom dictionary or Updating/Adding values > to the existing dictionary in cTAKES > > > > *[External]* > > Hello Abad, > > > > The short answer is, yes, the sno_rx_16ab can be "hacked". A couple of > caveats are that any mistake can stop all recognition and you will lose all > your mods on updates. So an additional dictionary is a recommended approach. > > > > There are two cases. EIther the CUI you are adding already exists and you > are just adding a synonym. In that case, you only need to add one line: > > INSERT INTO CUI_TERMS VALUES(CUI,RINDEX,TCOUNT,TEXT,RWORD) > > where: > > - CUI is the cui, nuf'said > - TEXT is the tokenized lowercase string for the entry. In your case > 'pap smear'. Most punctuation is a separate token. Single quotes are > escaped by doubling them > - RWORD is the one token in TEXT that is the most indicative (least > common) which will be used as the index in the lookup. In your case > probably 'pap' since it is not as common as 'smear' > - RINDEX is the index of RWORD in TEXT. First token is 0 which is the > case for 'pap' > - TCOUNT is the token count for TEXT. In your case, 2 > > So you would want to add: > > INSERT INTO CUI_TERMS VALUES(200845,0,2,'pap smear','pap') > > > > If the entry is a non-existing one, you will need to add a few more > lines. Their positions are unimportant as long as they are below the header > lines (below the final "SET SCHEMA PUBLIC" line). > > 1. INSERT INTO TUI VALUES(CUI,TUI) > One line for each TUI in the taxonomy > 2. INSERT INTO SNOMEDCT_US VALUES(CUI,SNOMED) > assuming you are adding a SNOMED > 3. INSERT INTO PREFTERM VALUES(CUI,PREFTERM) > where PREFTERM is the pretty string to describe the entry. It need not > correspond to any indexed entry. It is used for display once the lookup has > been successful. > > That's it. Use at your own discretion. No guarantees. > > > > > *Rémy Sanouillet* > > NLP Engineer > > [email protected] <[email protected]> > > > > > [image: cid:347EAEF1-26E8-42CB-BAE3-6CB228301B15] > ForeSee Medical, Inc. > > 12555 High Bluff Drive, Suite 100 > > San Diego, CA 92130 > > > > NOTICE: This e-mail message and all attachments transmitted with it are > intended solely for the use of the addressee and may contain legally > privileged and confidential information. If the reader of this message is > not the intended recipient, or an employee or agent responsible for > delivering this message to the intended recipient, you are hereby notified > that any dissemination, distribution, copying, or other use of this message > or its attachments is strictly prohibited. If you have received this > message in error, please notify the sender immediately by replying to this > message and please delete it from your computer. > > > > > > On Fri, May 29, 2020 at 7:34 AM <[email protected]> wrote: > > Hi Team, > > > > We set up cTAKES4.0.0 as our NLP engine for our profile recently . We have > faced situations where some of the expected tokens are not picked up by > cTAKES during clinical text extraction. So our first thought process was to > identify where the dictionary is configured and how that can be updated. > After some code analysis it was found that the dictionary is configured in > the below path under ctakes/resources for sources RxNorm and SNOMEDCT_US > > > > We were able to open the hsqldb using the hsql db gui and found out that > some of our required entries are already there . So if I come specifically > to our current problem. The Pap Smear and Mamogram are two clinical terms > which are not currently recognized by cTAKES in our profile. > > · If I look into the .script file , Pap Smear and > Mammogram/Mammography is already present in the .script file and in the > respective tables. PFB a snapshot as below > > > > > > > > > > But still this was not recogonised by cTAKES. I see there are some filters > working on top of the available entries in dictionary(ctakes-gui and > ctake-gui-res). Will that be because of these filters the tokens are not > recognized as expected. Could you pls. share us what exactly these filters > do. This will help us in future also when we are trying to add new terms > into the dictionary > > > > > > · What are the steps to do if we need to add/edit entries into the > existing dictionaries. I see we can add/edit the existing values in > .scripts files but our primary doubt is if suppose I have a term ‘xyz’ to > be added to dictionary how can I get the CUI and other values like > TUI,RINDEX,TCOUNT and PREFTERM. Is it fine if I can give any random value > for the TUI/CUI/RINDEX/TCOUNT. I could also see options to create custom > bsv dictionaries but couldn’t see much documentation for it. Kindly advise > which is the better option from the below 3. > > > > o Generate a custom dictionary using METAMORPHOSYS UML installation > tool(where we provide sources as ICD10,RxNORM,SNOMEDCT_US) and leverage the > full set of .rrf files in the meta folder . Is this approach better if the > entries to be populated are maximal? > > o Add/edit the available dictionary sno_rx_16ab and in that case how to > provide valid values for each columns like CUI, TUI,RINDEX,TCOUNT and > PREFTERM. If the entries to be populated are minimal is this approach would > be better?. > > o Use a custom bsv , in that case how should we add values to custom > bsv. Could you also provide a sample in that case. > > > > I found a Metathesaurus browser in the below url , where I can search for > the terms and get the CUI and the respective source like ICD/CPT/MDR. But > still I was unable to get the other required attributes to be populated > like TUI,RINDEX,TCOUNT and PREFTERM. Could you pls. brief what these > attributes signifies > > > > https://uts.nlm.nih.gov//metathesaurus.html > <https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Futs.nlm.nih.gov%2Fmetathesaurus.html&data=02%7C01%7CAbad.Ayyub%40cognizant.com%7Cbd4a861ed0404262802e08d803e8a4b0%7Cde08c40719b9427d9fe8edf254300ca7%7C0%7C0%7C637263645022133073&sdata=KFn7yO59jEsACpgY2%2BRv2XKnzipPHgC00oSvN3R0ADI%3D&reserved=0> > > > > Kindly advise us on how to proceed on this and correct us if we went wrong > somewhere. This would be of great help for us > > > > P.S : We comply with UMLS license > > > > > > Thanks & Regards > > *Abad Ayyub* > > Vnet: 406170 | Cell : +91-9447379028 > > > > > > This e-mail and any files transmitted with it are for the sole use of the > intended recipient(s) and may contain confidential and privileged > information. If you are not the intended recipient(s), please reply to the > sender and destroy all copies of the original message. Any unauthorized > review, use, disclosure, dissemination, forwarding, printing or copying of > this email, and/or any action taken in reliance on the contents of this > e-mail is strictly prohibited and may be unlawful. Where permitted by > applicable law, this e-mail and other e-mail communications sent to and > from Cognizant e-mail addresses may be monitored. This e-mail and any files > transmitted with it are for the sole use of the intended recipient(s) and > may contain confidential and privileged information. If you are not the > intended recipient(s), please reply to the sender and destroy all copies of > the original message. Any unauthorized review, use, disclosure, > dissemination, forwarding, printing or copying of this email, and/or any > action taken in reliance on the contents of this e-mail is strictly > prohibited and may be unlawful. Where permitted by applicable law, this > e-mail and other e-mail communications sent to and from Cognizant e-mail > addresses may be monitored. > > This e-mail and any files transmitted with it are for the sole use of the > intended recipient(s) and may contain confidential and privileged > information. If you are not the intended recipient(s), please reply to the > sender and destroy all copies of the original message. Any unauthorized > review, use, disclosure, dissemination, forwarding, printing or copying of > this email, and/or any action taken in reliance on the contents of this > e-mail is strictly prohibited and may be unlawful. Where permitted by > applicable law, this e-mail and other e-mail communications sent to and > from Cognizant e-mail addresses may be monitored. This e-mail and any files > transmitted with it are for the sole use of the intended recipient(s) and > may contain confidential and privileged information. If you are not the > intended recipient(s), please reply to the sender and destroy all copies of > the original message. Any unauthorized review, use, disclosure, > dissemination, forwarding, printing or copying of this email, and/or any > action taken in reliance on the contents of this e-mail is strictly > prohibited and may be unlawful. Where permitted by applicable law, this > e-mail and other e-mail communications sent to and from Cognizant e-mail > addresses may be monitored. >
