Hi Jeff,

The short answer: No, LVG is not in the pipeline created by the 
DefaultFastPipeline.piper

Longer answer:
In older versions of dictionary lookup the Lexical Variant Generator module 
(LVG) was recommended to capture lexical variants of terms.  However, the 
dictionary resource already contains variants so the LVG module should not make 
much of a difference. When the fast lookup was new several years ago I ran a 
test with and without LVG on two datasets and the difference was along the 
lines of +1-2% recall, -1% precision.  

I think that ClinicalPipelineFactory.getFastPipeline() was a copy-paste of the 
previous .getClinicalPipeline() but with the dictionary module replaced.  So, 
LVG is still in that method -created pipeline.

When I (more recently) wrote that piper file that you reference I left out LVG 
as the added burden didn't seem to warrant its presence.  When I say burden I 
don't just mean speed decrease and memory footprint.  There have been a lot of 
configuration problems with LVG on various systems which led to difficulty 
using ctakes.

The diagram that you reference places LVG after the dictionary lookup, and 
after the part of speech tagger, while the page on lvg 
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0+-+LVG lists those 
as the two modules that may benefit from its presence.  That diagram is very 
old and should definitely be updated.  Both the diagram and the page on lvg 
include information that precedes (does not account for) the existence of the 
fast dictionary lookup.

Sean


________________________________________
From: Jeffrey Miller <[email protected]>
Sent: Tuesday, February 19, 2019 10:53 AM
To: [email protected]
Subject: DefaultFastPipeline.piper and LVG Annotator [EXTERNAL]

Hi,

I was wondering if the LVG Annotator is included DefaultFastPipeline.piper
<https://urldefense.proofpoint.com/v2/url?u=https-3A__svn.apache.org_repos_asf_ctakes_trunk_ctakes-2Dclinical-2Dpipeline-2Dres_src_main_resources_org_apache_ctakes_clinical_pipeline_DefaultFastPipeline.piper&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=TrXJiUUghmeYrvrV21K68pCfJk5KnG-xwBfzwVbxoRo&s=3Sgs1Jc-C37kcy1efCEhU_3RV4aFipAt1lbTO0Wu_Ns&e=>.
I have tried to trace through all the includes, but I cannot find it.
However, when I look at the code for the
ClinicalPipelineFactory.getFastPipeline() it seems to be included.
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_ctakes_blob_513bb49ebb98c4ac63f690c7b88a82aff18947b8_ctakes-2Dclinical-2Dpipeline_src_main_java_org_apache_ctakes_clinicalpipeline_ClinicalPipelineFactory.java-23L98&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=TrXJiUUghmeYrvrV21K68pCfJk5KnG-xwBfzwVbxoRo&s=kmZDExXBOyXg84kix__UvgD3LniSHa8MgL8K5fK3XC4&e=>
From
documentation in this flow diagram
<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_download_attachments_68718172_ctakes-2D3.1-2Ddependencies.png-3Fversion-3D1-26modificationDate-3D1488992146000-26api-3Dv2&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=TrXJiUUghmeYrvrV21K68pCfJk5KnG-xwBfzwVbxoRo&s=4yYVqkyLiodAWATji1EjSwoMh-YpU7qTz2J8tZvRT6I&e=>
from
the components documentation page
<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_cTAKES-2B4.0-2BComponent-2BUse-2BGuide&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=TrXJiUUghmeYrvrV21K68pCfJk5KnG-xwBfzwVbxoRo&s=m-9MenhmNTr2vdVAhCvKgBt48OUiQB8R2TkR7fEYtsY&e=>,
it seems to be a recommended component for the dictionary annotator.

Thanks for your help,
Jeff

Reply via email to