RE: SubjectClearTkAnalysisEngine not working [EXTERNAL]

2018-01-17 Thread Finan, Sean
Yes.

Find your pipeline addition of "SimpleSegmentAnnotator" and replace it with 
"BsvRegexSectionizer".  It is most likely the first thing in the pipeline.

-Original Message-
From: Ratan Sharma [mailto:ratanc...@gmail.com] 
Sent: Wednesday, January 17, 2018 12:53 PM
To: dev@ctakes.apache.org
Subject: Re: SubjectClearTkAnalysisEngine not working [EXTERNAL]

No, I guess I am not running it anywhere explicitly. Is that the reason 
segment.getPreferredText() is always null for me.

On Wed, Jan 17, 2018 at 11:14 PM, Finan, Sean < 
sean.fi...@childrens.harvard.edu> wrote:

> Are you running the BsvRegexSectionizer?
>
> -Original Message-
> From: Ratan Sharma [mailto:ratanc...@gmail.com]
> Sent: Wednesday, January 17, 2018 12:17 PM
> To: dev@ctakes.apache.org
> Subject: Re: SubjectClearTkAnalysisEngine not working [EXTERNAL]
>
> Thanks Timothy for the details. I guess I was looking for 
> sections/segment only. I tried to work out that piece, but unable to 
> pull information, can you please guide me a bit.
>
> Below is the code snippet to pull segment information.
>
> for(IdentifiedAnnotation entity : JCasUtil.select(jcas,
> IdentifiedAnnotation.class)) {
> if(entity.getTypeID() != 0)
> System.out.println("Entity: " + entity.getTypeID() +
> " === Text: " +
> entity.getCoveredText() +
> " === Polarity: " +
> entity.getPolarity() +
> " === Subject: " +
> entity.getSubject() +
> " === EntityName: " + 
> entity.getType());
>   }
>
> for (Segment segment : JCasUtil.select(jcas, Segment.class)) {
>  List mentions = JCasUtil.selectCovered(jcas, 
> LabMention.class, segment);
> System.out.println("LATEST DATA : " + segment.getPreferredText() +  "
> (" + segment.getId() + "): " + mentions.size() + " mention(s)");
>
>
> First for loop is giving me correct result, but second for loop is 
> giving null error.
>
>
> On Wed, Jan 17, 2018 at 1:14 AM, Miller, Timothy < 
> timothy.mil...@childrens.harvard.edu> wrote:
>
> > OK, it sounds like a slight misunderstanding of what "subject" 
> > refers to. The subject field refers to _who_ is the subject of an event.
> >
> > This is important to differentiate diseases that are mentioned 
> > because the patient is experiencing them ("pt has colon cancer") 
> > from those that might be mentioned because a family member had them 
> > ("mother had breast cancer").
> >
> > What you're talking about sounds more like "Sections", which I think 
> > in ctakes are called "segments". There is a regex-based section 
> > finder in cTAKES but it is not enabled by default because it would 
> > usually need to be customized for a given institutions notes.
> >
> > Tim
> >
> >
> > On Wed, 2018-01-17 at 01:10 +0530, Ratan Sharma wrote:
> > > I am trying to find out something like If an entity falls in one 
> > > of these category, and my understanding was subject can get me 
> > > these information.
> > >
> > > SUBJECT it belongs to like -
> > > *"Vital Signs", "BP", "Physical Examination", "Family Medical 
> > > History", "Lab Results"*
> > >
> > > Any idea how to achieve this.
> > >
> > >
> > > On Wed, Jan 17, 2018 at 1:05 AM, Miller, Timothy < 
> > > timothy.mil...@childrens.harvard.edu> wrote:
> > >
> > > >
> > > > What output would you like? What are you expecting?
> > > >
> > > > This field in theory could have a few different values: patient, 
> > > > family_member, other, donor(iirc?)
> > > >
> > > > But in reality our training data was very skewed towards the 
> > > > patient label, and the representation we used for training is 
> > > > not great at picking up section-wide cues that would be helpful 
> > > > (like a family history section header). So in practice it almost 
> > > > always will say "patient." It may occasionally get something 
> > > > very
> > > > obvious: "Mother had breast cancer"
> > > > I don't know if it will get this exact example, it probably 
> > > > needs to look exactly like a training instance because we had 
> > > > very few to generalize from.
> > > > T

RE: SubjectClearTkAnalysisEngine not working [EXTERNAL]

2018-01-17 Thread Finan, Sean
Are you running the BsvRegexSectionizer? 

-Original Message-
From: Ratan Sharma [mailto:ratanc...@gmail.com] 
Sent: Wednesday, January 17, 2018 12:17 PM
To: dev@ctakes.apache.org
Subject: Re: SubjectClearTkAnalysisEngine not working [EXTERNAL]

Thanks Timothy for the details. I guess I was looking for sections/segment 
only. I tried to work out that piece, but unable to pull information, can you 
please guide me a bit.

Below is the code snippet to pull segment information.

for(IdentifiedAnnotation entity : JCasUtil.select(jcas,
IdentifiedAnnotation.class)) {
if(entity.getTypeID() != 0)
System.out.println("Entity: " + entity.getTypeID() +
" === Text: " +
entity.getCoveredText() +
" === Polarity: " +
entity.getPolarity() +
" === Subject: " +
entity.getSubject() +
" === EntityName: " + entity.getType());
  }

for (Segment segment : JCasUtil.select(jcas, Segment.class)) {
 List mentions = JCasUtil.selectCovered(jcas, LabMention.class, 
segment);
System.out.println("LATEST DATA : " + segment.getPreferredText() +  "
(" + segment.getId() + "): " + mentions.size() + " mention(s)");


First for loop is giving me correct result, but second for loop is giving null 
error.


On Wed, Jan 17, 2018 at 1:14 AM, Miller, Timothy < 
timothy.mil...@childrens.harvard.edu> wrote:

> OK, it sounds like a slight misunderstanding of what "subject" refers 
> to. The subject field refers to _who_ is the subject of an event.
>
> This is important to differentiate diseases that are mentioned because 
> the patient is experiencing them ("pt has colon cancer") from those 
> that might be mentioned because a family member had them ("mother had 
> breast cancer").
>
> What you're talking about sounds more like "Sections", which I think 
> in ctakes are called "segments". There is a regex-based section finder 
> in cTAKES but it is not enabled by default because it would usually 
> need to be customized for a given institutions notes.
>
> Tim
>
>
> On Wed, 2018-01-17 at 01:10 +0530, Ratan Sharma wrote:
> > I am trying to find out something like If an entity falls in one of 
> > these category, and my understanding was subject can get me these 
> > information.
> >
> > SUBJECT it belongs to like -
> > *"Vital Signs", "BP", "Physical Examination", "Family Medical 
> > History", "Lab Results"*
> >
> > Any idea how to achieve this.
> >
> >
> > On Wed, Jan 17, 2018 at 1:05 AM, Miller, Timothy < 
> > timothy.mil...@childrens.harvard.edu> wrote:
> >
> > >
> > > What output would you like? What are you expecting?
> > >
> > > This field in theory could have a few different values: patient, 
> > > family_member, other, donor(iirc?)
> > >
> > > But in reality our training data was very skewed towards the 
> > > patient label, and the representation we used for training is not 
> > > great at picking up section-wide cues that would be helpful (like 
> > > a family history section header). So in practice it almost always 
> > > will say "patient." It may occasionally get something very 
> > > obvious: "Mother had breast cancer"
> > > I don't know if it will get this exact example, it probably needs 
> > > to look exactly like a training instance because we had very few 
> > > to generalize from.
> > > Thanks
> > > Tim
> > >
> > >
> > > On Wed, 2018-01-17 at 00:57 +0530, Ratan Sharma wrote:
> > > >
> > > > I am able to pull entity information for different section 
> > > > correctly.
> > > > But
> > > > facing issues when it comes to pull subject information. The 
> > > > subject is always pulled as "PATIENT".
> > > >
> > > > I do have this added in the AssertionPipeline builder.add(
> > > > SubjectCleartkAnalysisEngine.createAnnotatorDescription() );
> > > >
> > > >
> > > > Here are some sample output :
> > > >
> > > > Entity: 3 === Text: Blood Transfusion === Polarity: 1 ===
> > > > Subject:
> > > > patient
> > > > === EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > > > Entity: 6 === Text: Blood === Polarity: 1 ==

Re: SubjectClearTkAnalysisEngine not working [EXTERNAL]

2018-01-17 Thread Ratan Sharma
Thanks Timothy for the details. I guess I was looking for sections/segment
only. I tried to work out that piece, but unable to pull information, can
you please guide me a bit.

Below is the code snippet to pull segment information.

for(IdentifiedAnnotation entity : JCasUtil.select(jcas,
IdentifiedAnnotation.class)) {
if(entity.getTypeID() != 0)
System.out.println("Entity: " + entity.getTypeID() +
" === Text: " +
entity.getCoveredText() +
" === Polarity: " +
entity.getPolarity() +
" === Subject: " +
entity.getSubject() +
" === EntityName: " +
entity.getType());
  }

for (Segment segment : JCasUtil.select(jcas, Segment.class)) {
 List mentions = JCasUtil.selectCovered(jcas,
LabMention.class, segment);
System.out.println("LATEST DATA : " + segment.getPreferredText() +  "
(" + segment.getId() + "): " + mentions.size() + " mention(s)");


First for loop is giving me correct result, but second for loop is giving
null error.


On Wed, Jan 17, 2018 at 1:14 AM, Miller, Timothy <
timothy.mil...@childrens.harvard.edu> wrote:

> OK, it sounds like a slight misunderstanding of what "subject" refers
> to. The subject field refers to _who_ is the subject of an event.
>
> This is important to differentiate diseases that are mentioned because
> the patient is experiencing them ("pt has colon cancer") from those
> that might be mentioned because a family member had them ("mother had
> breast cancer").
>
> What you're talking about sounds more like "Sections", which I think in
> ctakes are called "segments". There is a regex-based section finder in
> cTAKES but it is not enabled by default because it would usually need
> to be customized for a given institutions notes.
>
> Tim
>
>
> On Wed, 2018-01-17 at 01:10 +0530, Ratan Sharma wrote:
> > I am trying to find out something like If an entity falls in one of
> > these
> > category, and my understanding was subject can get me these
> > information.
> >
> > SUBJECT it belongs to like -
> > *"Vital Signs", "BP", "Physical Examination", "Family Medical
> > History",
> > "Lab Results"*
> >
> > Any idea how to achieve this.
> >
> >
> > On Wed, Jan 17, 2018 at 1:05 AM, Miller, Timothy <
> > timothy.mil...@childrens.harvard.edu> wrote:
> >
> > >
> > > What output would you like? What are you expecting?
> > >
> > > This field in theory could have a few different values: patient,
> > > family_member, other, donor(iirc?)
> > >
> > > But in reality our training data was very skewed towards the
> > > patient
> > > label, and the representation we used for training is not great at
> > > picking up section-wide cues that would be helpful (like a family
> > > history section header). So in practice it almost always will say
> > > "patient." It may occasionally get something very obvious: "Mother
> > > had
> > > breast cancer"
> > > I don't know if it will get this exact example, it probably needs
> > > to
> > > look exactly like a training instance because we had very few to
> > > generalize from.
> > > Thanks
> > > Tim
> > >
> > >
> > > On Wed, 2018-01-17 at 00:57 +0530, Ratan Sharma wrote:
> > > >
> > > > I am able to pull entity information for different section
> > > > correctly.
> > > > But
> > > > facing issues when it comes to pull subject information. The
> > > > subject
> > > > is
> > > > always pulled as "PATIENT".
> > > >
> > > > I do have this added in the AssertionPipeline
> > > > builder.add(
> > > > SubjectCleartkAnalysisEngine.createAnnotatorDescription() );
> > > >
> > > >
> > > > Here are some sample output :
> > > >
> > > > Entity: 3 === Text: Blood Transfusion === Polarity: 1 ===
> > > > Subject:
> > > > patient
> > > > === EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > > > Entity: 6 === Text: Blood === Polarity: 1 === Subject: patient
> > > > ===
> > > > EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.AnatomicalSiteMention
> > > > Entity: 3 === Text: Transfusion Reaction === Polarity: 1 ===
> > > > Subject:
> > > > patient === EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > > > Entity: 5 === Text: Transfusion === Polarity: 1 === Subject:
> > > > patient
> > > > ===
> > > > EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.ProcedureMention
> > > > Entity: 2 === Text: HIV === Polarity: 1 === Subject: patient ===
> > > > EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> > > > Entity: 6 === Text: Sickle Cell === Polarity: 1 === Subject:
> > > > patient
> > > > ===
> > > > EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.AnatomicalSiteMention
> > > > Entity: 2 === Text: Neurologic Disorders === Polarity: 1 ===
> > > > Subject:
> > > > patient === EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> > 

Re: SubjectClearTkAnalysisEngine not working [EXTERNAL]

2018-01-16 Thread Miller, Timothy
OK, it sounds like a slight misunderstanding of what "subject" refers
to. The subject field refers to _who_ is the subject of an event.

This is important to differentiate diseases that are mentioned because
the patient is experiencing them ("pt has colon cancer") from those
that might be mentioned because a family member had them ("mother had
breast cancer").

What you're talking about sounds more like "Sections", which I think in
ctakes are called "segments". There is a regex-based section finder in
cTAKES but it is not enabled by default because it would usually need
to be customized for a given institutions notes.

Tim


On Wed, 2018-01-17 at 01:10 +0530, Ratan Sharma wrote:
> I am trying to find out something like If an entity falls in one of
> these
> category, and my understanding was subject can get me these
> information.
> 
> SUBJECT it belongs to like -
> *"Vital Signs", "BP", "Physical Examination", "Family Medical
> History",
> "Lab Results"*
> 
> Any idea how to achieve this.
> 
> 
> On Wed, Jan 17, 2018 at 1:05 AM, Miller, Timothy <
> timothy.mil...@childrens.harvard.edu> wrote:
> 
> > 
> > What output would you like? What are you expecting?
> > 
> > This field in theory could have a few different values: patient,
> > family_member, other, donor(iirc?)
> > 
> > But in reality our training data was very skewed towards the
> > patient
> > label, and the representation we used for training is not great at
> > picking up section-wide cues that would be helpful (like a family
> > history section header). So in practice it almost always will say
> > "patient." It may occasionally get something very obvious: "Mother
> > had
> > breast cancer"
> > I don't know if it will get this exact example, it probably needs
> > to
> > look exactly like a training instance because we had very few to
> > generalize from.
> > Thanks
> > Tim
> > 
> > 
> > On Wed, 2018-01-17 at 00:57 +0530, Ratan Sharma wrote:
> > > 
> > > I am able to pull entity information for different section
> > > correctly.
> > > But
> > > facing issues when it comes to pull subject information. The
> > > subject
> > > is
> > > always pulled as "PATIENT".
> > > 
> > > I do have this added in the AssertionPipeline
> > > builder.add(
> > > SubjectCleartkAnalysisEngine.createAnnotatorDescription() );
> > > 
> > > 
> > > Here are some sample output :
> > > 
> > > Entity: 3 === Text: Blood Transfusion === Polarity: 1 ===
> > > Subject:
> > > patient
> > > === EntityName:
> > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > > Entity: 6 === Text: Blood === Polarity: 1 === Subject: patient
> > > ===
> > > EntityName:
> > > org.apache.ctakes.typesystem.type.textsem.AnatomicalSiteMention
> > > Entity: 3 === Text: Transfusion Reaction === Polarity: 1 ===
> > > Subject:
> > > patient === EntityName:
> > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > > Entity: 5 === Text: Transfusion === Polarity: 1 === Subject:
> > > patient
> > > ===
> > > EntityName:
> > > org.apache.ctakes.typesystem.type.textsem.ProcedureMention
> > > Entity: 2 === Text: HIV === Polarity: 1 === Subject: patient ===
> > > EntityName:
> > > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> > > Entity: 6 === Text: Sickle Cell === Polarity: 1 === Subject:
> > > patient
> > > ===
> > > EntityName:
> > > org.apache.ctakes.typesystem.type.textsem.AnatomicalSiteMention
> > > Entity: 2 === Text: Neurologic Disorders === Polarity: 1 ===
> > > Subject:
> > > patient === EntityName:
> > > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> > > Entity: 2 === Text: Autoimmune Disorders === Polarity: 1 ===
> > > Subject:
> > > patient === EntityName:
> > > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> > > Entity: 3 === Text: Autoimmune === Polarity: 1 === Subject:
> > > patient
> > > ===
> > > EntityName:
> > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > > Entity: 2 === Text: Autoimmune Disorders === Polarity: -1 ===
> > > Subject:
> > > patient === EntityName:
> > > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> > > Entity: 3 === Text: Autoimmune === Polarity: 1 === Subject:
> > > patient
> > > ===
> > > EntityName:
> > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention

Re: SubjectClearTkAnalysisEngine not working [EXTERNAL]

2018-01-16 Thread Ratan Sharma
I am trying to find out something like If an entity falls in one of these
category, and my understanding was subject can get me these information.

SUBJECT it belongs to like -
*"Vital Signs", "BP", "Physical Examination", "Family Medical History",
"Lab Results"*

Any idea how to achieve this.


On Wed, Jan 17, 2018 at 1:05 AM, Miller, Timothy <
timothy.mil...@childrens.harvard.edu> wrote:

> What output would you like? What are you expecting?
>
> This field in theory could have a few different values: patient,
> family_member, other, donor(iirc?)
>
> But in reality our training data was very skewed towards the patient
> label, and the representation we used for training is not great at
> picking up section-wide cues that would be helpful (like a family
> history section header). So in practice it almost always will say
> "patient." It may occasionally get something very obvious: "Mother had
> breast cancer"
> I don't know if it will get this exact example, it probably needs to
> look exactly like a training instance because we had very few to
> generalize from.
> Thanks
> Tim
>
>
> On Wed, 2018-01-17 at 00:57 +0530, Ratan Sharma wrote:
> > I am able to pull entity information for different section correctly.
> > But
> > facing issues when it comes to pull subject information. The subject
> > is
> > always pulled as "PATIENT".
> >
> > I do have this added in the AssertionPipeline
> > builder.add(
> > SubjectCleartkAnalysisEngine.createAnnotatorDescription() );
> >
> >
> > Here are some sample output :
> >
> > Entity: 3 === Text: Blood Transfusion === Polarity: 1 === Subject:
> > patient
> > === EntityName:
> > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > Entity: 6 === Text: Blood === Polarity: 1 === Subject: patient ===
> > EntityName:
> > org.apache.ctakes.typesystem.type.textsem.AnatomicalSiteMention
> > Entity: 3 === Text: Transfusion Reaction === Polarity: 1 === Subject:
> > patient === EntityName:
> > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > Entity: 5 === Text: Transfusion === Polarity: 1 === Subject: patient
> > ===
> > EntityName:
> > org.apache.ctakes.typesystem.type.textsem.ProcedureMention
> > Entity: 2 === Text: HIV === Polarity: 1 === Subject: patient ===
> > EntityName:
> > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> > Entity: 6 === Text: Sickle Cell === Polarity: 1 === Subject: patient
> > ===
> > EntityName:
> > org.apache.ctakes.typesystem.type.textsem.AnatomicalSiteMention
> > Entity: 2 === Text: Neurologic Disorders === Polarity: 1 === Subject:
> > patient === EntityName:
> > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> > Entity: 2 === Text: Autoimmune Disorders === Polarity: 1 === Subject:
> > patient === EntityName:
> > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> > Entity: 3 === Text: Autoimmune === Polarity: 1 === Subject: patient
> > ===
> > EntityName:
> > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > Entity: 2 === Text: Autoimmune Disorders === Polarity: -1 ===
> > Subject:
> > patient === EntityName:
> > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> > Entity: 3 === Text: Autoimmune === Polarity: 1 === Subject: patient
> > ===
> > EntityName:
> > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention