RE: SubjectClearTkAnalysisEngine not working [EXTERNAL]

2018-01-17 Thread Finan, Sean
Yes.

Find your pipeline addition of "SimpleSegmentAnnotator" and replace it with 
"BsvRegexSectionizer".  It is most likely the first thing in the pipeline.

-Original Message-
From: Ratan Sharma [mailto:ratanc...@gmail.com] 
Sent: Wednesday, January 17, 2018 12:53 PM
To: dev@ctakes.apache.org
Subject: Re: SubjectClearTkAnalysisEngine not working [EXTERNAL]

No, I guess I am not running it anywhere explicitly. Is that the reason 
segment.getPreferredText() is always null for me.

On Wed, Jan 17, 2018 at 11:14 PM, Finan, Sean < 
sean.fi...@childrens.harvard.edu> wrote:

> Are you running the BsvRegexSectionizer?
>
> -Original Message-
> From: Ratan Sharma [mailto:ratanc...@gmail.com]
> Sent: Wednesday, January 17, 2018 12:17 PM
> To: dev@ctakes.apache.org
> Subject: Re: SubjectClearTkAnalysisEngine not working [EXTERNAL]
>
> Thanks Timothy for the details. I guess I was looking for 
> sections/segment only. I tried to work out that piece, but unable to 
> pull information, can you please guide me a bit.
>
> Below is the code snippet to pull segment information.
>
> for(IdentifiedAnnotation entity : JCasUtil.select(jcas,
> IdentifiedAnnotation.class)) {
> if(entity.getTypeID() != 0)
> System.out.println("Entity: " + entity.getTypeID() +
> " === Text: " +
> entity.getCoveredText() +
> " === Polarity: " +
> entity.getPolarity() +
> " === Subject: " +
> entity.getSubject() +
> " === EntityName: " + 
> entity.getType());
>   }
>
> for (Segment segment : JCasUtil.select(jcas, Segment.class)) {
>  List mentions = JCasUtil.selectCovered(jcas, 
> LabMention.class, segment);
> System.out.println("LATEST DATA : " + segment.getPreferredText() +  "
> (" + segment.getId() + "): " + mentions.size() + " mention(s)");
>
>
> First for loop is giving me correct result, but second for loop is 
> giving null error.
>
>
> On Wed, Jan 17, 2018 at 1:14 AM, Miller, Timothy < 
> timothy.mil...@childrens.harvard.edu> wrote:
>
> > OK, it sounds like a slight misunderstanding of what "subject" 
> > refers to. The subject field refers to _who_ is the subject of an event.
> >
> > This is important to differentiate diseases that are mentioned 
> > because the patient is experiencing them ("pt has colon cancer") 
> > from those that might be mentioned because a family member had them 
> > ("mother had breast cancer").
> >
> > What you're talking about sounds more like "Sections", which I think 
> > in ctakes are called "segments". There is a regex-based section 
> > finder in cTAKES but it is not enabled by default because it would 
> > usually need to be customized for a given institutions notes.
> >
> > Tim
> >
> >
> > On Wed, 2018-01-17 at 01:10 +0530, Ratan Sharma wrote:
> > > I am trying to find out something like If an entity falls in one 
> > > of these category, and my understanding was subject can get me 
> > > these information.
> > >
> > > SUBJECT it belongs to like -
> > > *"Vital Signs", "BP", "Physical Examination", "Family Medical 
> > > History", "Lab Results"*
> > >
> > > Any idea how to achieve this.
> > >
> > >
> > > On Wed, Jan 17, 2018 at 1:05 AM, Miller, Timothy < 
> > > timothy.mil...@childrens.harvard.edu> wrote:
> > >
> > > >
> > > > What output would you like? What are you expecting?
> > > >
> > > > This field in theory could have a few different values: patient, 
> > > > family_member, other, donor(iirc?)
> > > >
> > > > But in reality our training data was very skewed towards the 
> > > > patient label, and the representation we used for training is 
> > > > not great at picking up section-wide cues that would be helpful 
> > > > (like a family history section header). So in practice it almost 
> > > > always will say "patient." It may occasionally get something 
> > > > very
> > > > obvious: "Mother had breast cancer"
> > > > I don't know if it will get this exact example, it probably 
> > > > needs to look exactly like a training instance because we had 
> > > > very few to generalize from.
> > > > Thanks
> > > > Tim
> > > >
> > > >
> > > > On Wed, 2018-01-17 at 00:57 +0530, Ratan Sharma wrote:
> > > > >
> > > > > I am able to pull entity information for different section 
> > > > > correctly.
> > > > > But
> > > > > facing issues when it comes to pull subject information. The 
> > > > > subject is always pulled as "PATIENT".
> > > > >
> > > > > I do have this added in the AssertionPipeline builder.add(
> > > > > SubjectCleartkAnalysisEngine.createAnnotatorDescription() );
> > > > >
> > > > >
> > > > > Here are some sample output :
> > > > >
> > > > > Entity: 3 === Text: Blood Transfusion === Polarity: 1 ===
> > > > > Subject:
> > > > > patient
> > > > > === EntityName:
> > > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > > > > Entity: 6 === Text: Blood === Polarity: 1 === 

RE: SubjectClearTkAnalysisEngine not working [EXTERNAL]

2018-01-17 Thread Finan, Sean
Are you running the BsvRegexSectionizer? 

-Original Message-
From: Ratan Sharma [mailto:ratanc...@gmail.com] 
Sent: Wednesday, January 17, 2018 12:17 PM
To: dev@ctakes.apache.org
Subject: Re: SubjectClearTkAnalysisEngine not working [EXTERNAL]

Thanks Timothy for the details. I guess I was looking for sections/segment 
only. I tried to work out that piece, but unable to pull information, can you 
please guide me a bit.

Below is the code snippet to pull segment information.

for(IdentifiedAnnotation entity : JCasUtil.select(jcas,
IdentifiedAnnotation.class)) {
if(entity.getTypeID() != 0)
System.out.println("Entity: " + entity.getTypeID() +
" === Text: " +
entity.getCoveredText() +
" === Polarity: " +
entity.getPolarity() +
" === Subject: " +
entity.getSubject() +
" === EntityName: " + entity.getType());
  }

for (Segment segment : JCasUtil.select(jcas, Segment.class)) {
 List mentions = JCasUtil.selectCovered(jcas, LabMention.class, 
segment);
System.out.println("LATEST DATA : " + segment.getPreferredText() +  "
(" + segment.getId() + "): " + mentions.size() + " mention(s)");


First for loop is giving me correct result, but second for loop is giving null 
error.


On Wed, Jan 17, 2018 at 1:14 AM, Miller, Timothy < 
timothy.mil...@childrens.harvard.edu> wrote:

> OK, it sounds like a slight misunderstanding of what "subject" refers 
> to. The subject field refers to _who_ is the subject of an event.
>
> This is important to differentiate diseases that are mentioned because 
> the patient is experiencing them ("pt has colon cancer") from those 
> that might be mentioned because a family member had them ("mother had 
> breast cancer").
>
> What you're talking about sounds more like "Sections", which I think 
> in ctakes are called "segments". There is a regex-based section finder 
> in cTAKES but it is not enabled by default because it would usually 
> need to be customized for a given institutions notes.
>
> Tim
>
>
> On Wed, 2018-01-17 at 01:10 +0530, Ratan Sharma wrote:
> > I am trying to find out something like If an entity falls in one of 
> > these category, and my understanding was subject can get me these 
> > information.
> >
> > SUBJECT it belongs to like -
> > *"Vital Signs", "BP", "Physical Examination", "Family Medical 
> > History", "Lab Results"*
> >
> > Any idea how to achieve this.
> >
> >
> > On Wed, Jan 17, 2018 at 1:05 AM, Miller, Timothy < 
> > timothy.mil...@childrens.harvard.edu> wrote:
> >
> > >
> > > What output would you like? What are you expecting?
> > >
> > > This field in theory could have a few different values: patient, 
> > > family_member, other, donor(iirc?)
> > >
> > > But in reality our training data was very skewed towards the 
> > > patient label, and the representation we used for training is not 
> > > great at picking up section-wide cues that would be helpful (like 
> > > a family history section header). So in practice it almost always 
> > > will say "patient." It may occasionally get something very 
> > > obvious: "Mother had breast cancer"
> > > I don't know if it will get this exact example, it probably needs 
> > > to look exactly like a training instance because we had very few 
> > > to generalize from.
> > > Thanks
> > > Tim
> > >
> > >
> > > On Wed, 2018-01-17 at 00:57 +0530, Ratan Sharma wrote:
> > > >
> > > > I am able to pull entity information for different section 
> > > > correctly.
> > > > But
> > > > facing issues when it comes to pull subject information. The 
> > > > subject is always pulled as "PATIENT".
> > > >
> > > > I do have this added in the AssertionPipeline builder.add(
> > > > SubjectCleartkAnalysisEngine.createAnnotatorDescription() );
> > > >
> > > >
> > > > Here are some sample output :
> > > >
> > > > Entity: 3 === Text: Blood Transfusion === Polarity: 1 ===
> > > > Subject:
> > > > patient
> > > > === EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > > > Entity: 6 === Text: Blood === Polarity: 1 === Subject: patient
> > > > ===
> > > > EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.AnatomicalSiteMention
> > > > Entity: 3 === Text: Transfusion Reaction === Polarity: 1 ===
> > > > Subject:
> > > > patient === EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > > > Entity: 5 === Text: Transfusion === Polarity: 1 === Subject:
> > > > patient
> > > > ===
> > > > EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.ProcedureMention
> > > > Entity: 2 === Text: HIV === Polarity: 1 === Subject: patient ===
> > > > EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> > > > Entity: 6 === Text: Sickle Cell === Polarity: 1 === Subject:
> > > > patient
> > > > ===
> > > > EntityName:
> > > > 

Re: SubjectClearTkAnalysisEngine not working [EXTERNAL]

2018-01-17 Thread Ratan Sharma
Thanks Timothy for the details. I guess I was looking for sections/segment
only. I tried to work out that piece, but unable to pull information, can
you please guide me a bit.

Below is the code snippet to pull segment information.

for(IdentifiedAnnotation entity : JCasUtil.select(jcas,
IdentifiedAnnotation.class)) {
if(entity.getTypeID() != 0)
System.out.println("Entity: " + entity.getTypeID() +
" === Text: " +
entity.getCoveredText() +
" === Polarity: " +
entity.getPolarity() +
" === Subject: " +
entity.getSubject() +
" === EntityName: " +
entity.getType());
  }

for (Segment segment : JCasUtil.select(jcas, Segment.class)) {
 List mentions = JCasUtil.selectCovered(jcas,
LabMention.class, segment);
System.out.println("LATEST DATA : " + segment.getPreferredText() +  "
(" + segment.getId() + "): " + mentions.size() + " mention(s)");


First for loop is giving me correct result, but second for loop is giving
null error.


On Wed, Jan 17, 2018 at 1:14 AM, Miller, Timothy <
timothy.mil...@childrens.harvard.edu> wrote:

> OK, it sounds like a slight misunderstanding of what "subject" refers
> to. The subject field refers to _who_ is the subject of an event.
>
> This is important to differentiate diseases that are mentioned because
> the patient is experiencing them ("pt has colon cancer") from those
> that might be mentioned because a family member had them ("mother had
> breast cancer").
>
> What you're talking about sounds more like "Sections", which I think in
> ctakes are called "segments". There is a regex-based section finder in
> cTAKES but it is not enabled by default because it would usually need
> to be customized for a given institutions notes.
>
> Tim
>
>
> On Wed, 2018-01-17 at 01:10 +0530, Ratan Sharma wrote:
> > I am trying to find out something like If an entity falls in one of
> > these
> > category, and my understanding was subject can get me these
> > information.
> >
> > SUBJECT it belongs to like -
> > *"Vital Signs", "BP", "Physical Examination", "Family Medical
> > History",
> > "Lab Results"*
> >
> > Any idea how to achieve this.
> >
> >
> > On Wed, Jan 17, 2018 at 1:05 AM, Miller, Timothy <
> > timothy.mil...@childrens.harvard.edu> wrote:
> >
> > >
> > > What output would you like? What are you expecting?
> > >
> > > This field in theory could have a few different values: patient,
> > > family_member, other, donor(iirc?)
> > >
> > > But in reality our training data was very skewed towards the
> > > patient
> > > label, and the representation we used for training is not great at
> > > picking up section-wide cues that would be helpful (like a family
> > > history section header). So in practice it almost always will say
> > > "patient." It may occasionally get something very obvious: "Mother
> > > had
> > > breast cancer"
> > > I don't know if it will get this exact example, it probably needs
> > > to
> > > look exactly like a training instance because we had very few to
> > > generalize from.
> > > Thanks
> > > Tim
> > >
> > >
> > > On Wed, 2018-01-17 at 00:57 +0530, Ratan Sharma wrote:
> > > >
> > > > I am able to pull entity information for different section
> > > > correctly.
> > > > But
> > > > facing issues when it comes to pull subject information. The
> > > > subject
> > > > is
> > > > always pulled as "PATIENT".
> > > >
> > > > I do have this added in the AssertionPipeline
> > > > builder.add(
> > > > SubjectCleartkAnalysisEngine.createAnnotatorDescription() );
> > > >
> > > >
> > > > Here are some sample output :
> > > >
> > > > Entity: 3 === Text: Blood Transfusion === Polarity: 1 ===
> > > > Subject:
> > > > patient
> > > > === EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > > > Entity: 6 === Text: Blood === Polarity: 1 === Subject: patient
> > > > ===
> > > > EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.AnatomicalSiteMention
> > > > Entity: 3 === Text: Transfusion Reaction === Polarity: 1 ===
> > > > Subject:
> > > > patient === EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention
> > > > Entity: 5 === Text: Transfusion === Polarity: 1 === Subject:
> > > > patient
> > > > ===
> > > > EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.ProcedureMention
> > > > Entity: 2 === Text: HIV === Polarity: 1 === Subject: patient ===
> > > > EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> > > > Entity: 6 === Text: Sickle Cell === Polarity: 1 === Subject:
> > > > patient
> > > > ===
> > > > EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.AnatomicalSiteMention
> > > > Entity: 2 === Text: Neurologic Disorders === Polarity: 1 ===
> > > > Subject:
> > > > patient === EntityName:
> > > > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention
> >