RE: SubjectClearTkAnalysisEngine not working [EXTERNAL]
Yes. Find your pipeline addition of "SimpleSegmentAnnotator" and replace it with "BsvRegexSectionizer". It is most likely the first thing in the pipeline. -Original Message- From: Ratan Sharma [mailto:ratanc...@gmail.com] Sent: Wednesday, January 17, 2018 12:53 PM To: dev@ctakes.apache.org Subject: Re: SubjectClearTkAnalysisEngine not working [EXTERNAL] No, I guess I am not running it anywhere explicitly. Is that the reason segment.getPreferredText() is always null for me. On Wed, Jan 17, 2018 at 11:14 PM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > Are you running the BsvRegexSectionizer? > > -Original Message- > From: Ratan Sharma [mailto:ratanc...@gmail.com] > Sent: Wednesday, January 17, 2018 12:17 PM > To: dev@ctakes.apache.org > Subject: Re: SubjectClearTkAnalysisEngine not working [EXTERNAL] > > Thanks Timothy for the details. I guess I was looking for > sections/segment only. I tried to work out that piece, but unable to > pull information, can you please guide me a bit. > > Below is the code snippet to pull segment information. > > for(IdentifiedAnnotation entity : JCasUtil.select(jcas, > IdentifiedAnnotation.class)) { > if(entity.getTypeID() != 0) > System.out.println("Entity: " + entity.getTypeID() + > " === Text: " + > entity.getCoveredText() + > " === Polarity: " + > entity.getPolarity() + > " === Subject: " + > entity.getSubject() + > " === EntityName: " + > entity.getType()); > } > > for (Segment segment : JCasUtil.select(jcas, Segment.class)) { > List mentions = JCasUtil.selectCovered(jcas, > LabMention.class, segment); > System.out.println("LATEST DATA : " + segment.getPreferredText() + " > (" + segment.getId() + "): " + mentions.size() + " mention(s)"); > > > First for loop is giving me correct result, but second for loop is > giving null error. > > > On Wed, Jan 17, 2018 at 1:14 AM, Miller, Timothy < > timothy.mil...@childrens.harvard.edu> wrote: > > > OK, it sounds like a slight misunderstanding of what "subject" > > refers to. The subject field refers to _who_ is the subject of an event. > > > > This is important to differentiate diseases that are mentioned > > because the patient is experiencing them ("pt has colon cancer") > > from those that might be mentioned because a family member had them > > ("mother had breast cancer"). > > > > What you're talking about sounds more like "Sections", which I think > > in ctakes are called "segments". There is a regex-based section > > finder in cTAKES but it is not enabled by default because it would > > usually need to be customized for a given institutions notes. > > > > Tim > > > > > > On Wed, 2018-01-17 at 01:10 +0530, Ratan Sharma wrote: > > > I am trying to find out something like If an entity falls in one > > > of these category, and my understanding was subject can get me > > > these information. > > > > > > SUBJECT it belongs to like - > > > *"Vital Signs", "BP", "Physical Examination", "Family Medical > > > History", "Lab Results"* > > > > > > Any idea how to achieve this. > > > > > > > > > On Wed, Jan 17, 2018 at 1:05 AM, Miller, Timothy < > > > timothy.mil...@childrens.harvard.edu> wrote: > > > > > > > > > > > What output would you like? What are you expecting? > > > > > > > > This field in theory could have a few different values: patient, > > > > family_member, other, donor(iirc?) > > > > > > > > But in reality our training data was very skewed towards the > > > > patient label, and the representation we used for training is > > > > not great at picking up section-wide cues that would be helpful > > > > (like a family history section header). So in practice it almost > > > > always will say "patient." It may occasionally get something > > > > very > > > > obvious: "Mother had breast cancer" > > > > I don't know if it will get this exact example, it probably > > > > needs to look exactly like a training instance because we had > > > > very few to generalize from. > > > > Thanks > > > > Tim > > > > > > > > > > > > On Wed, 2018-01-17 at 00:57 +0530, Ratan Sharma wrote: > > > > > > > > > > I am able to pull entity information for different section > > > > > correctly. > > > > > But > > > > > facing issues when it comes to pull subject information. The > > > > > subject is always pulled as "PATIENT". > > > > > > > > > > I do have this added in the AssertionPipeline builder.add( > > > > > SubjectCleartkAnalysisEngine.createAnnotatorDescription() ); > > > > > > > > > > > > > > > Here are some sample output : > > > > > > > > > > Entity: 3 === Text: Blood Transfusion === Polarity: 1 === > > > > > Subject: > > > > > patient > > > > > === EntityName: > > > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention > > > > > Entity: 6 === Text: Blood === Polarity: 1 ===
RE: SubjectClearTkAnalysisEngine not working [EXTERNAL]
Are you running the BsvRegexSectionizer? -Original Message- From: Ratan Sharma [mailto:ratanc...@gmail.com] Sent: Wednesday, January 17, 2018 12:17 PM To: dev@ctakes.apache.org Subject: Re: SubjectClearTkAnalysisEngine not working [EXTERNAL] Thanks Timothy for the details. I guess I was looking for sections/segment only. I tried to work out that piece, but unable to pull information, can you please guide me a bit. Below is the code snippet to pull segment information. for(IdentifiedAnnotation entity : JCasUtil.select(jcas, IdentifiedAnnotation.class)) { if(entity.getTypeID() != 0) System.out.println("Entity: " + entity.getTypeID() + " === Text: " + entity.getCoveredText() + " === Polarity: " + entity.getPolarity() + " === Subject: " + entity.getSubject() + " === EntityName: " + entity.getType()); } for (Segment segment : JCasUtil.select(jcas, Segment.class)) { List mentions = JCasUtil.selectCovered(jcas, LabMention.class, segment); System.out.println("LATEST DATA : " + segment.getPreferredText() + " (" + segment.getId() + "): " + mentions.size() + " mention(s)"); First for loop is giving me correct result, but second for loop is giving null error. On Wed, Jan 17, 2018 at 1:14 AM, Miller, Timothy < timothy.mil...@childrens.harvard.edu> wrote: > OK, it sounds like a slight misunderstanding of what "subject" refers > to. The subject field refers to _who_ is the subject of an event. > > This is important to differentiate diseases that are mentioned because > the patient is experiencing them ("pt has colon cancer") from those > that might be mentioned because a family member had them ("mother had > breast cancer"). > > What you're talking about sounds more like "Sections", which I think > in ctakes are called "segments". There is a regex-based section finder > in cTAKES but it is not enabled by default because it would usually > need to be customized for a given institutions notes. > > Tim > > > On Wed, 2018-01-17 at 01:10 +0530, Ratan Sharma wrote: > > I am trying to find out something like If an entity falls in one of > > these category, and my understanding was subject can get me these > > information. > > > > SUBJECT it belongs to like - > > *"Vital Signs", "BP", "Physical Examination", "Family Medical > > History", "Lab Results"* > > > > Any idea how to achieve this. > > > > > > On Wed, Jan 17, 2018 at 1:05 AM, Miller, Timothy < > > timothy.mil...@childrens.harvard.edu> wrote: > > > > > > > > What output would you like? What are you expecting? > > > > > > This field in theory could have a few different values: patient, > > > family_member, other, donor(iirc?) > > > > > > But in reality our training data was very skewed towards the > > > patient label, and the representation we used for training is not > > > great at picking up section-wide cues that would be helpful (like > > > a family history section header). So in practice it almost always > > > will say "patient." It may occasionally get something very > > > obvious: "Mother had breast cancer" > > > I don't know if it will get this exact example, it probably needs > > > to look exactly like a training instance because we had very few > > > to generalize from. > > > Thanks > > > Tim > > > > > > > > > On Wed, 2018-01-17 at 00:57 +0530, Ratan Sharma wrote: > > > > > > > > I am able to pull entity information for different section > > > > correctly. > > > > But > > > > facing issues when it comes to pull subject information. The > > > > subject is always pulled as "PATIENT". > > > > > > > > I do have this added in the AssertionPipeline builder.add( > > > > SubjectCleartkAnalysisEngine.createAnnotatorDescription() ); > > > > > > > > > > > > Here are some sample output : > > > > > > > > Entity: 3 === Text: Blood Transfusion === Polarity: 1 === > > > > Subject: > > > > patient > > > > === EntityName: > > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention > > > > Entity: 6 === Text: Blood === Polarity: 1 === Subject: patient > > > > === > > > > EntityName: > > > > org.apache.ctakes.typesystem.type.textsem.AnatomicalSiteMention > > > > Entity: 3 === Text: Transfusion Reaction === Polarity: 1 === > > > > Subject: > > > > patient === EntityName: > > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention > > > > Entity: 5 === Text: Transfusion === Polarity: 1 === Subject: > > > > patient > > > > === > > > > EntityName: > > > > org.apache.ctakes.typesystem.type.textsem.ProcedureMention > > > > Entity: 2 === Text: HIV === Polarity: 1 === Subject: patient === > > > > EntityName: > > > > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention > > > > Entity: 6 === Text: Sickle Cell === Polarity: 1 === Subject: > > > > patient > > > > === > > > > EntityName: > > > >
Re: SubjectClearTkAnalysisEngine not working [EXTERNAL]
Thanks Timothy for the details. I guess I was looking for sections/segment only. I tried to work out that piece, but unable to pull information, can you please guide me a bit. Below is the code snippet to pull segment information. for(IdentifiedAnnotation entity : JCasUtil.select(jcas, IdentifiedAnnotation.class)) { if(entity.getTypeID() != 0) System.out.println("Entity: " + entity.getTypeID() + " === Text: " + entity.getCoveredText() + " === Polarity: " + entity.getPolarity() + " === Subject: " + entity.getSubject() + " === EntityName: " + entity.getType()); } for (Segment segment : JCasUtil.select(jcas, Segment.class)) { List mentions = JCasUtil.selectCovered(jcas, LabMention.class, segment); System.out.println("LATEST DATA : " + segment.getPreferredText() + " (" + segment.getId() + "): " + mentions.size() + " mention(s)"); First for loop is giving me correct result, but second for loop is giving null error. On Wed, Jan 17, 2018 at 1:14 AM, Miller, Timothy < timothy.mil...@childrens.harvard.edu> wrote: > OK, it sounds like a slight misunderstanding of what "subject" refers > to. The subject field refers to _who_ is the subject of an event. > > This is important to differentiate diseases that are mentioned because > the patient is experiencing them ("pt has colon cancer") from those > that might be mentioned because a family member had them ("mother had > breast cancer"). > > What you're talking about sounds more like "Sections", which I think in > ctakes are called "segments". There is a regex-based section finder in > cTAKES but it is not enabled by default because it would usually need > to be customized for a given institutions notes. > > Tim > > > On Wed, 2018-01-17 at 01:10 +0530, Ratan Sharma wrote: > > I am trying to find out something like If an entity falls in one of > > these > > category, and my understanding was subject can get me these > > information. > > > > SUBJECT it belongs to like - > > *"Vital Signs", "BP", "Physical Examination", "Family Medical > > History", > > "Lab Results"* > > > > Any idea how to achieve this. > > > > > > On Wed, Jan 17, 2018 at 1:05 AM, Miller, Timothy < > > timothy.mil...@childrens.harvard.edu> wrote: > > > > > > > > What output would you like? What are you expecting? > > > > > > This field in theory could have a few different values: patient, > > > family_member, other, donor(iirc?) > > > > > > But in reality our training data was very skewed towards the > > > patient > > > label, and the representation we used for training is not great at > > > picking up section-wide cues that would be helpful (like a family > > > history section header). So in practice it almost always will say > > > "patient." It may occasionally get something very obvious: "Mother > > > had > > > breast cancer" > > > I don't know if it will get this exact example, it probably needs > > > to > > > look exactly like a training instance because we had very few to > > > generalize from. > > > Thanks > > > Tim > > > > > > > > > On Wed, 2018-01-17 at 00:57 +0530, Ratan Sharma wrote: > > > > > > > > I am able to pull entity information for different section > > > > correctly. > > > > But > > > > facing issues when it comes to pull subject information. The > > > > subject > > > > is > > > > always pulled as "PATIENT". > > > > > > > > I do have this added in the AssertionPipeline > > > > builder.add( > > > > SubjectCleartkAnalysisEngine.createAnnotatorDescription() ); > > > > > > > > > > > > Here are some sample output : > > > > > > > > Entity: 3 === Text: Blood Transfusion === Polarity: 1 === > > > > Subject: > > > > patient > > > > === EntityName: > > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention > > > > Entity: 6 === Text: Blood === Polarity: 1 === Subject: patient > > > > === > > > > EntityName: > > > > org.apache.ctakes.typesystem.type.textsem.AnatomicalSiteMention > > > > Entity: 3 === Text: Transfusion Reaction === Polarity: 1 === > > > > Subject: > > > > patient === EntityName: > > > > org.apache.ctakes.typesystem.type.textsem.SignSymptomMention > > > > Entity: 5 === Text: Transfusion === Polarity: 1 === Subject: > > > > patient > > > > === > > > > EntityName: > > > > org.apache.ctakes.typesystem.type.textsem.ProcedureMention > > > > Entity: 2 === Text: HIV === Polarity: 1 === Subject: patient === > > > > EntityName: > > > > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention > > > > Entity: 6 === Text: Sickle Cell === Polarity: 1 === Subject: > > > > patient > > > > === > > > > EntityName: > > > > org.apache.ctakes.typesystem.type.textsem.AnatomicalSiteMention > > > > Entity: 2 === Text: Neurologic Disorders === Polarity: 1 === > > > > Subject: > > > > patient === EntityName: > > > > org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention > >