RE: [gpc-informatics] #44: portable HERON ETL for NAACCR

2015-02-12 Thread Dan Connolly
The query terms were chosen by the breast cancer team two months ago; see 
#204https://informatics.gpcnetwork.org/trac/Project/ticket/204, (with related 
discussion in #167https://informatics.gpcnetwork.org/trac/Project/ticket/167, 
and #32https://informatics.gpcnetwork.org/trac/Project/ticket/32). I don't 
see any new information that would motivate them to reconsider. For example, 
see discussion of the SEER Site Summary last 
Augusthttp://listserv.kumc.edu/pipermail/gpc-dev/2014q3/000399.html.

To re-iterate: he SEER site recode is a function of not just the C50X primary 
site info but also histology. I suppose in theory it's possible to encode this 
in an i2b2 query, but that's not how the breast cancer team decided to do it.

Digging into the references I provided...

excerpt from 
seer_recode.sqlhttps://informatics.kumc.edu/work/browser/heron_load/seer_recode.sql:

/* Breast */ when (site between 'C500' and 'C509')
  and  not (histology between '9050' and '9055'
   or histology = '9140'
   or histology between '9590' and '9992') then '26000'

excerpt from SEER Site Recode ICD-O-3 (1/27/2003) 
Definitionhttp://seer.cancer.gov/siterecode/icdo3_d01272003/,:

Site Group  ICD-O-3 SiteICD-O-3 Histology (Type)Recode
Breast  C500-C509   excluding 9590-9989, and sometimes 9050-9055, 9140 
+http://seer.cancer.gov/siterecode/icdo3_d01272003/#_+ 26000

--
Dan


From: gpc-dev-boun...@listserv.kumc.edu [gpc-dev-boun...@listserv.kumc.edu] on 
behalf of Nadkarni, Prakash [prakash-nadka...@uiowa.edu]
Sent: Thursday, February 12, 2015 6:51 PM
To: gpc-dev@listserv.kumc.edu
Subject: RE: [gpc-informatics] #44: portable HERON ETL for NAACCR

We did not include the SEER Site Summary. The ICD-O3 pattern (topography codes) 
for the site C50% (e.g., C508, C509 etc) identify the Breast cancer subtree 
unambiguously,
and so we used this.
I believe that the SEER summary is in fact derived from the topography codes, 
which are the primary data entered by the registry folks, and more finely 
granular - e.g., breast cancer has subtypes like paget's disease, etc. - but 
the use of English as opposed to numbers in the SEER field makes the output 
more human-comprehensible without having to use an ICD-O3 code book. To anyone 
who has implemented I2B2, however - I2B2 allows search by keyphrases 
corresponding to the codes - the SEER field is redundant, and does not add 
information over and above what the ICDO3 topography code provides.

Prakash


From: GPC Informatics [d...@madmode.com]
Sent: Thursday, February 12, 2015 11:43 AM
To: dconno...@kumc.edu; Nadkarni, Prakash; jd...@umn.edu
Cc: ngra...@kumc.edu; ple...@uwhealth.org; gbus...@mcw.edu
Subject: Re: [gpc-informatics] #44: portable HERON ETL for NAACCR

#44: portable HERON ETL for NAACCR
--+---
Reporter: dconnolly | Owner: prakashnadkarni
Type: enhancement | Status: assigned
Priority: major | Milestone: bc-survey-cohort-def
Component: etl-dev | Resolution:
Keywords: breast-cancer-cohort | Blocked By:
Blocking: 119, 227 |
--+---

Comment (by dconnolly):

Prakash,

Did you include the SEER site recode in your ETL? We're not seeing any
breast cancer diagnosis data (`\i2b2\naaccr\SEER Site\Breast\`) in what
you submitted 02/11/2015 3:26pm.

For reference:
- [https://informatics.kumc.edu/work/wiki/TumorRegistry TumorRegistry]
in the KUMC wiki
- [https://informatics.kumc.edu/work/blog/2012/01/seer-recode-sql
Adding SEER Site Recode to HERON Tumor Registry integration] Jan 2012 blog
item

--
Ticket URL: 
http://informatics.gpcnetwork.org/trac/Project/ticket/44#comment:17
gpc-informatics http://informatics.gpcnetwork.org/
Greater Plains Network - Informatics



Notice: This UI Health Care e-mail (including attachments) is covered by the 
Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and 
may be legally privileged. If you are not the intended recipient, you are 
hereby notified that any retention, dissemination, distribution, or copying of 
this communication is strictly prohibited. Please reply to the sender that you 
have received the message in error, then delete it. Thank you.

___
Gpc-dev mailing list
Gpc-dev@listserv.kumc.edu
http://listserv.kumc.edu/mailman/listinfo/gpc-dev
___
Gpc-dev mailing list
Gpc-dev@listserv.kumc.edu
http://listserv.kumc.edu/mailman/listinfo/gpc-dev


RE: [gpc-informatics] #44: portable HERON ETL for NAACCR

2015-02-12 Thread Nadkarni, Prakash
We did not include the SEER Site Summary. The ICD-O3 pattern (topography codes) 
for the site C50% (e.g., C508, C509 etc) identify the Breast cancer subtree 
unambiguously,
  and so we used this.
 I believe that the SEER summary is in fact derived from the topography codes, 
which are the primary data entered by the registry folks, and more finely 
granular - e.g., breast cancer has subtypes like paget's disease, etc. - but 
the use of English as opposed to numbers in the SEER field makes the output 
more human-comprehensible without having to use an ICD-O3 code book. To anyone 
who has implemented I2B2, however - I2B2 allows search by keyphrases 
corresponding to the codes -  the SEER field is redundant, and does not add 
information over and above what the ICDO3 topography code provides.

Prakash


From: GPC Informatics [d...@madmode.com]
Sent: Thursday, February 12, 2015 11:43 AM
To: dconno...@kumc.edu; Nadkarni, Prakash; jd...@umn.edu
Cc: ngra...@kumc.edu; ple...@uwhealth.org; gbus...@mcw.edu
Subject: Re: [gpc-informatics] #44: portable HERON ETL for NAACCR

#44: portable HERON ETL for NAACCR
--+---
 Reporter:  dconnolly |   Owner:  prakashnadkarni
 Type:  enhancement   |  Status:  assigned
 Priority:  major |   Milestone:  bc-survey-cohort-def
Component:  etl-dev   |  Resolution:
 Keywords:  breast-cancer-cohort  |  Blocked By:
 Blocking:  119, 227  |
--+---

Comment (by dconnolly):

 Prakash,

 Did you include the SEER site recode in your ETL? We're not seeing any
 breast cancer diagnosis data (`\i2b2\naaccr\SEER Site\Breast\`) in what
 you submitted 02/11/2015 3:26pm.

 For reference:
   - [https://informatics.kumc.edu/work/wiki/TumorRegistry TumorRegistry]
 in the KUMC wiki
 - [https://informatics.kumc.edu/work/blog/2012/01/seer-recode-sql
 Adding SEER Site Recode to HERON Tumor Registry integration] Jan 2012 blog
 item

--
Ticket URL: 
http://informatics.gpcnetwork.org/trac/Project/ticket/44#comment:17
gpc-informatics http://informatics.gpcnetwork.org/
Greater Plains Network - Informatics



Notice: This UI Health Care e-mail (including attachments) is covered by the 
Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and 
may be legally privileged.  If you are not the intended recipient, you are 
hereby notified that any retention, dissemination, distribution, or copying of 
this communication is strictly prohibited.  Please reply to the sender that you 
have received the message in error, then delete it.  Thank you.

___
Gpc-dev mailing list
Gpc-dev@listserv.kumc.edu
http://listserv.kumc.edu/mailman/listinfo/gpc-dev


RE: [gpc-informatics] #44: portable HERON ETL for NAACCR

2015-01-05 Thread Lenon Patrick
Thx again guys, very helpful, many blind alleys avoided.  ;)

-Original Message-
From: Dan Connolly [mailto:dconno...@kumc.edu] 
Sent: Monday, January 05, 2015 3:28 PM
To: Lenon Patrick; 'gpc-dev@listserv.kumc.edu'
Cc: Nathan Graham; Mish Thomas F
Subject: RE: [gpc-informatics] #44: portable HERON ETL for NAACCR

1) I don't remember any waiting period, but it was 2 years ago, so who knows.

2) yes

3) No; ICD-10 isn't relevant to NAACCR ETL.

-- 
Dan


From: Lenon Patrick [ple...@uwhealth.org]
Sent: Monday, January 05, 2015 2:17 PM
To: 'gpc-dev@listserv.kumc.edu'; Dan Connolly
Cc: Nathan Graham; Lenon Patrick; Mish Thomas F
Subject: RE: [gpc-informatics] #44: portable HERON ETL for NAACCR

Thx, some additional questions:

1) I have not heard back from the WHO person, and my membership seems to be 
pending or something, so no downloads show as available for now.  Did you have 
to go through some waiting period like this?   (Maybe I mistakenly got in the 
commercial license queue instead of the research license queue?)

2) In your naaccr_concepts_load.sql, you refer to tables who.topo, who.morph2, 
and who.morph3.  Do those correspond to the ICD-O-3_CSV-metadata.zip and 
ICD-O-2_CSV.zip you refer to below?

3) Did you not download ICD-10?

Thanks as always for your assistance.


-Original Message-
From: GPC Informatics [mailto:d...@madmode.com]
Sent: Monday, January 05, 2015 12:59 PM
To: dconno...@kumc.edu; Lenon Patrick
Cc: ngra...@kumc.edu
Subject: Re: [gpc-informatics] #44: portable HERON ETL for NAACCR

#44: portable HERON ETL for NAACCR
--+---
 Reporter:  dconnolly |   Owner:  lenonpat
 Type:  enhancement   |  Status:  assigned
 Priority:  major |   Milestone:  bc-survey-cohort-def
Component:  etl-dev   |  Resolution:
 Keywords:  breast-cancer-cohort  |  Blocked By:
 Blocking:  119   |
--+---
Changes (by dconnolly):

 * owner:  dconnolly = lenonpat


Comment:

 Patrick,

 I found my (30 Jan 2013) notes on downloading materials from WHO...

 The pointer in my notes is
 http://www.who.int/classifications/icd/adaptations/oncology/en/index.html
 (along with
 
http://apps.who.int/classifications/apps/icd/ClassificationDownloadNR/license.htm
 )

 and I recorded the md5sums of what I downloaded

 - `b088c4e4bd2d685c9dd04e3b3c14c98b` ICD-O-3_CSV-metadata.zip
 - `1308ce6f4ef93c67137154cc6a723fc6` ICD-O-2_CSV.zip

--
Ticket URL: http://informatics.gpcnetwork.org/trac/Project/ticket/44#comment:8
gpc-informatics http://informatics.gpcnetwork.org/
Greater Plains Network - Informatics
___
Gpc-dev mailing list
Gpc-dev@listserv.kumc.edu
http://listserv.kumc.edu/mailman/listinfo/gpc-dev


RE: [gpc-informatics] #44: portable HERON ETL for NAACCR

2015-01-05 Thread Dan Connolly
1) I don't remember any waiting period, but it was 2 years ago, so who knows.

2) yes

3) No; ICD-10 isn't relevant to NAACCR ETL.

-- 
Dan


From: Lenon Patrick [ple...@uwhealth.org]
Sent: Monday, January 05, 2015 2:17 PM
To: 'gpc-dev@listserv.kumc.edu'; Dan Connolly
Cc: Nathan Graham; Lenon Patrick; Mish Thomas F
Subject: RE: [gpc-informatics] #44: portable HERON ETL for NAACCR

Thx, some additional questions:

1) I have not heard back from the WHO person, and my membership seems to be 
pending or something, so no downloads show as available for now.  Did you have 
to go through some waiting period like this?   (Maybe I mistakenly got in the 
commercial license queue instead of the research license queue?)

2) In your naaccr_concepts_load.sql, you refer to tables who.topo, who.morph2, 
and who.morph3.  Do those correspond to the ICD-O-3_CSV-metadata.zip and 
ICD-O-2_CSV.zip you refer to below?

3) Did you not download ICD-10?

Thanks as always for your assistance.


-Original Message-
From: GPC Informatics [mailto:d...@madmode.com]
Sent: Monday, January 05, 2015 12:59 PM
To: dconno...@kumc.edu; Lenon Patrick
Cc: ngra...@kumc.edu
Subject: Re: [gpc-informatics] #44: portable HERON ETL for NAACCR

#44: portable HERON ETL for NAACCR
--+---
 Reporter:  dconnolly |   Owner:  lenonpat
 Type:  enhancement   |  Status:  assigned
 Priority:  major |   Milestone:  bc-survey-cohort-def
Component:  etl-dev   |  Resolution:
 Keywords:  breast-cancer-cohort  |  Blocked By:
 Blocking:  119   |
--+---
Changes (by dconnolly):

 * owner:  dconnolly = lenonpat


Comment:

 Patrick,

 I found my (30 Jan 2013) notes on downloading materials from WHO...

 The pointer in my notes is
 http://www.who.int/classifications/icd/adaptations/oncology/en/index.html
 (along with
 
http://apps.who.int/classifications/apps/icd/ClassificationDownloadNR/license.htm
 )

 and I recorded the md5sums of what I downloaded

 - `b088c4e4bd2d685c9dd04e3b3c14c98b` ICD-O-3_CSV-metadata.zip
 - `1308ce6f4ef93c67137154cc6a723fc6` ICD-O-2_CSV.zip

--
Ticket URL: http://informatics.gpcnetwork.org/trac/Project/ticket/44#comment:8
gpc-informatics http://informatics.gpcnetwork.org/
Greater Plains Network - Informatics
___
Gpc-dev mailing list
Gpc-dev@listserv.kumc.edu
http://listserv.kumc.edu/mailman/listinfo/gpc-dev


Re: [gpc-informatics] #44: portable HERON ETL for NAACCR

2014-08-26 Thread GPC Informatics
#44: portable HERON ETL for NAACCR
-+
 Reporter:  dconnolly|   Owner:  dconnolly
 Type:  enhancement  |  Status:  assigned
 Priority:  major|   Milestone:  data-agg1
Component:  etl-dev  |  Resolution:
 Keywords:   |  Blocked By:
 Blocking:  119  |
-+
Changes (by dconnolly):

 * milestone:   = data-agg1


--
Ticket URL: http://informatics.gpcnetwork.org/trac/Project/ticket/44#comment:3
gpc-informatics http://informatics.gpcnetwork.org/
Greater Plains Network - Informatics
___
Gpc-dev mailing list
Gpc-dev@listserv.kumc.edu
http://listserv.kumc.edu/mailman/listinfo/gpc-dev