Dear Dan,

Sorry for the delayed response, but I wanted to write back about the
block_crosswalk and how it can be used with the geocoded patient records
and the census data.

The geocoded patient records have an attribute called *ID*, which is the
2010 census block assigned to each case.

The block_crosswalk file has unique identifiers for a wide variety of
census units, including 2010 census tracts, 2010 census block groups, 2010
county subdivisions, 2010 counties, 2010 urban areas, 2013 census tracts,
2013 places, etc.

Census and ACS data are available for those different geographic units
(2010 census tracts, 2010 census block groups, 2010 county subdivisions,
2010 counties, 2010 urban areas, 2013 census tracts, 2013 places, etc).

How I would use the block crosswalk is as follows:

1. Join the block_crosswalk to the patient records. Use GBLOCK in the
crosswalk and ID in the patient records as the linking keys.

2. On the joined data, keep only the unique identifiers you want from the
crosswalk (GTRACT, GBG, GTRACT_ACS). So, you will have a patient record
with a 2010 census block ID and one or more IDs from the block crosswalk.

3. You can then pull census data from the appropriate geographic unit
(e.g., tract, block group, county) and link it to the patient records.

By using the block_crosswalk, you won't have to jump through so many hoops
to create unique identifiers (e.g., no substrings required).

Questions - let me know!

Yours,
Dave

On Mon, Jan 9, 2017 at 5:00 PM, Dan Connolly <dconno...@kumc.edu> wrote:

> Thanks, Dave (and Tom and company).
>
> I had lost track of KUMC_census_acs_documentation.docx.
>
> And yes, I have the block_crosswalk. It wasn't obvious to me how to use it
> when I first looked at it, but perhaps I'll get it soon. Meanwhile, an
> example or two would be nice to have.
>
> Maybe we can trade?
>
> Some notes from Maren and Sravani include:
>
> Sample query to get median incomes by block group
>  {{{
>  select BLK.STUSAB,BLK.STATE,BLK.COUNTY,BLK.TRACT,BLK.BLKGRP,BLK.UHD001 as
>  MedianHouseholdIncome,GEOKUMC.* from mpc.geocoded_kumc GEOKUMC
>  JOIN MPC.ACS_20135A_BLCK_GRP_150_2 BLK ON BLK.TRACT=GEOKUMC.TRACT_ID AND
>  BLK.STATE=GEOKUMC.FIPSST AND BLK.BLKGRP=SUBSTR(GEOKUMC.BLOCK_ID,1,1);
>  }}}
>
>
> Sample query to get median incomes by Tract
>  {{{
>  select TRACT.UHD001,GEOKUMC.* from mpc.geocoded_kumc GEOKUMC
>  JOIN MPC.ACS_20135A_TRACT_140_2 TRACT ON TRACT.TRACT=GEOKUMC.TRACT_ID AND
>  TRACT.STATE=GEOKUMC.FIPSST;
>  }}}
>
>
> These use an MPC database schema something like:
> https://informatics.gpcnetwork.org/trac/Project/
> attachment/ticket/509/geo_acs_tables.sql
> generated using https://github.com/dckc/i2b2-
> geo/blob/master/census_schema_gen.py
>
> --
> Dan
>
> ------------------------------
> *From:* David Van Riper [vanri...@umn.edu]
> *Sent:* Monday, January 09, 2017 4:46 PM
> *To:* Dan Connolly
> *Cc:* <gpc-dev@listserv.kumc.edu>; Sravani Chandaka; Maren Wennberg;
> Green, Timothy A.; Alex Bokov
> *Subject:* Re: geocoding: how to join at tract, census block resolution?
>
> Dear Dan,
>
> In the Word document (KUMC_census_acs_documentation.docx) available at
> this link
> <https://drive.google.com/folderview?id=0B_ChrVjKL2WYMXE3azlGYm54Zkk&usp=sharing>,
> I included the fields that are required to uniquely identify census units
> in the various .dat files. They are listed on pages 5 and 6.
>
> Block groups are uniquely identified by concatenating STATE, COUNTY,
> TRACT, and BLKGRP. Census tracts are uniquely identified by concatenating
> STATE, COUNTY, and TRACT.
>
> On the kumc_geocoded.txt you will want to do the following concatenations
> for tract and block group:
>
> FIPSSTCO + TRACT_ID = census tract
>
> FIPSSTCO + TRACT_ID + first character of the BLOCK_ID (left most
> character) = block group
>
> FIPSTCO + TRACT_ID + BLOCK_ID = census block
>
>
> Do you also have access to a zip archive called block_crosswalk.zip? It
> should have been provided to each site from Justin Dale. That ZIP archive
> contains a crosswalk that links census blocks to a large number of other
> census identifiers. You can use the crosswalk to link census identifiers to
> the geocoded data on the census block.
>
> Yours,
> Dave
>
>
>
> On Mon, Jan 9, 2017 at 2:38 PM, Dan Connolly <dconno...@kumc.edu> wrote:
>
>> Hi Dave,
>>
>> More questions!
>>
>> I'm copying gpc-dev so we can all learn together. Before you reply all,
>> note the public archive http://listserv.kumc.edu/pipermail/gpc-dev/
>>
>> I'm working with Maren and Sravani on this stuff. Joining on zip code was
>> straightforward: we used the zcta5 column of acs_20135a/*zcta_860*
>> /ge.00_file.dat.gz.
>>
>> But now we're trying to join on tract and census block. What columns of
>> acs_20135a/*blck_grp_150*/ge.00_file.dat.gz and acs_20135a/*tract_140*
>> /ge.00_file.dat.gz should we use? What are their primary key columns?
>> Which documentation would  tell us this?
>>
>> BLKGRP looks like a single-digit number; it's clearly not a unique key to
>> the blck_grp_150 data.
>>
>> TRACT is a bigger number, but it's not a unique key to *tract_140*
>> either. It seems to be unique within state, though. Is that right? Should
>> we join TRACT_ID and ST_ABRV from geocoded_kumc.txt with TRACT and STUSAB
>> of tract_140?
>>
>> Have any other GPC sites figured this out yet? MU? UTHSCSA?
>>
>> For reference, these are the columns of our kumc_geocoded.txt data:
>>
>>         'OID,Join_Count,TARGET_FID,Loc_name,Status,Score,Match_type,'
>>         'X,Y,Match_addr,DISP_LON,DISP_LAT,SIDE,'
>>         'ARC_Address,ARC_City,ARC_State,'
>>         'ARC_Zip,ADDRESS,CITY,STATE,ZIP,New_X,New_Y,ID,BLOCK_ID,'
>>         'FIPSST,FIPSCO,FIPSSTCO,TRACT_ID,ST_ABRV,CO_NAME,ST_NAME
>>
>> context:
>> https://informatics.gpcnetwork.org/trac/Project/ticket/140
>> https://informatics.gpcnetwork.org/trac/Project/ticket/350
>>
>> --
>> Dan
>>
>>
>
>
> --
> David Van Riper
> Spatial Analysis Core Director
> Minnesota Population Center
> University of Minnesota
>
> Email: vanri...@umn.edu
> Phone: 612-625-7375 <(612)%20625-7375>
>



-- 
David Van Riper
Spatial Analysis Core Director
Minnesota Population Center
University of Minnesota

Email: vanri...@umn.edu
Phone: 612-625-7375
_______________________________________________
Gpc-dev mailing list
Gpc-dev@listserv.kumc.edu
http://listserv.kumc.edu/mailman/listinfo/gpc-dev

Reply via email to