Hello,

I was reviewing our code and noticed that we were not joining on county to get 
the median income by block group. A sample query to get median income by block 
group should instead be:

{{{
select BLK.STUSAB,BLK.STATE,BLK.COUNTY,BLK.TRACT,BLK.BLKGRP,BLK.UHD001 as 
MedianIncome,GEOKUMC.* from mpc.geocoded_kumc GEOKUMC
JOIN MPC.ACS_20135A_BLCK_GRP_150_2 BLK ON BLK.TRACT=GEOKUMC.TRACT_ID AND 
BLK.STATE=GEOKUMC.FIPSST AND BLK.BLKGRP=SUBSTR(GEOKUMC.BLOCK_ID,1,1) AND 
BLK.COUNTY = GEOKUMC.FIPSCO;
}}}

Best,
Maren

From: Dan Connolly
Sent: Monday, January 09, 2017 5:01 PM
To: David Van Riper <vanri...@umn.edu>
Cc: <gpc-dev@listserv.kumc.edu> <gpc-dev@listserv.kumc.edu>; Sravani Chandaka 
<schand...@kumc.edu>; Maren Wennberg <mwennb...@kumc.edu>; Green, Timothy A. 
<green...@health.missouri.edu>; Alex Bokov <bo...@uthscsa.edu>
Subject: RE: geocoding: how to join at tract, census block resolution?

Thanks, Dave (and Tom and company).

I had lost track of KUMC_census_acs_documentation.docx.

And yes, I have the block_crosswalk. It wasn't obvious to me how to use it when 
I first looked at it, but perhaps I'll get it soon. Meanwhile, an example or 
two would be nice to have.

Maybe we can trade?

Some notes from Maren and Sravani include:

Sample query to get median incomes by block group
 {{{
 select BLK.STUSAB,BLK.STATE,BLK.COUNTY,BLK.TRACT,BLK.BLKGRP,BLK.UHD001 as
 MedianHouseholdIncome,GEOKUMC.* from mpc.geocoded_kumc GEOKUMC
 JOIN MPC.ACS_20135A_BLCK_GRP_150_2 BLK ON BLK.TRACT=GEOKUMC.TRACT_ID AND
 BLK.STATE=GEOKUMC.FIPSST AND BLK.BLKGRP=SUBSTR(GEOKUMC.BLOCK_ID,1,1);
 }}}


Sample query to get median incomes by Tract
 {{{
 select TRACT.UHD001,GEOKUMC.* from mpc.geocoded_kumc GEOKUMC
 JOIN MPC.ACS_20135A_TRACT_140_2 TRACT ON TRACT.TRACT=GEOKUMC.TRACT_ID AND
 TRACT.STATE=GEOKUMC.FIPSST;
 }}}


These use an MPC database schema something like:
https://informatics.gpcnetwork.org/trac/Project/attachment/ticket/509/geo_acs_tables.sql
generated using 
https://github.com/dckc/i2b2-geo/blob/master/census_schema_gen.py

--
Dan
________________________________
From: David Van Riper [vanri...@umn.edu]
Sent: Monday, January 09, 2017 4:46 PM
To: Dan Connolly
Cc: <gpc-dev@listserv.kumc.edu<mailto:gpc-dev@listserv.kumc.edu>>; Sravani 
Chandaka; Maren Wennberg; Green, Timothy A.; Alex Bokov
Subject: Re: geocoding: how to join at tract, census block resolution?
Dear Dan,

In the Word document (KUMC_census_acs_documentation.docx) available at this 
link<https://drive.google.com/folderview?id=0B_ChrVjKL2WYMXE3azlGYm54Zkk&usp=sharing>,
 I included the fields that are required to uniquely identify census units in 
the various .dat files. They are listed on pages 5 and 6.

Block groups are uniquely identified by concatenating STATE, COUNTY, TRACT, and 
BLKGRP. Census tracts are uniquely identified by concatenating STATE, COUNTY, 
and TRACT.

On the kumc_geocoded.txt you will want to do the following concatenations for 
tract and block group:

FIPSSTCO + TRACT_ID = census tract

FIPSSTCO + TRACT_ID + first character of the BLOCK_ID (left most character) = 
block group

FIPSTCO + TRACT_ID + BLOCK_ID = census block


Do you also have access to a zip archive called block_crosswalk.zip? It should 
have been provided to each site from Justin Dale. That ZIP archive contains a 
crosswalk that links census blocks to a large number of other census 
identifiers. You can use the crosswalk to link census identifiers to the 
geocoded data on the census block.

Yours,
Dave



On Mon, Jan 9, 2017 at 2:38 PM, Dan Connolly 
<dconno...@kumc.edu<mailto:dconno...@kumc.edu>> wrote:
Hi Dave,

More questions!

I'm copying gpc-dev so we can all learn together. Before you reply all, note 
the public archive http://listserv.kumc.edu/pipermail/gpc-dev/

I'm working with Maren and Sravani on this stuff. Joining on zip code was 
straightforward: we used the zcta5 column of 
acs_20135a/zcta_860/ge.00_file.dat.gz.

But now we're trying to join on tract and census block. What columns of 
acs_20135a/blck_grp_150/ge.00_file.dat.gz and 
acs_20135a/tract_140/ge.00_file.dat.gz should we use? What are their primary 
key columns? Which documentation would  tell us this?

BLKGRP looks like a single-digit number; it's clearly not a unique key to the 
blck_grp_150 data.

TRACT is a bigger number, but it's not a unique key to tract_140 either. It 
seems to be unique within state, though. Is that right? Should we join TRACT_ID 
and ST_ABRV from geocoded_kumc.txt with TRACT and STUSAB of tract_140?

Have any other GPC sites figured this out yet? MU? UTHSCSA?

For reference, these are the columns of our kumc_geocoded.txt data:

        'OID,Join_Count,TARGET_FID,Loc_name,Status,Score,Match_type,'
        'X,Y,Match_addr,DISP_LON,DISP_LAT,SIDE,'
        'ARC_Address,ARC_City,ARC_State,'
        'ARC_Zip,ADDRESS,CITY,STATE,ZIP,New_X,New_Y,ID,BLOCK_ID,'
        'FIPSST,FIPSCO,FIPSSTCO,TRACT_ID,ST_ABRV,CO_NAME,ST_NAME

context:
https://informatics.gpcnetwork.org/trac/Project/ticket/140
https://informatics.gpcnetwork.org/trac/Project/ticket/350

--
Dan



--
David Van Riper
Spatial Analysis Core Director
Minnesota Population Center
University of Minnesota

Email: vanri...@umn.edu<mailto:vanri...@umn.edu>
Phone: 612-625-7375
_______________________________________________
Gpc-dev mailing list
Gpc-dev@listserv.kumc.edu
http://listserv.kumc.edu/mailman/listinfo/gpc-dev

Reply via email to