Can anyone comment on the accuracy of the Tiger geocoder vs MapMarker? Thanks Mark
On Tue, Mar 2, 2010 at 11:40 AM, Stephen Woodbridge <wood...@swoodbridge.com > wrote: > Hi Kevin, > > I have worked with the Tiger data for about 10 years now. The recent > improvements in tiger are really great to see, but not without their own set > of issues. Tiger has a lot of known limitations based on the rules, regs and > requirements of the US Census. The recent work has georectified the street > data and added lots of new streets based on digitizing high-res satellite > imagery. but that does not let you read the street names so they are added > after the fact. There are a lot of street segments that do not have names. > We can only hope that these will be added over time. Because of > non-disclosure, address ranges can be weird also. Many small streets have > address ranges 1-100 encoded on them, in spite of the fact that the real > address ranges only run from 1-20. This has the effect of skewing all the > locations to the front end of the street. > > Because language is ambiguous and typos and sounds-like errors, fuzzy > searching is employed. Most geocoders do some form of fuzzy searching so you > often run into the Main St vs Main Ln issue or you find W Main St when you > are search for E Main St. > > When a geocoder says "Found it!", you need to be prepared to say Found > What? or be tolerant to mis-geocodes. I like geocoders the score the results > and return them in ranked order. > > In general a geocoder can never be better than its data and can in fact be > much worse than its data. Fuzzy searching lets you find possible candidates > in the data that might not have been encoded correctly in either the input > address or the data address, but with the uncertainty that this is the > actual location wanted or not. > > You might also want to look at PAGC Geocoder. It is written in C and uses > some statistical matching techniques which are very good, There are some > change in one of the branches that let you load all the Tiger data for the > US. > > http://www.pagcgeo.org/ > > > -Steve > > > Kevin Galligan wrote: > >> I actually bought an early access copy of the book. I work in linux and >> have been playing around with different geocoders and the tiger files. Most >> recently with a ruby geocoder, for no other reason than I'm trying to find >> one that is fairly complete and functional. >> >> Any idea how "production quality" this particular one is? If its fairly >> high, I'll probably put some time in to get it working on linux. I have the >> full 2009 tiger dataset on an EC2 block drive, waiting to import into a >> different database. >> >> Right now I'm using zip+4 data to get a rough geocode, which is good >> enough for what we're doing, but it only gets 92% of our non-PO Box data. >> From my experience with the tiger data, it only adds a couple percent at >> most above that, but the geocoders I've used have been pretty hacky, so its >> possible that was the issue. Also, some of them seem to not be concerned >> with stuff like matching "Main St" when you're looking for "Main Ln", which >> is pretty terrible. >> >> On the plus side, if there is major work going on with this geocoder (or >> any tiger geocoder), I have a huge national data volume that will help >> stress test the system. >> >> Recently I've been toying with USC's free geocoder project. In some areas >> it actually gets about half of the data I previously could not, which is >> impressive. >> >> The really frustrating thing is, in general, the first 90% is cheap/free. >> The next 3-4% is marginally expensive. The rest is really pricey. >> >> Is there any idea how complete the tiger data is, and why there is this >> apparent lack of data in there? I find it strange. Some streets are just >> missing. Stuff like that. >> >> Rambling. Anyway, will take a look later. Thoughts on the quality of the >> geocoder appreciated. >> >> -Kevin >> >> On Fri, Feb 26, 2010 at 11:52 PM, Paragon Corporation <l...@pcorp.us<mailto: >> l...@pcorp.us>> wrote: >> >> David, >> >> As a matter of fact we've been working on that for chapter 10 of our >> upcoming book and think we have it all working. As a part of the >> example >> generation process for our chapter 10, we had to come up with a way >> to load >> the tables that works on both windows and Linux. Unfortunately we >> haven't >> had a chance to test the Linux loading approach, but is pretty much a >> parallel of the windows approach. >> >> To do so we started out with Steve's code, added some additional >> skeleton >> tables and a database function that generates a command line script >> for the >> respective OS. Hopefully it all makes sense from the readme file we >> have >> packaged. >> >> We also changed one of the functions because there was an error in >> it and >> revised slightly to work with Tiger 2009 data. You can dowload our >> slightly >> hacked version of Steve's code from our chapter 10 page. >> >> Steve -- if you are listening we are hoping to remerge your version >> with our >> loader part and bring back into the PostGIS distribution as part of >> PostGIS >> 1.5.1 or 2.0 release. >> >> http://www.postgis.us/chapter_10 >> >> >> Leo and Regina >> http://www.postgis.us/ >> >> >> -----Original Message----- >> From: postgis-users-boun...@postgis.refractions.net >> <mailto:postgis-users-boun...@postgis.refractions.net> >> [mailto:postgis-users-boun...@postgis.refractions.net >> <mailto:postgis-users-boun...@postgis.refractions.net>] On Behalf Of >> Dave >> Fuhry >> Sent: Friday, February 26, 2010 3:04 PM >> To: PostGIS Users Discussion >> Subject: [postgis-users] TIGER geocoder with Census 2009 shapefiles >> >> I'm trying to set up the TIGER geocoder from >> http://www.snowman.net/git/tiger_geocoder/ which is new and aims to >> work >> with the new TIGER shapefiles. I'm trying with the 2009 shapefiles >> from >> www2.census.gov/geo/tiger/TIGER2009/ >> <http://www2.census.gov/geo/tiger/TIGER2009/>. >> >> >> I'm not sure how to create the roads_local table (derived closely from >> completechain in the old version). A join between edges and addr? >> >> Wondering if anyone can offer any direction. A relevant ticket is >> http://trac.osgeo.org/postgis/ticket/135. The out-of-date file >> which used >> to create the roads_local table is tables/roads_local.sql, in the above >> repository. >> >> -Dave >> >> Table "tiger.edges" >> Column | Type | Modifiers >> >> ------------+------------------------+---------------------------------- >> ------------+------------------------+-------------------------- >> gid | integer | not null default >> nextval('public.edges_gid_seq'::regclass) >> statefp | character varying(2) | >> countyfp | character varying(3) | >> tlid | bigint | >> tfidl | bigint | >> tfidr | bigint | >> mtfcc | character varying(5) | >> fullname | character varying(100) | >> smid | character varying(22) | >> lfromadd | character varying(12) | >> ltoadd | character varying(12) | >> rfromadd | character varying(12) | >> rtoadd | character varying(12) | >> zipl | character varying(5) | >> zipr | character varying(5) | >> featcat | character varying(1) | >> hydroflg | character varying(1) | >> railflg | character varying(1) | >> roadflg | character varying(1) | >> olfflg | character varying(1) | >> passflg | character varying(1) | >> divroad | character varying(1) | >> exttyp | character varying(1) | >> ttyp | character varying(1) | >> deckedroad | character varying(1) | >> artpath | character varying(1) | >> persist | character varying(1) | >> gcseflg | character varying(1) | >> offsetl | character varying(1) | >> offsetr | character varying(1) | >> tnidf | bigint | >> tnidt | bigint | >> the_geom | public.geometry | >> >> >> Table "tiger.addr" >> Column | Type | Modifiers >> >> -----------+-----------------------+------------------------------------ >> -----------+-----------------------+----------------------- >> gid | integer | not null default >> nextval('public.addr_gid_seq'::regclass) >> tlid | bigint | >> fromhn | character varying(12) | >> tohn | character varying(12) | >> side | character varying(1) | >> zip | character varying(5) | >> plus4 | character varying(4) | >> fromtyp | character varying(1) | >> totyp | character varying(1) | >> fromarmid | integer | >> toarmid | integer | >> arid | character varying(22) | >> mtfcc | character varying(5) | >> statefp | character varying(2) | not null >> _______________________________________________ >> postgis-users mailing list >> postgis-users@postgis.refractions.net >> <mailto:postgis-users@postgis.refractions.net> >> >> http://postgis.refractions.net/mailman/listinfo/postgis-users >> >> >> _______________________________________________ >> postgis-users mailing list >> postgis-users@postgis.refractions.net >> <mailto:postgis-users@postgis.refractions.net> >> >> http://postgis.refractions.net/mailman/listinfo/postgis-users >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> postgis-users mailing list >> postgis-users@postgis.refractions.net >> http://postgis.refractions.net/mailman/listinfo/postgis-users >> > > _______________________________________________ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users > -- Mark Vantzelfde NetMasters, Inc.
_______________________________________________ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users