Dieter-- You may be able to simply paste your separate address components into single character vectors for each dataframe, change to consistent case with tolower(), and then use agrep() for Levenshtein edit distance approximate matching (minimum number of insertions & deletions). You may or may not want to preprocess (replacing 2 or more consecutive spaces with a single space, etc.).
If not, look in the CRAN task view on Natural Language Processing http://cran.r-project.org/web/views/NaturalLanguageProcessing.html for more tools for approximate matching. Based on my experience with US property assessors' geocoding, I do not recommend approximate matching by component and numerical compositing of the distances: one of the most common variants is the same information put at the end of one component (line) versus the beginning of the next (line). Good luck. Tom On Tue, Jul 24, 2012 at 6:51 AM, Dieter Mayr <dieter.m...@boku.ac.at> wrote: > Dear all, > > I am coming up with a rather simple problem. Maybe someone has experience > with this problem and knows an easy solution... > I want to geocode some household data, which contain the exact adresse > (street, street Nr, postalcode, city) in colums. Futhermore I have another > database with the adresses and the GIS-data of ALL houses in the areas > (again: street, street Nr., postalcode, city). > > So, I simply have to match these two data bases. However in many cases > adresses are sightly different spelled. Thus I think I need some kind of > algorithm to combine this two data. > Does anyone know an easy way how to do it? Rows contain numbers as well as > alphabetical street-/city-names. > > Thanks a lot in advance and kind regards, > > > Dieter Mayr > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-Geo mailing list > R-sig-Geo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-geo _______________________________________________ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo