Hi,

I have collected hospital data from multiple sources. However, each source 
have different name. Trying to clean list with no duplicates. I am using R 
and couldn't resolve with stringdist_join . Appreciate you suggesting some 
approach. 

For example, Guntur (A.P) is listed with following names. Can we mark (or 
eliminate) duplicate?

Example 1
SANKARA EYE HOSPITAL(GUNTUR) 
SANKARA EYE HOSPITAL 
SANKARA EYE HOSPITAL ( A UNIT OF SRI KANCHI KAMA KOTI MEDICAL TRUST)   


Example 2
ASHIRWAD HEART HOSPITAL ( GHATKOPAR ) 
Ashirwad Heart Hospital 
ASHIRWAD HEART HOSPITAL ( GHATKOPAR ) 
Ashirwad Heart Hospita-Ghatkopar   

Thanks
Ram

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/datameet/19ee8101-84ec-42b0-974a-43035b5902f1n%40googlegroups.com.

Reply via email to