Re: synonym dictionaries of person names
Was it raw POS tagged data or just raw data? can you share the code / process you used? -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Lucene.NET committer and PMC member On Thu, Jan 29, 2015 at 3:34 PM, Mark Harwood < mark.harw...@elasticsearch.com> wrote: > I've built one before from raw data but you need: > 1) a *lot* of data > 2) a unique ID per person > 3) some noise/variation in the names recorded for each person > > The input is of this form: > > personID recorded_name > === = > 1 Rob > 1 Robert > 1 Bob > 2 Dave > 2 David > 2 Alice > ... > > The output is a weighted graph of name<->variant e.g Robert== Bob with a > strong confidence rating. > Using this I know not just real names but also typos e.g. that "Janes" is > more likely to be "James" than "Jane" (a common typo due to key locations > on keyboard). > > > > > On Thursday, January 29, 2015 at 5:28:33 AM UTC, David Kemp wrote: >> >> I am looking for synonym dictionaries of person names that I can use with >> the Elasticsearch synonym analyser. >> e.g. dictionaries that map "Ted" to "Edward", and "Bill" to "William". >> I am curious to know what others are using. >> So far I have found these two possible sources: >> >> https://code.google.com/p/nickname-and-diminutive-names- >> lookup/downloads/list >> https://github.com/DallanQ/Names/wiki/Name-variant-files >> >> And perhaps >> http://www.behindthename.com >> >> Thanks, >> David >> > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/6a473177-7fdd-49d9-95e3-538b51df57f1%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/6a473177-7fdd-49d9-95e3-538b51df57f1%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zup6FroPitENCjBohH8Zxjtcs_H4fCvWmL1nQeD8zZL7w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: synonym dictionaries of person names
I've built one before from raw data but you need: 1) a *lot* of data 2) a unique ID per person 3) some noise/variation in the names recorded for each person The input is of this form: personID recorded_name === = 1 Rob 1 Robert 1 Bob 2 Dave 2 David 2 Alice ... The output is a weighted graph of name<->variant e.g Robert== Bob with a strong confidence rating. Using this I know not just real names but also typos e.g. that "Janes" is more likely to be "James" than "Jane" (a common typo due to key locations on keyboard). On Thursday, January 29, 2015 at 5:28:33 AM UTC, David Kemp wrote: > > I am looking for synonym dictionaries of person names that I can use with > the Elasticsearch synonym analyser. > e.g. dictionaries that map "Ted" to "Edward", and "Bill" to "William". > I am curious to know what others are using. > So far I have found these two possible sources: > > > https://code.google.com/p/nickname-and-diminutive-names-lookup/downloads/list > https://github.com/DallanQ/Names/wiki/Name-variant-files > > And perhaps > http://www.behindthename.com > > Thanks, > David > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6a473177-7fdd-49d9-95e3-538b51df57f1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
synonym dictionaries of person names
I am looking for synonym dictionaries of person names that I can use with the Elasticsearch synonym analyser. e.g. dictionaries that map "Ted" to "Edward", and "Bill" to "William". I am curious to know what others are using. So far I have found these two possible sources: https://code.google.com/p/nickname-and-diminutive-names-lookup/downloads/list https://github.com/DallanQ/Names/wiki/Name-variant-files And perhaps http://www.behindthename.com Thanks, David -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7898de0d-9aaa-456c-9e36-fffd41ff65cc%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.