Re: synonym dictionaries of person names

2015-02-01 Thread Itamar Syn-Hershko
Was it raw POS tagged data or just raw data? can you share the code /
process you used?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Jan 29, 2015 at 3:34 PM, Mark Harwood 
mark.harw...@elasticsearch.com wrote:

 I've built one before from raw data but you need:
 1) a *lot* of data
 2) a unique ID per person
 3) some noise/variation in the names recorded for each person

 The input is of this form:

 personID   recorded_name
 ===  =
 1   Rob
 1   Robert
 1   Bob
 2   Dave
 2   David
 2   Alice
 ...

 The output is a weighted graph of name-variant e.g Robert== Bob with a
 strong confidence rating.
 Using this I know not just real names but also typos e.g. that Janes is
 more likely to be James than Jane (a common typo due to key locations
 on keyboard).




 On Thursday, January 29, 2015 at 5:28:33 AM UTC, David Kemp wrote:

 I am looking for synonym dictionaries of person names that I can use with
 the Elasticsearch synonym analyser.
 e.g. dictionaries that map Ted to Edward, and Bill to William.
 I am curious to know what others are using.
 So far I have found these two possible sources:

 https://code.google.com/p/nickname-and-diminutive-names-
 lookup/downloads/list
 https://github.com/DallanQ/Names/wiki/Name-variant-files

 And perhaps
 http://www.behindthename.com

 Thanks,
 David

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/6a473177-7fdd-49d9-95e3-538b51df57f1%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/6a473177-7fdd-49d9-95e3-538b51df57f1%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zup6FroPitENCjBohH8Zxjtcs_H4fCvWmL1nQeD8zZL7w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


synonym dictionaries of person names

2015-01-28 Thread David Kemp
I am looking for synonym dictionaries of person names that I can use with 
the Elasticsearch synonym analyser.
e.g. dictionaries that map Ted to Edward, and Bill to William.
I am curious to know what others are using.
So far I have found these two possible sources:

https://code.google.com/p/nickname-and-diminutive-names-lookup/downloads/list
https://github.com/DallanQ/Names/wiki/Name-variant-files

And perhaps
http://www.behindthename.com

Thanks,
David

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7898de0d-9aaa-456c-9e36-fffd41ff65cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.