Sorry for the late reply. I was out for some days.

First thanks to those friends who mention the soundex. Since this is a
data structure question, I think just saying using soundex is not the
proper answer. :-) Also, to my understanding, soundex is good for
exact match, not approximate match (find all similar ones).

I got what Dhruva and Arun mean here. Since usually we only care about
the top K most similar words, so for each given words, we can solve th
problem by N + K*logK, where N is to scan through all the words, and
the K*logK is the heap that keep the top K most similar ones. It is
better than the N*logN as I previously thought.

Thanks.


On Mar 28, 8:49 pm, "Kevin" <[EMAIL PROTECTED]> wrote:
> This is from an interview question, see how do you guys think of it:
>
> Given a Enlgish word, how to find these words that pronounce similar
> to it?
>
> You can have some reasonable assumptions, such as: you have an
> dictionary which tells you how each word pronounce (such as its
> phonetic symbols), there is an existing function that can give you the
> similarity of two words' pronunciation (for example, double d =
> checkSimilarity(WordA, WordB), where d is between 0.0 to 1.0). You are
> allowed to have other assumptions, if you think it is reasonable and
> will help.
>
> A straight way is to compare the given word with all the words in the
> dictionary, and order the d values, then return the top ranking ones.
> This will take N comparisons for each given word. Plus the time to
> sort the result (N*logN).
>
> Since this supposed to be a data structure and algorithm question, any
> better ideas?
>
> I am thinking we can precompute a NxN similarity matrix, but even with
> that, it is still N*logN for each word since we need to sort/rank it.
>
> Another method, of course, is to use the above method to precompute
> and pre-ranking for every word --- but this sounds not an interesting
> idea as for an interview question. :-)


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~----------~----~----~----~------~----~------~--~---

Reply via email to