[algogeeks] Re: Hashing with strings

Don Tue, 26 Jul 2011 06:31:18 -0700

Usually a hash function is one-way, meaning that you can't recover the
original string. That is because there are many more possible values
for the string than for the hash index, making the hash function a
many-to-one relationship.


A very common hash function which I believe was mentioned in Knuth was

int hash(char *string)
{
  int result = 0;
  for(int i = 0; string[i]; ++i)
    result = 61*result + string[i];
  return result;
}

The result would then be used modulus the table size.

Here is one use for hashing:

If you want to find duplicated words in a large file, you could build
a hash table with words you have encountered in the file. Each time
you encounter a word, you would first search for it to determine if it
is already in the table. If so, you would increment its counter.
Otherwise, you would add it and set the counter to one. Then, when you
are done processing the file, any entries with a counter value of more
than one would be duplicated words. This would be far faster than
searching clear through the file for each word.

Don

On Jul 26, 8:13 am, syl <abeygau...@gmail.com> wrote:
> thanks for the info...i saw a method of using 4 bytes of string
> together and then add them and finally take a modulus....doing such a
> complex thing ...is thr any way to recover the string back using the
> key only.....can you give an example where you have seen using hashing
> with strings...that would clarify my doubt to more extent...
>
> On Jul 26, 6:07 pm, Don <dondod...@gmail.com> wrote:
>
> > A string hash function typically takes a string as an argument and
> > returns an integer which can be used as an index into a hash table
> > which allows it to be found quickly. The purpose is to relate a string
> > to something else in an efficient way. For instance, a symbol table
> > which stores variable names and needs to quickly find the type and the
> > value (or the address of the value) could use a hash table to avoid
> > having to search through the whole table. There are many theories on
> > the best hash functions for hashing strings. While two different
> > strings may produce the same hash index, a good hash function should
> > produce a reasonably uniform distribution of results, and should
> > produce different results for similar strings. Any hashing method
> > needs to deal with "collisions" or how to handle the case where two
> > strings are hashed to the same index. Some schemes use a linked list
> > or binary tree withing the hash "bucket", and some use a rehash which
> > stores one of the strings somewhere else. If a hash table becomes too
> > full, it can become much less efficient, particularly if the method of
> > resolving collisions is not well designed.
>
> > Don
>
> > On Jul 26, 7:49 am, syl <abeygau...@gmail.com> wrote:
>
> > > can anyone please tell me how to do hashing with strings......just
> > > wanna confirm...and most importantly what is the use of it...i have
> > > never used it....
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to algogeeks@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.

[algogeeks] Re: Hashing with strings

Reply via email to