[algogeeks] Re: check string A and B isPermutation ?

2006-04-16 Thread Timak

Padmanabhan, can you elaborate a bit on your suggestion?

Padmanabhan Natarajan [EMAIL PROTECTED] writes:

 how about a randomized approach if the string is truly big. Generate
 *relatively* prime numbers for each character in string and compute the
 product. Repeat it N times to minimize the chances of error. It must easy to
 observe that as you increase N, your chances of error decreases.

 On 4/14/06, Timak [EMAIL PROTECTED] wrote:


 Gene,

 If in reality space is no consern, I'd argue that just comparing
 the histograms (what BigYan suggested, except that he tried to
 optimize it) is the best approach.  It is O(2N) time and O(w) space,
 while your proposal is O(2wN) time and O(N) space.  Copmaring the
 histograms is also the simplest.

 Timak

 Gene  [EMAIL PROTECTED] writes:

  Let's be real.  I can't imagine a application where it could make much
  of a difference.  Can you?
 
  Almost certainly it is simplicity and beauty and angels dancing on pins
  we're talking about.
 
  By the way, you can add about 16 characters to my function above and it
  becomes a true sort (not a sort into 1-hamming or gray code order).
 
 
 


 

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-15 Thread Gene


Timak wrote:
 Hey, I think it's cool too :-) BTW, how exactly do you generalize it?

Well, I'm going to let that as a puzzle for Algorithm Geeks!  It's
equally cool.  And I've never seen it published anywhere.


  There are a few other things to think about however.
 
  First, simplest depends on environment.  In perl you can compare two
  strings $a and $b with
 
  if (join('', sort(split '', $a)) eq join('', sort(split '', $b)) {
# a is a permutation of b!
  }
 
  Histogram code would be more complex and would probably run slower.

 Why, can be pretty simple too.  Something like:

 bool is_perm(char* a, char* b)
 {
   while (*a) hist[*(a++)]++;
   while (*b) if (--hist[*(b++)]  0) return true;
   return false;
 }

Ah, but in the first place I was talking about Perl.  Histogramming
with arrays in Perl is really awkward. My whole point is that different
environments often lead to different best solutions.  You also have
omitted zeroing the histogram, which isn't fair because your function
won't work without it.

Initializing arrays often crops up as the most expensive operation in
interesting algorithms.
So I'll point out that there is a cool data structure that you can
tack on to an array.  This data structure needs only constant time
per array access to maintain.  Yet it allows you to treat the array as
though it's initialized to zero even though it really is not.  The down
side is that this data structure needs O(V) additional storage, where V
is the number of array entries eventually set to values different from
zero.  Again I'll let the details as a puzzle for the news group.

Unfortunately, your code above doesn't work.  For example it will
return false on is_perm(a, a) .  Did you mean to have the true and
false in the other order?  Even if you reverse them it doesn't work.
Try is_perm(aa, a) .  Your code will miss the fact that hist['a']
is 1 at the end.  So your algorithm should really be called
is_superset(a, b)

 However, please note that you don't need to compare the
 histograms (although those were my words).  You just need to
 accumulate the histogram of the first string, and then just decrement
 the counts when scanning the second one; since you would know that the
 lengths are the same, a trial to decrement a zero would mean the
 strings are not permutations.  All this is O(N).

Sorry this is not correct. To make your algorithm work, you'd need to
verify that the incremented/decremented histogram is all zeros at the
end (see my example above).  This final verification pass is O(2^w).

Another approach that doesn't need the verification pass is to use
is_superset(a, b)  is_superset(b, a).  But now you need to zero the
histogram twice!

IMHO if you have big characters, the histogram is best kept in a hash
table where the keys are letters and the values are counts.   But this
again is more complex.

Great discussion!  Thanks.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-15 Thread Timak

Yeah, I mistakenly swaped the true and false.

My code works, but it assumes that you know that the lengths of the
strings are the same.  BTW, the lengths could be calculated on the
fly.  Also the hist doesn't have to be a vector, it could be a hash,
as you mentioned.

Here's a full c++ implementation using an STL map (which is most often
implemented using a red-black tree).

#include iostream
#include map

std::mapchar, int hist;

bool is_perm(char* a, char* b)
{
int cnt = 0;
while (*a) {
hist[*(a++)]++;
cnt++;
}
while (*b) {
if (--hist[*(b++)]  0)
return false;
cnt--;
}
return !cnt;
}

main() {
std::cout  (is_perm(aab, abab) ? y : n)
   std::endl;
}

I think STL calls the default constructors so no need make sure the
counters are zero.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-14 Thread Timak

Gene, 

If in reality space is no consern, I'd argue that just comparing
the histograms (what BigYan suggested, except that he tried to
optimize it) is the best approach.  It is O(2N) time and O(w) space,
while your proposal is O(2wN) time and O(N) space.  Copmaring the
histograms is also the simplest.

Timak

Gene [EMAIL PROTECTED] writes:

 Let's be real.  I can't imagine a application where it could make much
 of a difference.  Can you?

 Almost certainly it is simplicity and beauty and angels dancing on pins
 we're talking about.

 By the way, you can add about 16 characters to my function above and it
 becomes a true sort (not a sort into 1-hamming or gray code order).


 

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-14 Thread Gene

I agree with much of what you say!  I just think character sorting into
hamming-1 order is cool, especially since it generalizes to a true sort
with just a slight modification.

There are a few other things to think about however.

First, simplest depends on environment.  In perl you can compare two
strings $a and $b with

if (join('', sort(split '', $a)) eq join('', sort(split '', $b)) {
  # a is a permutation of b!
}

Histogram code would be more complex and would probably run slower.

Another thing is that it's meaningless to talk about O(2N) because
O(2N)=O(N).

Finally, for characters of w bits, the histogram requires O(2^w log N)
space in the general case, not O(w) as you say.

Moreover, run time is also O(2^w) to compare those histograms.  For
example, if you are working in Unicode where characters might have up
to 4 bytes, you'd need 4 billion histogram entries (perhaps 16
gigabytes), and it would take a while to compare these.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-14 Thread Padmanabhan Natarajan
how about a randomized approach if the string is truly big. Generate *relatively* prime numbers for each character in string and compute the product. Repeat it N times to minimize the chances of error. It must easy to observe that as you increase N, your chances of error decreases.
On 4/14/06, Timak [EMAIL PROTECTED] wrote:

Gene,If in reality space is no consern, I'd argue that just comparingthe histograms (what BigYan suggested, except that he tried tooptimize it) is the best approach.It is O(2N) time and O(w) space,

while your proposal is O(2wN) time and O(N) space.Copmaring thehistograms is also the simplest.TimakGene 
[EMAIL PROTECTED] writes:
 Let's be real.I can't imagine a application where it could make much of a difference.Can you? Almost certainly it is simplicity and beauty and angels dancing on pins we're talking about.
 By the way, you can add about 16 characters to my function above and it becomes a true sort (not a sort into 1-hamming or gray code order).

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups Algorithm Geeks group.  To post to this group, send email to algogeeks@googlegroups.com  To unsubscribe from this group, send email to [EMAIL PROTECTED]  For more options, visit this group at http://groups.google.com/group/algogeeks  -~--~~~~--~~--~--~---


[algogeeks] Re: check string A and B isPermutation ?

2006-04-13 Thread Gene

Let's be real.  I can't imagine a application where it could make much
of a difference.  Can you?

Almost certainly it is simplicity and beauty and angels dancing on pins
we're talking about.

By the way, you can add about 16 characters to my function above and it
becomes a true sort (not a sort into 1-hamming or gray code order).


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-11 Thread Rajiv Mathews

I had written..
So, using a sort the complexity really is
O(2n log 2), Isn't it?
Oh yeah..
That's terribly bad logic!

@Gene
Use radix sort to make it O(N W) where W is the bit-width of the
characters in the string, hence O(N).
I don't really get how a radix pass
is going to help here either..
Wouldn't it just order the strings, while
we really want to order the characters in
the strings?


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-11 Thread [EMAIL PROTECTED]

i think we just construct a string C that repeate string A twice, for
example, string A is 'abcd', and string C is 'abcdabcd', then check
whether string B is a substring of string C.

by the way, the string C can be saved by utilizing string A and modular
function, for example, the char at position i in string C is exactly
the char at i%len(A) in string A.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-11 Thread BiGYaN

Assuming that you cannot have any letter repeated more than 15 times.

Create a struct of 26, 4 bit bitfields. This is going to take 26x4 bits
or 13 bytes of memory.

Use this structure to store the character count of each character of
the first string.

Now proceed to the second string and for each character, decrease it's
corresponding character count by 1 if it is greater than 0. If it is 0
then the second string is not a permutation of the first string.

At the end of processin the second string check whether each caracter
count is 0. If they all are 0 then the strings are permutations of one
another.

Of course you can modify the algo to accomodate higher maximum
character counts. In any case the time required is O(n) and space
required is 26 x ceiling ( log ( maximum_individual_character_count,
base=2) ) bits.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-11 Thread Padmanabhan Natarajan
How about the prime number approach I mentioned above. Any comments?On 4/11/06, Dhyanesh [EMAIL PROTECTED]
 wrote:I also particularly dont like radix sort. Because it is kind of a
hack. It is not a comparison based sort. It will work with charactersas you have here. But what if you have not two strings of characters,but two arrays of objects which you can only compare. You would
not be able to apply radix sort then.-DhyaneshOn 4/11/06, Kevin [EMAIL PROTECTED] wrote: Talking about radix sorting, my experience is that many job
 interviewers (who are coders/developer themselvef) do not like it. Each time when I mentioned it, they usually did not get happy with it. What could be the reason? Any bad things in practical for radix
 sorting? 

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups Algorithm Geeks group.  To post to this group, send email to algogeeks@googlegroups.com  To unsubscribe from this group, send email to [EMAIL PROTECTED]  For more options, visit this group at http://groups.google.com/group/algogeeks  -~--~~~~--~~--~--~---


[algogeeks] Re: check string A and B isPermutation ?

2006-04-11 Thread timak

With the prime number approach you would need to somehow facilitate
large numbers.  Also, you need to store those primes which would take
a lot of space, although admittedly you only need to take that space
once.  In addition, multiplication is a much heavier opperation than
increment.

BiGYaN's approach is nice but does not scale for large strings.  If
the strings are enourmous then an array of links to the counts with
the ability to increase the number of bits in a counter on the fly
might work, although that's getting complex and would take space.

If space is the main consern, wouldn't some sort of radix sort work
here?

You could sort the bytes of each string according to one bit (say the
least significant of the relevant bits), that would partition the
strings into two parts.  The sort could be done by starting both from
left and right ends and exchanging elements that are not in the appropriate
partition (left or right).  Once done with the first bit, check if the
partitioning position is different, if it is then A is no permitation
of B and stop.  Otherwise continue with the next relevant bit.

This is O(N * w) where w is the number of bits need to be checked
before returning FALSE or the total number of relevant bits in a
character in the worst case.  No additional space required.

--Timak

Padmanabhan Natarajan [EMAIL PROTECTED] writes:

 How about the prime number approach I mentioned above. Any comments?

 On 4/11/06, Dhyanesh [EMAIL PROTECTED]  wrote:


 I also particularly dont like radix sort. Because it is kind of a
 hack. It is not a comparison based sort. It will work with characters
 as you have here. But what if you have not two strings of characters,
 but two arrays of objects which you can only compare. You would
 not be able to apply radix sort then.

 -Dhyanesh

 On 4/11/06, Kevin [EMAIL PROTECTED] wrote:
 
  Talking about radix sorting, my experience is that many job
  interviewers (who are coders/developer themselvef) do not like it. Each
  time when I mentioned it, they usually did not get happy with it.
 
  What could be the reason? Any bad things in practical for radix
  sorting?
 
 
  
 






 

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-11 Thread Timak

[EMAIL PROTECTED] writes:

 So this is no better than just having an array of the counts per
 character, in terms of space and worse in terms of speed. :-/

Actually, what was I thinking, the radix approach would be better,
spacewise.  In terms of space it would require O(w) space where w is
the number of bits used for the determination.  The array of sums
approach (what Pramod Patangay suggested earlier) requires O(K) space
where K is the number of characters.  Since w is logK or less, the
radix approach is better.

Speedwise, the array of sums would be O(2N), while the radix is
O(2N*w).  Both are O(N) since w is a constant relatively small, in
case of huge N.

[boy, I just replied to my own message for a second time :-)]


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-10 Thread pramod

Something on the lines of counting sort will do. Make an array
Count[26] where Count[1] will have the number of a's etc. While reading
string A, keep note of the count of the characters in Count and while
reading B, just check if the same number of characters occur.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-10 Thread shishir

HI,
how do you intend to use hash ( as in which hash function) to check if
the hash keys generated for both A and B are equal.

Thanks,
Shishir


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-10 Thread Kevin

But the problem is:
there are much more than 26 possible characters in a string, even ASCII
have 100+ characters I think, not to mention those unicode, those which
use 2 or even more bytes as a character. So the Count[] will be too
large to be practical useful.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-10 Thread Padmanabhan Natarajan
How about this?Assign a prime number to each of the the characters. Multiply each of the numbers in each string and compare the product. If they are the same then it is a permutation.Cheers,Nat
On 4/10/06, Kevin [EMAIL PROTECTED] wrote:
But the problem is:there are much more than 26 possible characters in a string, even ASCIIhave 100+ characters I think, not to mention those unicode, those whichuse 2 or even more bytes as a character. So the Count[] will be too
large to be practical useful.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups Algorithm Geeks group.  To post to this group, send email to algogeeks@googlegroups.com  To unsubscribe from this group, send email to [EMAIL PROTECTED]  For more options, visit this group at http://groups.google.com/group/algogeeks  -~--~~~~--~~--~--~---


[algogeeks] Re: check string A and B isPermutation ?

2006-04-10 Thread Kevin

But multiply prime numbers will run out of range of integer quickly (as
I remember multiply the first 10-20 prime will get a too large number
to fit into an integer).


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-10 Thread Padmanabhan Natarajan
OK OK! Use the modulus operator then. Assign prime numbers mod p where p is a distinct large prime to characters and multiply.On 4/10/06, Kevin 
[EMAIL PROTECTED] wrote:But multiply prime numbers will run out of range of integer quickly (as
I remember multiply the first 10-20 prime will get a too large numberto fit into an integer).

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups Algorithm Geeks group.  To post to this group, send email to algogeeks@googlegroups.com  To unsubscribe from this group, send email to [EMAIL PROTECTED]  For more options, visit this group at http://groups.google.com/group/algogeeks  -~--~~~~--~~--~--~---


[algogeeks] Re: check string A and B isPermutation ?

2006-04-10 Thread Gene


Kevin wrote:
 Given two strings A and B, how to quickly check whether they are
 permutation or not?
 For example, abcd and bcda are permutation: same chars, just
 different order.

 Using hash: hash A and check B will use time O(N), at the cost of extra
 N space. How about do not use that much of extra space?

 One naive method: sort A and B by their chars, and do a compare will
 take O(NlogN) time. Any faster method?

Use radix sort to make it O(N W) where W is the bit-width of the
characters in the string, hence O(N).


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: check string A and B isPermutation ?

2006-04-10 Thread [EMAIL PROTECTED]

Just of the top of my head.
You're dealing with *2* strings here.
So the *naive* method you quoted wont be that
bad.
Use a sort. The basic operation here being
string compare, which takes up O(n) time
NOTE 'n' is the length of the strings.
So, using a sort the complexity really is
O(2n log 2), Isn't it?
Which is O(n) really, n being length of the strings.
NO extra space either.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---