Re: Does Regex help in this case ?

2003-09-07 Thread Li Ngok Lam
Thanks John, and Rob.

This reply is quite close to what I am going to do,
but some critical point is wanted here  I'll try to explain
my question further


  That's about the whole story, but I'll make it short.
  For example, I have a list like this :
 
  123ABCDEF456
  123456
  654WXYZ321
  987654321
  ABCDEF123456
  WXYZ321
 
  By user's INTEGER input , I will have to find how many similar
  patterns are matched within the list according to certain chars (user's
  input ) :
 
  For example, I input '3', then I will get the result like this :
 
  Res1: 123ABCDEF456 is similar to 123456
  Res2: 123ABCDEF456 is similar to ABCDEF123456
  Res3: 654WXYZ321 is similar to 987654321
  Res4: 654WXYZ321 is similar to WXYZ321
 
  In case , if a pattern match happens, then the elem in list will not
  be shown again even another match happens. Okay, thaz my
  homework for how to deal with the output.
 
  The question I want to ask is how to tell ( or is this a good starting
  point ) the regex to compare the patterns freely ? So I can get
  654WXYZ321 match 987654321 and also match WXYZ321 ?
 
  I hope I can explain my question well.

 I'm not sure exactly what you want but maybe this will give you some
ideas:


It does, and thaz about my coding currently up to.

 #!/usr/bin/perl
 use warnings;
 use strict;

 my @data = qw(
 123ABCDEF456
 123456
 654WXYZ321
 987654321
 ABCDEF123456
 WXYZ321
 );

 for my $x ( @data ) {
 for my $y ( @data ) {
 next if $x eq $y or length( $x )  length( $y );
 my $count = () = $x =~ /[\Q$y\E]/g;
 my $perc = ( $count / length $x ) * 100;
 printf %-12s %-12s  %2d %2d  %6.2f %%\n, $x, $y, length $x,
$count, $perc;
 }
 }

 __END__

 Produces this output:

 123ABCDEF456 12345612  6   50.00 %
 123ABCDEF456 654WXYZ32112  6   50.00 %

For what I want, this is not a match.
if my input is 3, than, the scanning process is like this :

123 compare 654WXYZ321 = false
23A  compare 654WXYZ321 = false
3AB  compare 654WXYZ321 = false
ABC  compare 654WXYZ321 = false
...
...
456 cmp 654WXYZ321 = false

In case, 3 means,  each 3 chars from the string formed a pattern
and trying to compare with elems in the list.


 123ABCDEF456 987654321 12  6   50.00 %
 123ABCDEF456 ABCDEF123456  12 12  100.00 %
 123ABCDEF456 WXYZ321   12  3   25.00 %
 654WXYZ321   12345610  6   60.00 %
 654WXYZ321   987654321 10  6   60.00 %
 654WXYZ321   WXYZ321   10  7   70.00 %
 987654321123456 9  6   66.67 %
 987654321WXYZ3219  3   33.33 %
 ABCDEF123456 123ABCDEF456  12 12  100.00 %
 ABCDEF123456 12345612  6   50.00 %
 ABCDEF123456 654WXYZ32112  6   50.00 %
 ABCDEF123456 987654321 12  6   50.00 %
 ABCDEF123456 WXYZ321   12  3   25.00 %
 WXYZ321  123456 7  3   42.86 %



Evaluate from the result, matching is by char based. So,
ZXCVBNM is 100 % match MNBVCXZ.. but for
what I am trying to compare will treat this 0 % match.
unless my input is '1'

I hope I can explain my question well this time, thanks for
any further advise. =)




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Does Regex help in this case ?

2003-09-07 Thread John W. Krahn
Li Ngok Lam wrote:
 
 Thanks John, and Rob.
 
 This reply is quite close to what I am going to do,
 but some critical point is wanted here  I'll try to explain
 my question further
 
 It does, and thaz about my coding currently up to.
 
  #!/usr/bin/perl
  use warnings;
  use strict;
 
  my @data = qw(
  123ABCDEF456
  123456
  654WXYZ321
  987654321
  ABCDEF123456
  WXYZ321
  );
 
  for my $x ( @data ) {
  for my $y ( @data ) {
  next if $x eq $y or length( $x )  length( $y );
  my $count = () = $x =~ /[\Q$y\E]/g;
  my $perc = ( $count / length $x ) * 100;
  printf %-12s %-12s  %2d %2d  %6.2f %%\n, $x, $y, length $x,
 $count, $perc;
  }
  }
 
  __END__
 
  Produces this output:
 
  123ABCDEF456 12345612  6   50.00 %
  123ABCDEF456 654WXYZ32112  6   50.00 %
 
 For what I want, this is not a match.
 if my input is 3, than, the scanning process is like this :
 
 123 compare 654WXYZ321 = false
 23A  compare 654WXYZ321 = false
 3AB  compare 654WXYZ321 = false
 ABC  compare 654WXYZ321 = false
 ...
 ...
 456 cmp 654WXYZ321 = false
 
 In case, 3 means,  each 3 chars from the string formed a pattern
 and trying to compare with elems in the list.

Maybe this is closer to what you want:

#!/usr/bin/perl
use warnings;
use strict;

my $len = 3;

my @data = qw(
123ABCDEF456
123456
654WXYZ321
987654321
ABCDEF123456
WXYZ321
);

for my $x ( @data ) {
for my $y ( @data ) {
next if $x eq $y or length( $x )  length( $y );
my $count;
for ( my $offset; length( my $chunk = substr $y, $offset++, $len ) == $len; ) {
$count += index( $x, $chunk ) = 0;
}
printf %-12s %-12s  %2d %2d\n, $x, $y, length $x, $count;
}
}

__END__

Produces this output:

123ABCDEF456 12345612  2
123ABCDEF456 654WXYZ32112  0
123ABCDEF456 987654321 12  0
123ABCDEF456 ABCDEF123456  12  6
123ABCDEF456 WXYZ321   12  0
654WXYZ321   12345610  0
654WXYZ321   987654321 10  2
654WXYZ321   WXYZ321   10  5
987654321123456 9  0
987654321WXYZ3219  1
ABCDEF123456 123ABCDEF456  12  6
ABCDEF123456 12345612  4
ABCDEF123456 654WXYZ32112  0
ABCDEF123456 987654321 12  0
ABCDEF123456 WXYZ321   12  0
WXYZ321  123456 7  0



John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Does Regex help in this case ?

2003-09-07 Thread R. Joseph Newton
Li Ngok Lam wrote:

 Hi all,

 That's about the whole story, but I'll make it short.
 For example, I have a list like this :

 123ABCDEF456
 123456
 654WXYZ321
 987654321
 ABCDEF123456
 WXYZ321

 By user's INTEGER

Please don't do this.  The word integer is not being used as a global
constant.  it doesn't ehlp us understand.

 input , I will have to find how many similar
 patterns are matched within the list according to certain chars

You already told us, accurately, that you sought an integer here.  Please,
don't even mention chars in input.  The relevant user input is its integer
value, which you later tell us is the minimum length desired for matches
between elements.

 (user's
 input ) :


Reading the passage above as originally written, I assume that only the
strings which contained the character '3' would be considered in the match.


 For example, I input '3', then I will get the result like this :

 Res1: 123ABCDEF456 is similar to 123456#each contains char '3'
 Res2: 123ABCDEF456 is similar to ABCDEF123456   #each contains char '3'
 Res3: 654WXYZ321 is similar to 987654321   #each contains char '3'
 Res4: 654WXYZ321 is similar to WXYZ321   #each contains char '3'

How about:

I want to compare a set of strings for pattern matches.  The user will
enter a number, then the program should seek, for each string in the array,
all strings for which a pattern of that length can be found in common
between them.

Please don't throw in technical termiology that is not necessary to
understand the problem.  You will get much better help, qand write much
more powerful code, by keeping things as simple as you can make them.

Joseph


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Does Regex help in this case ?

2003-09-06 Thread Li Ngok Lam
Hi all, 

That's about the whole story, but I'll make it short.
For example, I have a list like this :

123ABCDEF456
123456
654WXYZ321
987654321
ABCDEF123456
WXYZ321

By user's INTEGER input , I will have to find how many similar
patterns are matched within the list according to certain chars (user's
input ) :

For example, I input '3', then I will get the result like this :

Res1: 123ABCDEF456 is similar to 123456
Res2: 123ABCDEF456 is similar to ABCDEF123456
Res3: 654WXYZ321 is similar to 987654321
Res4: 654WXYZ321 is similar to WXYZ321

In case , if a pattern match happens, then the elem in list will not
be shown again even another match happens. Okay, thaz my 
homework for how to deal with the output.

The question I want to ask is how to tell ( or is this a good starting
point ) the regex to compare the patterns freely ? So I can get
654WXYZ321 match 987654321 and also match WXYZ321 ?

I hope I can explain my question well.

Thanks in advise,
Li





Re: Does Regex help in this case ?

2003-09-06 Thread Rob Dixon

Li Ngok Lam wrote:

 Hi all,

 That's about the whole story, but I'll make it short.
 For example, I have a list like this :

 123ABCDEF456
 123456
 654WXYZ321
 987654321
 ABCDEF123456
 WXYZ321

 By user's INTEGER input , I will have to find how many similar
 patterns are matched within the list according to certain chars (user's
 input ) :

 For example, I input '3', then I will get the result like this :

 Res1: 123ABCDEF456 is similar to 123456
 Res2: 123ABCDEF456 is similar to ABCDEF123456
 Res3: 654WXYZ321 is similar to 987654321
 Res4: 654WXYZ321 is similar to WXYZ321

 In case , if a pattern match happens, then the elem in list will not
 be shown again even another match happens. Okay, thaz my
 homework for how to deal with the output.

 The question I want to ask is how to tell ( or is this a good starting
 point ) the regex to compare the patterns freely ? So I can get
 654WXYZ321 match 987654321 and also match WXYZ321 ?

 I hope I can explain my question well.

My best guess is that you want to compare an element of the array with
all other elements to find similar ones. But I'm not sure how they should
match. Explain what you mean by 'similar'. Also try reducing the list to
something like

  my @words = qw/ A1 1A AA 11 /

and explain what your output should be.

Rob



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Does Regex help in this case ?

2003-09-06 Thread John W. Krahn
Li Ngok Lam wrote:
 
 Hi all,

Hello,

 That's about the whole story, but I'll make it short.
 For example, I have a list like this :
 
 123ABCDEF456
 123456
 654WXYZ321
 987654321
 ABCDEF123456
 WXYZ321
 
 By user's INTEGER input , I will have to find how many similar
 patterns are matched within the list according to certain chars (user's
 input ) :
 
 For example, I input '3', then I will get the result like this :
 
 Res1: 123ABCDEF456 is similar to 123456
 Res2: 123ABCDEF456 is similar to ABCDEF123456
 Res3: 654WXYZ321 is similar to 987654321
 Res4: 654WXYZ321 is similar to WXYZ321
 
 In case , if a pattern match happens, then the elem in list will not
 be shown again even another match happens. Okay, thaz my
 homework for how to deal with the output.
 
 The question I want to ask is how to tell ( or is this a good starting
 point ) the regex to compare the patterns freely ? So I can get
 654WXYZ321 match 987654321 and also match WXYZ321 ?
 
 I hope I can explain my question well.

I'm not sure exactly what you want but maybe this will give you some ideas:

#!/usr/bin/perl
use warnings;
use strict;

my @data = qw(
123ABCDEF456
123456
654WXYZ321
987654321
ABCDEF123456
WXYZ321
);

for my $x ( @data ) {
for my $y ( @data ) {
next if $x eq $y or length( $x )  length( $y );
my $count = () = $x =~ /[\Q$y\E]/g;
my $perc = ( $count / length $x ) * 100;
printf %-12s %-12s  %2d %2d  %6.2f %%\n, $x, $y, length $x, $count, $perc;
}
}

__END__

Produces this output:

123ABCDEF456 12345612  6   50.00 %
123ABCDEF456 654WXYZ32112  6   50.00 %
123ABCDEF456 987654321 12  6   50.00 %
123ABCDEF456 ABCDEF123456  12 12  100.00 %
123ABCDEF456 WXYZ321   12  3   25.00 %
654WXYZ321   12345610  6   60.00 %
654WXYZ321   987654321 10  6   60.00 %
654WXYZ321   WXYZ321   10  7   70.00 %
987654321123456 9  6   66.67 %
987654321WXYZ3219  3   33.33 %
ABCDEF123456 123ABCDEF456  12 12  100.00 %
ABCDEF123456 12345612  6   50.00 %
ABCDEF123456 654WXYZ32112  6   50.00 %
ABCDEF123456 987654321 12  6   50.00 %
ABCDEF123456 WXYZ321   12  3   25.00 %
WXYZ321  123456 7  3   42.86 %



John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]