RE: Searching a binary file for a specific sequence of hex values?

2002-11-18 Thread Carl Jolley
On Mon, 18 Nov 2002, Peter Guzis wrote:

> You're right on the first issue.  I need to slow down in my rush to be first
> reply :P
>
> The search string will NOT be truncated because of these lines:
>
>   my $search_length = length ($search);
>   my $chunk_size = $search_length > CHUNK_SIZE ? $search_length :
> CHUNK_SIZE;
>
>
> -Original Message-
> From: Carl Jolley [mailto:[EMAIL PROTECTED]]
> Sent: Sunday, November 17, 2002 7:35 PM
> To: Peter Guzis
> Cc: Perl Win32 Users (E-mail)
> Subject: RE: Searching a binary file for a specific sequence of hex
> values?
>
>
> Shouldn't the index function be:
>
> $idx = index "$last_chunk$chunk", $search;
>
> instead of
>
> $idx = index "$chunk$last_chunk", $search;
>
> Somehow appending the last several (i.e. match string length) bytes of the
> previous chunk to the end of the next chunk would not seem to allow
> a match on the search string if it was truncated due to the length of
> of chunk.
>
> On Fri, 15 Nov 2002, Peter Guzis wrote:
>
> > Try the code below.  For a search string you can specify either hex codes
> > (e.g. '70 65 72 6C' or '7065726C') or binary data (e.g. 'perl').
> >
> > ---
> >
> > use strict;
> > use Fcntl 'O_RDONLY';
> >
> > use constant CHUNK_SIZE => 4096;
> >
> > SearchBinary ('searchstring', 'path/to/file.ext');
> >
> > sub SearchBinary {
> >
> >   my $search = shift;
> >   my $file = shift;
> >   my ($chunk, $last_chunk, $chunks_read, $idx);
> >   die "expected: SearchBinary (\$search, \$file)\n" unless length $search
> &&
> > length $file;
> >
> >   # convert hex code to binary data
> >
> >   if ($search =~ /^(?:[0-9A-F]{2}\s*)+$/i) {
> >
> > $search =~ s/\s+//;
> > $search =  join '', (map { chr(hex $_) } $search =~ /../g);
> >
> >   }
> >
> >   my $search_length = length ($search);
> >   my $chunk_size = $search_length > CHUNK_SIZE ? $search_length :
> > CHUNK_SIZE;
> >   die "File '$file' does not exist\n" unless -f $file;
> >   my $file_size = -s $file;
> >   sysopen BIN, $file, O_RDONLY or die "Could not read $file: $!\n";
> >   binmode BIN;
> >
> >   while ($chunks_read * $chunk_size < $file_size) {
> >
> > sysread BIN, $chunk, $chunk_size;
> > $idx = index "$chunk$last_chunk", $search;
> >
> > if ($idx > -1) {
> >
> >   printf "Found string at position %d\n", $chunks_read * $chunk_size +
> > $idx;
> >
> > }
> >
> > $last_chunk = substr $chunk, $chunk_size - $search_length,
> > $search_length - 1;
> > $chunks_read++;
> >
> >   }
> >
> >   close BIN;
> >
> > }
> >
> >
> > -Original Message-
> > From: Thad Schultz [mailto:[EMAIL PROTECTED]]
> > Sent: Friday, November 15, 2002 6:04 AM
> > To: Perl Win32 Users (E-mail)
> > Subject: Searching a binary file for a specific sequence of hex values?
> >
> >
> > What are the best ways to search a binary file for a specific sequence of
> > hex values?  The sequence that I'm looking for is: FF D8 FF E0 00 10 4A 46
> > 49 46 00.  The files that I'm searching are 14K bytes in size.  I suppose
> > the easiest way would be slurp the whole file in at once and then search
> for
> > my sequence in some string variable.  But what if my files were huge?  Do
> I
> > read in one character at a time until I find an FF and then I look for the
> > D8 and then the rest of the hex values resetting my search if I fail to
> find
> > the next value in the sequence?  Or do I read in 1K of data at a time and
> > search that for my sequence?  Could some of you who have been down this
> road
> > before point me in the right direction?  I'm not asking you to write my
> code
> > for me (unless you want to).  I'm just looking for pointers and
> suggestions.
> >
> > Thanks!
> >
> > Thad Schultz
> > EDA Librarian / Sys Admin
> > Woodward Industrial Controls
> > [EMAIL PROTECTED]
> > ph (970)498-3570
> > fax (970)498-3077
> > www.woodward.com
> >

The actual string to be matched in the file is what I was talking about.
Regardless of the size of the chunk, the actual mathing string _can_
be truncated due the the size of the chunk depending of the location of
the matching string in the file. I.E. if the chunk was 1024 bytes long
and the matching string began at location 1020 AND was longer than 4
bytes OR if the chunch size was 12 and the matching string began at
location 8 and was more than 4 characters long.

 [EMAIL PROTECTED] 
 All opinions are my own and not necessarily those of my employer 
k

___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs



RE: Searching a binary file for a specific sequence of hex values?

2002-11-18 Thread Peter Guzis
You're right on the first issue.  I need to slow down in my rush to be first
reply :P

The search string will NOT be truncated because of these lines:

  my $search_length = length ($search);
  my $chunk_size = $search_length > CHUNK_SIZE ? $search_length :
CHUNK_SIZE;

Peter Guzis
Web Administrator, Sr.
ENCAD, Inc.
- A Kodak Company
email: [EMAIL PROTECTED]
www.encad.com 

-Original Message-
From: Carl Jolley [mailto:[EMAIL PROTECTED]]
Sent: Sunday, November 17, 2002 7:35 PM
To: Peter Guzis
Cc: Perl Win32 Users (E-mail)
Subject: RE: Searching a binary file for a specific sequence of hex
values?


Shouldn't the index function be:

$idx = index "$last_chunk$chunk", $search;

instead of

$idx = index "$chunk$last_chunk", $search;

Somehow appending the last several (i.e. match string length) bytes of the
previous chunk to the end of the next chunk would not seem to allow
a match on the search string if it was truncated due to the length of
of chunk.

 [EMAIL PROTECTED] 
 All opinions are my own and not necessarily those of my employer 

On Fri, 15 Nov 2002, Peter Guzis wrote:

> Try the code below.  For a search string you can specify either hex codes
> (e.g. '70 65 72 6C' or '7065726C') or binary data (e.g. 'perl').
>
> ---
>
> use strict;
> use Fcntl 'O_RDONLY';
>
> use constant CHUNK_SIZE => 4096;
>
> SearchBinary ('searchstring', 'path/to/file.ext');
>
> sub SearchBinary {
>
>   my $search = shift;
>   my $file = shift;
>   my ($chunk, $last_chunk, $chunks_read, $idx);
>   die "expected: SearchBinary (\$search, \$file)\n" unless length $search
&&
> length $file;
>
>   # convert hex code to binary data
>
>   if ($search =~ /^(?:[0-9A-F]{2}\s*)+$/i) {
>
> $search =~ s/\s+//;
> $search =  join '', (map { chr(hex $_) } $search =~ /../g);
>
>   }
>
>   my $search_length = length ($search);
>   my $chunk_size = $search_length > CHUNK_SIZE ? $search_length :
> CHUNK_SIZE;
>   die "File '$file' does not exist\n" unless -f $file;
>   my $file_size = -s $file;
>   sysopen BIN, $file, O_RDONLY or die "Could not read $file: $!\n";
>   binmode BIN;
>
>   while ($chunks_read * $chunk_size < $file_size) {
>
> sysread BIN, $chunk, $chunk_size;
> $idx = index "$chunk$last_chunk", $search;
>
> if ($idx > -1) {
>
>   printf "Found string at position %d\n", $chunks_read * $chunk_size +
> $idx;
>
> }
>
> $last_chunk = substr $chunk, $chunk_size - $search_length,
> $search_length - 1;
> $chunks_read++;
>
>   }
>
>   close BIN;
>
> }
>
> Peter Guzis
> Web Administrator, Sr.
> ENCAD, Inc.
> - A Kodak Company
> email: [EMAIL PROTECTED]
> www.encad.com
>
> -Original Message-
> From: Thad Schultz [mailto:[EMAIL PROTECTED]]
> Sent: Friday, November 15, 2002 6:04 AM
> To: Perl Win32 Users (E-mail)
> Subject: Searching a binary file for a specific sequence of hex values?
>
>
> What are the best ways to search a binary file for a specific sequence of
> hex values?  The sequence that I'm looking for is: FF D8 FF E0 00 10 4A 46
> 49 46 00.  The files that I'm searching are 14K bytes in size.  I suppose
> the easiest way would be slurp the whole file in at once and then search
for
> my sequence in some string variable.  But what if my files were huge?  Do
I
> read in one character at a time until I find an FF and then I look for the
> D8 and then the rest of the hex values resetting my search if I fail to
find
> the next value in the sequence?  Or do I read in 1K of data at a time and
> search that for my sequence?  Could some of you who have been down this
road
> before point me in the right direction?  I'm not asking you to write my
code
> for me (unless you want to).  I'm just looking for pointers and
suggestions.
>
> Thanks!
>
> Thad Schultz
> EDA Librarian / Sys Admin
> Woodward Industrial Controls
> [EMAIL PROTECTED]
> ph (970)498-3570
> fax (970)498-3077
> www.woodward.com
>
>
>
> ***
> The information in this e-mail is confidential and intended solely for the
> individual or entity to whom it is addressed. If you have received this
> e-mail in error please notify the sender by return e-mail, delete this
> e-mail, and refrain from any disclosure or action based on the
information.
> 
> ___
> Perl-Win32-Users mailing list
> [EMAIL PROTECTED]
> To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
> ___
> Perl-Win32-Users mailing list
> [EMAIL PROTECTED]
> To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
>

___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs



RE: Searching a binary file for a specific sequence of hex values?

2002-11-15 Thread Peter Guzis
Try the code below.  For a search string you can specify either hex codes
(e.g. '70 65 72 6C' or '7065726C') or binary data (e.g. 'perl').

---

use strict;
use Fcntl 'O_RDONLY';

use constant CHUNK_SIZE => 4096;

SearchBinary ('searchstring', 'path/to/file.ext');

sub SearchBinary {

  my $search = shift;
  my $file = shift;
  my ($chunk, $last_chunk, $chunks_read, $idx);
  die "expected: SearchBinary (\$search, \$file)\n" unless length $search &&
length $file;

  # convert hex code to binary data

  if ($search =~ /^(?:[0-9A-F]{2}\s*)+$/i) {

$search =~ s/\s+//;
$search =  join '', (map { chr(hex $_) } $search =~ /../g);

  }

  my $search_length = length ($search);
  my $chunk_size = $search_length > CHUNK_SIZE ? $search_length :
CHUNK_SIZE;
  die "File '$file' does not exist\n" unless -f $file;
  my $file_size = -s $file;
  sysopen BIN, $file, O_RDONLY or die "Could not read $file: $!\n";
  binmode BIN;

  while ($chunks_read * $chunk_size < $file_size) {

sysread BIN, $chunk, $chunk_size;
$idx = index "$chunk$last_chunk", $search;

if ($idx > -1) {

  printf "Found string at position %d\n", $chunks_read * $chunk_size +
$idx;

}

$last_chunk = substr $chunk, $chunk_size - $search_length,
$search_length - 1;
$chunks_read++;

  }

  close BIN;

}

Peter Guzis
Web Administrator, Sr.
ENCAD, Inc.
- A Kodak Company
email: [EMAIL PROTECTED]
www.encad.com 

-Original Message-----
From: Thad Schultz [mailto:tschul@;woodward.com]
Sent: Friday, November 15, 2002 6:04 AM
To: Perl Win32 Users (E-mail)
Subject: Searching a binary file for a specific sequence of hex values?


What are the best ways to search a binary file for a specific sequence of
hex values?  The sequence that I'm looking for is: FF D8 FF E0 00 10 4A 46
49 46 00.  The files that I'm searching are 14K bytes in size.  I suppose
the easiest way would be slurp the whole file in at once and then search for
my sequence in some string variable.  But what if my files were huge?  Do I
read in one character at a time until I find an FF and then I look for the
D8 and then the rest of the hex values resetting my search if I fail to find
the next value in the sequence?  Or do I read in 1K of data at a time and
search that for my sequence?  Could some of you who have been down this road
before point me in the right direction?  I'm not asking you to write my code
for me (unless you want to).  I'm just looking for pointers and suggestions.

Thanks!

Thad Schultz
EDA Librarian / Sys Admin
Woodward Industrial Controls 
[EMAIL PROTECTED]
ph (970)498-3570
fax (970)498-3077
www.woodward.com



***
The information in this e-mail is confidential and intended solely for the
individual or entity to whom it is addressed. If you have received this
e-mail in error please notify the sender by return e-mail, delete this
e-mail, and refrain from any disclosure or action based on the information.

___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs



Searching a binary file for a specific sequence of hex values?

2002-11-15 Thread Thad Schultz
What are the best ways to search a binary file for a specific sequence of
hex values?  The sequence that I'm looking for is: FF D8 FF E0 00 10 4A 46
49 46 00.  The files that I'm searching are 14K bytes in size.  I suppose
the easiest way would be slurp the whole file in at once and then search for
my sequence in some string variable.  But what if my files were huge?  Do I
read in one character at a time until I find an FF and then I look for the
D8 and then the rest of the hex values resetting my search if I fail to find
the next value in the sequence?  Or do I read in 1K of data at a time and
search that for my sequence?  Could some of you who have been down this road
before point me in the right direction?  I'm not asking you to write my code
for me (unless you want to).  I'm just looking for pointers and suggestions.

Thanks!

Thad Schultz
EDA Librarian / Sys Admin
Woodward Industrial Controls 
[EMAIL PROTECTED]
ph (970)498-3570
fax (970)498-3077
www.woodward.com



***
The information in this e-mail is confidential and intended solely for the
individual or entity to whom it is addressed. If you have received this
e-mail in error please notify the sender by return e-mail, delete this
e-mail, and refrain from any disclosure or action based on the information.

___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs