Re: [Boston.pm] script to "normalize" output of Windows dir command

2005-09-23 Thread Ben Tilly
On 9/23/05, Tolkin, Steve <[EMAIL PROTECTED]> wrote:
> I do have a port of Unix find on my current Windows machine.
> But I do not have that on the machine I back up to (my wife's), so I
> would need to install that, and its dependencies, which makes me
> reluctant to take that approach.

Are you reinventing the rsync wheel?

(Yeah, I know.  Getting the flags right can be a pain.)

Cheers,
Ben
 
___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] script to "normalize" output of Windows dir command

2005-09-23 Thread Kripa Sundar
Dear Steve,

> Also date and time are combined into three fields, but the third is
> either time or year.  This makes it harder to process.  I would actually
> prefer time in seconds since the start of the Unix eon.

IMHO, File::Find and stat() should solve your problems.

The following is a starting point for what you are looking for.

==\/BEGIN=\/==
% perl -wMwarnings -Mstrict -MFile::Find \
-le 'sub wanted {my $mod = localtime((stat)[9]); \
  my $size = -s _; print "$File::Find::name|$size|$mod";} \
  File::Find::find(\, "tmp")'
tmp|4096|Mon Jul 11 15:45:10 2005
tmp/apply|0|Fri Aug 27 13:31:26 2004
tmp/bogus|0|Fri Sep 10 14:24:49 2004
tmp/apple|0|Fri Aug 27 13:31:26 2004
tmp/xblend.pl,v|28810|Thu Jul 17 14:16:34 2003
tmp/bous|0|Fri Oct  8 10:39:19 2004
tmp/t|2880|Mon Jul 11 15:43:41 2005
tmp/t2|2880|Mon Jul 11 15:45:10 2005
%
==/\=END==/\==

peace, || What can one hour achieve?
--{kr.pA}  || http://www.workanhour.com/
--
Kid, n.: A noise with dirt on it.
 
___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] script to "normalize" output of Windows dir command

2005-09-23 Thread Tom Metro
Jeremy Muhlich wrote:
> Also, diff -r might be helpful. ...

I'd strongly second that recommendation. I often use diff on Windows to 
verify file systems, such as burned CDs. (And prior to a diff port being 
available, I had a home brew script written in Perl that compared the 
checksum of files in two similarly structured file systems.)

As suggested you'll want to use --brief (or -q), so the command line 
would be something like:

diff -r --brief --binary dir1 dir2

You can find a copy of diff ported for Windows in "Unxutils," which is a 
collection of natively (no Cygwin libraries needed) ported GNU 
utilities. http://unxutils.sourceforge.net/

More importantly this will give you a more meaningful comparison than 
simply looking at directory listings.


Steve Tolkin wrote:
> I do have a port of Unix find on my current Windows machine.

If you're in the midst of file recovery, I wouldn't recommend installing 
one, either. A better solution would be to share the machines drive over 
the network, and run the comparison on the machine where you have the 
secondary copy of the files. It'll be slow, but it should be the least 
intrusive approach.

  -Tom

-- 
Tom Metro
Venture Logic, Newton, MA, USA
"Enterprise solutions through open source."
Professional Profile: https://www.linkedin.com/e/fps/3452158/
 
___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] script to "normalize" output of Windows dir command

2005-09-23 Thread John Abreau
On Fri, 23 Sep 2005, Tolkin, Steve wrote:

> Here are a few lines from the output of
> \bin\find -print -ls
> 
>  945730 drwxr-xr-x   6 a071046  Administ0 Sep 21 15:05 ./ant
> ./ant/bin
>  951240 drwxr-xr-x   2 a071046  Administ0 Sep 21 15:05
> ./ant/bin
> ./ant/bin/ant
>  951283 -rwxr-xr-x   1 a071046  Administ 5140 Apr 16  2003
> ./ant/bin/ant
> ./ant/bin/ant.bat
> 
> Note each file is on two lines.  Probably that is the default for -ls.

Nope. It does that because you told it to. You told find that you wanted 
a -ls listing, and you also told it -print to just print the filename. If 
you did the -ls without the unnecessary -print, you'd just get one line 
per file. 

> Also date and time are combined into three fields, but the third is
> either time or year.  This makes it harder to process.  I would actually
> prefer time in seconds since the start of the Unix eon.

That's the normal behavior of ls. I'm not sure offhand if there's a good 
alternative. 

> Also there is no easy way to distinguish Files from Directories except
> by further parsing of the permissions string, e.g. drwxr-xr-x.

If you only want one or the other, you can use -type:

find . -type f -ls  f == files
find . -type d -ls  d == directories

-- 
John Abreau / Executive Director, Boston Linux & Unix
ICQ 28611923 / AIM abreauj / JABBER [EMAIL PROTECTED] / YAHOO abreauj
Email [EMAIL PROTECTED] / WWW http://www.abreau.net / PGP-Key-ID 0xD5C7B5D9
PGP-Key-Fingerprint 72 FB 39 4F 3C 3B D6 5B E0 C8 5A 6E F1 2C BE 99

 
___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] script to "normalize" output of Windows dir command

2005-09-23 Thread Tolkin, Steve
I do have a port of Unix find on my current Windows machine.
But I do not have that on the machine I back up to (my wife's), so I
would need to install that, and its dependencies, which makes me
reluctant to take that approach.

I, like many people, have had problems with find, but I thought I would
try your suggestion.  There are "quirks" with the time reporting, and
probably other issues I have forgotten.  I do not know exactly how to
set the argument to -printf and it is not explained in the help (shown
below).  If you send an example I would try that.

Here are a few lines from the output of
\bin\find -print -ls

 945730 drwxr-xr-x   6 a071046  Administ0 Sep 21 15:05 ./ant
./ant/bin
 951240 drwxr-xr-x   2 a071046  Administ0 Sep 21 15:05
./ant/bin
./ant/bin/ant
 951283 -rwxr-xr-x   1 a071046  Administ 5140 Apr 16  2003
./ant/bin/ant
./ant/bin/ant.bat

Note each file is on two lines.  Probably that is the default for -ls.
Also date and time are combined into three fields, but the third is
either time or year.  This makes it harder to process.  I would actually
prefer time in seconds since the start of the Unix eon.
Also there is no easy way to distinguish Files from Directories except
by further parsing of the permissions string, e.g. drwxr-xr-x.

Here is the help.  I cannot figure out how to suppress certain useless
fields e.g. inode and owner, nor put output on one line, etc.  

C:\foo>\bin\find -help
Usage: /bin/find [path...] [expression]
default path is the current directory; default expression is -print
expression may consist of:
operators (decreasing precedence; -and is implicit where no others are
given):
  ( EXPR ) ! EXPR -not EXPR EXPR1 -a EXPR2 EXPR1 -and EXPR2
  EXPR1 -o EXPR2 EXPR1 -or EXPR2 EXPR1 , EXPR2
options (always true): -daystart -depth -follow --help
  -maxdepth LEVELS -mindepth LEVELS -mount -noleaf --version -xdev
tests (N can be +N or -N or N): -amin N -anewer FILE -atime N -cmin N
  -cnewer FILE -ctime N -empty -false -fstype TYPE -gid N -group
NAME
  -ilname PATTERN -iname PATTERN -inum N -ipath PATTERN -iregex
PATTERN
  -links N -lname PATTERN -mmin N -mtime N -name PATTERN -newer FILE
  -nouser -nogroup -path PATTERN -perm [+-]MODE -regex PATTERN
  -size N[bckw] -true -type [bcdpfls] -uid N -used N -user NAME
  -xtype [bcdpfls]
actions: -exec COMMAND ; -fprint FILE -fprint0 FILE -fprintf FILE FORMAT
  -ok COMMAND ; -print -print0 -printf FORMAT -prune -ls

Thanks for the suggestion, but it is probably faster to write the perl
that use find.
Steve


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeremy Muhlich
Sent: Friday, September 23, 2005 12:19 PM
To: boston-pm@mail.pm.org
Subject: Re: [Boston.pm] script to "normalize" output of Windows dir
command


How about the unix "find" command, with the -printf option?  You can get
it through cygwin.  Taking find's output (even without -printf) from two
directories and diffing it has gotten me through most of these sorts of
problems.

Also, diff -r might be helpful.  (possibly with the --brief option as
well)


 -- Jeremy


On Fri, 2005-09-23 at 11:55 -0400, Tolkin, Steve wrote:
> Summary:
> I would like a perl script that converts the output of the Windows dir
> command so that each line has the same format, including the directory

> C:\_from_laptop\AAA BBB_files|abc||File|123|2003-04-14|10:21
> C:\_from_laptop\AAA BBB_files|empty.jpg|txt|Dir|0|2003-04-14|23:00


 
___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm
 
___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] script to "normalize" output of Windows dir command

2005-09-23 Thread Jeremy Muhlich
How about the unix "find" command, with the -printf option?  You can get
it through cygwin.  Taking find's output (even without -printf) from two
directories and diffing it has gotten me through most of these sorts of
problems.

Also, diff -r might be helpful.  (possibly with the --brief option as
well)


 -- Jeremy


On Fri, 2005-09-23 at 11:55 -0400, Tolkin, Steve wrote:
> Summary:
> I would like a perl script that converts the output of the Windows dir
> command so that each line has the same format, including the directory

> C:\_from_laptop\AAA BBB_files|abc||File|123|2003-04-14|10:21
> C:\_from_laptop\AAA BBB_files|empty.jpg|txt|Dir|0|2003-04-14|23:00


 
___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


[Boston.pm] script to "normalize" output of Windows dir command

2005-09-23 Thread Tolkin, Steve
Summary:
I would like a perl script that converts the output of the Windows dir
command so that each line has the same format, including the directory
it is in, and its extension.  The date and time should use a format that
can be sorted as a string, e.g. -mm-dd and a 24 hour clock
I think pipe delimited would work best, as the pipe character | cannot
appear in a file name, and that would let me sort the output, and/or
load it into a database.

Details:
I could probably write this in an hour but laziness is a virtue, and if
someone has got one already that will probably be better anyway.
I want to translate lines like this:

 Directory of C:\_from_laptop\AAA BBB_files

04/14/2003  10:21 AM   123 abc
04/14/2003  11:00 PM 0 empty.jpg.txt

To lines something like this.  Note that I moved the file name and
extension sooner, so that the natural sort is by directory and file
name, and a sort on the last two fields is by time.  (I have a port of
Unix sort in my c:\bin\ directory that I can use.)

C:\_from_laptop\AAA BBB_files|abc||File|123|2003-04-14|10:21
C:\_from_laptop\AAA BBB_files|empty.jpg|txt|Dir|0|2003-04-14|23:00


None of it is tricky.  You just need to remember what Directory line you
saw last, convert the date and time fields, insert either File or Dir
depending on its type, and write out each line that comes from a file or
dir (except skip all the . and .. dirs).  Note that a file named
foo.bar.txt has a name of foo.bar and extension of txt.  Some files can
have no extension, and some directories do have an extension.

Here is an except of the output.  (Because it is an except the totals
for Files and Bytes are not right.)
Note that there are a few lines of boilerplate at the beginning which
can be ignored, and a few lines at the end which can be ignored (or used
as a sanity check on the totals.)  Note that a file might not have an
extension, that a file or directory can be empty, can have white space
and strange characters in its name.

 Volume in drive C has no label.
 Volume Serial Number is A898-B50D

 Directory of C:\_from_laptop

01/23/2005  08:37 AM  .
01/23/2005  08:37 AM  ..
04/14/2003  01:46 PM  _from_c
02/06/2001  01:34 PM 15618 0101.txt
02/06/2001  01:34 PM 15618 abc
04/14/2003  10:22 AM 32451 AAA BBB.htm
01/17/2005  09:53 AM  AAA BBB_files
04/04/2000  06:14 PM 27648 acm_pubform.doc
01/17/2005  09:53 AM  acrobat
01/17/2005  09:54 AM  address
08/17/2004  10:04 AM 0 zzz
 650 File(s)   92010877 bytes

 Directory of C:\_from_laptop\AAA BBB_files

01/17/2005  09:53 AM  .
01/17/2005  09:53 AM  ..
04/14/2003  10:21 AM  1045 abc
04/14/2003  10:21 AM  0 empty.jpg.txt
04/14/2003  10:22 AM 32451 AAA BBB CCC.htm
01/17/2005  09:53 AM  AAA BBB_CCC_files
04/14/2003  10:21 AM43 spacer.gif
  11 File(s)  37476 bytes

 Directory of C:\_from_laptop\AAA BBB CCC_files

01/17/2005  09:53 AM  .
01/17/2005  09:53 AM  ..
   0 File(s)  0 bytes

 Total Files Listed:
   245909 File(s)28969650933 bytes
   154376 Dir(s) 31272304640 bytes free


Background:
My laptop's died a few days ago.  The process to recover files and
directories from it seems to have lots of missing files.  I have a
directory on another machine that I have been backing up to.  I want to
find out which file are missing.  I have run dir on the backed up
machine, and will run dir on the new machine, and then diff the outputs.
The diff will work best if each line in the file had the same format,
and includes the full directory path.

P.S. Here is the command I ran in a DOS box (aka command prompt window
etc.) from my Windows XP machine.

dir >dir.txt c:\_from_laptop /-C /ON /S /TW /4

The /-C means suppress the thousand separator in the size, /ON means
order by name, /S means recurse into subdirectories, /TW means show the
last time it was written, and /4 means show 4 digit years.



Thanks,
Steve
 
___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm