Re: [Boston.pm] script to "normalize" output of Windows dir command
On 9/23/05, Tolkin, Steve <[EMAIL PROTECTED]> wrote: > I do have a port of Unix find on my current Windows machine. > But I do not have that on the machine I back up to (my wife's), so I > would need to install that, and its dependencies, which makes me > reluctant to take that approach. Are you reinventing the rsync wheel? (Yeah, I know. Getting the flags right can be a pain.) Cheers, Ben ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] script to "normalize" output of Windows dir command
Dear Steve, > Also date and time are combined into three fields, but the third is > either time or year. This makes it harder to process. I would actually > prefer time in seconds since the start of the Unix eon. IMHO, File::Find and stat() should solve your problems. The following is a starting point for what you are looking for. ==\/BEGIN=\/== % perl -wMwarnings -Mstrict -MFile::Find \ -le 'sub wanted {my $mod = localtime((stat)[9]); \ my $size = -s _; print "$File::Find::name|$size|$mod";} \ File::Find::find(\, "tmp")' tmp|4096|Mon Jul 11 15:45:10 2005 tmp/apply|0|Fri Aug 27 13:31:26 2004 tmp/bogus|0|Fri Sep 10 14:24:49 2004 tmp/apple|0|Fri Aug 27 13:31:26 2004 tmp/xblend.pl,v|28810|Thu Jul 17 14:16:34 2003 tmp/bous|0|Fri Oct 8 10:39:19 2004 tmp/t|2880|Mon Jul 11 15:43:41 2005 tmp/t2|2880|Mon Jul 11 15:45:10 2005 % ==/\=END==/\== peace, || What can one hour achieve? --{kr.pA} || http://www.workanhour.com/ -- Kid, n.: A noise with dirt on it. ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] script to "normalize" output of Windows dir command
Jeremy Muhlich wrote: > Also, diff -r might be helpful. ... I'd strongly second that recommendation. I often use diff on Windows to verify file systems, such as burned CDs. (And prior to a diff port being available, I had a home brew script written in Perl that compared the checksum of files in two similarly structured file systems.) As suggested you'll want to use --brief (or -q), so the command line would be something like: diff -r --brief --binary dir1 dir2 You can find a copy of diff ported for Windows in "Unxutils," which is a collection of natively (no Cygwin libraries needed) ported GNU utilities. http://unxutils.sourceforge.net/ More importantly this will give you a more meaningful comparison than simply looking at directory listings. Steve Tolkin wrote: > I do have a port of Unix find on my current Windows machine. If you're in the midst of file recovery, I wouldn't recommend installing one, either. A better solution would be to share the machines drive over the network, and run the comparison on the machine where you have the secondary copy of the files. It'll be slow, but it should be the least intrusive approach. -Tom -- Tom Metro Venture Logic, Newton, MA, USA "Enterprise solutions through open source." Professional Profile: https://www.linkedin.com/e/fps/3452158/ ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] script to "normalize" output of Windows dir command
On Fri, 23 Sep 2005, Tolkin, Steve wrote: > Here are a few lines from the output of > \bin\find -print -ls > > 945730 drwxr-xr-x 6 a071046 Administ0 Sep 21 15:05 ./ant > ./ant/bin > 951240 drwxr-xr-x 2 a071046 Administ0 Sep 21 15:05 > ./ant/bin > ./ant/bin/ant > 951283 -rwxr-xr-x 1 a071046 Administ 5140 Apr 16 2003 > ./ant/bin/ant > ./ant/bin/ant.bat > > Note each file is on two lines. Probably that is the default for -ls. Nope. It does that because you told it to. You told find that you wanted a -ls listing, and you also told it -print to just print the filename. If you did the -ls without the unnecessary -print, you'd just get one line per file. > Also date and time are combined into three fields, but the third is > either time or year. This makes it harder to process. I would actually > prefer time in seconds since the start of the Unix eon. That's the normal behavior of ls. I'm not sure offhand if there's a good alternative. > Also there is no easy way to distinguish Files from Directories except > by further parsing of the permissions string, e.g. drwxr-xr-x. If you only want one or the other, you can use -type: find . -type f -ls f == files find . -type d -ls d == directories -- John Abreau / Executive Director, Boston Linux & Unix ICQ 28611923 / AIM abreauj / JABBER [EMAIL PROTECTED] / YAHOO abreauj Email [EMAIL PROTECTED] / WWW http://www.abreau.net / PGP-Key-ID 0xD5C7B5D9 PGP-Key-Fingerprint 72 FB 39 4F 3C 3B D6 5B E0 C8 5A 6E F1 2C BE 99 ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] script to "normalize" output of Windows dir command
I do have a port of Unix find on my current Windows machine. But I do not have that on the machine I back up to (my wife's), so I would need to install that, and its dependencies, which makes me reluctant to take that approach. I, like many people, have had problems with find, but I thought I would try your suggestion. There are "quirks" with the time reporting, and probably other issues I have forgotten. I do not know exactly how to set the argument to -printf and it is not explained in the help (shown below). If you send an example I would try that. Here are a few lines from the output of \bin\find -print -ls 945730 drwxr-xr-x 6 a071046 Administ0 Sep 21 15:05 ./ant ./ant/bin 951240 drwxr-xr-x 2 a071046 Administ0 Sep 21 15:05 ./ant/bin ./ant/bin/ant 951283 -rwxr-xr-x 1 a071046 Administ 5140 Apr 16 2003 ./ant/bin/ant ./ant/bin/ant.bat Note each file is on two lines. Probably that is the default for -ls. Also date and time are combined into three fields, but the third is either time or year. This makes it harder to process. I would actually prefer time in seconds since the start of the Unix eon. Also there is no easy way to distinguish Files from Directories except by further parsing of the permissions string, e.g. drwxr-xr-x. Here is the help. I cannot figure out how to suppress certain useless fields e.g. inode and owner, nor put output on one line, etc. C:\foo>\bin\find -help Usage: /bin/find [path...] [expression] default path is the current directory; default expression is -print expression may consist of: operators (decreasing precedence; -and is implicit where no others are given): ( EXPR ) ! EXPR -not EXPR EXPR1 -a EXPR2 EXPR1 -and EXPR2 EXPR1 -o EXPR2 EXPR1 -or EXPR2 EXPR1 , EXPR2 options (always true): -daystart -depth -follow --help -maxdepth LEVELS -mindepth LEVELS -mount -noleaf --version -xdev tests (N can be +N or -N or N): -amin N -anewer FILE -atime N -cmin N -cnewer FILE -ctime N -empty -false -fstype TYPE -gid N -group NAME -ilname PATTERN -iname PATTERN -inum N -ipath PATTERN -iregex PATTERN -links N -lname PATTERN -mmin N -mtime N -name PATTERN -newer FILE -nouser -nogroup -path PATTERN -perm [+-]MODE -regex PATTERN -size N[bckw] -true -type [bcdpfls] -uid N -used N -user NAME -xtype [bcdpfls] actions: -exec COMMAND ; -fprint FILE -fprint0 FILE -fprintf FILE FORMAT -ok COMMAND ; -print -print0 -printf FORMAT -prune -ls Thanks for the suggestion, but it is probably faster to write the perl that use find. Steve -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeremy Muhlich Sent: Friday, September 23, 2005 12:19 PM To: boston-pm@mail.pm.org Subject: Re: [Boston.pm] script to "normalize" output of Windows dir command How about the unix "find" command, with the -printf option? You can get it through cygwin. Taking find's output (even without -printf) from two directories and diffing it has gotten me through most of these sorts of problems. Also, diff -r might be helpful. (possibly with the --brief option as well) -- Jeremy On Fri, 2005-09-23 at 11:55 -0400, Tolkin, Steve wrote: > Summary: > I would like a perl script that converts the output of the Windows dir > command so that each line has the same format, including the directory > C:\_from_laptop\AAA BBB_files|abc||File|123|2003-04-14|10:21 > C:\_from_laptop\AAA BBB_files|empty.jpg|txt|Dir|0|2003-04-14|23:00 ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] script to "normalize" output of Windows dir command
How about the unix "find" command, with the -printf option? You can get it through cygwin. Taking find's output (even without -printf) from two directories and diffing it has gotten me through most of these sorts of problems. Also, diff -r might be helpful. (possibly with the --brief option as well) -- Jeremy On Fri, 2005-09-23 at 11:55 -0400, Tolkin, Steve wrote: > Summary: > I would like a perl script that converts the output of the Windows dir > command so that each line has the same format, including the directory > C:\_from_laptop\AAA BBB_files|abc||File|123|2003-04-14|10:21 > C:\_from_laptop\AAA BBB_files|empty.jpg|txt|Dir|0|2003-04-14|23:00 ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
[Boston.pm] script to "normalize" output of Windows dir command
Summary: I would like a perl script that converts the output of the Windows dir command so that each line has the same format, including the directory it is in, and its extension. The date and time should use a format that can be sorted as a string, e.g. -mm-dd and a 24 hour clock I think pipe delimited would work best, as the pipe character | cannot appear in a file name, and that would let me sort the output, and/or load it into a database. Details: I could probably write this in an hour but laziness is a virtue, and if someone has got one already that will probably be better anyway. I want to translate lines like this: Directory of C:\_from_laptop\AAA BBB_files 04/14/2003 10:21 AM 123 abc 04/14/2003 11:00 PM 0 empty.jpg.txt To lines something like this. Note that I moved the file name and extension sooner, so that the natural sort is by directory and file name, and a sort on the last two fields is by time. (I have a port of Unix sort in my c:\bin\ directory that I can use.) C:\_from_laptop\AAA BBB_files|abc||File|123|2003-04-14|10:21 C:\_from_laptop\AAA BBB_files|empty.jpg|txt|Dir|0|2003-04-14|23:00 None of it is tricky. You just need to remember what Directory line you saw last, convert the date and time fields, insert either File or Dir depending on its type, and write out each line that comes from a file or dir (except skip all the . and .. dirs). Note that a file named foo.bar.txt has a name of foo.bar and extension of txt. Some files can have no extension, and some directories do have an extension. Here is an except of the output. (Because it is an except the totals for Files and Bytes are not right.) Note that there are a few lines of boilerplate at the beginning which can be ignored, and a few lines at the end which can be ignored (or used as a sanity check on the totals.) Note that a file might not have an extension, that a file or directory can be empty, can have white space and strange characters in its name. Volume in drive C has no label. Volume Serial Number is A898-B50D Directory of C:\_from_laptop 01/23/2005 08:37 AM . 01/23/2005 08:37 AM .. 04/14/2003 01:46 PM _from_c 02/06/2001 01:34 PM 15618 0101.txt 02/06/2001 01:34 PM 15618 abc 04/14/2003 10:22 AM 32451 AAA BBB.htm 01/17/2005 09:53 AM AAA BBB_files 04/04/2000 06:14 PM 27648 acm_pubform.doc 01/17/2005 09:53 AM acrobat 01/17/2005 09:54 AM address 08/17/2004 10:04 AM 0 zzz 650 File(s) 92010877 bytes Directory of C:\_from_laptop\AAA BBB_files 01/17/2005 09:53 AM . 01/17/2005 09:53 AM .. 04/14/2003 10:21 AM 1045 abc 04/14/2003 10:21 AM 0 empty.jpg.txt 04/14/2003 10:22 AM 32451 AAA BBB CCC.htm 01/17/2005 09:53 AM AAA BBB_CCC_files 04/14/2003 10:21 AM43 spacer.gif 11 File(s) 37476 bytes Directory of C:\_from_laptop\AAA BBB CCC_files 01/17/2005 09:53 AM . 01/17/2005 09:53 AM .. 0 File(s) 0 bytes Total Files Listed: 245909 File(s)28969650933 bytes 154376 Dir(s) 31272304640 bytes free Background: My laptop's died a few days ago. The process to recover files and directories from it seems to have lots of missing files. I have a directory on another machine that I have been backing up to. I want to find out which file are missing. I have run dir on the backed up machine, and will run dir on the new machine, and then diff the outputs. The diff will work best if each line in the file had the same format, and includes the full directory path. P.S. Here is the command I ran in a DOS box (aka command prompt window etc.) from my Windows XP machine. dir >dir.txt c:\_from_laptop /-C /ON /S /TW /4 The /-C means suppress the thousand separator in the size, /ON means order by name, /S means recurse into subdirectories, /TW means show the last time it was written, and /4 means show 4 digit years. Thanks, Steve ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm