UNIX utilities in Perl
Hi, I am not sure if this is the right list for this question. I suspect it belongs on a list about CPAN, but the CPAN site doesn't appear to mention one. I am working on Windows, and find I keep writing very short Perl scripts to implement UNIX utilities. I call them zap_grep, zap_wc, zap_unique, zap_diff, zap_sort. My versions are very crude compared with the UNIX originals, and I wondered if somebody had already gone through this exercise. A search of CPAN came up with "prep" (Perl grep), "wc", and "uniq". Are these modules collected together anywhere? Should they be? What additional UNIX utilities would be useful? (Above, I suggest diff and sort.) What would be *really* useful is a method to pipe filters together. Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
How remove newlines?
Hi, A nice easy one for a Friday afternoon. I have a variable containing text in which I wish to convert embedded newlines to spaces. And, I wish to do this in a way which is portable. I have come up with: $tempValue =~ s/\r\n/ /g; $tempValue =~ s/\n/ /g; I believe this pair of statements will work on both Windows and UNIX, and with text originating on both Windows and UNIX. I welcome improvements. Also, how do I accommodate Macs? Many thanks, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: Date format patterns
"$Bill Luebkert" <[EMAIL PROTECTED]> wrote: Jim Hill wrote: I see where you're going, I think, but the problem is that I can't predict the date format which other users will retrieve from their registries. A lot of variations can be generated using different separators, eg dd-mm- and dd/mm/, and different orders, eg mm/dd/yy and /mm/dd, and the solution needs to be able to accommodate them all. ... If you can't predict the date format, I don't see how you can convert it into a standard format. Date::Manip or Date::Calc may have more of a chance, but it would still be a guess if the format is unknown. You would at least have to know which field is which - the delimiters can be handled which a regex. I have to do this all the time. Bill is correct that it is impossible to know whether the single date 06/07/1999 means July 06 or June 07. However, given a set of such dates, and assuming they are consistent, then it is possible to use heuristic rules (such as 07/22/1999 has to be mm/dd/, so everything else is). If Jim can extract a set of dates from the registry then he might be able to do this. The other thought which struck me was whether it would be possible to use stuff in the registry to tell the locale, and to then use the locale to predict the date format. Best of luck, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: [PHP] people/projects looking for developers...
"bruce" <[EMAIL PROTECTED]> wrote: interesting... for the most part, people have responded with contract resource sites... (guru.com/elance.com/etc) no one has mentioned any kind of site specifically geared towards people who want to come together to kind of build applications. ala the old garage/basement type of process were you get a few guys together with a few sales guys, and they build/start to sale the app, and create a biz... so my question i guess, is why the hell isn't there more of an interest in this kind of atmosphere.. have people just been too dam* burned too many times, or are people really just satisfied with their jobs What you seem to have in mind is technology driven development. This was find during the dot.com boom, but that bust. Venture capital companies will no longer fund a good idea; good ideas are ten a penny. They fund only good business propositions, and these are market led. That is to say, find out what product will sell, get a group of customers to sign up, and then develop it. Try reading "A Good Hard Kick in the Ass: Basic Training for Entrepreneurs" by Rob Adams. The author used to work for venture capital companies, and now runs a consultancy preparing start-ups for their initial encounter with a VC company. He describes the process as going out and finding people in pain. You then develop a product to remove their pain (for a price). This is off topic, so maybe continue it off-line. Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: simple regex question
"bruce" <[EMAIL PROTECTED]> hi.. a simple/basic/embarassingly simple one... i have: $foo = "foo.txt" i simply want to separate on the "." ie $foo =~ /([^.]+).txt/ $ans = $1 this doesn't seem to get $ans = 'foo' Try: use File::Basename; # my $foo = "foo.txt"; my $ans = fileparse($foo, ('.txt')); print "$ans\n"; Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: Reading Locked PDF files
"ashish srivastava" <[EMAIL PROTECTED]> wrote: Hi, Has anybody read a .pdf file from a perl script? I am trying to use PDF::Extract module, but not able to get the o/p. Please let me know if any one has written similar scripts? I tried to do this in Perl, gave up, and used "pdftotext.exe -layout" from the free Swish-e package, http://swish-e.org/ Having converted the PDF file to text, I then used Perl to unpick it. "If at first you don't succeed, try, try again. Then quit. No use being a damn fool about it." [Attributed to W C Fields] Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: problems with text encoding
"R. Sánchez" <[EMAIL PROTECTED]> wrote: Hi group. I've installed ActivePerl on win XP and I'm having some problems. I've tried documentation at activestate but found nothing on this topic. When a variable contains a non-english character (for ex. á [thats á if you don't see it]) and the script prints that variable to the screen, I get a lousy character, like if the cmd shell didn't support this character. But I can type it directly in a shel prompt and it shows ok. I've tried playing with locale (use locale;) and perl uses Spanish_Spain.1252 (es_es), so those character should print out. Windows local is also set to spanish. I don't understand what "perl uses Spanish_Spain.1252 (es_es)" means. Internally, Perl uses the UTF-8 character encoding. As far as I can tell, you have Windows XP set up to use the Windows-1252 character encoding; this is the same as ISO-8859-1 (I think). So, your problem is how to convert from ISO-8859-1 (also called Latin-1) to UTF-8. Look on CPAN http://search.cpan.org/ for "Encode" and then try using: $string = decode("iso-8859-1", $octets); Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: a regex to remove characters...
"bruce" <[EMAIL PROTECTED]> wrote: hi... need to know if there's a quick regex to remove everything after a given character. ie $foo = http://ww.rr.bb/cat/dog/index.html or $foo = http://ww.rr.bb/cat/dog/ball/abc.html i'd like to know how to remove everything after the last "/"... Not a regex, but try: #!/usr/bin/perl # use File::Basename; # use strict; use warnings; # my $foo = "http://ww.rr.bb/cat/dog/index.html";; my (undef, $directory, undef) = fileparse($foo); print $directory; # exit(0); Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: converting dates..
"Perl Developer" <[EMAIL PROTECTED]> wrote: You can always use Date::Format. If you need to convert it from the string first ("June 02, 2004"), you might want to look into Date::Manip -- namely the ParseDate function. Here is the URL on cpan: Date::Manip (ParseDate function) http://search.cpan.org/~sbeck/DateManip-5.42a/Manip.pod#ROUTINES Date::Format http://search.cpan.org/~gbarr/TimeDate-1.16/lib/Date/Format.pm HTH, Chris Snyder -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of George Gallen Sent: Thursday, June 10, 2004 12:31 PM To: Perl Win32 Subject: converting dates.. I have a date that looks like: June 02, 2004 Is there an perl function that will convert it to 06/02/2004 or 0602004 ? or if not a modules that converts. Thanks George I needed to do this the other day. It was so simple it never occurred to me to use a module. In general, using modules is preferable. Here's the code I used (use at your own risk): sub parseDate { my ($date) = @_; # if ( ($tempMonth, $day, $year) = $date =~ m/^\s*(\w+)\s+(\d\d?),\s+(\d\d\d?\d?)\s*$/ ) { $day = sprintf("%02d", $day); # ensure day is two digits # Ensure year is four digits if ($year <= 1900) { if ($year > 05) { $year += 1900; } else { $year += 2000; } } my $month = &convertMonth($tempMonth); # my $oldDate = $date; $date = $year . $month . $day; print "$0: WARNING: Date parsed from \"", $oldDate, "\" to \"", $date, "\"\n"; } # return $date; } sub convertMonth { my ($month) = @_; # my $number; if ($month =~ m/^January$/i) # /i = ignore case { $number = "01"; } elsif ($month =~ m/^February$/i) # /i = ignore case { $number = "02"; } elsif ($month =~ m/^March$/i) # /i = ignore case { $number = "03"; } elsif ($month =~ m/^April$/i) # /i = ignore case { $number = "04"; } elsif ($month =~ m/^May$/i) # /i = ignore case { $number = "05"; } elsif ($month =~ m/^June$/i) # /i = ignore case { $number = "06"; } elsif ($month =~ m/^July$/i) # /i = ignore case { $number = "07"; } elsif ($month =~ m/^August$/i) # /i = ignore case { $number = "08"; } elsif ($month =~ m/^September$/i) # /i = ignore case { $number = "09"; } elsif ($month =~ m/^October$/i) # /i = ignore case { $number = "10"; } elsif ($month =~ m/^November$/i) # /i = ignore case { $number = "11"; } elsif ($month =~ m/^December$/i) # /i = ignore case { $number = "12"; } else { $number = $month; } # return $number; } Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: perl/spider/crawling question...
"bruce" <[EMAIL PROTECTED]> wrote: hi... we're looking at creating a project/app to extract information from university websites. we know we can write a separate individual perl app/scipt for each school which would crawl/parse/extract the information we need. however, we'd rather not write a unique perl script for each school if there is a better/more efficient way. anybody have any good suggestions, preferably with code samples!! thanks for any help/assistance/pointers/etc... If you unleash a spider, don't forget to build in support for avoiding pages when requested to do so. Visit: http://www.robotstxt.org/wc/exclusion.html Also, a good place to start would be The Web Robots FAQ at http://www.robotstxt.org/wc/faq.html This suggests a book or three on spiders. Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: undef example
"Rob Dixon" <[EMAIL PROTECTED]> [EMAIL PROTECTED] wrote: &closeFtpConn ( $MVSFTP ); Don't use the ampersand notation unless you specifically need it. If you don't know what it does differently then you don't need it! closeFtpConn($MVSFTP); is the syntax to use. Why? To quote "Programming Perl": The official name of a subroutine includes the & prefix. A subroutine may be called using the prefix, but the & is usually optional. When used with parentheses, the & does nothing differently. Including it makes clear that the sub is user-defined. Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
searching a file for keywords
Craig Cardimon <[EMAIL PROTECTED]> wrote: I'm searching a text file for keywords. These keywords are stored in an array. First, I read the file into a scalar variable. Then, I search for each keyword, cycling through the array in an outer foreach loop: foreach $keyword (@keywords) The keywords should be in all caps as a heading -- all by themselves on a line. I have noticed they can be either centered, or right or left justified. I thought this would do the trick: if($wholefile =~ /^\s*$keyword\s*$/) Start and end of line: ^ and $ Zero or more whitespace on either side: \s* But it doesn't work. Won't pull up anything. I'm doing something dumb here. This works, but is not as specific as I would like: if($wholefile =~ m/$keyword/) Any suggestions? Try (untested): if($wholefile =~ /^\W*$keyword\W*$/) If you don't have whitespace fore and aft then maybe you have some other nonword character(s). Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: help needed
Jeff Griffiths <[EMAIL PROTECTED]> wrote: ... Here is a link to a free online book targeted at new Perl developers: http://learn.perl.org/library/beginning_perl/ Here is another free online book: http://www.ebb.org/PickingUpPerl/ I like Picking Up Perl because the book is available both in separate chapters and as a single lump. Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
small text files give empty socket?
michael higgins <[EMAIL PROTECTED] wrote: Hey, all -- I've been trying to write a small http server. It seems to work fine, but for one big problem I just can't get around. I'm hoping someone will know why this happens: When I read a _small_ text file, I can't get it to print to the socket, or/or recieve on the other end. Here's the snip to illustrate. Again, any files print to the socket and are read fine, but for _small_ text files. Small binary files are fine, but no way of reading the file and printing to the socket has worked for the small text (html, etc.) files. -- mike higgins You might want to try: use FileHandle; STDOUT->autoflush(1); From the perlvar page: Note that STDOUT will typically be line buffered if output is to the terminal and block buffered otherwise. Setting this variable is useful primarily when you are outputting to a pipe, such as when you are running a Perl script under rsh and want to see the output as it's happening. This has no effect on input buffering. (Mnemonic: when you want your pipes to be piping hot.) Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE to split on every other comma
Hi, I am trying to convert "Smith, A.B., Jones, C.D. and Doe, E.F." into: "Smith, A.B." "Jones, C.D." "Doe, E.F." That is to say, I need to split on every other ','. I have code which works, but there must be a more elegant way. Here is what I have: #! /usr/local/bin/perl # use strict; use warnings; use diagnostics; # my @originators = (); my $cellValue = "Smith, A.B., Jones, C.D. and Doe, E.F."; # This could also be "Smith, A.B., Jones, C.D., Doe, E.F." # my @tempArray = split(m/,| and /, $cellValue); # split on ',' or ' and ' # Unfortunately, this splits too finely, so knit back together again for (my $loop=0; $loop<@tempArray; $loop+=2) { $originators[$loop / 2] = $tempArray[$loop] . "," . $tempArray[$loop + 1]; ###print STDERR "tempArray[$loop] = \"", $tempArray[$loop], "\"\n"; ### DEBUG ### ###print STDERR "tempArray[$loop+1] = \"", $tempArray[$loop+1], "\"\n"; ### DEBUG ### print STDERR "originators[$loop/2] = \"", $originators[$loop/2], "\"\n"; ### DEBUG ### } # exit(0); I strip leading and trailing white space elsewhere, so that is not a problem. Many thanks for any help you can give. Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: Perl-Win32-Users Digest, Vol 2, Issue 14
"Eric Edwards" <[EMAIL PROTECTED]> wrote: >> Hello list, >> I added a -w and -e and both return false. >> But the file is created and out there. >> When I change the format to STDOUT it prints the format to the screec >> just >> as it should. But it won't write the format to the file because it >> can't >> find the file. >> Any suggestions? >> Thanks much! >> Eric >> #!/usr/bin/perl -w >> use strict; >> use warnings; >> my ($name, $address, $city, $state, $zip); >> my $filename = "c:\\labels.txt"; >> open(ADDRESSLABEL, ">>c:\\labels.txt") or >>die "can't create"; >> open(ADDRESSES, "c:\\perl_prgm\\addresses.txt") or >>die "cannot open addresses"; >> die "cam't find" if -e $filename; >> while () { >>chomp; >>($name, $address, $city, $state, $zip) = split(/:/); >>write ADDRESSLABEL; >> } Haven't tried this, but try closing the output file. close(ADDRESSLABEL) or die "can't close ADDRESSLABEL: $!"; >> print "Done!"; >> exit; Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Simple text editor under Windows
Hi, I don't want to start a huge flame war with this question, so please exercise restraint. I have just started to use Perl under Windows NT. I have been using Wordpad to edit my perl scripts simply because it is there (and Notepad doesn't have search and replace). My problem is that perl error messages say "error at line 187", and if I try to count down that far then I get lost. So, what is a nice simple editor that will allow me to jump to line 187? Note the word "simple"; if Wordpad can be made to do this then that would be the preferred solution. I could use MS Word (Edit->Find->Go To->Line) but this seems like overkill. My background is UNIX under which I used "vi" to edit text files. I can probably find a port of "vi" to Windows NT, but wondered what others use. Many thanks for any help you can give. Regards, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
searching for ppms
Randy Kobes <[EMAIL PROTECTED]> wrote: Hi, As part of a redesign of our CPAN search engine at http://cpan.uwinnipeg.ca/htdocs/cpan-search.html, information reported on modules and distributions searched for now includes that of available Win32 ppm packages. Right now it includes packages on the ActiveState, crazyinsomniac, and theoryx5 repositories (both 6xx and 8xx builds) - Please, is the URL I put into PPM for crazyinsomniac build 8xx http://crazyinsomniac.perlmonk.org/perl/ppm/5.8/ Still struggling a little with PPM. Many thanks, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: installing modules on Unix server
Howard Owen <[EMAIL PROTECTED]> wrote: perl -MThe::Module::Name On Fri, 2004-02-27 at 11:55, Lori wrote: My main question: Is there a way to tell if I already have these modules before I attempt to install them ? PPM query * I have not used Perl under UNIX, so this may not work. Regards, ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: Pipe and redirect STDOUT
Sui Ming Louie wrote: I have run similar batch files on NT4.0. They worked OK. At one point, I used to think that redirection from batch files was not possible. Then I got some insight from reading the perl2bat.bat file included with Perl. I asked the original question, so I thought I would try to summarize progress to date. First, many thanks to everyone who has responded. One poster suggested the problem might be that typing "test.pl" invoked a BAT file, and that my problem was redirection from BAT files in general. There is no "perl.bat" on my system, so this is not the problem. Also, redirection from BAT files appears to work as expected. (Also, also, there is no "perl2bat.bat" file on my system.) Another poster suggested trying "perl test.pl" and, indeed, redirection using this form of invocation works fine. This is a valid work around and, although my problem has not been solved, my problem is no longer a problem. Martin Leese wrote: Hi, I know this is a really dumb question (yes, there are dumb questions), but I am stuck. Why can I neither pipe the output from a perl script to "more" nor redirect it to a file? This is under Windows NT. The following script: #! /usr/local/bin/perl # use strict; use warnings; ###use diagnostics; # print STDOUT "Hello me"; # exit(0); produces "Hello me" on the screen with the command "test.pl". However, the command "test.pl |more" produces a blank line, and the command "test.pl >test.txt" produces an empty file. Why? As the more intelligent amongst you will have guessed, I am not a Windows person. Also, while you are here, what is the difference between "use warnings;" and "use diagnostics;"? I am running perl, v5.8.2 built for MSWin32-x86-multi-thread, Binary build 808 provided by ActiveState Corp. under Windows NT. Many thanks for any help you can give. Regards, Martin Thanks again to everyone who responded, Martin ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs