> >On Thu, May 30, 2002 at 09:05:39AM +0100, Alex McLintock wrote:
> >> PS There is a new Manning book "graphics programming with perl"
> >>
> >> http://news.diversebooks.com/article.pl?sid=02/05/30/0754246
> >>
> >> Shall we ask for a review copy?

>On 30/05/2002 at 10:06 +0100, Paul Makepeace wrote:
> >No, instead Manning collectively should be lowered onto a rough-hewn
> >wooden spike and sacrificially burnt to be set as an example to other
> >would-be book publishing spammers who harvest addresses from www.pm.org.


To defend Manning in this case I explicitly asked for their press releases 
as editor of
http://news.DiverseBooks.com/

I know that they have asked various people to contribute to some perl books 
they are
writing but that is separate from this instance. Their only crime so far is 
to use a perl website
to find perl programmers, and then email them. Hardly a hanging offence....

PS
http://news.diversebooks.com/article.pl?sid=02/05/27/1147220


Title: Data Munging With Perl
Author: Dave Cross
Publisher: Manning
Format:Paperback
Pages: 283
ISBN: 1930110006
Price: 36.95 USD Shop For It
Reviewer: Alex McLintock

It might be worth explaining "Data Munging". My mind translates this as 
"Data Massaging". I assume the term "munging" comes from the "Mung bean" 
which is used in chinese cookery by being squeezed into a paste and 
reformed into a variety of dumplings and cakes. Although I call myself a 
Java programmer nowadays there are many tasks for which perl is ideally 
suited. I use it for quick one off CGI scripts, process control on my unix 
boxes, and of course, data file manipulation or "munging".
These are tasks where I don't go through the full software engineering 
process of requirements analysis, design, and specifications. I often just 
throw something together by writing perl-like pseudo code which only takes 
a short while to turn into running code. The proof of the pudding is that 
it runs and is easily testable.
So what sort of things will this book help you improve? It starts off by 
pointing out some "best practice" rules which will help. These include 
telling the reader to decouple input, munging, and output processes, and 
use the unix filter model. Fine. The person reading this may not be a full 
time programmer, or a full time unix sys admin so these skills do need to 
be covered. However I would have thought unix tools so important that there 
should have been more investigation of tools like "sort", "uniq" and so on. 
I suppose the defence against such criticism is that you don't need the 
program "sort" or "uniq" if you can write it yourself in perl. Chapter 
three starts with sorting. This is the sort of thing I first used perl to 
do way back in 1995. Sorting is hard and choosing and building your sort 
keys is just part of the problem. The data sets I was dealing with quite 
often required temporary files and these weren't discussed much. It seemed 
like every example could be done in memory.
Much of the rest of the chapter seemed to be a random selection of possibly 
useful topics. DBI for accessing databases, Data::Dumper for serialising 
data structures in perl, and writing short perl scripts on the command line.

Chapter Four is all the good stuff in perl. Those pattern match facilities 
which look like so much noise on the line. Dave Cross makes an admirable 
effort at steering people away from regular expressions when not needed by 
pointing out some of the other useful facilities. I think this sort of book 
will sink or swim based on how good their regex sections are - and how 
accessible they are to beginner users. The example of how to use the 
modifier x to allow whitespace and thus comments is very good. Why don't we 
see examples like this more often?

Chapter Five concerns unstructured data which is what I deal with most of 
all. It wasn't as detailed as I would have liked. It seemed to skip the 
process of turning unstructured data into structured data, which is a very 
difficult task.

Chapters six, seven, and eight look at more and more structured data. This 
is where the meat of the problem lies. Tab delimited, comma separated, 
binary formats are all here. We then discuss popular the topics of html and 
XML in the last few chapters.

I'll be passing this book on to one of my reviewers who is a beginner with 
perl. Let's see how well she copes with the book


Alex






Openweb Analysts Ltd, London: Software For Complex Websites 
http://www.OWAL.co.uk/
Free Consultancy for London Companies thinking of Open Source Software.


Reply via email to