Re: [WWWOFFLE-Users] Alternative WWWOFFLE implementation

Paul A. Rombouts Sat, 28 Sep 2002 13:47:23 -0700

"Andrew M. Bishop" wrote:

> > > 1) Some changes will definitely not be included.
> > >
> > > These are for example the changes that stop it compiling on anything
> > > except Linux.
...
> > There is no such thing as absolute portability, you can only be portable to a
> > certain degree. GNU portability is all I strive for, because I simply don't have
> > any intention of running my code on any other system. (On the other hand, I
> > don't want be to restricted to Linux, so if anyone detects a change that can
> > only run on Linux, but not on another GNU system, I'd appreciate it very much
> > if that was pointed out to me.)
> 
> I may be missing something, but there are only two GNU systems aren't
> there; GNU/Linux and the Hurd?  (No arguments about Linux not being a
> GNU system please).


OK, if you mean a "pure" GNU OS, then I guess only GNU running on a Hurd
kernel qualifies.
But that's of course not what I meant. As Martin B�hr remarked, gcc and other
GNU tools are available on many UNIX-like systems besides Linux.

And while it may not always be possible to link against glibc, I don't see this
as a valid reason to always prefer a solution using only common and portable
functions to one that is clearly superior, but uses glibc extensions.
After all, the glibc source is freely available. It's usually possible to take
the part you need and to include it in your own code.

An example of this is the use of strcat(), which may be common and portable, but
almost never provides a satisfactory solution.
For a good explanation why this is so, I suggest you read the glibc info pages.
It's usually possible to construct far more sensible solutions using e.g.
stpcpy() or mempcpy(). From your point of view using these functions is
unacceptable, because not every C library provides them.

But why should you necessarily dumb down your code to make it portable?
Another possibility is to use a configure script to detect whether these
functions are available, and include alternative versions guarded by
C-preprocessor conditionals in your code. 


> > I don't like the way that striving for portability restricts you to a limited
> > set of common tools. Some GNU extensions make great sense to me, and enable
> > solutions that are (much) more elegant. And finding elegant solutions is for me
> > one of the most rewarding aspects of coding. After all, I'm hacking WWWOFFLE to
> > scratch my own itch; nobody is paying me to do it.
> 
> Nobody pays me for it either, I also do it for the enjoyment and the
> knowledge that other people like and use my programs.  One of the
> prices for this is to have to use some non-elegant features to
> work-around other UNIX systems.  This addds to the fun and I have
> learnt a lot about writing portable code.

I guess everybody has his own notion of fun.
I prefer a top-down style of programming. These means that I prefer to think of
an algorithm first in the most abstract terms possible, perhaps writing a
version in pseudo code, and then trying to implement it using the tools that
happen to be available to me. I strive to arrive at an implementation that
matches the abstract notions I had in my mind as closely as possible. This
usually gives me much more confidence that the resulting code is correct.
Portability will always be an afterthought.
This is because I think it's easier to make a correct program portable, than to
make a portable program correct.


> > Some of the macros I wrote could be replaced by functions. Some are butt ugly.
> > These are valid criticisms. But I fail to see what makes
> >
> >   char *copy=malloc(strlen(string)+1);
> >   strcpy(copy,string);
> >
> > so much more desirable than
> >
> >   char *copy=strdup(string);
> 
> This is the point (problem) about different coding styles, we will
> never agree.  You disagree with the code above like I use and I
> disagree about (for example) the strcmp_litbeg() macro that you have
> introduced.  I don't see why you need to use a separate macro
> 
>     strcmp_litbeg(foo,"bar")
> 
> rather than a more optimal
> 
>     strncmp(foo,"bar",sizeof("bar")-1)
> 
> which is only slightly better than
> 
>     strncmp(foo,"bar",3)
> 
> which is what I use.

I don't know what you mean by "more optimal". Using an optimizing compiler that
does constant folding, all three examples will result in the exactly the same
object code.
The reason I prefer strcmp_litbeg() is because it's more obvious to me that the
resulting code is correct.
Being only human, I'm not very good at counting characters, I'd much rather let
a compiler do that.
For instance, to me it's not obvious at a glance that 

  strncmp("WWWOFFLE Must be online or autodial to fetch",line,44)

is correct.
In fact if you have dozens and dozens of lines like these there will be more
likely than not be a mistake in one of them.
And once you have got them all correct, a mistake is very easily re-introduced
when you change one of the strings and forget to adjust the character count.
Casting this in the form of your second example would result in a very ugly

  strncmp("WWWOFFLE Must be online or autodial to fetch",line,
          sizeof("WWWOFFLE Must be online or autodial to fetch")-1)

It's slightly easier to see that this is correct than the version you used in
your code, but you still have to scan both the string literals to check that
they are identical.


> > > 2) Some changes should be included.
> > >
> > > These are usually obvious to spot.  Some are spelling mistakes in
> > > source code or obvious programming errors.  Anything that is obvious
> > > in this sense should be included, the problem is finding it. [...]
> > > There is [...] no way that I am going to read through [a huge patch]
> > > to find the changes that fit into category number 2 [...].
> >
> > I've reported some of these small errors to you in the past. The problem is that
> > it takes time to write neat little e-mails reporting every little error
> > separately, time that I would rather spend coding.
> 
> I believe that all of these have been included in WWWOFFLE.  Are there
> any that you have reported that I ignored?

No, as far as I remember you haven't ignored any of the errors I have reported
that fit into this category.
But that's not my point. It typically takes less time to fix a bug, than to
explain it in writing. So, in the time that it takes me to fix say 20 minor
bugs, I can only fix *and* report maybe 10 of them.

I find increasingly I can't be bothered to report them, or even forget that I
have fixed them.
I'm not claiming right now that I know how to make the process of getting these
types of bugs fixed in the main code more efficient.

You can see the same problem in the Linux kernel. Often the kernels that are
shipped with various distributions have important bugfixes that unfortunately
never seem to end up in the code you can get from kernel.org.


> > > 3) Some changes that might be included.
> ...
> > > Finally I will be taking the approach that Linus Torvalds takes with
> > > patches to the Linux kernel.  There is no way that I will apply a huge
> > > patch of over 670 kB (uncompressed).
> 
> > I never expected that my patch or any sizeable part of it would end up in the
> > main WWWOFFLE code, this was not my reason for publishing it. The changes that
> > I've made to the WWWOFFLE code were simply to satisfy my own personal needs.
> > And I am pleased with the result: I find that the modified version runs
> > much better and is much more useful than the original version.
> 
> I understood that you did not intend me to include the whole thing.  I
> was only answering the question that was asked of me; which was if I
> would be including it.

Yes, but your remarks read like a categorical rejection of the work I had done,
as if by "poisoning" the code with non-portable extensions, I had precluded the
possibility of it having any redeeming features.

Some people seem to like some of the ideas I have implemented.
I think you could have adopted a more constructive attitude, instead of trying
to lay down strict rules for everybody who wishes to contribute.
Linus Torvalds is an extremely busy man, who gets contributions from hundreds of
people, so he can afford to offend some of them.
I don't get the impression there are that many contributing to the WWWOFFLE
project yet.


> > I've succeeded in getting a few features into the main code: the
> > request-redirection option, the CGI-interface and unlimited number of *'s in
> > patterns are the most important. But overall, my experience with this has been a
> > negative one: for instance I put a lot of work into making the patch for the
> > CGI-interface, but the way it ended up in the main code was not very useful to
> > me.
> 
> Is there anything specific about your CGI patch that I included (with
> some changes) that you dislike?  I thought that there was only one
> part of the functionality that I removed (the Location header built-in
> redirection).

Of course it's not my own patch that I dislike, rather the way you left out some
parts and changed others.
For one thing I was really disappointed that you hadn't got rid of the size
field in the Header struct. It's almost impossible to keep the value of the size
field correct; the way you do it is much too error-prone.
In fact you don't seem to trust its value yourself, considering that you've
replaced head->size with strlen(head) in src/wwwoffles.c version 2.7e.

Furthermore the CGI-specification explicitly allows the output of a CGI-script
to start with a valid HTTP status-line, but your code completely ignores it.
I use some scripts that pass on the headers retrieved from a remote server, so
it's really irritating when WWWOFFLE mangles the status-line.

Thirdly, the CGI-specification states that some types of redirection should be
handled by the server internally. Instead you always let the browser handle
these redirections, because you're afraid of an infinite recursion in the
WWWOFFLE server. But this doesn't solve the problem, because some browsers
(older versions of Netscape, for instance) will happily follow endless
redirections. If you really want to solve this problem, it's possible to do this
without breaking the CGI-specification. I've simply used a counter in the
WWWOFFLE code to check that the number of iterations does not exceed a certain
limit. (This was not included in my original CGI-patch, I only thought of it
later.)

Finally I don't understand why you replaced the pipe I used for providing the
input to the CGI-script by a temporary file, and added an extra pipe handle the
output of the script. Your code parses the headers output by the CGI-script
twice instead of only once like my code did. This all seems unnecessarily
inefficient to me.

-- 
Paul A. Rombouts <[EMAIL PROTECTED]>

My alternative WWWOFFLE implementation page:
  http://www.phys.uu.nl/~rombouts/wwwoffle.html

Re: [WWWOFFLE-Users] Alternative WWWOFFLE implementation

Reply via email to