subject:"fixhrefgz \- tool for converting anchors to gzipped files"

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-30 Thread Craig Sanders

On Sat, 28 Jun 1997, Jim Pick wrote:

> > You are proposing that a web-server is supposed to be searching
> > through the .html code it serves and replace all links referring to
> > .html.gz by .html links?
>
> dwww does this - it's not trivial. This is definitely not the job of a
> web server.

I agree 100%.

A web server should NOT mess with the content.

If we do this then we will make it difficult (or impossible) to serve .gz
compressed files from debian-based web-servers.  

remember that not all .gz files are compressed documentation that needs
to be decompressed on the fly. e.g. you put the linux kernel source on
your web server for anyone to download. Do you really want apache or
whatever to decompress a 6+MB .tar.gz file on-the-fly while sending it
out to someone?

> So here's my stand:
> 
> - let's munch up the links to point to ".html.gz" files.  Ugly, I know,
>   and a bit of work, but then we don't need to force people to install a
>   web server.  I think it's pretty important that we don't force people
>   to run stuff they don't want.

no, that will make it a pain to download files which are already
compressed.  The "smarts" should be in the web browser.

> - we should compress html, because lots of people (like me) are using
>   Debian on machines with almost no hard drive.
> 
> - Lynx and Netscape work with the compressed links (correct me if I'm wrong),
>   and we could use a web server/dwww combination to allow other browsers
>   to work too.

I think that the correct place for this translation is in the web browser
(as many others have suggested).

modify lynx, mosaic, chimera, etc so that:

 1.  when a NON- .gz link is selected, try to fetch it.

 2.  if it fails, try to fetch .gz version and decompress it.

 3.  if that fails too, report an error to user.

 4.  if the link was pointing to a .gz file then DO NOTHING TO IT.  If I 
 download a compressed file from the net, then i want to save it as
 a compressed file.  I certainly dont want my browser mangling the
 file for me.

point 2. should probably be restricted to localhost or `hostname -d`.  i
don't know.  will have to think some more about it.

It may also be worthwhile doing this for file managers like mc, git,
tkdesk. It would definitely be worthwhile to use less' LESSPIPE feature
to do this.

we can't patch netscape, but that's not our problem. people using
netscape will just have to use dwww (which should be the preferred way
of browsing debian docs anyway).

I can't wait for the Mnemonic project to get off the ground...an
extensible, modular, freeware web browser written using the Gimp's GTK
widget set: Heaven! 

I'm getting really tired of netscape 4.0b5 crashing when I do Really
Bad Things like click on a page to select a link or foolishly try to
use it's drag and drop feature (it worked in 4.0b3, crashes almost 100%
of the time under 4.0b5), or leave a netscape window idle for a while
and have it just die for no apparent reason. Netscape is becoming yet
another example of why free software is better than commercial/non-free.

craig

--
craig sanders
networking consultant  Available for casual or contract
temporary autonomous zone  system administration tasks.

Re: Re^4: fixhrefgz - tool for converting anchors to gzipped files

1997-06-29 Thread Alex Yukhimets

> Am 29.06.97 schrieb aqy6633 # is5.nyu.edu an Marco Budde ...
> 
> Moin Alex!
> 
> AY> > Right, but does all WWW server offer this feature? We can't force the
> AY> > user to install a specific server.
> AY> Why not? This could be a part of Debian documentation system.
> 
> Because no admin would like to have to httpd on his system: one for our  
> documentation and one for the other.
> 

Why? If the server is started from inetd it doesn't eat up any resources
while not in use. And another option would just have two (or more)
document roots. It is not difficult to configure httpd this way, but
configuration differs for different servers. Changing web server would 
then be not very trivial task.

> AY> The only restriction would be to run it on unconventional port
> AY> and preferably from inetd.
> 
> Again, why should we use a WWW server? This is always slower than read the  
> files direct from disc.
> 
> cu, Marco

Yes, but as you mentioned to your e-mail to Christoph, we can't patch
Netscape to convert .gz files on the fly, not mentioning that if we
implement that using web server, there will be possibility to browse
documents from different host.

Thanks.

Alex Y.

> 
> --
> Uni: [EMAIL PROTECTED]  Fido: 2:240/5202.15
> Mailbox: [EMAIL PROTECTED]http://www.tu-harburg.de/~semb2204/
> 
> 


-- 
   _   
 _( )_
( (o___
 |  _ 7  '''
  \(")  (O O)
  / \ \ +---oOO--(_)+
 |\ __/   <--   | Alexander Yukhimets   [EMAIL PROTECTED] |
 || |   http://pages.nyu.edu/~aqy6633/  |
 (   /  +-oOO---+
  \ /  |__|__|
   )   /(_  || ||
   |  (___)ooO Ooo
\___)


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re^2: fixhrefgz - tool for converting anchors to gzipped files

1997-06-29 Thread Marco Budde

Am 29.06.97 schrieb clameter # waterf.org ...

Moin Christoph!

CL> There is no need to much up any links. The web-browser should simply check
CL> if a .gz file exists if the file referenced by the link cannot be found
CL> and decompress the file with a tagged on .gz on the fly. That is the way
CL> the servers work.

Could you please tell, how we can patch for example netscape to behave  
like this?

cu, Marco

--
Uni: [EMAIL PROTECTED]  Fido: 2:240/5202.15
Mailbox: [EMAIL PROTECTED]http://www.tu-harburg.de/~semb2204/


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re^2: fixhrefgz - tool for converting anchors to gzipped files

1997-06-29 Thread Marco Budde

Am 28.06.97 schrieb clameter # waterf.org ...

Moin Christoph!

CL> It will still serve the .html file (now uncompressed) containing .html.gz
CL> links which are not understood by web-servers outside of the Debian realm.

Maybe we could use the following:

1.) Change the links inside the documents to .html.gz (offline) to
allow browsing the documents without a WWW server.
2.) Tell the WWW servers to uncompress .html.gz files on-the-fly
if the browser request .html.gz files. The server delivers
the uncompress file with MIME type text/html with the filename
.html.

  browser   server
  ---   --
  req. foo.html.gz  ->
<- please req. foo.html
  req. foo.html ->
<- uncompress foo.html.gz
   on-the-fly, send it
   as text/html


I think this should be possible using the apache's Redirect feature.

cu, Marco

--
Uni: [EMAIL PROTECTED]  Fido: 2:240/5202.15
Mailbox: [EMAIL PROTECTED]http://www.tu-harburg.de/~semb2204/


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-29 Thread Christoph Lameter

On Sat, 28 Jun 1997, Jim Pick wrote:

>So here's my stand:
>
>- let's munch up the links to point to ".html.gz" files.  Ugly, I know,
>  and a bit of work, but then we don't need to force people to install a
>  web server.  I think it's pretty important that we don't force people
>  to run stuff they don't want.

There is no need to much up any links. The web-browser should simply check
if a .gz file exists if the file referenced by the link cannot be found
and decompress the file with a tagged on .gz on the fly. That is the way
the servers work.

--- +++ --- +++ --- +++ --- +++ --- +++ --- +++ --- +++ ---

pgpzMLeNUP2HJ.pgp
Description: PGP signature

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Nicolás Lichtmaier

On Sat, 28 Jun 1997, Christian Schwarz wrote:

> Why? The files are called ".html.gz" in the file system. Thus, these links
> are valid. We only have to implement on-the-fly decompression on some web
> servers. (This functionality could be useful for others, too, so we could
> forward our patches to the upstream maintainers of the web servers as
> well.)

 So..

---
GET http://localhost/hello.html.gz
[...]
Content-Type: text/html

[uncompressed HTML]
---

 This is non-standard... the file in the HD exists, httpd is supposed to
send it as is, and using the suffix `html.gz' for every uncompressed HTML
documentation would be strange, or even annoying for a user trying to
`save as' the file in w95.

 I think that Christoph's idea is the elegant way of doing this. The www
server could even be just something like...

#!/bin/bash
read req
read
req=${req#GET }
req=${req% HTTP*}
if [ -r $req ]; then
echo HTTP/1.0 200 OK
echo Content-type: text/html
echo
cat "$req"
else
if [ -r $req.gz ]; then
echo HTTP/1.0 200 OK
echo Content-type: text/html
echo
zcat "$req.gz"
fi
echo HTTP/1.0 404 Not found
echo Content-type: text/html
echo
echo "Can't find $req here!"
fi
-
 (with `debdoc   stream  tcp nowait  nobody  /usr/sbin/tcpd
/usr/sbin/in.debdoc')

 This is only for testing, but works fast..! A VERY small C program can do
this safely...
 And connections to that service could be restricted by default to the
local machine...

-- 
Nicolás Lichtmaier.-

--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] .
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: Re^2: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Alex Yukhimets

> Moin Christoph!
> 
> CL> 200Mhz Pentiums are the standard fare today. And I am running
> CL> the boa webserver for example on some low memory 486DX66s with
> 
> I'm using a 486/100 and a 486SL/33. In my opinion we should avoid using  
> the server to uncompress the files. We should find another solution.
> 
> CL> excellent performance. Boa serves directly from disk unless
> CL> there is the need to gunzip something.
> 
> Right, but does all WWW server offer this feature? We can't force the user  
> to install a specific server.

Why not? This could be a part of Debian documentation system.
The only restriction would be to run it on unconventional port
and preferably from inetd.

Alex Y.
> 
> CL> The big issue here is that you want to change an existing
> CL> very public API (http protocol) to include compression which
> CL> may be a big hassle to install on many platforms and so far
> CL> has not been an issue on the more popular platforms such as
> CL> Win95 or other Unixes.
> 
> I don't see the problem. The online help system should be designed for the  
> Debian users. If a Windows, MAC, etc user want to use the help systen he  
> has to install gzip as helper application.
> 
> But maybe we would build a compressed file system using gzip and the loop  
> back device?
> 
> cu, Marco
> 
> --
> Uni: [EMAIL PROTECTED]  Fido: 2:240/5202.15
> Mailbox: [EMAIL PROTECTED]http://www.tu-harburg.de/~semb2204/
-- 
   _   
 _( )_
( (o___
 |  _ 7  '''
  \(")  (O O)
  / \ \ +---oOO--(_)+
 |\ __/   <--   | Alexander Yukhimets   [EMAIL PROTECTED] |
 || |   http://pages.nyu.edu/~aqy6633/  |
 (   /  +-oOO---+
  \ /  |__|__|
   )   /(_  || ||
   |  (___)ooO Ooo
\___)


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Jason Gunthorpe


On Sat, 28 Jun 1997, Richard Kaszeta wrote:

> >Exactly. So there is no problem when using web-servers.
> 
> Umm, yes, there is.  I don't want a server running on my machine for
> *security* reasons (and one of the places I put debian machines has a
> site policy against running http servers).  I have enough problems
> with security, denial of service attacks, etc that I don't need to
> aggravate the situation by placing dozens of extra http servers on my
> net.

Just a thought, but it would be possible to make the webserver bind
the the local host address only. To the outside world it would be
exactly the same as if there was no server running at all.

Jason


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Federico Di Gregorio


> > Err... emmh... THIRD time I write this to the list:
> > 
> > if you ask boa for the file foo.html and it does not exist,
> > boa looks for boa.html.gz and if THAT exists boa DECOMPRESS it
> > and serves you the uncompressed verion, as if foo.html existed!
> > (The browser think it has just loaded foo.html, no references to
> > foo.html.gz!)
> 
> Sorry, but this does _not_ answer my question. The question is: does "boa"
> uncompress the file if "foo.html.gz" is requested (and exists)?

I am sorry I misuderstood your question... I'll try this out with both
xemacs w3 and netscape 4.0b5 tout-de-suite...

w3 -- went bad, it asked me a file name and saved an *unzipped* copy
  of the file to it...

netscape -- idem

Mmmm... this sounds pretty logical to me, if you ask for a compressed file
you want to save it... even if the server unzips it for you... snort!

Sorry again,
Federico

PS - If you dont change the links in the HTML source all goes fine,
you'll obtain what you are looking for... I cast my vote against
the modification of every foo.html link into foo.html.gz.
**
* Federico Di Gregorio   |  /  the number you dialled is *
* [EMAIL PROTECTED]   | / -1   imaginary... please, rotate*
*|/  the phone PI/2 and try again!   *
**



--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] .
Trouble?  e-mail to [EMAIL PROTECTED] .

Re^2: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Marco Budde

Am 28.06.97 schrieb clameter # miriam.fuller.edu ...

Moin Christoph!

CL> 200Mhz Pentiums are the standard fare today. And I am running
CL> the boa webserver for example on some low memory 486DX66s with

I'm using a 486/100 and a 486SL/33. In my opinion we should avoid using  
the server to uncompress the files. We should find another solution.

CL> excellent performance. Boa serves directly from disk unless
CL> there is the need to gunzip something.

Right, but does all WWW server offer this feature? We can't force the user  
to install a specific server.

CL> The big issue here is that you want to change an existing
CL> very public API (http protocol) to include compression which
CL> may be a big hassle to install on many platforms and so far
CL> has not been an issue on the more popular platforms such as
CL> Win95 or other Unixes.

I don't see the problem. The online help system should be designed for the  
Debian users. If a Windows, MAC, etc user want to use the help systen he  
has to install gzip as helper application.

But maybe we would build a compressed file system using gzip and the loop  
back device?

cu, Marco

--
Uni: [EMAIL PROTECTED]  Fido: 2:240/5202.15
Mailbox: [EMAIL PROTECTED]http://www.tu-harburg.de/~semb2204/


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Richard Kaszeta

>Exactly. So there is no problem when using web-servers.

Umm, yes, there is.  I don't want a server running on my machine for
*security* reasons (and one of the places I put debian machines has a
site policy against running http servers).  I have enough problems
with security, denial of service attacks, etc that I don't need to
aggravate the situation by placing dozens of extra http servers on my
net.

And I shouldn't have to have one when all the documentation is locally
available on the disk.

Whatever our solution, it shouldn't require a web server to operate.

-- 
Richard W Kaszeta   Graduate Student/Sysadmin
[EMAIL PROTECTED]   University of MN, ME Dept
http://www.menet.umn.edu/~kaszeta


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Jim Pick


[ I hate to wade into this, but  ]

> >However, as you surely know, this does not work without web server, since
> >the browsers are not looking for "foo.html.gz" if "foo.html" is
> >referenced.
> 
> Yes. But if you change the references then the web-serverws will no longer
> do on the fly decompression. They will serve the links as .gz which is not
> universally supported by web-browsers not under Debians control.

For cases where people want to use a web browser that doesn't grok gzip,
we could use dwww (I think).
 
> >Thus, we are considering changing the "href's" to "foo.html.gz" and fix
> >the browsers, where possible, to uncompress the file on-the-fly. If the
> >browser cannot be fixed (for example, if we don't have the source code) we
> >could probably offer a simple web server (e.g. boa) to do this
> >automatically.
> 
> Please think about this.
> 
> You are proposing that a web-server is supposed to be searching through
> the .html code it serves and replace all links referring to .html.gz by
> .html links?

dwww does this - it's not trivial.  This is definitely not the job of
a web server.
 
So here's my stand:

- let's munch up the links to point to ".html.gz" files.  Ugly, I know,
  and a bit of work, but then we don't need to force people to install a
  web server.  I think it's pretty important that we don't force people
  to run stuff they don't want.

- we should compress html, because lots of people (like me) are using
  Debian on machines with almost no hard drive.

- Lynx and Netscape work with the compressed links (correct me if I'm wrong),
  and we could use a web server/dwww combination to allow other browsers
  to work too.

- all the documentation isn't going to be HTML anyways - just "book-like"
  stuff.  So what's the big deal anyways.  No need to start a flame-war.

- the other option would be to leave HTML full uncompressed, which would
  be easiest

Cheers,

 - Jim




pgp18O0coF5QA.pgp
Description: PGP signature

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Christian Schwarz

On Sat, 28 Jun 1997, Christoph Lameter wrote:

> On Sat, 28 Jun 1997, Christian Schwarz wrote:
> 
> >
> >Hi!
> >
> >Christoph, please tell us why using "fixhrefgz" on "html.gz" files does
> >not work with our web servers.
> 
> Please read the other posts.

Please answer my questions! I haven't found an answer elsewhere.

> >As far as I have understood, these web servers are so intelligent that if
> >a file "foo.html" is referenced, but only "foo.html.gz" is found, they
> >uncompress the file on-the-fly and pass the decompressed version to the
> >web browser.
> 
> Exactly. So there is no problem when using web-servers.
> 
> >However, as you surely know, this does not work without web server, since
> >the browsers are not looking for "foo.html.gz" if "foo.html" is
> >referenced.
> 
> Yes. But if you change the references then the web-serverws will no longer
> do on the fly decompression. They will serve the links as .gz which is not
> universally supported by web-browsers not under Debians control.

But most of the web browser can easily be fixed. Since "boa" is really
very small and already supports on-the-fly decompression, we can include
it even in the base system so everyone out there his it installed. It can
be started on another port than 80, I think, so it doesn't conflict with
other web servers.

> >Thus, we are considering changing the "href's" to "foo.html.gz" and fix
> >the browsers, where possible, to uncompress the file on-the-fly. If the
> >browser cannot be fixed (for example, if we don't have the source code) we
> >could probably offer a simple web server (e.g. boa) to do this
> >automatically.
> 
> Please think about this.
> 
> You are proposing that a web-server is supposed to be searching through
> the .html code it serves and replace all links referring to .html.gz by
> .html links?

No. The links are adopted from ".html" to ".html.gz" where necessary by
the _maintainer_ when the ".deb" is created. We have a Perl tool to do
this. (I posted it here, yesterday.)

> >But why can't "boa" be extended to uncompress "foo.html.gz" on-the-fly
> >when _this_ file is requested, just as "foo.html" would have been
> >requested and that file does not exist?
> 
> It can certainly do this but the links are the problem
> 
> It will still serve the .html file (now uncompressed) containing .html.gz
> links which are not understood by web-servers outside of the Debian realm.

Why? The files are called ".html.gz" in the file system. Thus, these links
are valid. We only have to implement on-the-fly decompression on some web
servers. (This functionality could be useful for others, too, so we could
forward our patches to the upstream maintainers of the web servers as
well.)

Christoph, I take your objection seriously, I don't want to include
"technical nonsense" in our policy manual. So please explain to us what
difficulties you see.

Thanks,

Chris

--  Christian Schwarz
   [EMAIL PROTECTED], [EMAIL PROTECTED]
  [EMAIL PROTECTED], [EMAIL PROTECTED]

PGP-fp: 8F 61 EB 6D CF 23 CA D7  34 05 14 5C C8 DC 22 BA

 CS Software goes online! Visit our new home page at
 http://www.schwarz-online.com

--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Christoph Lameter

On Sat, 28 Jun 1997, Christian Schwarz wrote:

>
>Hi!
>
>Christoph, please tell us why using "fixhrefgz" on "html.gz" files does
>not work with our web servers.

Please read the other posts.

>As far as I have understood, these web servers are so intelligent that if
>a file "foo.html" is referenced, but only "foo.html.gz" is found, they
>uncompress the file on-the-fly and pass the decompressed version to the
>web browser.

Exactly. So there is no problem when using web-servers.

>However, as you surely know, this does not work without web server, since
>the browsers are not looking for "foo.html.gz" if "foo.html" is
>referenced.

Yes. But if you change the references then the web-serverws will no longer
do on the fly decompression. They will serve the links as .gz which is not
universally supported by web-browsers not under Debians control.

>Thus, we are considering changing the "href's" to "foo.html.gz" and fix
>the browsers, where possible, to uncompress the file on-the-fly. If the
>browser cannot be fixed (for example, if we don't have the source code) we
>could probably offer a simple web server (e.g. boa) to do this
>automatically.

Please think about this.

You are proposing that a web-server is supposed to be searching through
the .html code it serves and replace all links referring to .html.gz by
.html links?

>But why can't "boa" be extended to uncompress "foo.html.gz" on-the-fly
>when _this_ file is requested, just as "foo.html" would have been
>requested and that file does not exist?

It can certainly do this but the links are the problem

It will still serve the .html file (now uncompressed) containing .html.gz
links which are not understood by web-servers outside of the Debian realm.

--- +++ --- +++ --- +++ --- +++ --- +++ --- +++ --- +++ ---

--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread branden

-BEGIN PGP SIGNED MESSAGE-

On Fri, 27 Jun 1997, Christoph Lameter wrote:

I can sort of see both sides of this argument.

I agree that rewriting html documents to say ".html.gz" in their hrefs is
bad.

[Lars Wirzenius wrote:]
> : Being able to read documentation directly without running
> : a web server is very important.
> 
> So far I cannot discern why.

I can, and it's a good point.

In case the machine is in single-user mode, for instance. You seldom
need the docs more than when the box is at init level 1.

I have two suggestions, and they're both going to take some work. If
someone wants to organize this, I can contribute what feeble coding
skills I have to it, because I think it's important.

1) We need to make sure boa (since it looks like that's what we'll be
using) is as close to bulletproof as it can get. It needs to stay small
and fast, and it also needs to be clean, efficient, and secure. It needs
to be as ready to do its job as telnetd. (Is boa hooked into inetd or
does it run on its own? If it's a non-forking daemon, why can't it be
grafted into inetd? Answer these questions gently, folks -- I haven't
looked into the guts of inetd.) This way we don't have to screw with
rewriting HTML files.

2) We need to hack some .gz sophistication into the file: handling code
of lynx. (I.e., you feed it a filename, it looks -- can't find it?
look for filename.gz and run gzip as a filter -- still can't find it?
smack the user around) This is probably slower and less preferred, but
necessary for single user mode. Lynx already is smart enough to gunzip
files that have a .gz extension. This doesn't sound very hard, so if
people think it's a good idea, I'll have a go at modifying the source to
do this myself. Perhaps it should look for a certain flag in its config
file before behaving this way, which we can call "debian_gz_flamewar" or
something.

Anyway, I know what happens when you step in between quarrelling giants, so
I think I'll don both a hardhat and an asbestos suit.

- --
G. Branden Robinson |  It was a typical net.exercise -- a
Purdue University   |  screaming mob pounding on a greasy spot
[EMAIL PROTECTED]  |  on the pavement, where used to lie the
http://www.ecn.purdue.edu/~branden/ |  carcass of a dead horse.

-BEGIN PGP SIGNATURE-
Version: 2.6.3a
Charset: noconv

iQCVAwUBM7UUVaiRn0nSNFD5AQH5wwQAtKT5wobK8PYMe9siVh4ujRG0cU3dxFNb
VGRniXE+2Dzd+MEeP8/loaXLFIWF24aUHb/2MkmF2mfnJW7Vj2e6e6Ernbt27R96
dB6cz985yze7LApqNmBpkwZAP91jAAWZsvbQLHcVYPDe2j9fuSBfBvJuVsKt8g3p
y0ssPLz/Ocs=
=4Vrn
-END PGP SIGNATURE-

--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Fernando

Lars Wirzenius wrote:

> 
> We have the rather unpleasant situation that reading documentation
> requires a web server. That's a problem. Fixing it requires changing
> the .html files.
> 

I have boa installed in a 386 SX-25 and I hardly notice any overhead.
I mean for doing the same thing I would do with just a browser, but with
more flexibility.

I seriously suggest that you try it and then decide whether a web server
is such a problem.

Regards,
Fernando

--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Christian Schwarz


Hi!

Christoph, please tell us why using "fixhrefgz" on "html.gz" files does
not work with our web servers.

As far as I have understood, these web servers are so intelligent that if
a file "foo.html" is referenced, but only "foo.html.gz" is found, they
uncompress the file on-the-fly and pass the decompressed version to the
web browser.

However, as you surely know, this does not work without web server, since
the browsers are not looking for "foo.html.gz" if "foo.html" is
referenced.

Thus, we are considering changing the "href's" to "foo.html.gz" and fix
the browsers, where possible, to uncompress the file on-the-fly. If the
browser cannot be fixed (for example, if we don't have the source code) we
could probably offer a simple web server (e.g. boa) to do this
automatically.

But why can't "boa" be extended to uncompress "foo.html.gz" on-the-fly
when _this_ file is requested, just as "foo.html" would have been
requested and that file does not exist?


Thanks,

Chris

--  _,, Christian Schwarz
   / o \__   [EMAIL PROTECTED], [EMAIL PROTECTED],
   !   ___;   [EMAIL PROTECTED], [EMAIL PROTECTED]
   \  /
  \\\__/  !PGP-fp: 8F 61 EB 6D CF 23 CA D7  34 05 14 5C C8 DC 22 BA
   \  / http://fatman.mathematik.tu-muenchen.de/~schwarz/
-.-.,---,-,-..---,-,-.,.-.-
  "DIE ENTE BLEIBT DRAUSSEN!"


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Christoph Lameter

: (Anyeay, even fast machines may be unable to run a web server,
: if they need to be secure. Running extra daemons is insecure.)

Running a web-broser on the machine may also be insecure. If
you run a 8Meg binary on the machine anyways what an issue
could 150K for a webserver be?

: > The big issue here is that you want to change an existing
: > very public API (http protocol) to include compression which

: I'm sorry, but you don't know what you're talking about.

I definitely do.

: When a web browser requests the documentation via HTTP, it
: will be uncompressed by the web server. This already works
: fine. No problem at all.

Yes and I would like to keep it that way. As I understand it you
are proposing to make serving documentation through the webserver
very problematic.

: When a web browser reads the documentation directly from disk,
: it needs to understand how to identify .html.gz as HTML, and how
: to uncompress it, so that it can display it.  This works (except
: for some versions of Netscape, and that might be fixable). The
: only problem is that the documentation contains links to other
: files. The browsers are not (yet?) capable of understanding
: that when the link is to foo.html, and foo.html doesn't exist,
: that they should look for foo.html.gz instead. Because of this,
: the link must be modified to point at foo.html.gz directly.

If you change the links in order to accomodate the Webbrowsers running
on the machine itself then that will change the files that webserver
can sever to the outside world. They are the same files right? Or
is there a separate /usr/doc for html code for webbrowsers running on
the machine?

Which means in turn that the webservers will be forced by the proposed
approach to serve links with .gz suffixes. This is what I object strongly
against because many web-browswers will not be able to handle those suffixes
without special configuration. The user friendliness of the documentation is 
gone.

: Being able to read documentation directly without running
: a web server is very important.

So far I cannot discern why.

-- 
--- +++ --- +++ --- +++ --- +++ --- +++ --- +++ --- +++ ---
Please always CC me when replying to posts on mailing lists.


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Christoph Lameter

: Christoph Lameter:
: > Web browsers are small. Dont think instantly of Apache.

: I assume you meant web servers. They may be small, but they
: make things slow. Unacceptably slow, unless you have a fast
: machine.

200Mhz Pentiums are the standard fare today. And I am running
the boa webserver for example on some low memory 486DX66s with
excellent performance. Boa serves directly from disk unless 
there is the need to gunzip something.

The big issue here is that you want to change an existing
very public API (http protocol) to include compression which
may be a big hassle to install on many platforms and so far
has not been an issue on the more popular platforms such as 
Win95 or other Unixes.

-- 
--- +++ --- +++ --- +++ --- +++ --- +++ --- +++ --- +++ ---
Please always CC me when replying to posts on mailing lists.


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread Christian Schwarz


Thanks Lars for the tool.

I wrote exactly the same thing in Perl (on your request!) some time ago. I
have attached it to this mail.

I don't know which version is better. It looks like Lars' implementation
has hard coded a lot of HTML tags for processing. Mine is based on Perl's
HTML::Parser class and is thus independent of any specific HTML tags.


Thanks,

Chris

--  Christian Schwarz
 [EMAIL PROTECTED], [EMAIL PROTECTED],
Debian is looking [EMAIL PROTECTED], [EMAIL PROTECTED]
for a logo! Have a
look at our drafts PGP-fp: 8F 61 EB 6D CF 23 CA D7  34 05 14 5C C8 DC 22 BA
athttp://fatman.mathematik.tu-muenchen.de/~schwarz/debian-logo/
#!/usr/bin/perl
#
# fixhtmlgz 0.2
# Copyright (c) 1997 by Christian Schwarz <[EMAIL PROTECTED]>
# May by distributed under GPL 2.
#

# Specification:
#
# Currently, we have a problem with compressed HTML: we can access
# compressed HTML fine, but links don't work very well. The problem
# is that the link says "foo.html", and the actual file is
# "foo.html.gz",
# and the browsers and servers aren't intelligent enough to handle
# this invisibly. This means that we can't install compressed HTML, if
# it contains links.
# 
# We need a program that can be run on uncompressed HTML, which converts
# local links to the compressed versions of the files. Usage would
# be something like:
# 
# fixhtmlgz file.html ...
# 
# - read file.html
# - for each link , if foo.html exists,
#   convert the link to foo.html.gz instead
# - otherwise, do not modify the link
# - output is either to file.html.fixed or file.html (replace
#   original with modified version)
#
# Changes:
#  v0.2:
# - now handles gzipped files
# - parse .html and .htm files
# - changed replacing rule: change href to refer to the
#   file, as it actually exists. Example:
#will only be converted to
#   foo.html.gz, if this file exists, and not if
#   foo.html exists.
# 

package Parser; #---
require HTML::Parser;
@ISA = qw(HTML::Parser);

sub declaration {
  my ($self, $decl) = @_;
  print ::OUT "";
}

sub start {
  my ($self, $tag, $attr, $attrseq, $origtext) = @_;

  if ($tag eq 'a') {
if ($href = $$attr{'href'}) {
  if (!($href =~ s/^(\S+:)//o) or ($1 =~ /file:/i)) {
$type = $1;
$href =~ s/(\#.*)$//o;
$anchor = $1;
#print "href: ($type,$href,$anchor)\n";
if (($href =~ /\.html$/) and -f $href) {
  # append `.gz'
  $$attr{'href'} = "$type$href.gz$anchor";
  # rebuild origtext.
  $origtext = "";
}

sub text {
  my ($self, $text) = @_;
  print ::OUT "$text";
}

sub comment {
  my ($self, $comment) = @_;
  print ::OUT "";
}

#

package main;

if ($#ARGV == -1) {
  print "usage: fixhtmlgz  ...\n";
  exit 1;
}

$p = Parser->new;

while ($filename = shift) {
  if ( ! -f $filename ) {
print "error: file $filename not found, skipping.\n";
next;
  }

  $output = "$filename.fixed";
  open(OUT,">$output") or die "cannot open output file $output: $!";

  $p->parse_file($filename);

  close(OUT);

  rename($filename,"$filename.bak") or die "cannot rename $filename: $!";
  rename($output,$filename) or die "cannot rename $output: $!";
}

exit 0;

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-28 Thread branden

On Fri, 27 Jun 1997, Martin Schulze wrote:

> On Jun 27, Christian Schwarz wrote
> 
> > I wrote exactly the same thing in Perl (on your request!) some time ago. I
> > have attached it to this mail.
> > 
> > I don't know which version is better. It looks like Lars' implementation
> > has hard coded a lot of HTML tags for processing. Mine is based on Perl's
> > HTML::Parser class and is thus independent of any specific HTML tags.
> 
> "better"?  I'm not sure if this is important.  If the tool should
> be used on every system, we should use the perl tool.  We cannot
> recommend another high-level interpreter (python) - perl should be enough.
> 
> > # Currently, we have a problem with compressed HTML: we can access
> > # compressed HTML fine, but links don't work very well. The problem
> > # is that the link says "foo.html", and the actual file is
> > # "foo.html.gz",
> > # and the browsers and servers aren't intelligent enough to handle
> > # this invisibly. This means that we can't install compressed HTML, if
> > # it contains links.
> 
> Wouldn't it be a cool project if we would improve all Debian used
> browsers to handle this and give back the code to the upstream
> release?  I like the idea.

Yes, but...

On Thu, 26 Jun 1997, Federico Di Gregorio wrote:
> 
> From: Federico Di Gregorio <[EMAIL PROTECTED]>
> To: Debian Development 
> 
> [...]
> > > Right, but typing xxx.html.gz will work! We can write a litte sed script  
> > > to change the links from xxx.html to xxx.html.gz inside the documents.
> > 
> > What do the popular http daemons do about this?  I think a good solution
> > would be:
> > 
> > For every .html request that comes in (or perhaps for any request in
> > general), look for a file fitting the traditional spec.
> > 
> > If that fails, look for a .gz version of that file in the same directory.
> > 
> > If that fails, return the usual 404 error.
> > 
> > Does anything already implement this?  If not, why not?
> 
> boa support this. it even decompress on the .gz file on the fly.
> I just tryed to access http://localhost/doc/HOWTO/INFO-SHEET and
> it works!
> 
> boa is also very small and really fast. i think should be the default
> httpd of choice for a small debian system.

I haven't tested this myself, but it looks like the ideal solution,
because:

1) You can compress documentation, like we want to, without having to hack
or modify it, and
2) It permits document maintainers to gunzip the html file for
modifications, and uses the uncompressed one in case they forget (or don't
want to) re-compress it.

--
G. Branden Robinson |   Kissing girls is a goodness.  It is a
Purdue University   |  growing closer.  It beats the hell out
[EMAIL PROTECTED]  |  of card games.
http://www.ecn.purdue.edu/~branden/ |-- Robert Heinlein



--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-27 Thread Martin Schulze

On Jun 27, Christian Schwarz wrote

> I wrote exactly the same thing in Perl (on your request!) some time ago. I
> have attached it to this mail.
> 
> I don't know which version is better. It looks like Lars' implementation
> has hard coded a lot of HTML tags for processing. Mine is based on Perl's
> HTML::Parser class and is thus independent of any specific HTML tags.

"better"?  I'm not sure if this is important.  If the tool should
be used on every system, we should use the perl tool.  We cannot
recommend another high-level interpreter (python) - perl should be enough.

> # Currently, we have a problem with compressed HTML: we can access
> # compressed HTML fine, but links don't work very well. The problem
> # is that the link says "foo.html", and the actual file is
> # "foo.html.gz",
> # and the browsers and servers aren't intelligent enough to handle
> # this invisibly. This means that we can't install compressed HTML, if
> # it contains links.

Wouldn't it be a cool project if we would improve all Debian used
browsers to handle this and give back the code to the upstream
release?  I like the idea.

Regards... Joey

-- 
Individual Network e.V._/ OrgaTech
[EMAIL PROTECTED]_/  [EMAIL PROTECTED]
Geschaeftszeit: Di+Mi+Fr, 15-18 Uhr  _/Tel: (0441) 9808556


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-27 Thread Christoph Lameter

On Sat, 28 Jun 1997, Lars Wirzenius wrote:

>Christoph Lameter:
>> This was discussed half a year ago and the webservers were fitted
>> with on the fly decompression for .gz files.
>
>For the umpteenth time, that DOES NOT HELP WHEN THE USER IS READING
>THE FILES DIRECTLY, NOT VIA A WEB SERVER.

Web browsers are small. Dont think instantly of Apache. And keep the calm
please.

I run boa for that purpose on some machines and its really good. I can
read compressed docs without dwww.

>> What dwww does is already not necessary. Changing the content of .html 
>> files might lead to problems with web browsers. 
>
>dwww doesn't change content.

I did not intend to say that dwww does. The conversion program intends to
though and will cause a mess with the browsers.

>> Not all platforms have a gzip by default available.
>
>Then they lose, unless they go via a web server that uncompresses
>things. It's more important that things work under Debian than
>under, say, OS-9.

So the beginning user with his straight out of the box Win95 looses when
trying to access Debian documentation? Debian has already a name for user
hostileness with dpkg. You want to set up new barriers for newbies?
You want me to run around Campus installing gzip on 300 machines because
those users are not able to? Granted its rare that those guys will read
technical docs but quite a few use pine.

>> Please do not do this. We do not have any problems here and you are about to 
>> create some.
>
>We have the rather unpleasant situation that reading documentation
>requires a web server. That's a problem. Fixing it requires changing
>the .html files.

What is so unpleasant about a small webserver being part of the
standard set of packages?

--- +++ --- +++ --- +++ --- +++ --- +++ --- +++ --- +++ ---

pgp65eJEo1P0k.pgp
Description: PGP signature

Re: fixhrefgz - tool for converting anchors to gzipped files

1997-06-27 Thread Christoph Lameter

This was discussed half a year ago and the webservers were fitted
with on the fly decompression for .gz files. What dwww does is already
not necessary. Changing the content of .html files might lead to problems
with web browsers. Not all platforms have a gzip by default available.

Please do not do this. We do not have any problems here and you are about to 
create some.



In article <[EMAIL PROTECTED]> you wrote:
: --==_Exmh_817738214P
: Content-Type: multipart/mixed ;
:   boundary="==_Exmh_8169585350"

: This is a multipart MIME message.

: --==_Exmh_8169585350
: Content-Type: text/plain; charset=us-ascii

: [ Please don't Cc: public replies to me. ]

: During the recent thread on providing documentation in HTML,
: the need to compress it was pointed out. The compression itself
: is a trivial application of find, xargs, and gzip (or just gzip,
: of course), but that changes the files, so that links within
: the documentation break.

: Things work if you read the documentation through dwww, since
: dwww gives you foo.html.gz, if it exists and foo.html doesn't
: exist. That doesn't help if you browse the filesystem directly,
: and not via dwww and a web server.

: I hacked together a Python program that converts the links
: in the files themselves. It is attached.

: I've tried it with one of my own packages (sex), and it seems
: to work. Browsing the filesystem directly works, if the browser
: can handle gzipped files. Lynx works; Netscape 3.01 doesn't
: work, but I seem to recall that an earlier version did work.
: Someone familiar with mailcap might be able to get Netscape
: to work as well.

: Comments?

: -- 
: Please read  before mailing me.
: Please don't Cc: public replies to me.


: --==_Exmh_8169585350
: Content-Type: application/octet-stream ; name="fixhrefgz"
: Content-Description: fixhrefgz
: Content-Disposition: attachment; filename="fixhrefgz"

: #!/usr/bin/python

: """Convert local links in HTML documents to/from gzipped documents.

: Usage: fixhrefgz [-hzu] [--help] [--gzip] [--gunzip] [file ...]

: This program will convert links to local documents so that they
: point at the version compressed with gzip. Before conversion, an
: anchor tag might look like this:

:   foo

: After conversion, it will look like this:

:   foo

: This allows one to compress HTML files. All other tags are
: unchanged by this program (except for case conversion).

: Lars Wirzenius, [EMAIL PROTECTED]

: """

: import formatter, htmllib, sys, urlparse, getopt, StringIO

: def gzip_mangler(path):
:   if path[-5:] == ".html" or path[-4:] == ".htm":
:   path = path + ".gz"
:   return path

: def gunzip_mangler(path):
:   if path[-8:] == ".html.gz" or path[-7:] == ".htm.gz":
:   path = path[:-3]
:   return path

: mangler = gzip_mangler

: class ParseAndCat(htmllib.HTMLParser):
:   def __init__(self, formatter, verbose=0):
:   htmllib.HTMLParser.__init__(self, formatter, verbose)
:   self.nofill = 1

:   def anchor_bgn(self, href, name, type):
:   parts = urlparse.urlparse(href)
:   if not parts[0] and not parts[1]:
:   path = parts[2]
:   path = mangler(path)
:   parts = (parts[0], parts[1], path,
:parts[3], parts[4], parts[5])
:   href = urlparse.urlunparse(parts)
:   
:   s = ''
:   self.formatter.add_literal_data(s)

:   def anchor_end(self):
:   self.formatter.add_literal_data('')
:   
:   def handle_image(self, src, alt, ismap, align, width, height):
:   s = ''
:   self.formatter.add_literal_data(s)
:   
:   def _format_tag(self, tag, attrs):
:   s = '<' + tag
:   for attr, value in attrs:
:   if value:
:   s = s + (' %s="%s"' % (attr, value))
:   else:
:   s = s + (' %s' % attr)
:   s = s + '>'
:   self.formatter.add_literal_data(s)

:   def start_html(self, attrs):self._format_tag('HTML', attrs)
:   def end_html(self): self._format_tag('/HTML', [])

:   def start_head(self, attrs):self._format_tag('HEAD', attrs)
:   def end_head(self): self._format_tag('/HEAD', [])

:   def start_body(self, attrs):self._format_tag('BODY', attrs)
:   def end_body(self): self._format_tag('/BODY', [])

:   def start_title(self, attrs):   self._format_tag('TITLE', attrs)
:   def end_title(self):self._format_tag('/TITLE', [])

:   def do_base(self, attrs):   self._format_tag('BASE', attrs)
:   def do_isindex(self, attrs):self._format_tag('ISINDEX', attrs)
:   def do_link(self, attrs):   self._format_tag('LINK', attrs)
:

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: Re^4: fixhrefgz - tool for converting anchors to gzipped files

Re^2: fixhrefgz - tool for converting anchors to gzipped files

Re^2: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: Re^2: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re^2: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

Re: fixhrefgz - tool for converting anchors to gzipped files

24 matches

Site Navigation

Mail list logo

Footer information