Re: http Mirrors

1998-01-30 Thread Tim Sailer
On Fri, Jan 30, 1998 at 03:35:55PM -0700, Jason Gunthorpe wrote:
> 
> Could we perhaps ask our FTP mirrors to put the archive up for http as
> well as FTP? 
> 
> HTTP is much easier to work with when dealing with firewalls and I intend
> to have support for it in Deity, trouble is right now there are a minimal
> number of HTTP servers that offer debian :|

I'll have llug.sep.bnl.gov up with http tonight.

Tim

-- 
   ><
   >> Tim Sailer   ><  Coastal Internet, Inc.  <<
   >> Network and Systems Operations   ><  PO Box 671  <<
   >> http://www.buoy.com  ><  Ridge, NY 11961 <<
   >> [EMAIL PROTECTED] ><  (516) 476-3031  
<<
   ><


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .


http Mirrors

1998-01-30 Thread Jason Gunthorpe

Could we perhaps ask our FTP mirrors to put the archive up for http as
well as FTP? 

HTTP is much easier to work with when dealing with firewalls and I intend
to have support for it in Deity, trouble is right now there are a minimal
number of HTTP servers that offer debian :|

Thanks,
Jason


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .


Re: content negotiation for language in web pages

1998-01-30 Thread Tommi Virtanen
On Fri, Jan 30, 1998 at 03:26:23PM -0500, James A.Treacy wrote:
> This does almost exactly what we want although I was hoping to avoid 
> type-maps.
> One thing I'd prefer is if the server could be convinced to use the
> type-map even if foo.html exists. That way, all mirrors could use ftp or rsync
> to mirror the pages. Servers not using CN would simply serve foo.html (really
> foo.en.html) - exactly what we want. Users could still access the other pages
> by the links at the bottom.
>
> Mirroring the human written pages by wget is acceptable, but wget isn't very
> inefficient so it is not a good idea to use it on the rest of the archive
> (Packages, Lists-Archives and Bugs). Also, I don't like the idea of forcing
> anyone not using CN to mirror using wget.

I see you don't like mirroring with HTTP. Allright.
The only way around it I can see right now is to create a
second tree for non-CN mirrors with links from .html ->
.en.html. A tree full of symlinks. Yack.

Umm. The question is now - is this good enough, or do we
try to think of something even better? We haven't even
included mod_rewrite yet;)

Note that again, on the CN side, all would be swell with urls
referencing just foo -- but then again it would break non-CN
mirrors. This is getting ugly.

I'll do some testing with .html.s tomorrow. Maybe something
is possible..

> There are some people who feel content negotiation should be used for every
> page (I'm not sure why. Once CN has you using your preferred language, what
> difference does it make). I'd like their comments on this scheme (repeated 
> below).

Actually I'm indifferent about that, I just dislike urls
containing the language. I actually do most of mine
without the .html..

-- 
[EMAIL PROTECTED] - it's a valid address w/o spam | +358-50-5124907
f u cn rd ths, thn u cn rd perl 2 | rm -rf / && echo bye-bye. |   --tv


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .


Re: content negotiation for language in web pages

1998-01-30 Thread James A . Treacy
This does almost exactly what we want although I was hoping to avoid type-maps.
One thing I'd prefer is if the server could be convinced to use the
type-map even if foo.html exists. That way, all mirrors could use ftp or rsync
to mirror the pages. Servers not using CN would simply serve foo.html (really
foo.en.html) - exactly what we want. Users could still access the other pages
by the links at the bottom.

Mirroring the human written pages by wget is acceptable, but wget isn't very
inefficient so it is not a good idea to use it on the rest of the archive
(Packages, Lists-Archives and Bugs). Also, I don't like the idea of forcing
anyone not using CN to mirror using wget.

There are some people who feel content negotiation should be used for every
page (I'm not sure why. Once CN has you using your preferred language, what
difference does it make). I'd like their comments on this scheme (repeated 
below).

BTW, apache behaves strangely if foo.html exists under this arrangement. If it 
is
a file, it is served (as the doc says). If it is a link, you get '403 
Forbidden'.
If CN is not turned on then the link works fine (as it should).

- Jay

> # foo.html.var
> URI: foo.en.html
> Content-type: text/html
> Content-language: en
> 
> URI: foo.de.html
> Content-type: text/html
> Content-language: de
> 
> # bar.html.var
> URI: bar.en.html
> Content-type: text/html
> Content-language: en
> 
> URI: bar.de.html
> Content-type: text/html
> Content-language: de
> 
> # bar.de.html
> Germanfoo
> [auto][de]
> 
> # bar.en.html
> Englishfoo
> [auto][de]
> 
> # foo.de.html
> Germanbar
> [auto][de]
> 
> # foo.en.html
> Englishbar
> [auto][de]
> 


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .


Re: content negotiation for language in web pages

1998-01-30 Thread Tommi Virtanen
On Fri, Jan 30, 1998 at 01:12:06AM -0500, James A.Treacy wrote:
> Aaargh. This content negotiation (CN) has some annoying quirks.
> After re-reading the apache manual it appears we don't have a lot
> of choice.

Don't give up hope yet. How about creating type-maps for
all the files? With these files:

# foo.html.var
URI: foo.en.html
Content-type: text/html
Content-language: en

URI: foo.de.html
Content-type: text/html
Content-language: de

# bar.html.var
URI: bar.en.html
Content-type: text/html
Content-language: en

URI: bar.de.html
Content-type: text/html
Content-language: de

# bar.de.html
Germanfoo
[auto][de]

# bar.en.html
Englishfoo
[auto][de]

# foo.de.html
Germanbar
[auto][de]

# foo.en.html
Englishbar
[auto][de]

My test pages at www.havoc.fi/z/foo.html seem to work
just fine! And at www.havoc.fi/zmirror/foo.html is a mirror
made with wget --mirror, it just causes "auto" to be "en".

The only problem is if we want content-negotiating mirrors,
they must mirror the pages by ftp -- so that the maps get
mirrored too. I don't think is too bad, as those mirrors will
have to check their configs for type-maps etc..

Any problems here?
-- 
[EMAIL PROTECTED] - it's a valid address w/o spam | +358-50-5124907
f u cn rd ths, thn u cn rd perl 2 | rm -rf / && echo bye-bye. |   --tv


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .


Re: content negotiation for language in web pages

1998-01-30 Thread Tommi Virtanen
On Fri, Jan 30, 1998 at 01:12:06AM -0500, James A.Treacy wrote:
> This makes it quite difficult to have pages that work with both servers that 
> do and
> do not support CN. Pages can do the following:
> 1. pages reference foo. Works great on a CN server. Doesn't work at all on a 
> non-CN
>server.
> 2. pages reference foo.html . In all cases foo.html will be served if it 
> exists.
>If it doesn't exist, a CN server will look for files with the language 
> extensions
>and a non-CN server will fail.
> 3. pages reference foo.html. . Works on all servers but obviates the 
> need for
>CN.

Please note that number 3 does not work unless the servers sets the
content-type of .html. to text/html. Otherwise it will be
text/plain and.. we don't want that. So unless CN is possible, the files
should either be separated to dirs or named ..html
-- 
[EMAIL PROTECTED] - it's a valid address w/o spam | +358-50-5124907
f u cn rd ths, thn u cn rd perl 2 | rm -rf / && echo bye-bye. |   --tv


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .


Re: content negotiation for language in web pages

1998-01-30 Thread James A . Treacy
Aaargh. This content negotiation (CN) has some annoying quirks.
After re-reading the apache manual it appears we don't have a lot
of choice.

from the apache manual:
> The effect of MultiViews is as follows: if the server receives a request for 
> /some/dir/foo,
> if /some/dir has MultiViews enabled, and /some/dir/foo does not exist, then 
> the server
> reads the directory looking for files named foo.*, and effectively fakes up a 
> type map
> which names all those files, assigning them the same media types and 
> content-encodings
> it would have if the client had asked for one of them by name. It then 
> chooses the best
> match to the client's requirements, and forwards them along. 

This makes it quite difficult to have pages that work with both servers that do 
and
do not support CN. Pages can do the following:
1. pages reference foo. Works great on a CN server. Doesn't work at all on a 
non-CN
   server.
2. pages reference foo.html . In all cases foo.html will be served if it exists.
   If it doesn't exist, a CN server will look for files with the language 
extensions
   and a non-CN server will fail.
3. pages reference foo.html. . Works on all servers but obviates the need 
for
   CN.

It seems everyone agrees that the first method (pages served using CN) is the
best solution. Essentially, this means the decision on the route we take is in 
the
hands of the mirror administrators. If they all agree to set up CN, then we'll 
get to
do this properly. If not, we'll be stuck doing the translations the old 
fashioned way
(method 3 or seperate directories).

I'll contact the administrators and find out if they'll all set it up.

- Jay


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .


Re: content negotiation for language in web pages

1998-01-30 Thread James A . Treacy
>   Pardon me for being dense, but it is not any worse than the
>  current situation, is it? I mean, we have the default language at the
>  moment. Though we can't mandate content negotiation on mirror
>  servers, we can suggest that they do so; don't most modern servers
>  support CN quite easily? 
> 
>   And in any case, it does seem like CN is the way to go, and
>  even mirror servers that do not support it yet shall gradually get to
>  supporting it.
> 
Exactly. That's why I suggested the 3rd version. It allows the pages
to work on both servers that do and do not support CN.

- Jay


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .


Re: content negotiation for language in web pages

1998-01-30 Thread Manoj Srivastava
Hi,
>>"James" == James A Treacy <[EMAIL PROTECTED]> writes:

James> The big problem with this is that it all hinges on every server
James> supporting content negotiation (CN from here on in). We don't
James> control the mirrors so it's a big problem. Without content
James> negotiation, you run into a problem in deciding how to write
James> links. If you just write .html, then CN does the right
James> thing. Without CN, you can only get one language.

Pardon me for being dense, but it is not any worse than the
 current situation, is it? I mean, we have the default language at the
 moment. Though we can't mandate content negotiation on mirror
 servers, we can suggest that they do so; don't most modern servers
 support CN quite easily? 

And in any case, it does seem like CN is the way to go, and
 even mirror servers that do not support it yet shall gradually get to
 supporting it.

manoj

-- 
 The likelihood of anything happening is in direct proportion to the
 amount of trouble it will cause if it does happen.  -- Sam W. Warren
Manoj Srivastava  <[EMAIL PROTECTED]> 
Key C7261095 fingerprint = CB D9 F4 12 68 07 E4 05  CC 2D 27 12 1D F5 E8 6E


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .


Re: content negotiation for language in web pages

1998-01-30 Thread James A . Treacy
I've moved this to debian-www where it should have been in the first place.

> > Mirrors that don't support content negotiation would be stuck serving
> > in one language (the pages would be set up to default to English).
> 
> Not necessarily  true. I mean, the English part. For example, if the
> German mirror doesn't support negotiation, under the previous scheme, it
> can only mirror the German directories. (Flattening symlinks, of course)
> 
> > It has the benefit of supporting partial translations. If a
> 
> Yes, that's why I used this in the first place. It works great.
> 
> > Also, if a browser doesn't know about content negotiation or the user
> > hasn't configured it to use their preferred language (and the default
> > is usually English), the user will get English docs.
> 
> This again may not be always true. If the browser doesn't support content
> negotiation, it has an internal list (at least Apache does). It knows what
> language to serve by default.
> 
The big problem with this is that it all hinges on every server supporting
content negotiation (CN from here on in). We don't control the mirrors so
it's a big problem. Without content negotiation, you run into a problem
in deciding how to write links. If you just write .html, then CN does
the right thing. Without CN, you can only get one language.

> > 3. Similar to 2, but each language references the pages in its language,
> >e.g. index.html.de would reference vendors.html.de . At the main
> 
> ugly, ugly, ugly. It's a nightmare to maintain. Plus, the server has to be
> reconfigured to understand that html.en is text/html, and that is not
> always possible because of the "extra" dot.
> 
I don't see why it's ugly. It's a compromise for an imperfect world. It's
not any more difficult to maintain than any of the other methods.
Also, a server that doesn't understand content negotiation doesn't need to
worry about html.en as all the english files would have a link from .html to
.html.en .

> >page the user would get a language (either by content negotiation
> >or by explicitly choosing the language by using one of the cross-links)
> >and all links followed after that would be in that language.
> >Someone jumping into a different page would have no idea other languages
> >existed.
> 
> With the setup I presented, this can be solved in this way:
> 
> http://www.debian.org/lang-1 reads DocumentRoot.lang-1 and it DOESN'T do
> content negotiation. The other languages are treated in the same way.
> http://www.debian.org/ reads DocumentRoot and it DOES content negotiation.
> 
> Drawback: you have to remember to use relative links only, that is,  HREF="/dir/document.html"> is not allowed. You have to  HREF="../../dir/document.html>. This almost always limits the usefulness
> of server generated footer and headers that contain links.
> 
This all supposes that we have some control over the mirrors. Many of the
mirror administrators have no time for this sort of thing. We are lucky
that they have convinced their superiors to donate space on the machines.

BTW, the entire web pages use relative links. Works great.

> I really think content-negotiation is the way to go, considering that it's
> something that can be configured on a server by server basis. For example,
> www.es.debian.org (the mirror in Spain, not the server in Spanish that
> someone else proposed) can be configured to provide documents in Spanish
> by default. www.it.debian.org provides documents in Italian,
> www.us.debian.org in English, and so on.
> 
Of course CN is the way to go. At the same time it is important,
given the structure of Debian, that we also make the pages accessable in
all languages even if CN isn't available. That's why number 3 was proposed.
When you catch up on the thread, you'll see a few changes have been
proposed which should make this work quite well.

> The problem I saw, and still see, is search engines are stupid enough not
> to know about content-negotiation (well, I complained, and someone at
> Altavista emailed me saying they were consireding that, maybe they have
> implemented it by now). For example, http://www.debian.org/ may appear in
> search engines only in English, but when the user gets there it suddenly
> starts speaking German (because the browser asks for "de fr en", for
> example). For me, that's really nice, but others may not think so. That's
> the other reason I came up with the DocumentRoot.lang thing. 
> 
As for setting up searching, each file should say what language it is in
(say with a meta tag). Searches will check for this tag so only the language(s)
specified will be returned. I've used htdig and glimpse and found that they both
had annoying limitations. It looks like glimpse has been improved in the
last 6 months so I will take a look at it again.

- Jay


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .