[Bug 1258546] Re: Apache2 defaults to the wrong character set, it should be UTF-8

2014-02-07 Thread Launchpad Bug Tracker
[Expired for apache2 (Ubuntu) because there has been no activity for 60
days.]

** Changed in: apache2 (Ubuntu)
   Status: Incomplete = Expired

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to apache2 in Ubuntu.
https://bugs.launchpad.net/bugs/1258546

Title:
  Apache2 defaults to the wrong character set, it should be UTF-8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1258546/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1258546] Re: Apache2 defaults to the wrong character set, it should be UTF-8

2013-12-09 Thread Robie Basak
Thank you for taking the time to report this bug and helping to make
Ubuntu better.

I have checked both Precise and Trusty, and can find no windows-1252
default that you refer to. I used wget -S to see the headers returned
by the Apache server, and it did not specify a character set.

Could you please provide exact steps to reproduce this bug on a freshly
installed Ubuntu Server system, and detail what you are expecting and
what happens on your system instead? Please use wget -S or similar to
demonstrate that the problem is actually with Apache, and not with a
client, or the web page source or similar.

Without a test case, there isn't enough information here for a developer
to confirm this issue is a bug, or to begin working on it, so I am
marking this bug Incomplete for now.

If you can provide exact steps so that a developer can reproduce the
original problem, then please add them to this bug and change the status
back to New.

** Changed in: apache2 (Ubuntu)
   Status: New = Incomplete

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to apache2 in Ubuntu.
https://bugs.launchpad.net/bugs/1258546

Title:
  Apache2 defaults to the wrong character set, it should be UTF-8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1258546/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1258546] Re: Apache2 defaults to the wrong character set, it should be UTF-8

2013-12-09 Thread Lars Noodén
I can do the Server system, too, but right now the steps I have followed
to get the problem are:

1. install Ubuntu 12.04 desktop, or Lubuntu 14.04devel desktop (it occurs on 
both)
2. install Apache2, leaving default configuration settings
3. load an html page from the server in a browser (in 12.04 or 14.04devel)
4. check page info regarding Encoding

Adding AddDefaultCharset utf-8 to the configuration file makes the problem go 
away.
But could this be a problem with the browser anyway?

$ wget -S http://xx.yy.zz.aa
--2013-12-09 14:38:34--  http://xx.yy.zz.aa/
Connecting to xx.yy.zz.aa:80... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  Date: Mon, 09 Dec 2013 12:38:34 GMT
  Server: Apache/2.2.22 (Ubuntu)
  Last-Modified: Sat, 07 Dec 2013 14:39:28 GMT
  ETag: 222742-b1-4ecf2bae66f2c
  Accept-Ranges: bytes
  Content-Length: 177
  Vary: Accept-Encoding
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Content-Type: text/html
Length: 177 [text/html]

$ wget -S http://xx.yy.zz.bb
--2013-12-09 14:39:46--  http://xx.yy.zz.bb/
Connecting to xx.yy.zz.bb:80... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  Date: Mon, 09 Dec 2013 12:39:46 GMT
  Server: Apache/2.4.6 (Ubuntu)
  Last-Modified: Mon, 25 Nov 2013 16:12:19 GMT
  ETag: b1-4ec02a0e06c9c
  Accept-Ranges: bytes
  Content-Length: 177
  Vary: Accept-Encoding
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Content-Type: text/html
Length: 177 [text/html]

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to apache2 in Ubuntu.
https://bugs.launchpad.net/bugs/1258546

Title:
  Apache2 defaults to the wrong character set, it should be UTF-8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1258546/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1258546] Re: Apache2 defaults to the wrong character set, it should be UTF-8

2013-12-09 Thread Lars Noodén
The one browser is Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:25.0) 
Gecko/20100101 Firefox/25.0
HTTP_ACCEPT Headers : 
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 gzip, deflate 
en,en-us;q=0.7,sv;q=0.3

The other is: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:25.0) Gecko/20100101 
Firefox/25.0
HTTP_ACCEPT Headers : text/html, */* gzip, deflate en-US,en;q=0.5

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to apache2 in Ubuntu.
https://bugs.launchpad.net/bugs/1258546

Title:
  Apache2 defaults to the wrong character set, it should be UTF-8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1258546/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1258546] Re: Apache2 defaults to the wrong character set, it should be UTF-8

2013-12-09 Thread Lars Noodén
I've done a fresh installation from the ubuntu-12.04.3-server-i386.iso
image and installed Apache2.  The Firefox web browser still shows that
the pages being served are encoded in windows-1252 instead of UTF-8,
which is what the locale is set to, or ISO-8859 which would be the old
standard.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to apache2 in Ubuntu.
https://bugs.launchpad.net/bugs/1258546

Title:
  Apache2 defaults to the wrong character set, it should be UTF-8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1258546/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1258546] Re: Apache2 defaults to the wrong character set, it should be UTF-8

2013-12-09 Thread Robie Basak
I believe browsers typically try to guess. If Apache serves a page that
doesn't have any non-ASCII characters in it, then browsers can guess,
and windows-1252 would still be correct, since the document was a
strict subset of this charset.

What happens if you serve a UTF-8 encoded file? What does the browser do
then?

If you want Apache to assume that everything in /var/www is UTF-8 by
default, and explicitly set that in every response, then I can
understand such a request, but I think it needs to be coordinated with
the Debian packaging, perhaps also including upstream's view on a
suitable default.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to apache2 in Ubuntu.
https://bugs.launchpad.net/bugs/1258546

Title:
  Apache2 defaults to the wrong character set, it should be UTF-8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1258546/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1258546] Re: Apache2 defaults to the wrong character set, it should be UTF-8

2013-12-09 Thread Lars Noodén
If I serve a UTF-8 encoded file *AND* set the default myself in Apache,
then everything is fine.  If the default encoding is left alone, Apache
serves it up as windows-1252 and then UTF-8 encoded letters come out
as garbage like this: åäöÅÄÖéÉ

As seen from the browser HTTP_ACCEPT Headers, it seems to be the web
server making the choice.

Apache has a defaut encoding.  It should be a standard, UTF-8 or
ISO-8859, having non-standard windows-1252 in the default configuration
just makes a mess.  It's easy to fix by AddDefaultCharset to the
configuration.  However, it would be great if Apache worked with non-
English languages out of the box, especially when the locale is set so.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to apache2 in Ubuntu.
https://bugs.launchpad.net/bugs/1258546

Title:
  Apache2 defaults to the wrong character set, it should be UTF-8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1258546/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1258546] Re: Apache2 defaults to the wrong character set, it should be UTF-8

2013-12-09 Thread Robie Basak
 If the default encoding is left alone, Apache serves it up as
windows-1252 and then UTF-8 encoded letters come out as garbage like
this: åäöÅÄÖéÉ

I do not see this behaviour:

root@trusty:/var/www# xxd test.txt 
000: 5363 6872 c3b6 6469 6e67 6572 2773 2043  Schr..dinger's C
010: 6174 0a  at.
root@trusty:/var/www# wget -S -O/dev/null http://localhost/test.txt
--2013-12-09 15:26:28--  http://localhost/test.txt
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:80... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  Date: Mon, 09 Dec 2013 15:26:28 GMT
  Server: Apache/2.4.6 (Ubuntu)
  Last-Modified: Mon, 09 Dec 2013 12:19:37 GMT
  ETag: 13-4ed1902654840
  Accept-Ranges: bytes
  Content-Length: 19
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Content-Type: text/plain
Length: 19 [text/plain]
Saving to: ‘/dev/null’

100%[=]
19  --.-K/s   in 0s

2013-12-09 15:26:28 (1.52 MB/s) - ‘/dev/null’ saved [19/19]

root@trusty:/var/www#

Here, Apache is just not setting an encoding. It never claims
windows-1252.

 Apache has a defaut encoding.

As you can see from the headers, this does not appear to be true. I can
understand that perhaps it does in other circumstances that I haven't
been able to test. If this is true, please can you provide steps to
reproduce?

 It's easy to fix by AddDefaultCharset to the configuration. However,
it would be great if Apache worked with non-English languages out of the
box, especially when the locale is set so.

I appreciate that there is a case to perhaps provide a default
AddDefaultCharset that matches the system locale, but unfortunately it's
not simple since the system locale may not match the encoding of the
files you expect to serve from /var/www. This is a tricky issue, and one
I think would be better addressed in Debian or upstream than for Ubuntu
to diverge from Debian and upstream on this.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to apache2 in Ubuntu.
https://bugs.launchpad.net/bugs/1258546

Title:
  Apache2 defaults to the wrong character set, it should be UTF-8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1258546/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1258546] Re: Apache2 defaults to the wrong character set, it should be UTF-8

2013-12-09 Thread Lars Noodén
If wget is not seeing the wrong encoding then it may be a problem with
Firefox instead.

However, the steps to reproduce are

1. install Ubuntu 12.04 desktop, or Lubuntu 14.04devel desktop (it occurs on 
both)
2. install Apache2, leaving default configuration settings
3. load an html page from the server in Firefox (in 12.04 or 14.04devel)
4. check page info regarding Encoding with ctrl-i

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to apache2 in Ubuntu.
https://bugs.launchpad.net/bugs/1258546

Title:
  Apache2 defaults to the wrong character set, it should be UTF-8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1258546/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1258546] Re: Apache2 defaults to the wrong character set, it should be UTF-8

2013-12-09 Thread Robie Basak
Sorry, your test case involving Firefox isn't sufficient to determine
validity of a bug in Apache. What is Apache actually sending to Firefox
in your case?

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to apache2 in Ubuntu.
https://bugs.launchpad.net/bugs/1258546

Title:
  Apache2 defaults to the wrong character set, it should be UTF-8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1258546/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1258546] Re: Apache2 defaults to the wrong character set, it should be UTF-8

2013-12-09 Thread Lars Noodén
It looks like the problem is Firefox then.  If no default is set, then
it sends wget 'Content-Type: text/html'.  If the default is set to
utf-8, then it sends wget 'Content-Type: text/html; charset=utf-8'

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to apache2 in Ubuntu.
https://bugs.launchpad.net/bugs/1258546

Title:
  Apache2 defaults to the wrong character set, it should be UTF-8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1258546/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs