Hello.
Paul Hampson:
The email address isn't important, since
that has to be a subset of ASCII anyway.
Are the Unicode-encoded domain names
supported in (modern) browsers only?
I can surf to http://.pl/ (with, e.g., Firefox) - can I send mail to
[EMAIL PROTECTED], or should I always use the
On Dec 11, Shot (Piotr Szotkowski) [EMAIL PROTECTED] wrote:
I can surf to http://?.pl/ (with, e.g., Firefox) - can I send mail to
[EMAIL PROTECTED], or should I always use the [EMAIL PROTECTED] equivalent, as
the Unicode in domain names is restricted to WWW only?
It depends on your MUA. With
On Sat, 11 Dec 2004 16:08:12 +0100, Shot (Piotr Szotkowski) wrote:
Hello.
Paul Hampson:
The email address isn't important, since
that has to be a subset of ASCII anyway.
Are the Unicode-encoded domain names
supported in (modern) browsers only?
I can surf to http://.pl/ (with,
On Sat, Dec 11, 2004 at 04:08:12PM +0100, Shot (Piotr Szotkowski) wrote:
Hello.
Paul Hampson:
The email address isn't important, since
that has to be a subset of ASCII anyway.
Are the Unicode-encoded domain names
supported in (modern) browsers only?
I can surf to http://.pl/ (with,
On Tue, Dec 07, 2004 at 05:56:54PM +, Thaddeus H. Black wrote:
But yes, non-ASCII Latin-1 chars should not be given
special status over the national chars found in other
languages spoken by project members. Debian should be
using either ASCII, or Unicode; standardizing on
Latin-1
It is one thing spiritedly to argue a point against
friends and allies. It is another to be obstinate. I
do not wish the latter, and I admit that I am both
outnumbered and outreasoned today. Please permit me
without malice to conform my position, which now might
be stated as follows.
Unicode
[Roger Leigh]
I've been using Debian with UTF-8 only locales for over 12 months
now. I now consider it fine for general use, with respect to
terminal and application support. Unlike a couple of years ago, most
things work perfectly.
Some apps like 'screen' do not just configure themselves
On Tuesday 07 December 2004 00.19, Roger Leigh wrote:
I think going to UTF-8 as the default locale charmap for all locales
is a feasable goal for etch, as is recoding everything to UTF-8 (where
it makes sense).
Yep.
My biggest problem right now is 'lpr sometextfile' to a postscript printer
* Roger Leigh ([EMAIL PROTECTED]) [041207 00:40]:
I think going to UTF-8 as the default locale charmap for all locales
is a feasable goal for etch, as is recoding everything to UTF-8 (where
it makes sense).
feasable goal and etch are the magic words I think: I agree on that,
but I don't want
Patrze w ekran, a to Roger Leigh pisze do mnie:
- No UTF-8 console keymaps
- Some broken libraries e.g. GTK+ 1.2 [obsolete]
- I can't paste UTF-8 into emacs (perhaps a problem in my .emacs)
- mc making mess with its frames
Maciek
--
M.Sc. Maciej Dems [EMAIL PROTECTED]
07.12.2004 13:33 +0100 Maciej Dems (-):
Patrze w ekran, a to Roger Leigh pisze do mnie:
- No UTF-8 console keymaps
- Some broken libraries e.g. GTK+ 1.2 [obsolete]
- I can't paste UTF-8 into emacs (perhaps a problem in my .emacs)
- mc making mess with its frames
Add dselect and
On Tuesday 07 December 2004 12:44 am, Peter Samuelson wrote:
Defining the character set as utf-8 means that any non-unicode
capable application is going to have issues, yes.
Postulate an app that is ignorant of character sets - we'll call it
aptitude. Fixing it to make it accept utf-8 and
On Tuesday 07 December 2004 10:17 am, Daniel Burrows wrote:
complex replacement string class
Admittedly, complex might (hypothetically) be a bit of an exaggeration.
:P
Daniel
--
/--- Daniel Burrows [EMAIL PROTECTED] --\
| You are in a maze of
On Tue, Dec 07, 2004 at 10:17:17AM -0500, Daniel Burrows wrote:
On Tuesday 07 December 2004 12:44 am, Peter Samuelson wrote:
And if the app already deals with charset conversions but assumes
iso-8859-1 input, then it's trivial to fix it to assume utf-8 input.
This is not true.
Daniel Burrows [EMAIL PROTECTED] wrote:
iso-8859-1 is an 8-bit charset, while Unicode is a 32-bit [0] charset. =20
Storing and manipulating iso-8859-1 strings requires no changes to internal=
=20
datatypes (only conversions for input and output); storing and manipulating=
=20
Unicode means
On Tuesday 07 December 2004 10:40 am, Richard Atterer wrote:
No, you do not have to do this. You can keep working with char, the
changes when switching to UTF-8 will mostly have to deal with the fact that
one Unicode character is represented by more than one char. This means that
you need to
Steve Langasek writes,
... most of the letters you listed here are specific
to the IPA, which would have no use at all in a
control file as they're not part of the writing system
of any natural language.
Ok.
Encodings and charsets are distinct concepts. Just
because the file is specified
On Dec 07, Thaddeus H. Black [EMAIL PROTECTED] wrote:
UTF-8 is neat, but I do not really like Unicode (you may
Actually you do not even understand it, because this sentence is
meaningless.
--
ciao, |
Marco | [9639 coubl1Ib61SmA]
signature.asc
Description: Digital signature
[Thaddeus H. Black]
UTF-8 is neat, but I do not really like Unicode (you may
[Marco d'Itri]
Actually you do not even understand it, because this sentence is
meaningless.
Perhaps he is aware of the difference between Unicode and ISO-10646?
UTF-8 is an encoding of ISO-10646.
Thaddeus H. Black wrote:
However, the typical roster of skills one masters in contributing
broadly to Debian development is already awesome: C, C++, CPP, Make,
Perl, Python, Autoconf, CVS, Shell, Glibc, System calls, /proc, IPC,
sockets, Sed, Awk, Vi, Emacs, locales, Libdb, GnuPG, Readline,
On Sunday 05 December 2004 20.11, Goswin von Brederlow wrote:
Any parser that acceps 8bit non-ascii chars
will accept UTF-8 then. What remains is just making the UTF-8 chars
visually correct then.
And make sure that, where character strings are modified, the multibyte
sequences are counted
Daniel Burrows [EMAIL PROTECTED] writes:
On Sunday 05 December 2004 03:32 pm, Jose Carlos Garcia Sogo wrote:
Would Peter permit me a mild dissent? I prefer Latin-1. Reason: I can
recognize and distinguish Latin-1 characters, even when I do not always
understand the words they spell.
I would not disagree with Peter or Daniel. They are
right in my view. However, consider the following
Unicode characters:
025A LATIN SMALL LETTER SCHWA WITH HOOK
025E LATIN SMALL LETTER CLOSED REVERSED OPEN E
0261 LATIN SMALL LETTER SCRIPT G
0264 LATIN SMALL LETTER RAMS HORN
0267
Thaddeus H. Black wrote:
025A LATIN SMALL LETTER SCHWA WITH HOOK
025E LATIN SMALL LETTER CLOSED REVERSED OPEN E
0261 LATIN SMALL LETTER SCRIPT G
0264 LATIN SMALL LETTER RAMS HORN
0267 LATIN SMALL LETTER HENG WITH HOOK
027A LATIN SMALL LETTER TURNED R WITH LONG LEG
027F LATIN SMALL LETTER
Thaddeus H. Black [EMAIL PROTECTED] wrote:
We are not speaking of a stricken Polish L, a
double-accented Magyar O, or a euro sign. We are
speaking of... well, to tell the truth I have no idea
what these letters are. Have you? More to the point,
should you and I learn to recognize such
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Andreas Barth [EMAIL PROTECTED] writes:
Though I agree on your last statement (and please, remember, I'm from
germany where non-ASCII-characters are also in common use), I still
consider that UTF-8-not-ASCII has not finally reached ok, but it's on
On Mon, Dec 06, 2004 at 06:58:10PM +, Thaddeus H. Black wrote:
I would not disagree with Peter or Daniel. They are
right in my view. However, consider the following
Unicode characters:
025A LATIN SMALL LETTER SCHWA WITH HOOK
025E LATIN SMALL LETTER CLOSED REVERSED OPEN E
0261
On Mon, Dec 06, 2004 at 06:53:42PM -0800, Steve Langasek [EMAIL PROTECTED]
wrote:
But yes, non-ASCII Latin-1 chars should not be given special status over
the national chars found in other languages spoken by project members.
Debian should be using either ASCII, or Unicode; standardizing on
On Tue, Dec 07, 2004 at 12:04:56PM +0900, Mike Hommey wrote:
On Mon, Dec 06, 2004 at 06:53:42PM -0800, Steve Langasek [EMAIL PROTECTED]
wrote:
But yes, non-ASCII Latin-1 chars should not be given special status over
the national chars found in other languages spoken by project members.
On Mon, Dec 06, 2004 at 07:10:21PM -0800, Steve Langasek [EMAIL PROTECTED]
wrote:
On Tue, Dec 07, 2004 at 12:04:56PM +0900, Mike Hommey wrote:
On Mon, Dec 06, 2004 at 06:53:42PM -0800, Steve Langasek [EMAIL
PROTECTED] wrote:
But yes, non-ASCII Latin-1 chars should not be given special
[Matthew Garrett]
Defining the character set as utf-8 means that any non-unicode
capable application is going to have issues, yes.
Postulate an app that is ignorant of character sets - we'll call it
aptitude. Fixing it to make it accept utf-8 and spit out the correct
encoding for its LC_CTYPE
We seem to be moving to a de facto standard of UTF-8 for non-ASCII
characters in debian/control files. This is not specified in Policy
[1], but for hopefully obvious reasons, consistency is a Good Thing,
and UTF-8 seems to be the best solution for this sort of thing.
In my sid control files, I
[Peter Samuelson]
I suggest that the affected source packages[3] be run through the
command 'iconv -f ORIGINAL_CHARSET -t utf-8' as soon as convenient.
Ehhh, I see I have already ruined my credibility by pasting the wrong
source package list. The real list is much shorter.
Apologies,
Peter
[Peter Samuelson]
We seem to be moving to a de facto standard of UTF-8 for non-ASCII
characters in debian/control files. This is not specified in Policy
[1], but for hopefully obvious reasons, consistency is a Good Thing,
and UTF-8 seems to be the best solution for this sort of thing.
Some
* Petter Reinholdtsen ([EMAIL PROTECTED]) [041205 11:30]:
[Peter Samuelson]
We seem to be moving to a de facto standard of UTF-8 for non-ASCII
characters in debian/control files. This is not specified in Policy
[1], but for hopefully obvious reasons, consistency is a Good Thing,
and
Le dimanche 05 décembre 2004 à 11:43 +0100, Andreas Barth a écrit :
I think most of us agree that non-UTF-8-characters are not a good idea
(please note the UTF-8-characters is a superset of ASCII). For some
places (like package names), I think most of us even agree that only
ASCII-characters
* Josselin Mouette ([EMAIL PROTECTED]) [041205 13:05]:
Le dimanche 05 décembre 2004 à 11:43 +0100, Andreas Barth a écrit :
I think most of us agree that non-UTF-8-characters are not a good idea
(please note the UTF-8-characters is a superset of ASCII). For some
places (like package names),
On Sun, Dec 05, 2004 at 01:01:16PM +0100, Josselin Mouette wrote:
Many of us have names that can't be written using ASCII.
Well, they usually can be transliterated, can't they?
Transliterating is somewhat of a kludge (and I think in most cases UTF-8 is a
much better solution); OTOH I'd rapidly
On Dec 05, Peter Samuelson [EMAIL PROTECTED] wrote:
Would people support a mass bug at minor severity?
Make it normal.
--
ciao, |
Marco | [9589 inOGrPyJFNKhM]
signature.asc
Description: Digital signature
On Dec 05, Steinar H. Gunderson [EMAIL PROTECTED] wrote:
Transliterating is somewhat of a kludge (and I think in most cases UTF-8 is a
much better solution); OTOH I'd rapidly get confused in the list of Japanese
maintainers if their names weren't transliterated.
This is a different issue: in
[Steinar H. Gunderson]
Transliterating is somewhat of a kludge (and I think in most cases
UTF-8 is a much better solution); OTOH I'd rapidly get confused in
the list of Japanese maintainers if their names weren't
transliterated.
I think it's a valid choice for a maintainer who natively
[Marco d'Itri]
Would people support a mass bug at minor severity?
Make it normal.
Given that Policy recommends debian/changelog to be utf-8, coupled with
the observation (which I had not thought of) that various tools may
require a maintainer's name in debian/control and debian/changelog to
[Peter Samuelson]
I suggest that the affected source packages[3] be run through the
command 'iconv -f ORIGINAL_CHARSET -t utf-8' as soon as convenient.
No, as you noticed this list is short and can be processed in a more
elegant manner, e.g. sympa description uses a no-break space where a
Josselin Mouette [EMAIL PROTECTED] writes:
Le dimanche 05 décembre 2004 à 11:43 +0100, Andreas Barth a écrit :
I think most of us agree that non-UTF-8-characters are not a good idea
(please note the UTF-8-characters is a superset of ASCII). For some
places (like package names), I think most
On Sun, Dec 05, 2004 at 06:40:52PM +0100, Goswin von Brederlow wrote:
On that note, how likely is it to hit a UTF-8 character encoding that
contains a '\n'? Any non UTF-8 aware parser would assume a new line
has started and get parse errors.
0% likely, guaranteed.
UTF-8 is *designed* to be
Bart Schuller [EMAIL PROTECTED] writes:
On Sun, Dec 05, 2004 at 06:40:52PM +0100, Goswin von Brederlow wrote:
On that note, how likely is it to hit a UTF-8 character encoding that
contains a '\n'? Any non UTF-8 aware parser would assume a new line
has started and get parse errors.
0%
On Sun, Dec 05, 2004 at 06:40:52PM +0100, Goswin von Brederlow wrote:
On that note, how likely is it to hit a UTF-8 character encoding that
contains a '\n'? Any non UTF-8 aware parser would assume a new line
has started and get parse errors.
Thats no problem. The only problem you have with
Peter Samuelson writes,
We seem to be moving to a de facto standard of UTF-8 for non-ASCII
characters in debian/control files. This is not specified in Policy
[1], but for hopefully obvious reasons, consistency is a Good Thing,
and UTF-8 seems to be the best solution for this sort of thing.
El dom, 05-12-2004 a las 20:16 +, Thaddeus H. Black escribi:
Peter Samuelson writes,
We seem to be moving to a de facto standard of UTF-8 for non-ASCII
characters in debian/control files. This is not specified in Policy
[1], but for hopefully obvious reasons, consistency is a Good
On Sunday 05 December 2004 03:32 pm, Jose Carlos Garcia Sogo wrote:
Would Peter permit me a mild dissent? I prefer Latin-1. Reason: I can
recognize and distinguish Latin-1 characters, even when I do not always
understand the words they spell. Recognizing and distinguishing the
On Sun, Dec 05, 2004 at 04:42:24PM -0500, Daniel Burrows wrote:
On Sunday 05 December 2004 03:32 pm, Jose Carlos Garcia Sogo wrote:
Would Peter permit me a mild dissent? I prefer Latin-1. Reason: I can
recognize and distinguish Latin-1 characters, even when I do not always
understand
On Mon, Dec 06, 2004 at 09:54:36AM +1100, Paul Hampson [EMAIL PROTECTED]
wrote:
Isn't there a proposal around for
Description#en: English text
Description#ja: Japanese text
And you'd advocate to write the English text in latin1 and the japanese
text in euc-jp ?
Let's make it clear: 1 text
Le lundi 06 décembre 2004 à 09:26 +0900, Mike Hommey a écrit :
On Mon, Dec 06, 2004 at 09:54:36AM +1100, Paul Hampson [EMAIL PROTECTED]
wrote:
Isn't there a proposal around for
Description#en: English text
Description#ja: Japanese text
And you'd advocate to write the English text in
[Thaddeus H. Black]
Would Peter permit me a mild dissent? I prefer Latin-1.
Dissents are fine. (:
The reason to go with UTF-8 is for consistency. Tools that wish to
render text onto the screen ought to be able to depend on knowing the
encoding that text is in. See below for why I (and many
Thaddeus H. Black wrote:
I do not deny that Latin-1 represents all the languages I can read, and
that this fact may color my view. Nevertheless to me a source written
in Chinese is effectively non-free. It might as well be a compiled
binary blob.
So Emacs is effectively non-free, because I
On Sun, Dec 05, 2004 at 09:32:00PM +0100, Jose Carlos Garcia Sogo wrote:
But the only field in UTF8 should be Maintainer, and that field should
have (IMHO) also a roman transliterate for the name, if you don't use a
latin charset (Greek, Arabic, Japanese, Chinese...)
The transliterated field
On Mon, Dec 06, 2004 at 09:26:57AM +0900, Mike Hommey wrote:
On Mon, Dec 06, 2004 at 09:54:36AM +1100, Paul Hampson [EMAIL PROTECTED]
wrote:
Isn't there a proposal around for
Description#en: English text
Description#ja: Japanese text
And you'd advocate to write the English text in
On Mon, Dec 06, 2004 at 01:40:27AM +, Andrew Suffield wrote:
On Sun, Dec 05, 2004 at 09:32:00PM +0100, Jose Carlos Garcia Sogo wrote:
But the only field in UTF8 should be Maintainer, and that field should
have (IMHO) also a roman transliterate for the name, if you don't use a
latin
58 matches
Mail list logo