Re: Endianness of data files in MultiArch (was: Please test gzip -9n - related to dpkg with multiarch support)

2012-02-13 Thread Aron Xu
On Sun, Feb 12, 2012 at 08:00, Carsten Hey cars...@debian.org wrote:
 * Aron Xu [2012-02-09 01:22 +0800]:
 Some packages come with data files that endianness matters, and many
 of them are large enough to split into a separate arch:all package if
 endianness were not something to care about. ...

 Debian Policy, begin of section 5.6.8:
 | Depending on context and the control file used, the Architecture field
 | can include the following sets of values:
 |  * A unique single word identifying a Debian machine architecture as
 |    described in Architecture specification strings, Section 11.1.
 |  * An architecture wildcard identifying a set of Debian machine
 |    architectures, see Architecture wildcards, Section 11.1.1. any
 |    matches all Debian machine architectures and is the most frequently
 |    used.
 |  * all, which indicates an architecture-independent package.
 |  * source, which indicates a source package.

 Possible addition to solve your problem:
   * littleendian[1], which indicates a package that is installable on
     all little endian architectures.
   * bigendian[1], which indicates a package that is installable on
     all big endian architectures.


I agree this will help a lot, and the endians may be shortened as le
and be. But there's still file collision if maintainer doesn't
install them in different paths, but that's another story.

debian-policy people, would you like to take this idea? What's the
steps to make this (possibly) happen?


-- 
Regards,
Aron Xu


--
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/CAMr=8w6zX=2_u9qgswi-sno+axxskr4fpk4au1n7cbxtkjd...@mail.gmail.com



Re: Endianness of data files in MultiArch (was: Please test gzip -9n - related to dpkg with multiarch support)

2012-02-11 Thread Carsten Hey
* Aron Xu [2012-02-09 01:22 +0800]:
 Some packages come with data files that endianness matters, and many
 of them are large enough to split into a separate arch:all package if
 endianness were not something to care about. ...

Debian Policy, begin of section 5.6.8:
| Depending on context and the control file used, the Architecture field
| can include the following sets of values:
|  * A unique single word identifying a Debian machine architecture as
|described in Architecture specification strings, Section 11.1.
|  * An architecture wildcard identifying a set of Debian machine
|architectures, see Architecture wildcards, Section 11.1.1. any
|matches all Debian machine architectures and is the most frequently
|used.
|  * all, which indicates an architecture-independent package.
|  * source, which indicates a source package.

Possible addition to solve your problem:
   * littleendian[1], which indicates a package that is installable on
 all little endian architectures.
   * bigendian[1], which indicates a package that is installable on
 all big endian architectures.

The following paragraph could be (changes are marked in a wdiff like
format):
| In the main debian/control file in the source package, this field may
| contain the special value all, the special architecture wildcard{+s+}
| any{+ or endian (which matches littleendian and bigendian)+}, or
| a list of specific and wildcard architectures separated by spaces. If
| all{+, endian+} or any appears, that value must be the entire contents
| of the field. Most packages will use either all or any.



 [1] The dash before endian to make it more readable is omitted to make
 the resulting architecture wildcards (see Debian Policy, section
 11.1.1) more consistent with the existing ones.


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/2012021216.ga17...@furrball.stateful.de



Endianness of data files in MultiArch (was: Please test gzip -9n - related to dpkg with multiarch support)

2012-02-08 Thread Aron Xu
I want to speak up about endianness of data files, this is a
suggestion but not a flaw which I just want to discover the
possibility of improvement to current status by the chance of
implementing Multi-Arch in Debian.

Some packages come with data files that endianness matters, and many
of them are large enough to split into a separate arch:all package if
endianness were not something to care about. AFAIK some maintainers
are not aware of endianness issues in their packages and then just
ignored it (not sure how many, but if any of them are discovered it
should lead to RC bug). It would be great to have some mechanism to
handle such kind of problems in Debian, to avoid forcing those data to
be placed into arch:any package.


-- 
Regards,
Aron Xu


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/CAMr=8w494xg1bwj3lr5rqnjrgrcung-e6igqb+xt6bdygpr...@mail.gmail.com



Re: Endianness of data files in MultiArch (was: Please test gzip -9n - related to dpkg with multiarch support)

2012-02-08 Thread Simon McVittie
On 08/02/12 17:22, Aron Xu wrote:
 Some packages come with data files that endianness matters, and many
 of them are large enough to split into a separate arch:all package if
 endianness were not something to care about. AFAIK some maintainers
 are not aware of endianness issues in their packages and then just
 ignored it (not sure how many, but if any of them are discovered it
 should lead to RC bug).

Hopefully Jakub Wilk's automatic checks for conflicting files
http://people.debian.org/~jwilk/multi-arch/ will already be picking
this up, in cases where the less-used-endianness architectures aren't
broken already.

If the less-used-endianness architectures are already broken, that's
also a bug (potentially an RC one), just like code that compiles but
doesn't work on a particular endianness due to other assumptions - and
if nobody has noticed it yet, presumably the package doesn't have any
users (or regression tests) on those architectures.

 It would be great to have some mechanism to
 handle such kind of problems in Debian, to avoid forcing those data to
 be placed into arch:any package.

If the right endianness is critical: libfoo:i386 Depends:
libfoo-data-le, libfoo:powerpc Depends: libfoo-data-be, both data
packages arch:all, data files in /usr/share/foo/le and /usr/share/foo/be
respectively?

Or just make sure the data has an endianness marker, and enhance the
reading package to do the right byteswapping based on the endianness
marker - e.g. this has been discussed for gettext, which ended up just
writing out the same endianness on all platforms. Many formats
(particularly those that originated on Windows) are always
little-endian, and big-endian platforms reading them just take the minor
performance hit; formats that respect network byte order have the
opposite situation.

S


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4f32b26f.8050...@debian.org



Re: Endianness of data files in MultiArch (was: Please test gzip -9n - related to dpkg with multiarch support)

2012-02-08 Thread Aron Xu
On Thu, Feb 9, 2012 at 01:35, Simon McVittie s...@debian.org wrote:
 On 08/02/12 17:22, Aron Xu wrote:
 Some packages come with data files that endianness matters, and many
 of them are large enough to split into a separate arch:all package if
 endianness were not something to care about. AFAIK some maintainers
 are not aware of endianness issues in their packages and then just
 ignored it (not sure how many, but if any of them are discovered it
 should lead to RC bug).

 Hopefully Jakub Wilk's automatic checks for conflicting files
 http://people.debian.org/~jwilk/multi-arch/ will already be picking
 this up, in cases where the less-used-endianness architectures aren't
 broken already.

 If the less-used-endianness architectures are already broken, that's
 also a bug (potentially an RC one), just like code that compiles but
 doesn't work on a particular endianness due to other assumptions - and
 if nobody has noticed it yet, presumably the package doesn't have any
 users (or regression tests) on those architectures.


Or some of them just gave up because it is less-used architecture.

 It would be great to have some mechanism to
 handle such kind of problems in Debian, to avoid forcing those data to
 be placed into arch:any package.

 If the right endianness is critical: libfoo:i386 Depends:
 libfoo-data-le, libfoo:powerpc Depends: libfoo-data-be, both data
 packages arch:all, data files in /usr/share/foo/le and /usr/share/foo/be
 respectively?


This looks not very nice, because we need to maintain a list of
architectures in debian/control, and when new architectures are added
the package is potentially broken.

Also, arch:all packages are usually generated by the uploading DD on
one architecture, mostly amd64 and i386 today, how can he managed to
generate be data files if he doesn't have access to such a machine?
Adding an option to the data generator/parser and make it able to
generate be/le data on any architecture seems not to be a reasonable
approach.

 Or just make sure the data has an endianness marker, and enhance the
 reading package to do the right byteswapping based on the endianness
 marker - e.g. this has been discussed for gettext, which ended up just
 writing out the same endianness on all platforms. Many formats
 (particularly those that originated on Windows) are always
 little-endian, and big-endian platforms reading them just take the minor
 performance hit; formats that respect network byte order have the
 opposite situation.


This is valid for most-used applications/formats like gettext, images
that are designed to behave in this way, but on the contrary there are
upstream that don't like to see such impact, especially due to the
complexity and performance impact.

Currently I am using arch:any for data files which aren't be affected
with multiarch, i.e. not same or foreign. For endianness-critical
data that is required to make a library working, I have to force them
to be installed into /usr/lib/triplet/$package/data/ and mark them
as Multiarch: same, this is sufficient to avoid breakage, but again
it consumes a lot of space on mirror.

I thought about something like /usr/share/$package/data/{be,le} in
arch:all, but appears to be not a reasonable solution because we need
to modify the data generator/parser.

-- 
Regards,
Aron Xu


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/CAMr=8w6s+itap8usgjaqf86mffypaop+qjodetjhdyumb7a...@mail.gmail.com