2011/4/5 Jeff Johnson <n3...@mac.com>: > > On Apr 5, 2011, at 3:49 PM, Per Øyvind Karlsen wrote: > >> 2011/4/5 Jeff Johnson <n3...@mac.com>: >>> No way Jose! >>> >>> rpmbuild (and *.rpm metadata) can NOT have any encoding >>> assumed. >>> >>> Encoding is for DISPLAY, not for octets. >>> >>> Put unicode into package metadata at your own peril. >>> >>> Meanwhile -- without an means to specify encoding in metadata -- >>> rpm in C has *ONLY* 8 bit clean octet's and the usual conventions >>> for NUL terminated strings. >>> >>> Until there's a well defined means of specifying encoding for all >>> tag strings -- and that's a fundamental design change to *.rpm packaging >>> that >>> likely will NEVER happen -- the problem simply CANNOT be fixed to meet naive >>> luser expectations, and all attempts to "fix" anything >>> are just doomed. >>> >>> C has octest, not utf8, and rpmdb strings are _NOT_ based on LC_ALL >>> and other i18n/l10n conventions. >>> >>> You can of course put whatever garbage you wish into strings that >>> will be stored as keys in an rpmdb, subject to all the usual >>> GIGO conventions distro's wish to inflict upon their customers. >> Okay, my mistake anyways, I was looking into an issue with unicode strings, >> then I specified wrong locale when testing. I notice now that with properly >> specified locale, it accepts unicode characters. >> > > The test is way way feeble, but once the expectation starts, well > there's nothing to do but solve the problem "correctly". > > What is broken -- by design -- is that *.spec recipes have multiple > encodings, not a per-file encoding. And all hell starts to break > loose in *.rpm packages when retrievals using keys pick up a per-key encoding. > > You tell me how to lookup all possible encodings from a database without > specifically tying an encoding to every possible tag. > >> Still though, using '%description -l', descriptions disappears.. :| >> > > C permits octets, not encodings. All possible encodings fit into octets > with NUL terminated strings. The only thing that saves %description > (which doesn't belong in *.rpm packages, another design issue that I don't > feels like arguing about because you somplly will NOT like the answer > of specifying possibly hundreds of properly encoded %description's in > a single *.spec using the full-blown form of the 4-tuple used for > encoding on a per-tag basis. Package metadata will simply explode > for no known purpose. > > And RPM_I18NSTRING_TYPE has been on death row all of this century, > is carried along solely because PLD and a few other distros *still* > insist on inserting translations into *.spec recipes directly. > > A data type that is sometimes an arary, and sometimes a scalar dependent > on the context of interpretation just isn't a useful data type. > > Nor is there any known/modern reason why all possible encodings MUST be > carried in > each and every package header in the year 2011. There's specspo and other > means > of %description et al distribution that are far far superiour to > RPM_I18NSTRING_TYPE pulled > in from *.spec recipes. This was _NOT_ true back in 1998 when > RPM_I18NSTRING_TYPE was devised. Hm, okay, so better obviously needs to be done.
For what currently is though, is it supposed to be broken or...? -- Regards, Per Øyvind ______________________________________________________________________ RPM Package Manager http://rpm5.org Developer Communication List rpm-devel@rpm5.org