Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-03 Thread Rob Landley

On 03/01/2013 11:33:46 AM, Antonio Diaz Diaz wrote:

Dear Denys.
The mistake here would be to reject lzip...


You deny the busybox maintainer's reality, and substitute your own!


The current situation looks pretty simple:
lzip and xz are roughly the same feature-wise,


The only feature for which lzip and xz are roughly the same is  
compression
speed/size. Sadly it seems the only feature ever tested/cared for by  
most users.


Gee, I wonder why?

Stop and think, what is any compression code in busybox _for_? The only  
reason to have xz in busybox at all is because there are a lot of  
existing tar.xz files out there. It is an existing, deployed file  
format which busybox wants to be compatible with.


You're saying that you've got a new super compression format called arj  
or zoo or stuffit or binhex or whatever it is, and you'd very much like  
to shoehorn it into busybox in hopes of getting it wider adoption.


Denys said no. You're getting huffy about it. I await the flounce.


Therefore, in their real-world use, Busybox users will need
to unpack *xz* files. Such as kernel tarballs from kernel.org,
distribution .rpms with internally-xz'ed cpio archives,
and many other things.


This sees users as consumers. What about the users who want to create  
their

own compressed files?


They might want to do so in a format that people they send it to would  
previously have heard of. Given how bad an ambassador you are for your  
preferred choice, I'm guessing lzma ain't ever gonna be it.


Not counting that any Busybox user wanting to check the integrity of  
files will
avoid xz files anyway. Kernel tarballs are also distributed in bzip2  
format.


Great, so we've got this compression thing covered. So we don't need  
your new format, ever, for any reason, at all. Good to know.



You still have a way in, though. You have prepared _compression_
support too. That is something xz embedded doesn't provide.
Anyone who wants to _create_ a .xz file using bbox is potentially
your client.


I think there is a misunderstanding here. I am not seeking clients.
I am trying to be the change I wish to see in the world.


No, you're trying to make busybox be the change you see in the world,  
by leveraging the installed base of an established project to promote  
your agenda, and doing so _OVER_ the maintainer's objections.


If the change you wish to make in the world is annoying people, you're  
doing great.


Hijacking a mailing list thread about a bug to promote an alternate  
_incompatible_ implementation is not even potentially the same as  
addressing the bug. It's not look, this other code has a bug, I win!  
That's not how it works. I've been working to replace busybox with  
toybox for years and I still occasionally submit bug reports (and  
fixes!) here.


Rob
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-03 Thread Rob Landley

On 03/01/2013 02:41:39 PM, Matias A. Fonzo wrote:

  There are people who like to have a full compressor/decompressor
  in Busybox, performing better than gzip/bzip2.

 xz compressor then.

Precisely, adding the compressor, doesn't it imply adding more code?.
More code than the expected, I guess...


Adding more code is the price you pay for adding more functionality.  
Busybox can't have _no_ code and do anything useful. So the question is  
whether you get functionality worth the size and complexity penality of  
the code you choose to include.


If you've got support for one end of an existing format, there's an  
argument for supporting the other end because that code has to exist  
somewhere for your code to be useful, and busybox tends to avoid  
external dependencies where possible. This is similar to if we're  
going to have patch, it should handle applying hunks at offsets,  
because otherwise people ahve to rip it out and replace it with a real  
patch to do anything useful. And if we have patch, we should have diff  
that can generate those... Arguing that our find should do -xdev is  
this class of argument: we already have code in this space and this is  
part of the feature set that people actually use with that code, and  
it's not too much code or complexity to be worth adding.


That argument isn't there for adding support for a _new_ format. When  
you open a new can of worms you can make arguments based on use cases  
or user bases, but Denys's argument was about where you draw the line  
based on what busybox has already got.


Rob
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-03 Thread Harald Becker
Hi Rob !

Hijacking a mailing list thread about a bug to promote an alternate
_incompatible_ implementation is not even potentially the same as
addressing the bug. It's not look, this other code has a bug, I win!
That's not how it works. I've been working to replace busybox with
toybox for years and I still occasionally submit bug reports (and
fixes!) here.

Now you are going to far! Nobody hijacked the thread, and when it was
my mistake. Due to the running thread about an decompressor bug I
hooked in and tried to push somewhat an older announcement from
Antonio about lzip. This started this horrible discussion, as far as I
know.

I do not want to go further here. Denys said no. I sill like to
(quickly) have a better compressor statically linked into the Busybox
binary. At least until an xz compressor for Busybox is available. I
accept Denys decision, but like to ask Antonio to further provide
patches to add lzip compressor/decompressor to Busybox. That way those
people who like, may add lzip to there Busybox. The only thing I want
to request from Denys is a link from Busybox web site to Antonios site
where he provides the patches (tiny utilities section?). To have a
known starting location, in case someone loses the link to Anonios site.

--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-01 Thread Antonio Diaz Diaz

Hello Roy.

Roy wrote:
I wonder, why not adding .lzma support in lunzip(like what pdlzip does)  
for deprecating decompress_unlzma?


If .lzma support is going to be added to some other applet, I guess it 
should be to unxz, because it is the full xz the one supporting .lzma, 
not the full lzip.


Pdlzip is a hack whose main purpose is providing a public domain 
implementation of lzip to those who can't distribute GPL software. It 
also decompresses .lzma files just as a side effect of using the LZMA 
SDK from Igor Pavlov.



Regards,
Antonio.
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-01 Thread Denys Vlasenko
Hi Antonio,

On Thu, Feb 28, 2013 at 11:20 PM, Antonio Diaz Diaz
ant_d...@teleline.es wrote:
 text+data text+rodatarwdata   bss filename
  29282928 0 0
 archival/libarchive/decompress_lunzip.o


 So basically, lzip lost the race wrt adoption. xz is used more widely.
 Kernel tarballs are .xz, not .lz.

 It depends on where you get your kernels from:

 http://linux-libre.fsfla.org/pub/linux-libre/releases/3.8-gnu/

https://www.kernel.org/pub/linux/kernel/v3.x/

Please understand my position.

It's not about preferring xz over lzip, or other way around.

Maintaining three copies of LZMA (de)compressors with
virtually identical performance would be a mistake.

The current situation looks pretty simple:
lzip and xz are roughly the same feature-wise,
but xz (fairly or not) managed to get much more widely adopted
in current Linux distributions.

Therefore, in their real-world use, Busybox users will need
to unpack *xz* files. Such as kernel tarballs from kernel.org,
distribution .rpms with internally-xz'ed cpio archives,
and many other things.

Therefore I don't see sufficient reason to add .lzip
decompression support to bbox.

You still have a way in, though. You have prepared _compression_
support too. That is something xz embedded doesn't provide.
Anyone who wants to _create_ a .xz file using bbox is potentially
your client.

Unfortunately, there won't be many people interested
in creating .lzip files. If you can you change your
code so that it produces valid .xz files (even if they are
stupid in a sense that they are merely LZMA chunks w/o
LZMA2 improvements), then I will take it.

-- 
vda
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-01 Thread Denys Vlasenko
On Fri, Mar 1, 2013 at 4:18 PM, Harald Becker ra...@gmx.de wrote:
 On 01-03-2013 15:51 Denys Vlasenko vda.li...@googlemail.com wrote:
Please understand my position.

Maintaining three copies of LZMA (de)compressors with
virtually identical performance would be a mistake.

 You are still right, but there is one big difference: lzip has a
 compressor in Busybox not only a decompressor. lzma and xz are only
 decompressors and it is handy to have same available in cases where you
 hit one of those files and you do not have access to the full package.

What percentage of bbox users would want to produce .lzip files?
It isn't a widely used format.

bbox didn't have even bzip2 compressor for a long time.

 Beside this I prefer lzip due to its full implementation in Busybox ...
 or are you going to add an xz compressor in Busybox?

Yes, adding xz compressor is a good idea.

 And in addition, if you do not like to have all those decompressors in
 your Busybox binary, you can disable your dislikes in the config.

This does not remove the need to maintain the code.
More code = more bugs.
Rarely used code = bugs stay unfixed for a longer time.

 There are people who like to have a full compressor/decompressor
 in Busybox, performing better than gzip/bzip2.

xz compressor then.

-- 
vda
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-01 Thread Harald Becker
Hi Denys !

On 01-03-2013 18:03 Denys Vlasenko vda.li...@googlemail.com wrote:

Yes, adding xz compressor is a good idea.

xz compressor then.

Fine! When is it available? Is one actively working at it? lzip is
there and works. SCNR

--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-01 Thread Antonio Diaz Diaz

Dear Denys.

Denys Vlasenko wrote:

Maintaining three copies of LZMA (de)compressors with
virtually identical performance would be a mistake.


This can be easily solved by removing the deprecated one, as others are 
doing. The mistake here would be to reject lzip just to keep the 
deprecated lzma-alone.




The current situation looks pretty simple:
lzip and xz are roughly the same feature-wise,


The only feature for which lzip and xz are roughly the same is 
compression speed/size. Sadly it seems the only feature ever 
tested/cared for by most users.




Therefore, in their real-world use, Busybox users will need
to unpack *xz* files. Such as kernel tarballs from kernel.org,
distribution .rpms with internally-xz'ed cpio archives,
and many other things.


This sees users as consumers. What about the users who want to create 
their own compressed files?


Not counting that any Busybox user wanting to check the integrity of 
files will avoid xz files anyway. Kernel tarballs are also distributed 
in bzip2 format.




Therefore I don't see sufficient reason to add .lzip
decompression support to bbox.


All right. If you ever change your mind, just ask me for an updated 
patch. :-)




You still have a way in, though. You have prepared _compression_
support too. That is something xz embedded doesn't provide.
Anyone who wants to _create_ a .xz file using bbox is potentially
your client.


I think there is a misunderstanding here. I am not seeking clients. I 
am trying to be the change I wish to see in the world. I prefer to see 
my work rejected better than causing harm to humankind by working on a 
project I consider a mistake.


The lzip applet already produces full lzip files, not dumbed-down files 
like the ones an hypothetical xz applet could produce.



Regards,
Antonio.

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-01 Thread Denys Vlasenko
On Fri, Mar 1, 2013 at 6:33 PM, Antonio Diaz Diaz ant_d...@teleline.es wrote:
 You still have a way in, though. You have prepared _compression_
 support too. That is something xz embedded doesn't provide.
 Anyone who wants to _create_ a .xz file using bbox is potentially
 your client.


 I think there is a misunderstanding here. I am not seeking clients. I am
 trying to be the change I wish to see in the world.

(1) Why do you want the world to stop using .xz and start using .lzip?
Apart from xz has a fatal flaw - it is not designed/written by me.

(2) What are the chances of this happening?

 I prefer to see my work rejected better than causing harm
 to humankind by working on a project I consider a mistake.

A possibly suboptimal choice of the prevalent LZMA compressor
is way down on the list of dangers for the humankind.

-- 
vda
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-01 Thread Laurent Bercot
 Why talk about rejected work, or mistakes ? Alternatives are a good thing.
 Also, as useful and widespread as Busybox is, it doesn't have to be the
be-all, end-all of embedded software; it doesn't have to package
*everything* a user might need.

 As I see it, Busybox exists to provide low-resource-consuming (be it
disk space or RAM) implementations of existing utilities - especially
GNU utilities, that are traditionally feature-oriented instead of
embedded-friendly.
 But if some software is already small and easy to use in restricted
environments, why would Busybox have to integrate it ? If the original
lzip utility doesn't require a nuclear plant to run, then making a
Busybox version seems redundant - embedded users who want to use lzip
can simply install the original !

 Getting *everything* into Busybox - one binary to rule them all -
smells a bit too much like systemd. Do we want to go there ?

 Instead, Denys, or whoever maintains the busybox.net website, there is a
tinyutils.html page that is way out of date, and that seems precisely
made to list utilities that might benefit embedded users, *additionally*
to Busybox. I would very much like my own execline (in the scripting
language section) and s6, and even s6-linux-utils (mainly for the s6-devd
netlink utility), to appear there. If the original lzip qualifies, it
could certainly be listed there too, as well as other utilities I'm
not thinking of atm. Less work for busybox, same benefits for the
community.

 If I can help concretely, I will be happy to.

-- 
 Laurent
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-01 Thread Antonio Diaz Diaz

Denys Vlasenko wrote:

I think there is a misunderstanding here. I am not seeking clients. I am
trying to be the change I wish to see in the world.


(1) Why do you want the world to stop using .xz and start using .lzip?
Apart from xz has a fatal flaw - it is not designed/written by me.


This is the most gratuitous insult I have ever received. Even more so 
given that the motivation of this thread was a real flaw in xz.


BTW, it was Gandhi the one who said You must be the change you wish to 
see in the world.




(2) What are the chances of this happening?


Unless a better algorithm is discovered, 100%. You can not fool all of 
the people all of the time.




A possibly suboptimal choice of the prevalent LZMA compressor
is way down on the list of dangers for the humankind.


Certainly, but it was also Gandhi who said, Whatever you do will be 
insignificant, but it is very important that you do it.



Regards,
Antonio.

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-01 Thread Denys Vlasenko
On Fri, Mar 1, 2013 at 9:09 PM, Antonio Diaz Diaz ant_d...@teleline.es wrote:
 Denys Vlasenko wrote:

 I think there is a misunderstanding here. I am not seeking clients. I
 am trying to be the change I wish to see in the world.

 (1) Why do you want the world to stop using .xz and start using .lzip?
 Apart from xz has a fatal flaw - it is not designed/written by me.

 This is the most gratuitous insult I have ever received.

In fact, many coders (including me) are susceptible to this effect:
they like their code. I guess it's a human nature.

 Even more so given that the motivation of this thread was a real flaw in xz.

I do not see any flaws in xz. I just reread its specification
and it doesn't sound bad (although I'd do a few things differently).

-- 
vda
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-01 Thread Matias A. Fonzo
El Fri, 1 Mar 2013 18:03:55 +0100
Denys Vlasenko vda.li...@googlemail.com escribió:
 On Fri, Mar 1, 2013 at 4:18 PM, Harald Becker ra...@gmx.de wrote:
  On 01-03-2013 15:51 Denys Vlasenko vda.li...@googlemail.com wrote:
 Please understand my position.
 
 Maintaining three copies of LZMA (de)compressors with
 virtually identical performance would be a mistake.
 
  You are still right, but there is one big difference: lzip has a
  compressor in Busybox not only a decompressor. lzma and xz are only
  decompressors and it is handy to have same available in cases where
  you hit one of those files and you do not have access to the full
  package.
 
 What percentage of bbox users would want to produce .lzip files?

How to know it?.

 It isn't a widely used format.

With this thought (nothing personal), what chances have the good
alternatives out there?.

(xz is not more popular (or widely used) than gzip or bzip2).
 
 bbox didn't have even bzip2 compressor for a long time.

Hmm.. what about the memory usage?.
 
  Beside this I prefer lzip due to its full implementation in
  Busybox ... or are you going to add an xz compressor in Busybox?
 
 Yes, adding xz compressor is a good idea.
 
  And in addition, if you do not like to have all those decompressors
  in your Busybox binary, you can disable your dislikes in the config.
 
 This does not remove the need to maintain the code.
 More code = more bugs.
 Rarely used code = bugs stay unfixed for a longer time.
 
  There are people who like to have a full compressor/decompressor
  in Busybox, performing better than gzip/bzip2.
 
 xz compressor then.
 

Precisely, adding the compressor, doesn't it imply adding more code?.
More code than the expected, I guess...

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-01 Thread Matias A. Fonzo
El Fri, 1 Mar 2013 21:50:44 +0100
Denys Vlasenko vda.li...@googlemail.com escribió:
 On Fri, Mar 1, 2013 at 9:41 PM, Matias A. Fonzo s...@dragora.org
 wrote:
  What percentage of bbox users would want to produce .lzip files?
 
  How to know it?
 
 
 
  It isn't a widely used format.
 
  With this thought (nothing personal), what chances have the good
  alternatives out there?.
 
  (xz is not more popular (or widely used) than gzip or bzip2).
 
 LZMA-based compressors give a better, and slower, compression
 than bzip2. It is not unexpected that with faster processors,
 we reached the point when people can use it without excessive
 time penalty.
 
 Kernel is released in .xz tarballs (in addition to .bz2).
 Distributions are using xz-compressed .rpms.

I prefer to download tarballs in bzip2 format, (if there's no other
option between xz or bzip2). At least, bzip2 provides a recovery
tool. ;-)

By the way -- RPM has lzip support[1]:

[1] http://www.rpm.org/ticket/839
 
 These are cold hard facts. I don't invent them.
 Try googling for kernel tarballs in .lzip.Or any tarballs
 in .lzip for that matter. Sure, I found them... *eventually*.
 
 Busybox has no xz compression support, but it inevitably
 will be requested. (As it has happened with bzip2).
 And if by that time it will have lzip, it ended up
 having *two* LZMA compressors, one widely used
 and another much less known. I don't thing having
 that extra baggage would be useful.
 

This criteria was applied to sysvinit vs. runit, too?. :-)

One can choose.


Regards,
Matias
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-03-01 Thread Laurent Bercot
 Hi Harald,


 1) I like to create small self contained initramfs systems. Systems
 included in a single kernel image. So all you need to run the system is
 that single kernel image and a boot loader to start. In those initramfs
 systems i like to have only a minimum of statically linked binaries
 with a maximum of flexibility. At best there is only one binary
 containing all the utilities and the special application binaries.

 Oh, I do not deny there are good points in favour of having one
single binary; your use case is one of them. I personally build my
systems from bits and pieces, busybox being a tool among others, and
agree that the more pieces, the more work for me.
 You have to weigh the costs: my approach undeniably means more work for
the administrator, but is also more flexible. The day you need a utility
that is not included in Busybox, you will have to pack it by hand too,
so my point is that Busybox including stuff is *convenient* more than
*required*.


 2) Comparing a utility collection box like Busybox with a tool like
 systemd really smells bad. They are two completely different things.

 The comparison was obviously over-the-top and provocative, and I'm glad
it elicited a reaction from you - but I wanted to poke at Denys, who
shares my dislike of systemd for the same reasons ;)

 However, I think the underlying question about Busybox's policy needs
to be addressed. If Busybox starts including things that are already
small and embeddable to begin with (and I think it has already started
going down this path with runit), then it becomes a one-stop-shop, a
kind of Linux distribution, and like every distribution, sooner or later
it will have to include the whole world. I would much rather have it
stick to providing replacements for standard utilities that really need
rewriting, along with a collection of links to other small, high-quality
utilities - do one job and do it well, as the Unix philosophy says; be a
part of the community instead of trying to be the whole community, which
is exactly the same kind of hubris systemd (as well as most distributions,
really) is suffering from.

-- 
 Laurent
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz (was: Re: tar: short read on linux-3.8.tar.xz)

2013-02-28 Thread Harald Becker
Hi Denys !

What I'm saying is that bbox project would like to have is (ideally)
_one_ LZMA decoder. Unpacking the compressed stream from two formats
isn't a terribly difficult thing.

You are right and I don't want to start another format war. In addition
to this there is one issue which lets me hop on that lzip. It is not
only an LZMA decompressor it has also a compressor counterpart in
Busybox. A compressor with gives better results than bzip2. And there
is no small xz compressor available for Busybox, or is there any light
at the horizont?

And what about lzop? It is another compressor not very widely used. Why
can't we have those lzip compressor in Busybox? Those who dislike
that format may disable it in configuration.

... but another time: You are right, it would be very nice to have a
single LZMA decompressor to uncompress lzma, lzip and xz streams. If
this is possible.

... just as it would be nice to have a single zcat able to detect the
format and decompress ANY compressed stream (falling back to operation
of cat, if uncompressed data given to zcat). With a generalized 
uncompress to do zcat file temp; then rename temp and mangle name
extensions (like gunzip).

--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-28 Thread Antonio Diaz Diaz

Harald Becker wrote:

... just as it would be nice to have a single zcat able to detect the
format and decompress ANY compressed stream (falling back to operation
of cat, if uncompressed data given to zcat).


I guess you don't know zutils.

http://www.nongnu.org/zutils/zutils.html


Regards,
Antonio.
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-28 Thread Denys Vlasenko
On Thu, Feb 28, 2013 at 8:54 AM, Michael Tokarev m...@tls.msk.ru wrote:
 For some reason I haven't heard of lzip at all until now.

Yes. That's the problem, maybe the main one:
xz people won on this front hands down,
even if technically lzip is better.

-- 
vda
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-28 Thread Harald Becker
Hi Antonio !

On 28-02-2013 14:34 Antonio Diaz Diaz ant_d...@teleline.es wrote:
Harald Becker wrote:
 ... just as it would be nice to have a single zcat able to detect the
 format and decompress ANY compressed stream (falling back to
 operation of cat, if uncompressed data given to zcat).

Pleace read as: ... have a single zcat ... AS AN BUSBOX APPLET ...

I guess you don't know zutils.
http://www.nongnu.org/zutils/zutils.html

On my systems sits a script in /usr/local/zcat. That one works well on
on regular files, which is the 99.9% case I need. A zcat applet as part
of Busybox would simplify things, especially on small systems and on
rescue images.

IMO a wide spread general zcat (may be as part of GNU) would be better
than separate decompressors for every compression format ... but as a
little fly it is difficult to move the hole world.

--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-28 Thread Antonio Diaz Diaz

Hello Harald.

Harald Becker wrote:

Pleace read as: ... have a single zcat ... AS AN BUSBOX APPLET ...


I see. Well, perhaps the zcat from zutils could be adapted to Busybox. :-)



IMO a wide spread general zcat (may be as part of GNU) would be better
than separate decompressors for every compression format ... but as a
little fly it is difficult to move the hole world.


Zutils was not accepted in GNU because the names conflict with those in 
gzip. Tell me about how difficult it is for a little fly to move the 
whole world. :-)



Regards,
Antonio.
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-28 Thread Denys Vlasenko
On Thu, Feb 28, 2013 at 5:53 PM, Antonio Diaz Diaz ant_d...@teleline.es wrote:
 You didn't try lzip but plzip, which is beta software. And of course,
 parallel versions of lzip or xz compress less than standard versions because
 they split data in blocks before compressing it.

 But even so there is someting wrong with your test. Maybe your C++ compiler
 produces slower executables than the C compiler, or you used an old version
 of plzip or lzlib... I have just retried to compress gcc-4.7.2.tar (just in
 case) and in my single-processor machine, plzip (using the default
 compression level) is faster(6:16) than both lzip(6:37) and xz(7:32), just
 as expected.

 Why is this expected? Because both lzip and plzip use a default value for
 --match-length smaller than the equivalent option in xz (36 vs 64), and
 plzip sees a smaller effective dictionary size because it splits the input
 data in blocks.

Parallel compression benchmark isn't interesting: the task is trivially
parallelizable, so the speedup will be nerly exactly proportional
to the number of CPUs.

Compare one-thread xz against one-thread lzip.
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-28 Thread Antonio Diaz Diaz

Denys Vlasenko wrote:

On Thu, Feb 28, 2013 at 8:54 AM, Michael Tokarev wrote:


For some reason I haven't heard of lzip at all until now.


Yes. That's the problem, maybe the main one:
xz people won on this front hands down,
even if technically lzip is better.


Lzip is not going to disappear. Maybe a small group of influential 
people already familiarized with lzma-utils helped xz to gain a head 
start, but lzip is much more in line with what is expected from a 
compressor in unix-like systems.


Think about it, and remember the problem that began this thread. What 
LZMA compressor do you think is better for Busibox users; one that 
behaves essentially like gzip and bzip2, or one that users of small 
systems will probably never be able to make full use of?



Regards,
Antonio.

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-28 Thread Antonio Diaz Diaz

Denys Vlasenko wrote:

Why?. *.lzma are deprecated some time ago


Because someone submitted the code:


I also submitted the code[1]. :-)

[1] http://lists.busybox.net/pipermail/busybox/2012-December/078750.html



In fact, it is surprisingly small:

archival/libarchive:
   textdata bss dec hex filename
   2827   0   02827 b0b decompress_unlzma.o
   7277   0   072771c6d decompress_unxz.o
   2743   0   02743 ab7 decompress_bunzip2.o
   5270   0   052701496 decompress_gunzip.o


Just the same size as lunzip minus the header and integrity checkings:

text+data text+rodatarwdata   bss filename
 29282928 0 0 
archival/libarchive/decompress_lunzip.o




So basically, lzip lost the race wrt adoption. xz is used more widely.
Kernel tarballs are .xz, not .lz.


It depends on where you get your kernels from:

http://linux-libre.fsfla.org/pub/linux-libre/releases/3.8-gnu/

BTW, xz files are a bit smaller than lz files in the above directory, 
but you need twice the memory to decompress them. Of course lzip can 
achieve the same (or better) compresion if you add -s 64MiB to the 
command line.




What I'm saying is that bbox project would like to have is (ideally)
_one_ LZMA decoder. Unpacking the compressed stream from two formats
isn't a terribly difficult thing.


But xz is not (only) a LZMA encoder, therefore no LZMA decoder will ever 
be able to decode xz streams.




But can lzip decompress unxz *stream*?


If we have learned something from this thread is that not even all 
unxz's can decompress all xz streams. Much less verify their integrity.



Regards,
Antonio.
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-27 Thread Denys Vlasenko
On Mon, Feb 25, 2013 at 12:05 PM, Lasse Collin lasse.col...@tukaani.org wrote:
 liblzma in XZ Utils has a flag to decode concatenated streams to make
 it a bit easier to handle such files. I would prefer to not include
 such a flag in XZ Embedded, since I think in most embedded situations
 (boot loaders, kernels etc.) such a flag is useless. Busybox is an
 exception to this.

 Below is a patch to add support for concatenated .xz streams. It also
 handles possible padding (sequence of zero-bytes) between the streams.
 It probably has room for improvement, but it should be a useful starting
 point.

Applied, thanks!
-- 
vda
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-27 Thread Denys Vlasenko
On Mon, Feb 25, 2013 at 12:05 PM, Lasse Collin lasse.col...@tukaani.org wrote:
 By the way, since Busybox' copy of XZ Embedded hasn't been updated
 since unxz was added, this bug fix is missing from Busybox:

 
 http://git.tukaani.org/?p=xz-embedded.git;a=commitdiff;h=4cec51e1be4797a4bd8b266a1d34cabd7fdb79fd

 There is also the following bug fix but I think it doesn't affect
 Busybox' unxz:

 
 http://git.tukaani.org/?p=xz-embedded.git;a=commitdiff;h=9690fe69dc97eb2e7fe2804e4448a5278cde5411

I incorporated these and a few other changes, thanks!

-- 
vda
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-27 Thread Antonio Diaz Diaz

Hello Denys et all.

Denys Vlasenko wrote:

On Mon, Feb 25, 2013 at 7:20 PM, Matias A. Fonzo s...@dragora.org wrote:

Can be lzip considered for inclusion in busybox?:

[...]

Matias, sure, this can be done.

But bbox already has *two* LZMA decompressors.
Feels wrong, isn't it?


It certainly feels wrong, but those two are in reality the same one, 
which suffered a radical and bumpy transformation. (Do you remember 
lzma-4.42, which had a format incompatible with both lzma-4.32 and xz?)


Some people think that adding lzma support to GNU tools was a mistake. I 
think that adding xz support was simply the continuation of the same 
mistake.


As lzma is legacy software, I guess it will be eventually removed from 
Busybox, just as it is being removed from GNU packages: The deprecated 
'lzma' compression format for distribution archives has been removed, in 
favor of 'xz' and 'lzip'[1].


[1] http://lists.gnu.org/archive/html/automake/2012-04/msg00060.html

The xz decompressor included in Busybox is not able to decompress all 
valid xz files because it only understands the xz-embedded subset of the 
xz format. Therefore, any user wanting to decompress or check the 
integrity of real xz files needs to install the full xz!


None of the other formats (bzip2, gzip, lzip) have this problem (the 
lunzip proposed for Busybox is able to decompress and check any .lz 
file, even those produced by the parallel version of lzip, plzip). And 
it can only get worse for xz, because It is possible and even somewhat 
likely that new features will be added in the future which old programs 
won't support[2].


[2] http://www.mail-archive.com/xz-devel@tukaani.org/msg00059.html



In the long run it would be a nightmare to have two
or more LZMA (de)compressors in common use on Linux.


Agreed.



What happened between lzip and xz? Are they incompatible?
On what level? File format, or compression stream format too?


The history in a nutshell: In 2008, Antonio Diaz released lzip, which 
uses a proper container format with checksums and magic numbers instead 
of the raw LZMA data stream, providing a complete Unix-style solution 
for using LZMA. Nevertheless, LZMA Utils was extended to have similar 
features and then renamed to XZ Utils[3].


[3] http://en.wikipedia.org/wiki/Lzip

Lzip and xz are totally incompatible. Lzip uses the same stream format 
that .lzma files, just with proper header and trailer. Xz is a complex 
container format derived from 7-zip (or at least inspired by it) and 
without any resemblance to the old .lzma format.


Lzip is a compressor, just like gzip and bzip2.

Xz is much more complex than that. Even the stripped-down version of 
unxz included in Busybox is already larger than any of the other 
decompressors.


IMHO all this leaves lzip as the LZMA compressor most suitable for 
Unix-like systems in general, and for Busybox in particular.



Best regards,
Antonio.

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-27 Thread Michael Tokarev
28.02.2013 04:22, Antonio Diaz Diaz wrote:
[]
 The history in a nutshell: In 2008, Antonio Diaz released lzip, which uses a 
 proper container format with checksums and magic numbers instead of the raw 
 LZMA data stream, providing a complete Unix-style solution for using LZMA. 
 Nevertheless, LZMA Utils was extended to have similar features and then 
 renamed to XZ Utils[3].

Oh.  I remember that 2008 year (or a bit before) when kernel folks discussed 
which
format to use for kernel.org archives and leaned towards lzma, and I pointed out
that it does not have any checksums.  I guess it was a starting point for xz and
lzip.

For some reason I haven't heard of lzip at all until now.  I remember when xz 
come
out, I looked at it and noticed its complexity and lack of stable format, 
exactly
as you describe, but that didn't rang any bells for me and eventually it become 
a
widely known and accepted format.

So, I become curious how lzip behaves.  And I immediately gave it a very quick 
try.

CPU: AMD AthlonII X2 260, 3.2GHz (2 cores)
file: 1Gb (1073741824 bytes), an image of a small linux virtual machine.

 .lz:  273684804, real 11m53.112s, user 20m30.563s
 .xz:  266670056, real 11m8.190s,  user 10m45.835s

This is the default compression level.

WOW.  So, 2-thread plzip is about TWO TIMES solwer than single-thread xz when
compressing, making parallel plzip on 2 cores to be as fast as xz.  lz produces
slightly larger result.

Are you sure the stream and compression algorithm are the same? :)

Thanks,

/mjt
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-26 Thread John Spencer

On 02/26/2013 07:43 AM, Michael Tokarev wrote:

26.02.2013 03:21, John Spencer wrote:

[ quoting the full mail of lasse since it didnt make its way into the bb 
maillist yet ]


Additionally there has been a discussion and attempts to cook up a
patch in Debian, see http://bugs.debian.org/686502 , which I submitted
as a bug to busybox bugzilla -- https://bugs.busybox.net/show_bug.cgi?id=5804 .
Cc'ing the Debian bugreport.  I like the below patch better :)


the patches for busybox 1.20.2 are available in this commit
https://github.com/rofl0r/sabotage/commit/c03ddd39878473939bda6b574bc8854c533b4b00


(so that you dont have to backport them yourselves again)

i.e.
https://raw.github.com/rofl0r/sabotage/c03ddd39878473939bda6b574bc8854c533b4b00/KEEP/busybox-xz-bugfix1.patch
https://raw.github.com/rofl0r/sabotage/c03ddd39878473939bda6b574bc8854c533b4b00/KEEP/busybox-xz-bugfix2.patch
https://raw.github.com/rofl0r/sabotage/c03ddd39878473939bda6b574bc8854c533b4b00/KEEP/busybox-xz-bugfix3.patch



/mjt



--JS
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz (was: Re: tar: short read on linux-3.8.tar.xz)

2013-02-25 Thread Matias A. Fonzo
El Mon, 25 Feb 2013 07:14:28 +0100
Denys Vlasenko vda.li...@googlemail.com escribió:
 [CC'ing XZ embedded author]
 
 On Sunday 24 February 2013 22:37, John Spencer wrote:
   http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.8.tar.xz
  
   using busybox 1.20.2 and xz 5.0.3 or xz 5.0.4:
  
   $ tar xf linux-3.8.tar.xz
  
   i get: short read and exit status 1.
   however the data seems to be there (at least partial).
  
  the culprit is the file linux-3.8/drivers/media/tuners/mt2063.c
  
  after doing xzcat linux-3.8.tar.xz  linux-3.8.tar , that file is 
  truncated after 4096*2+512 bytes.
  
  xzcat is from busybox (not from xz, as i assumed earlier)
  
  the .tar file is truncated at this point as well, it is only 200
  MB, but with xzcat from xz package, it is  500 MB.
 
 Apparently XZ embedded has a bug :(
 Not only our in-tree one, but the latest git of it is buggy too:
 
 $ git clone http://git.tukaani.org/xz-embedded.git
 $ cd xz-embedded/userspace
 $ make
 $ ./xzminidec /tmp/linux-3.8.tar.xz | wc -c
 ./xzminidec: Unsupported check; not verifying file integrity
 ..working for some time...
 201330688
 
 (xzminidec doesn't crash: exit code is zero).
 
 The peculiar thing is that 201330688 is exactly 0x0c001000.

Lack of integrity checking, (de)compressor.

Can be lzip considered for inclusion in busybox?:

[1] http://lzip.nongnu.org
[2] http://en.wikipedia.org/wiki/Lzip
[3] http://lists.busybox.net/pipermail/busybox/2012-December/078750.html
[4] http://ur1.ca/810mp

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz (was: Re: tar: short read on linux-3.8.tar.xz)

2013-02-25 Thread Harald Becker
Hi !

Can be lzip considered for inclusion in busybox?:

I'm also interested to get that lzip into Busybox. I used the provided
patch on current snapshot and it works fine for me.

--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-25 Thread John Spencer
[ quoting the full mail of lasse since it didnt make its way into the bb 
maillist yet ]


On 02/25/2013 12:05 PM, Lasse Collin wrote:

On 2013-02-25 Denys Vlasenko wrote:

On Sunday 24 February 2013 22:37, John Spencer wrote:

http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.8.tar.xz

using busybox 1.20.2 and xz 5.0.3 or xz 5.0.4:

$ tar xf linux-3.8.tar.xz

i get: short read and exit status 1.
however the data seems to be there (at least partial).


the culprit is the file linux-3.8/drivers/media/tuners/mt2063.c

after doing xzcat linux-3.8.tar.xz  linux-3.8.tar , that file is
truncated after 4096*2+512 bytes.

xzcat is from busybox (not from xz, as i assumed earlier)

the .tar file is truncated at this point as well, it is only 200
MB, but with xzcat from xz package, it is  500 MB.


Apparently XZ embedded has a bug :(
Not only our in-tree one, but the latest git of it is buggy too:

$ git clone http://git.tukaani.org/xz-embedded.git
$ cd xz-embedded/userspace
$ make
$ ./xzminidec/tmp/linux-3.8.tar.xz | wc -c
./xzminidec: Unsupported check; not verifying file integrity
..working for some time...
201330688

(xzminidec doesn't crash: exit code is zero).

The peculiar thing is that 201330688 is exactly 0x0c001000.


linux-3.8.tar.xz from kernel.org has three concatenated .xz streams.
You can see this with e.g. xz -l or xz -lvv. At least pxz creates
such .xz files. Such files are valid and fine.

xzminidec is a limited example program. It doesn't support concatenated
streams. This is mentioned in the comment in the beginning of
xzminidec.c. One may argue if it is a bug or a feature, but at least
the limitation has been documented.

Busybox' xzcat lacks support for concatenated .xz streams. For
comparison, Busybox' zcat and bzcat do support concatenated streams.

 $ echo foo | gzip  test.gz
 $ echo bar | gzip  test.gz
 $ busybox zcat test.gz
 foo
 bar

 $ echo foo | xz  test.xz
 $ echo bar | xz  test.xz
 $ busybox xzcat test.xz
 foo

liblzma in XZ Utils has a flag to decode concatenated streams to make
it a bit easier to handle such files. I would prefer to not include
such a flag in XZ Embedded, since I think in most embedded situations
(boot loaders, kernels etc.) such a flag is useless. Busybox is an
exception to this.

Below is a patch to add support for concatenated .xz streams. It also
handles possible padding (sequence of zero-bytes) between the streams.
It probably has room for improvement, but it should be a useful starting
point.

diff --git a/archival/libarchive/decompress_unxz.c 
b/archival/libarchive/decompress_unxz.c
index 79b48a1..5ebbd28 100644
--- a/archival/libarchive/decompress_unxz.c
+++ b/archival/libarchive/decompress_unxz.c
@@ -86,8 +86,40 @@ unpack_xz_stream(transformer_aux_data_t *aux, int src_fd, 
int dst_fd)
IF_DESKTOP(total += iobuf.out_pos;)
iobuf.out_pos = 0;
}
-   if (r == XZ_STREAM_END) {
-   break;
+   while (r == XZ_STREAM_END) {
+   /* Handle concatenated .xz Streams including possible
+* Stream Padding.
+*/
+   if (iobuf.in_pos == iobuf.in_size) {
+   int rd = safe_read(src_fd, membuf, BUFSIZ);
+   if (rd  0) {
+   bb_error_msg(bb_msg_read_error);
+   total = -1;
+   goto out;
+   }
+   if (rd == 0)
+   goto out;
+
+   iobuf.in_size = rd;
+   iobuf.in_pos = 0;
+   }
+
+   /* Stream Padding must always be a multiple of four
+* bytes to preserve four-byte alignment. To keep the
+* code slightly smaller, we aren't as strict here as
+* the .xz spec requires. We just skip all zero-bytes
+* without checking the alignment and thus can accept
+* files that aren't valid, e.g. the XZ Utils test
+* files bad-0pad-empty.xz and bad-0catpad-empty.xz.
+*/
+   while (iobuf.in_pos  iobuf.in_size) {
+   if (membuf[iobuf.in_pos] != 0) {
+   xz_dec_reset(state);
+   r = XZ_OK;
+   break;
+   }
+   ++iobuf.in_pos;
+   }
}
if (r != XZ_OK  r != XZ_UNSUPPORTED_CHECK) {
bb_error_msg(corrupted data);
@@ -95,6 +127,8 @@ unpack_xz_stream(transformer_aux_data_t *aux, 

Re: XZ embedded bug unpacking linux-3.8.tar.xz

2013-02-25 Thread Michael Tokarev
26.02.2013 03:21, John Spencer wrote:
 [ quoting the full mail of lasse since it didnt make its way into the bb 
 maillist yet ]

Additionally there has been a discussion and attempts to cook up a
patch in Debian, see http://bugs.debian.org/686502 , which I submitted
as a bug to busybox bugzilla -- https://bugs.busybox.net/show_bug.cgi?id=5804 .
Cc'ing the Debian bugreport.  I like the below patch better :)

/mjt

 On 02/25/2013 12:05 PM, Lasse Collin wrote:
 On 2013-02-25 Denys Vlasenko wrote:
 On Sunday 24 February 2013 22:37, John Spencer wrote:
 http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.8.tar.xz

 using busybox 1.20.2 and xz 5.0.3 or xz 5.0.4:

 $ tar xf linux-3.8.tar.xz

 i get: short read and exit status 1.
 however the data seems to be there (at least partial).

 the culprit is the file linux-3.8/drivers/media/tuners/mt2063.c

 after doing xzcat linux-3.8.tar.xz  linux-3.8.tar , that file is
 truncated after 4096*2+512 bytes.

 xzcat is from busybox (not from xz, as i assumed earlier)

 the .tar file is truncated at this point as well, it is only 200
 MB, but with xzcat from xz package, it is  500 MB.

 Apparently XZ embedded has a bug :(
 Not only our in-tree one, but the latest git of it is buggy too:

 $ git clone http://git.tukaani.org/xz-embedded.git
 $ cd xz-embedded/userspace
 $ make
 $ ./xzminidec/tmp/linux-3.8.tar.xz | wc -c
 ./xzminidec: Unsupported check; not verifying file integrity
 ..working for some time...
 201330688

 (xzminidec doesn't crash: exit code is zero).

 The peculiar thing is that 201330688 is exactly 0x0c001000.

 linux-3.8.tar.xz from kernel.org has three concatenated .xz streams.
 You can see this with e.g. xz -l or xz -lvv. At least pxz creates
 such .xz files. Such files are valid and fine.

 xzminidec is a limited example program. It doesn't support concatenated
 streams. This is mentioned in the comment in the beginning of
 xzminidec.c. One may argue if it is a bug or a feature, but at least
 the limitation has been documented.

 Busybox' xzcat lacks support for concatenated .xz streams. For
 comparison, Busybox' zcat and bzcat do support concatenated streams.

  $ echo foo | gzip  test.gz
  $ echo bar | gzip  test.gz
  $ busybox zcat test.gz
  foo
  bar

  $ echo foo | xz  test.xz
  $ echo bar | xz  test.xz
  $ busybox xzcat test.xz
  foo

 liblzma in XZ Utils has a flag to decode concatenated streams to make
 it a bit easier to handle such files. I would prefer to not include
 such a flag in XZ Embedded, since I think in most embedded situations
 (boot loaders, kernels etc.) such a flag is useless. Busybox is an
 exception to this.

 Below is a patch to add support for concatenated .xz streams. It also
 handles possible padding (sequence of zero-bytes) between the streams.
 It probably has room for improvement, but it should be a useful starting
 point.

 diff --git a/archival/libarchive/decompress_unxz.c 
 b/archival/libarchive/decompress_unxz.c
 index 79b48a1..5ebbd28 100644
 --- a/archival/libarchive/decompress_unxz.c
 +++ b/archival/libarchive/decompress_unxz.c
 @@ -86,8 +86,40 @@ unpack_xz_stream(transformer_aux_data_t *aux, int src_fd, 
 int dst_fd)
   IF_DESKTOP(total += iobuf.out_pos;)
   iobuf.out_pos = 0;
   }
 -if (r == XZ_STREAM_END) {
 -break;
 +while (r == XZ_STREAM_END) {
 +/* Handle concatenated .xz Streams including possible
 + * Stream Padding.
 + */
 +if (iobuf.in_pos == iobuf.in_size) {
 +int rd = safe_read(src_fd, membuf, BUFSIZ);
 +if (rd  0) {
 +bb_error_msg(bb_msg_read_error);
 +total = -1;
 +goto out;
 +}
 +if (rd == 0)
 +goto out;
 +
 +iobuf.in_size = rd;
 +iobuf.in_pos = 0;
 +}
 +
 +/* Stream Padding must always be a multiple of four
 + * bytes to preserve four-byte alignment. To keep the
 + * code slightly smaller, we aren't as strict here as
 + * the .xz spec requires. We just skip all zero-bytes
 + * without checking the alignment and thus can accept
 + * files that aren't valid, e.g. the XZ Utils test
 + * files bad-0pad-empty.xz and bad-0catpad-empty.xz.
 + */
 +while (iobuf.in_pos  iobuf.in_size) {
 +if (membuf[iobuf.in_pos] != 0) {
 +xz_dec_reset(state);
 +r = XZ_OK;
 +break;
 +}
 +++iobuf.in_pos;
 +}
   }
   if (r != XZ_OK  r != XZ_UNSUPPORTED_CHECK) {
   bb_error_msg(corrupted data);
 @@ -95,6 +127,8 @@ unpack_xz_stream(transformer_aux_data_t *aux, int src_fd, 
 int dst_fd)
   break;
   }
   }
 +
 +out: