Laurence Parry wrote...

> Perhaps, though the SWF format does not make it easy . . .

Thanks a lot for your input, and sorry for the long delay. I'll try to
find a solution that covers at least the vast majority of the files that
are around there. Frankly speaking, file(1) cannot be perfect and will
never be. But we can at least aim.

> == FWS ==
(...)

> On the plus side, "FrameSize RECT always has Xmin and Ymin value of 0." So
> we could create 31 cases depending on the value of the ninth octet equating
> to a particular bitmask and then check for 0-values for Xmin and Xmax [which
> vary in length and, for Ymin, position, depending on their length].
> 
> In other words, in this particular case, we check in bitstream order for:
> [01011|00000000000|xxxxxxxxxxx|0000000000]
> [mask |   Xmin   |   Xmax   |   Ymin  ]
> 
> I foresee lots of & and ^, unfortunately. But it should be possible. Could
> short-cut it a bit, since for all but the 1- and 2-bit cases, the rest of
> the ninth octet must be 0 in order to match Xmin, so it's not necessary to
> mask the ninth octet to match the first five bits.

That is something to work on. Most notably, a mask len of six and above
requires the following octet has a value of 0x1f the most, i.e.
non-printable. This leaves six cases to examine, that's feasible.


> == ZLIB  (CWS) ==

> CM (compression method) nibble is always 8, and the CINFO (compression info)
> nibble which defines the base-2 logarithm of the LC77 window size, minus
> eight, must be 7 or below. In all the files I have examined, it is 7;
> however it could theoretically be something else. This means the ninth byte
> of a CWS file is 0xN8 , where N <= 7; and commonly it is 0x78 ('x'). [Note:
> it is perfectly possible for an uncompressed FWS file to have an 0x78 in the
> 9th position.]

You brought back old memories. I remember I had to detect compressed
files before, might have been git's packed files. However, this is one
of the places where I'd sacrifice perfection for a solution that is good
enough for the most cases.

> == LZMA (ZWS) ==

> I don't have any of these SWF files to hand, but the specification above
> notes that LZMA Utils only creates files with lz/lp/pb values 3/0/2. This
> would correspond to a properties byte of 0x5d (9th octet). There is also a
> little-endian dictionary size and a file length, which may be all FF if it
> is unknown. For comparison, one bare .lzma file looks like this:
> 
> 00000000  5d 00 00 80 00 ff ff ff  ff ff ff ff ff 00 16 e9
> |]...............|

So we'll have to guess here anyway. For all three I'll try to come up
with something suitable within the next hours (uploads targetting
stretch should be done be tomorrow). Upstreaming them will be my job,
too.

> Perhaps it's possible to delegate to the LZMA and ZLIB magic to test this?

I'll keep that in mind. It might require a major change in file's
architecture.

    Christoph

Attachment: signature.asc
Description: Digital signature

Reply via email to