Perhaps, though the SWF format does not make it easy . . .

== FWS ==
https://www.adobe.com/content/dam/Adobe/en/devnet/swf/pdf/swf-file-format-spec.pdf
(See Appendix A for another walkthrough)

All integer values are little-endian byte order, but big-endian bit order within bytes. Signed integers have typical twos-complement arithmetic including sign-extension.

To start with, FrameSize is a RECT - a variable-length structure starting with an _unsigned_ five-bit value determining how many bits the other four _signed_ bit-values (Xmin, Xmax, Ymin, Ymax) each have. If it starts 01011 in bitstream order, the next eleven bits are Xmin, and so on.

On the plus side, "FrameSize RECT always has Xmin and Ymin value of 0." So we could create 31 cases depending on the value of the ninth octet equating to a particular bitmask and then check for 0-values for Xmin and Xmax [which vary in length and, for Ymin, position, depending on their length].

In other words, in this particular case, we check in bitstream order for:
[01011|00000000000|xxxxxxxxxxx|0000000000]
[mask |   Xmin   |   Xmax   |   Ymin  ]

I foresee lots of & and ^, unfortunately. But it should be possible. Could short-cut it a bit, since for all but the 1- and 2-bit cases, the rest of the ninth octet must be 0 in order to match Xmin, so it's not necessary to mask the ninth octet to match the first five bits.

FrameRate and FrameCount might be useful, too. Note that as integers, they are byte-aligned, with zero-padding at the end of the preceding RECT if necessary.

--

There is another problem: those octets are only guaranteed to be available for FWS. In the case of CWS or ZWS, the files are compressed after Length with ZLIB (introduced in SWF 6) or LZMA (SWF 13) respectively.

The file in question was CWS, and I understand this to be the default option in current versions of Adobe software, which are also the ones most likely to be saving files in the latest versions. Reviewing an assortment of the latest SWF files uploaded to our website, the division is 60%/40% CWS/FWS.

The compressed length relates to the actual length of the file, but I don't think libmagic can use that. However, the files must be in the according compressed formats, which have their own headers that may be of use.

== ZLIB  (CWS) ==
https://www.ietf.org/rfc/rfc1950.txt
CM (compression method) nibble is always 8, and the CINFO (compression info) nibble which defines the base-2 logarithm of the LC77 window size, minus eight, must be 7 or below. In all the files I have examined, it is 7; however it could theoretically be something else. This means the ninth byte of a CWS file is 0xN8 , where N <= 7; and commonly it is 0x78 ('x'). [Note: it is perfectly possible for an uncompressed FWS file to have an 0x78 in the 9th position.]

The flag octet after it, is commonly 0x9C ('Œ') but this is not guaranteed; I have also seen 0xDA ('Ú') and various items may be expected, so I would not rely on it. Beyond that is the possible dictionary ID and then compressed data.

== LZMA (ZWS) ==
http://www.7-zip.org/a/lzma-specification.7z
with a summary at
https://svn.python.org/projects/external/xz-5.0.3/doc/lzma-file-format.txt

I don't have any of these SWF files to hand, but the specification above notes that LZMA Utils only creates files with lz/lp/pb values 3/0/2. This would correspond to a properties byte of 0x5d (9th octet). There is also a little-endian dictionary size and a file length, which may be all FF if it is unknown. For comparison, one bare .lzma file looks like this:

00000000 5d 00 00 80 00 ff ff ff ff ff ff ff ff 00 16 e9 |]...............|

But it is technically possible to create a valid LZMA stream with other property bytes, and presumably these would be valid SWF files as well. Perhaps it's possible to delegate to the LZMA and ZLIB magic to test this?

--
Laurence "GreenReaper" Parry
http://greenreaper.co.uk - https://inkbunny.net

Reply via email to