https://issues.apache.org/bugzilla/show_bug.cgi?id=52949

--- Comment #3 from Barry Lagerweij <[email protected]> ---
Created attachment 30052
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=30052&action=edit
Class which can extrace macro source code

Since POI does not provide access to this, I've written a class which allows
you to extract the sourcecode as text.

The two attached classes can be used together with POI (I've tested with 3.8
and 3.9) to process the xl/vbaProject.bin (for ooxml) or XLS file and retrieve
the sources.

The RLEDecompressingInputStream is an InputStream which can be used to
decompress the chunks as described in the MS-OVBA specification. It wraps
around a compressed inputstream (ussually a DocumentInputStream from the POIFS)
and decompresses on the fly to preserve memory.

The VBAMacroExtractor processes the OLE binary stream records, records the
CodePage (in order to convert byte-arrays to Strings) and will store the
ModuleOffset. This offset specifies the location in the MemoryStream where the
sourcecode starts. The VBAMacroExtractor has been written to automatically
detect XLSM or XLS, and uses POIFSReader to process the file only once and
preserve memory.

It might be worthwhile to enhance the POI workbook with classes which provide
access to the VBA modules, see Andrey Yesyev's contributions to the poi-dev
mailinglist.

I hope it's useful, feel free to use the sources under Apache2 license.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to