Hi, I've been looking for a way to extract the source-code of VBA Modules and macros using POI. Since POI does not provide access to this, I've written a class which allows you to extract the sourcecode as text.
The two attached classes can be used together with POI (I've tested with 3.8 and 3.9) to process the vbaProject.bin (for ooxml) and XLS file and retrieve the sources. The RLEDecompressingInputStream is an InputStream which can be used to decompress the chunks as described in the MS-OVBA specification. It wraps around a compressed inputstream (ussually a DocumentInputStream from the POIFS) and decompresses on the fly to preserve memory. The VBAMacroExtractor processes the OLE binary stream records, records the CodePage (in order to convert byte-arrays to Strings) and will store the ModuleOffset. This offset specifies the location in the MemoryStream where the sourcecode starts. The VBAMacroExtractor has been written to automatically detect XLSM or XLS, and uses POIFSReader to process the file only once and preserve memory. It might be worthwhile to enhance the POI workbook with classes which provide access to the VBA modules, see Andrey Yesyev's contributions to the Nabble mailinglist. I hope it's useful, feel free to use the sources under Apache2 license. With kind regards, Barry
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
