Hi, POI will have a WMF module (org.apache.poi.hwmf.*) in the next beta. Looking over the govdocs collection, those embedded wmfs might contain interesting information for TIKA.
Although my main goal is to integrate the rendering for common sl, it shouldn't be to laborious to provide something afterwards. Should the output be part of the embedding document, e.g. ppt, or does it make sense to crawl over various extensions and extract those metadata separately? (I haven't checked how the parsers are called, so this might be nonsense ...) Andi