This looks like an area for a new feature in both Tika and POI. I've only
looked very briefly into the POI libraries, and I may have missed how to
extract text from autoshapes. I'll open an issue in both projects.
-Original Message-
From: Hiroshi Tatsumi
This is one way to access the underlying CTShape that contains the text:
XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream(f));
XSSFSheet sheet = wb.getSheetAt(0);
XSSFDrawing drawing = sheet.createDrawingPatriarch();
for (XSSFShape shape :
Thank you for your reply. I really appreciate it.
This is a high priority for me.
Because we use solr, and our customer wants to search autoshapes' text in
Excel 2007+ files.
I've been investigating the Tika source code, and trying to fix it.
I understand that I can extract text from