On Fri, 1 Aug 2014, Allison, Timothy B. wrote:
I found one regression in the handling of an xlsx file:
http://digitalcorpora.org/corp/nps/files/govdocs1/598/598948.xlsx

Tika 1.6 w/ POI 3.11 Beta 1 is not extracting the comments in this file, whereas Tika 1.5 (and Tika 1.6 w/ POI 3.10-Final) did extract the comments. This suggests that the issue is with POI, but I haven't had a chance to dig in, and unfortunately, I don't think I will have a chance until Monday.

Tika used to look up cell comments manually in the xlsx extractor, but that logic has now been moved into the POI xlsx event handler. My hunch is there's something not quite right in that, that's probably the place to look + write unit test for!

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to