Tim,

I have extracted the pptx PowerPoint file containing the Prague
footer. I'm want to write a unit test for POI to find the Prague
string so I can figure why Prague was not included in the Tika
regression test using POI 3.15 beta 3 but was found by POI 3.15 beta
1.

Could you point me to the Tika code that generated the potential
regressions zip file in TIKA-2013, or the POI class/function that is
used to extract the text from a document?

Also, is the pptx file shareable and ASL 2.0 licensed so that it can
be included as part of POI's unit test suite?

On Fri, Aug 12, 2016 at 6:52 PM, Javen O'Neal <javenon...@gmail.com> wrote:
> On Aug 12, 2016 11:39, "Allison, Timothy B." <talli...@mitre.org> wrote:
>>...the two potential content regressions may be caused by something at the
>> Tika level.  If anyone has time to take a look, that'd be great.
>
> I can take a look this weekend.
>
> Did you use the same Tika code with different POI versions for these tests
> (so that we can attribute the change in behavior to a POI commit, regardless
> of whether the bug is in Tika or POI)?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org

Reply via email to