I think before we release 0.10 we should address TIKA-712? I don't think we should hold the release... I think we should just turn off the new functionality (to extract text from master slides) for the time being, until we work out how to fix it more correctly, because right now it's always extracting boilerplate text from the master slide onto each slide. Ie put Tika back to what it did before any of the TIKA-712 commits, for the 0.10 release.
I've made some progress trying to understand what we can use in the OOXML format to not extract the boiler plate while keeping what the user had actually edited, but I'm not done yet and I think what's committed is worse than the original issue... Thoughts? Mike McCandless http://blog.mikemccandless.com On Wed, Sep 21, 2011 at 11:02 PM, Mattmann, Chris A (388J) <[email protected]> wrote: > Hey Jukka, > > If everyone is cool with me doing it over the weekend, I'll bust it out, > no worries. Thanks for getting the RC all prepped up and > thanks to everyone for the hard work. > > Cheers, > Chris > > On Sep 21, 2011, at 11:19 AM, Jukka Zitting wrote: > >> Hi, >> >> On Wed, Sep 21, 2011 at 2:28 PM, Christian Göller <[email protected]> >> wrote: >>> can anyone tell me if there is a date for the next TIKA release 1.0 or 0.10 >>> ? >> >> As discussed in the other thread, we seem to have a rough consensus to >> make a 0.10 release pretty soon while we work on perfecting things for >> the 1.0 release. >> >> I think the trunk is pretty much ready to be released already, so I'd >> suggest we cut the release already this week, for example over the >> weekend. Chris, do you want to take care of it? I should also have >> some spare cycles to cut the release if needed. >> >> BR, >> >> Jukka Zitting > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: [email protected] > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >
