RE: 1.7 release? | potential blocker?
All, I think I may have found a problem with the interaction of OutlookPSTParser with AutoDetectParser that I'd want to fix before 1.7. If you use the AutoDetectParser instead of the OutlookPSTParser() in OutlookPSTParserTest: // OutlookPSTParser pstParser = new OutlookPSTParser(); Parser pstParser = new AutoDetectParser(); I'm seeing this exception: org.apache.tika.exception.TikaException: Failed to close temporary resources at org.apache.tika.io.TemporaryResources.dispose(TemporaryResources.java:152) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:127) Are others seeing this? I'll try to dig into this today, might not get to it until tomorrow. Best, Tim -Original Message- From: Tyler Palsulich [mailto:tpalsul...@gmail.com] Sent: Monday, December 22, 2014 1:58 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hi All, Nick added the temporary fix for TIKA-1445 and made the POI updates for TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7! :) I'll start the process this weekend or a couple days into the new year. Cheers, Tyler On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: +1 ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 9:15 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As Nick just recommended, I'll try adding metadata extraction to Tesseract soon, then adding the extensible solution in 1.8. Tyler On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: I haven’t tried my hand at it - been super busy. tyler if you have a chance go for it, I think that’s the remaining blocker. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 12:54 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi All, It's been a few months, so I just want to follow up on this thread. We've resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445? Has anyone tried their hand at the suggested (significant) fix? Are there any other issues someone would like to fit in? Cheers, Tyler [0] - https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?select e dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Tim saw your patch and am looking now. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 27, 2014 at 12:30 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release
Re: 1.7 release? | potential blocker?
Works for me. I got stalled midway through the process of getting RC#1 out (authentication issues). But, going to try to finish it right now (best way to upload to dist.apache.org? http://www.apache.org/dev/release.html#upload-scp each file?). I won't send a VOTE for RC#1, though -- I'll wait for Tim's patch then send an RC#2. Sound good? Tyler On Mon, Jan 5, 2015 at 8:09 AM, Allison, Timothy B. talli...@mitre.org wrote: All, I think I may have found a problem with the interaction of OutlookPSTParser with AutoDetectParser that I'd want to fix before 1.7. If you use the AutoDetectParser instead of the OutlookPSTParser() in OutlookPSTParserTest: // OutlookPSTParser pstParser = new OutlookPSTParser(); Parser pstParser = new AutoDetectParser(); I'm seeing this exception: org.apache.tika.exception.TikaException: Failed to close temporary resources at org.apache.tika.io.TemporaryResources.dispose(TemporaryResources.java:152) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:127) Are others seeing this? I'll try to dig into this today, might not get to it until tomorrow. Best, Tim -Original Message- From: Tyler Palsulich [mailto:tpalsul...@gmail.com] Sent: Monday, December 22, 2014 1:58 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hi All, Nick added the temporary fix for TIKA-1445 and made the POI updates for TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7! :) I'll start the process this weekend or a couple days into the new year. Cheers, Tyler On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: +1 ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 9:15 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As Nick just recommended, I'll try adding metadata extraction to Tesseract soon, then adding the extensible solution in 1.8. Tyler On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: I haven’t tried my hand at it - been super busy. tyler if you have a chance go for it, I think that’s the remaining blocker. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 12:54 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi All, It's been a few months, so I just want to follow up on this thread. We've resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445? Has anyone tried their hand at the suggested (significant) fix? Are there any other issues someone would like to fit in? Cheers, Tyler [0] - https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?select e dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Tim saw your patch and am looking now. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http
Re: 1.7 release? | potential blocker?
On Mon, 5 Jan 2015, Tyler Palsulich wrote: Works for me. I got stalled midway through the process of getting RC#1 out (authentication issues). But, going to try to finish it right now (best way to upload to dist.apache.org? That's a svn checkout For the RC, assuming it's the same process as for Apache POI, you checkout https://dist.apache.org/repos/dist/dev/tika and put the files there Then, if the vote passes, you svn mv them to https://dist.apache.org/repos/dist/release/tika/ + upload things to maven central Nick
Re: 1.7 release? | potential blocker?
Thanks, Nick! You were right. OK -- Technically, RC#1 is up at https://dist.apache.org/repos/dist/dev/tika/. Should I also patch the rc1 branch or will you re-branch from trunk? I'll re-branch. Tyler On Mon, Jan 5, 2015 at 12:03 PM, Allison, Timothy B. talli...@mitre.org wrote: I'll patch trunk tonight (with null check, of course :)). Should I also patch the rc1 branch or will you re-branch from trunk? -Original Message- From: Tyler Palsulich [mailto:tpalsul...@gmail.com] Sent: Monday, January 05, 2015 11:38 AM To: dev@tika.apache.org Subject: Re: 1.7 release? | potential blocker? Works for me. I got stalled midway through the process of getting RC#1 out (authentication issues). But, going to try to finish it right now (best way to upload to dist.apache.org? http://www.apache.org/dev/release.html#upload-scp each file?). I won't send a VOTE for RC#1, though -- I'll wait for Tim's patch then send an RC#2. Sound good? Tyler On Mon, Jan 5, 2015 at 8:09 AM, Allison, Timothy B. talli...@mitre.org wrote: All, I think I may have found a problem with the interaction of OutlookPSTParser with AutoDetectParser that I'd want to fix before 1.7. If you use the AutoDetectParser instead of the OutlookPSTParser() in OutlookPSTParserTest: // OutlookPSTParser pstParser = new OutlookPSTParser(); Parser pstParser = new AutoDetectParser(); I'm seeing this exception: org.apache.tika.exception.TikaException: Failed to close temporary resources at org.apache.tika.io.TemporaryResources.dispose(TemporaryResources.java:152) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:127) Are others seeing this? I'll try to dig into this today, might not get to it until tomorrow. Best, Tim -Original Message- From: Tyler Palsulich [mailto:tpalsul...@gmail.com] Sent: Monday, December 22, 2014 1:58 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hi All, Nick added the temporary fix for TIKA-1445 and made the POI updates for TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7! :) I'll start the process this weekend or a couple days into the new year. Cheers, Tyler On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: +1 ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 9:15 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As Nick just recommended, I'll try adding metadata extraction to Tesseract soon, then adding the extensible solution in 1.8. Tyler On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: I haven’t tried my hand at it - been super busy. tyler if you have a chance go for it, I think that’s the remaining blocker. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 12:54 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi All, It's been a few months, so I just want to follow up on this thread. We've resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445? Has anyone tried their hand at the suggested (significant) fix
RE: 1.7 release? | potential blocker?
I'll patch trunk tonight (with null check, of course :)). Should I also patch the rc1 branch or will you re-branch from trunk? -Original Message- From: Tyler Palsulich [mailto:tpalsul...@gmail.com] Sent: Monday, January 05, 2015 11:38 AM To: dev@tika.apache.org Subject: Re: 1.7 release? | potential blocker? Works for me. I got stalled midway through the process of getting RC#1 out (authentication issues). But, going to try to finish it right now (best way to upload to dist.apache.org? http://www.apache.org/dev/release.html#upload-scp each file?). I won't send a VOTE for RC#1, though -- I'll wait for Tim's patch then send an RC#2. Sound good? Tyler On Mon, Jan 5, 2015 at 8:09 AM, Allison, Timothy B. talli...@mitre.org wrote: All, I think I may have found a problem with the interaction of OutlookPSTParser with AutoDetectParser that I'd want to fix before 1.7. If you use the AutoDetectParser instead of the OutlookPSTParser() in OutlookPSTParserTest: // OutlookPSTParser pstParser = new OutlookPSTParser(); Parser pstParser = new AutoDetectParser(); I'm seeing this exception: org.apache.tika.exception.TikaException: Failed to close temporary resources at org.apache.tika.io.TemporaryResources.dispose(TemporaryResources.java:152) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:127) Are others seeing this? I'll try to dig into this today, might not get to it until tomorrow. Best, Tim -Original Message- From: Tyler Palsulich [mailto:tpalsul...@gmail.com] Sent: Monday, December 22, 2014 1:58 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hi All, Nick added the temporary fix for TIKA-1445 and made the POI updates for TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7! :) I'll start the process this weekend or a couple days into the new year. Cheers, Tyler On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: +1 ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 9:15 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As Nick just recommended, I'll try adding metadata extraction to Tesseract soon, then adding the extensible solution in 1.8. Tyler On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: I haven’t tried my hand at it - been super busy. tyler if you have a chance go for it, I think that’s the remaining blocker. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 12:54 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi All, It's been a few months, so I just want to follow up on this thread. We've resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445? Has anyone tried their hand at the suggested (significant) fix? Are there any other issues someone would like to fit in? Cheers, Tyler [0] - https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?select e dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Tim saw your patch and am looking now
Re: 1.7 release?
On 22 Dec 2014, at 18:57, Tyler Palsulich tpalsul...@gmail.com wrote: Hi All, Nick added the temporary fix for TIKA-1445 and made the POI updates for TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7! :) I'll start the process this weekend or a couple days into the new year. Nice one Tyler! Cheers, Dave
Re: 1.7 release?
Hi All, Nick added the temporary fix for TIKA-1445 and made the POI updates for TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7! :) I'll start the process this weekend or a couple days into the new year. Cheers, Tyler On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: +1 ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 9:15 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As Nick just recommended, I'll try adding metadata extraction to Tesseract soon, then adding the extensible solution in 1.8. Tyler On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: I haven’t tried my hand at it - been super busy. tyler if you have a chance go for it, I think that’s the remaining blocker. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 12:54 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi All, It's been a few months, so I just want to follow up on this thread. We've resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445? Has anyone tried their hand at the suggested (significant) fix? Are there any other issues someone would like to fit in? Cheers, Tyler [0] - https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?select e dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Tim saw your patch and am looking now. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 27, 2014 at 12:30 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release? Sounds good. As long as the default behavior remains the same, I'm happy. I'm going to play with a combination of your patch and Tyler's and see what the ramifications are for embedded docs. To confirm, the OCR integration is fantastic. Thank you and Tyler! Best, Tim -Original Message- From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Friday, October 24, 2014 5:36 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hey Tim, What do you think about my existing patch for 1445? For example to just call all the parsers? I thought I was seeing behavior that was slow because of that, but it turned out to be Tesseract and my machine at the time? I think my patch for 1445 may be enough, and we should get the metadata I think? Thoughts? I honestly think we need
Re: 1.7 release?
WOOO HOO! Go Tyler go! :0) Merry Christmas bud. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, December 22, 2014 at 10:57 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi All, Nick added the temporary fix for TIKA-1445 and made the POI updates for TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7! :) I'll start the process this weekend or a couple days into the new year. Cheers, Tyler On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: +1 ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 9:15 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As Nick just recommended, I'll try adding metadata extraction to Tesseract soon, then adding the extensible solution in 1.8. Tyler On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: I haven’t tried my hand at it - been super busy. tyler if you have a chance go for it, I think that’s the remaining blocker. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 12:54 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi All, It's been a few months, so I just want to follow up on this thread. We've resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445? Has anyone tried their hand at the suggested (significant) fix? Are there any other issues someone would like to fit in? Cheers, Tyler [0] - https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?select e dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Tim saw your patch and am looking now. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 27, 2014 at 12:30 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release? Sounds good. As long as the default
Re: 1.7 release?
+1 for going. Many thanks to Tyler and to Nick to take the POI upgrade. So many christmas gifts in advance or just after :-) Merry christmas to all 2014-12-22 19:59 GMT+01:00 Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov: WOOO HOO! Go Tyler go! :0) Merry Christmas bud. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, December 22, 2014 at 10:57 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi All, Nick added the temporary fix for TIKA-1445 and made the POI updates for TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7! :) I'll start the process this weekend or a couple days into the new year. Cheers, Tyler On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: +1 ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 9:15 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As Nick just recommended, I'll try adding metadata extraction to Tesseract soon, then adding the extensible solution in 1.8. Tyler On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: I haven’t tried my hand at it - been super busy. tyler if you have a chance go for it, I think that’s the remaining blocker. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 12:54 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi All, It's been a few months, so I just want to follow up on this thread. We've resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445? Has anyone tried their hand at the suggested (significant) fix? Are there any other issues someone would like to fit in? Cheers, Tyler [0] - https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?select e dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Tim saw your patch and am looking now. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA
Re: 1.7 release?
Hi All, It's been a few months, so I just want to follow up on this thread. We've resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445? Has anyone tried their hand at the suggested (significant) fix? Are there any other issues someone would like to fit in? Cheers, Tyler [0] - https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Tim saw your patch and am looking now. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 27, 2014 at 12:30 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release? Sounds good. As long as the default behavior remains the same, I'm happy. I'm going to play with a combination of your patch and Tyler's and see what the ramifications are for embedded docs. To confirm, the OCR integration is fantastic. Thank you and Tyler! Best, Tim -Original Message- From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Friday, October 24, 2014 5:36 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hey Tim, What do you think about my existing patch for 1445? For example to just call all the parsers? I thought I was seeing behavior that was slow because of that, but it turned out to be Tesseract and my machine at the time? I think my patch for 1445 may be enough, and we should get the metadata I think? Thoughts? I honestly think we need to deliver Tesseract in 1.7. We're close. I'll even take it upon myself to try and experiment with the idea of multiple parsers being called. I think a simple solution to the metadata key conflict issue is simply to have a policy to add values (by default) and replace if a property is set in ParseContext. Some simple updates to CompositeParser would allow this. Thoughts? Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Friday, October 24, 2014 at 2:24 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release? Sorry for coming late to the game on the implications of TIKA-1445. I don't want to hold up the release of 1.7. However, would it be possible to return to the legacy default behavior of extracting metadata from images? We can then document on the OCR parser page on the wiki that you need to install Tesseract _and_ make a change in the parser/mime config file. If you want this new capability, it will take a small bit of work until we solve TIKA-1445. I worry that the current behavior of 1.7 would be surprising to most non-dev users (well, even to at least one dev :) ). Cheers, Tim From: Oleg Tikhonov [olegtikho...@gmail.com] Sent: Friday, October 24, 2014 2:24 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hi Tyler, don't mention. Cheers, Oleg On Oct 24, 2014 8:02 PM, Tyler Palsulich tpalsul...@gmail.com wrote: Thank you for the help, Oleg! I just resolved TIKA-1422. So, are there any other issues anyone would like to resolve before a new release? Thanks, Tyler On Tue, Oct 21, 2014 at 2:42 AM, Oleg Tikhonov olegtikho...@gmail.com wrote: Sorry!!! On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Oleg, will try tomorrow for me Los angeles time
Re: 1.7 release?
Hi, it might be worth waiting until POI 3.11-FINAL is released so that the TIKA release do not depend on a beta version. It's due on Sunday, corrects a lot of old office parsing and just needs the patch in TIKA-1469 to properly work. Regards Thomas 2014-12-18 21:54 GMT+01:00 Tyler Palsulich tpalsul...@gmail.com: Hi All, It's been a few months, so I just want to follow up on this thread. We've resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445? Has anyone tried their hand at the suggested (significant) fix? Are there any other issues someone would like to fit in? Cheers, Tyler [0] - https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Tim saw your patch and am looking now. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 27, 2014 at 12:30 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release? Sounds good. As long as the default behavior remains the same, I'm happy. I'm going to play with a combination of your patch and Tyler's and see what the ramifications are for embedded docs. To confirm, the OCR integration is fantastic. Thank you and Tyler! Best, Tim -Original Message- From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Friday, October 24, 2014 5:36 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hey Tim, What do you think about my existing patch for 1445? For example to just call all the parsers? I thought I was seeing behavior that was slow because of that, but it turned out to be Tesseract and my machine at the time? I think my patch for 1445 may be enough, and we should get the metadata I think? Thoughts? I honestly think we need to deliver Tesseract in 1.7. We're close. I'll even take it upon myself to try and experiment with the idea of multiple parsers being called. I think a simple solution to the metadata key conflict issue is simply to have a policy to add values (by default) and replace if a property is set in ParseContext. Some simple updates to CompositeParser would allow this. Thoughts? Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Friday, October 24, 2014 at 2:24 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release? Sorry for coming late to the game on the implications of TIKA-1445. I don't want to hold up the release of 1.7. However, would it be possible to return to the legacy default behavior of extracting metadata from images? We can then document on the OCR parser page on the wiki that you need to install Tesseract _and_ make a change in the parser/mime config file. If you want this new capability, it will take a small bit of work until we solve TIKA-1445. I worry that the current behavior of 1.7 would be surprising to most non-dev users (well, even to at least one dev :) ). Cheers, Tim From: Oleg Tikhonov [olegtikho...@gmail.com] Sent: Friday, October 24, 2014 2:24 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hi Tyler, don't mention. Cheers, Oleg On Oct 24, 2014 8:02 PM, Tyler Palsulich tpalsul...@gmail.com wrote: Thank you
Re: 1.7 release?
I haven’t tried my hand at it - been super busy. tyler if you have a chance go for it, I think that’s the remaining blocker. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 12:54 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi All, It's been a few months, so I just want to follow up on this thread. We've resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445? Has anyone tried their hand at the suggested (significant) fix? Are there any other issues someone would like to fit in? Cheers, Tyler [0] - https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?selecte dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Tim saw your patch and am looking now. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 27, 2014 at 12:30 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release? Sounds good. As long as the default behavior remains the same, I'm happy. I'm going to play with a combination of your patch and Tyler's and see what the ramifications are for embedded docs. To confirm, the OCR integration is fantastic. Thank you and Tyler! Best, Tim -Original Message- From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Friday, October 24, 2014 5:36 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hey Tim, What do you think about my existing patch for 1445? For example to just call all the parsers? I thought I was seeing behavior that was slow because of that, but it turned out to be Tesseract and my machine at the time? I think my patch for 1445 may be enough, and we should get the metadata I think? Thoughts? I honestly think we need to deliver Tesseract in 1.7. We're close. I'll even take it upon myself to try and experiment with the idea of multiple parsers being called. I think a simple solution to the metadata key conflict issue is simply to have a policy to add values (by default) and replace if a property is set in ParseContext. Some simple updates to CompositeParser would allow this. Thoughts? Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Friday, October 24, 2014 at 2:24 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release? Sorry for coming late to the game on the implications of TIKA-1445. I don't want to hold up the release of 1.7. However, would it be possible to return to the legacy default behavior of extracting metadata from images? We can then document on the OCR parser page on the wiki that you need to install Tesseract _and_ make a change in the parser/mime config file. If you want this new capability, it will take a small bit of work until we solve
Re: 1.7 release?
I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As Nick just recommended, I'll try adding metadata extraction to Tesseract soon, then adding the extensible solution in 1.8. Tyler On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: I haven’t tried my hand at it - been super busy. tyler if you have a chance go for it, I think that’s the remaining blocker. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Tyler Palsulich tpalsul...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, December 18, 2014 at 12:54 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi All, It's been a few months, so I just want to follow up on this thread. We've resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445? Has anyone tried their hand at the suggested (significant) fix? Are there any other issues someone would like to fit in? Cheers, Tyler [0] - https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?selecte dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Tim saw your patch and am looking now. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 27, 2014 at 12:30 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release? Sounds good. As long as the default behavior remains the same, I'm happy. I'm going to play with a combination of your patch and Tyler's and see what the ramifications are for embedded docs. To confirm, the OCR integration is fantastic. Thank you and Tyler! Best, Tim -Original Message- From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Friday, October 24, 2014 5:36 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hey Tim, What do you think about my existing patch for 1445? For example to just call all the parsers? I thought I was seeing behavior that was slow because of that, but it turned out to be Tesseract and my machine at the time? I think my patch for 1445 may be enough, and we should get the metadata I think? Thoughts? I honestly think we need to deliver Tesseract in 1.7. We're close. I'll even take it upon myself to try and experiment with the idea of multiple parsers being called. I think a simple solution to the metadata key conflict issue is simply to have a policy to add values (by default) and replace if a property is set in ParseContext. Some simple updates to CompositeParser would allow this. Thoughts? Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Friday, October 24, 2014 at 2:24 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release? Sorry for coming
Re: 1.7 release?
Thanks Tim saw your patch and am looking now. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 27, 2014 at 12:30 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release? Sounds good. As long as the default behavior remains the same, I'm happy. I'm going to play with a combination of your patch and Tyler's and see what the ramifications are for embedded docs. To confirm, the OCR integration is fantastic. Thank you and Tyler! Best, Tim -Original Message- From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Friday, October 24, 2014 5:36 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hey Tim, What do you think about my existing patch for 1445? For example to just call all the parsers? I thought I was seeing behavior that was slow because of that, but it turned out to be Tesseract and my machine at the time? I think my patch for 1445 may be enough, and we should get the metadata I think? Thoughts? I honestly think we need to deliver Tesseract in 1.7. We're close. I'll even take it upon myself to try and experiment with the idea of multiple parsers being called. I think a simple solution to the metadata key conflict issue is simply to have a policy to add values (by default) and replace if a property is set in ParseContext. Some simple updates to CompositeParser would allow this. Thoughts? Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Allison, Timothy B. talli...@mitre.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Friday, October 24, 2014 at 2:24 PM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: 1.7 release? Sorry for coming late to the game on the implications of TIKA-1445. I don't want to hold up the release of 1.7. However, would it be possible to return to the legacy default behavior of extracting metadata from images? We can then document on the OCR parser page on the wiki that you need to install Tesseract _and_ make a change in the parser/mime config file. If you want this new capability, it will take a small bit of work until we solve TIKA-1445. I worry that the current behavior of 1.7 would be surprising to most non-dev users (well, even to at least one dev :) ). Cheers, Tim From: Oleg Tikhonov [olegtikho...@gmail.com] Sent: Friday, October 24, 2014 2:24 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hi Tyler, don't mention. Cheers, Oleg On Oct 24, 2014 8:02 PM, Tyler Palsulich tpalsul...@gmail.com wrote: Thank you for the help, Oleg! I just resolved TIKA-1422. So, are there any other issues anyone would like to resolve before a new release? Thanks, Tyler On Tue, Oct 21, 2014 at 2:42 AM, Oleg Tikhonov olegtikho...@gmail.com wrote: Sorry!!! On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Oleg, will try tomorrow for me Los angeles time! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Oleg Tikhonov o...@apache.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 20, 2014 at 11
Re: 1.7 release?
Hi Tyler, don't mention. Cheers, Oleg On Oct 24, 2014 8:02 PM, Tyler Palsulich tpalsul...@gmail.com wrote: Thank you for the help, Oleg! I just resolved TIKA-1422. So, are there any other issues anyone would like to resolve before a new release? Thanks, Tyler On Tue, Oct 21, 2014 at 2:42 AM, Oleg Tikhonov olegtikho...@gmail.com wrote: Sorry!!! On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Oleg, will try tomorrow for me Los angeles time! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Oleg Tikhonov o...@apache.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 20, 2014 at 11:20 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Please take a try with newest patch. Cheers, Oleg On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov olegtikho...@gmail.com wrote: Taken. Thanks. in progress ... On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Trunk is the current checkout/branch: http://svn.apache.org/repos/asf/tika/trunk ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Oleg Tikhonov olegtikho...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 20, 2014 at 10:16 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi, I can try this on. What is a trunk? Thanks, Oleg On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Hmm any idea why this is failing on Windows? Tyler P. and I were talking the other day - maybe we shouldn't run the tests from TIKA-1422 unless Tesseract is installed? Thoughts? ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hong-Thai Nguyen thaicha...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, October 16, 2014 at 2:03 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi Andrzej, We are impatient for 1.7 release too. I'm having compiling problem of TIKA-1422 on me. If anyone can build successfully on Windows, I have no objection to release 1.7 Thanks, On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org wrote: Hi, Any news on the 1.7 release? or at least a 1.6.1 release that includes the fix for broken ODF parsing... --- Best regards, Andrzej Bialecki -- -- Hong-Thai
RE: 1.7 release?
Sorry for coming late to the game on the implications of TIKA-1445. I don't want to hold up the release of 1.7. However, would it be possible to return to the legacy default behavior of extracting metadata from images? We can then document on the OCR parser page on the wiki that you need to install Tesseract _and_ make a change in the parser/mime config file. If you want this new capability, it will take a small bit of work until we solve TIKA-1445. I worry that the current behavior of 1.7 would be surprising to most non-dev users (well, even to at least one dev :) ). Cheers, Tim From: Oleg Tikhonov [olegtikho...@gmail.com] Sent: Friday, October 24, 2014 2:24 PM To: dev@tika.apache.org Subject: Re: 1.7 release? Hi Tyler, don't mention. Cheers, Oleg On Oct 24, 2014 8:02 PM, Tyler Palsulich tpalsul...@gmail.com wrote: Thank you for the help, Oleg! I just resolved TIKA-1422. So, are there any other issues anyone would like to resolve before a new release? Thanks, Tyler On Tue, Oct 21, 2014 at 2:42 AM, Oleg Tikhonov olegtikho...@gmail.com wrote: Sorry!!! On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Oleg, will try tomorrow for me Los angeles time! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Oleg Tikhonov o...@apache.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 20, 2014 at 11:20 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Please take a try with newest patch. Cheers, Oleg On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov olegtikho...@gmail.com wrote: Taken. Thanks. in progress ... On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Trunk is the current checkout/branch: http://svn.apache.org/repos/asf/tika/trunk ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Oleg Tikhonov olegtikho...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 20, 2014 at 10:16 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi, I can try this on. What is a trunk? Thanks, Oleg On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Hmm any idea why this is failing on Windows? Tyler P. and I were talking the other day - maybe we shouldn't run the tests from TIKA-1422 unless Tesseract is installed? Thoughts? ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hong-Thai Nguyen thaicha...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, October 16, 2014 at 2:03 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi Andrzej, We are impatient for 1.7 release too. I'm having compiling problem of TIKA-1422 on me. If anyone can build successfully
Re: 1.7 release?
Taken. Thanks. in progress ... On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Trunk is the current checkout/branch: http://svn.apache.org/repos/asf/tika/trunk ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Oleg Tikhonov olegtikho...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 20, 2014 at 10:16 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi, I can try this on. What is a trunk? Thanks, Oleg On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Hmm any idea why this is failing on Windows? Tyler P. and I were talking the other day - maybe we shouldn't run the tests from TIKA-1422 unless Tesseract is installed? Thoughts? ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hong-Thai Nguyen thaicha...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, October 16, 2014 at 2:03 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi Andrzej, We are impatient for 1.7 release too. I'm having compiling problem of TIKA-1422 on me. If anyone can build successfully on Windows, I have no objection to release 1.7 Thanks, On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org wrote: Hi, Any news on the 1.7 release? or at least a 1.6.1 release that includes the fix for broken ODF parsing... --- Best regards, Andrzej Bialecki -- -- Hong-Thai
Re: 1.7 release?
Please take a try with newest patch. Cheers, Oleg On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov olegtikho...@gmail.com wrote: Taken. Thanks. in progress ... On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Trunk is the current checkout/branch: http://svn.apache.org/repos/asf/tika/trunk ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Oleg Tikhonov olegtikho...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 20, 2014 at 10:16 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi, I can try this on. What is a trunk? Thanks, Oleg On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Hmm any idea why this is failing on Windows? Tyler P. and I were talking the other day - maybe we shouldn't run the tests from TIKA-1422 unless Tesseract is installed? Thoughts? ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hong-Thai Nguyen thaicha...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, October 16, 2014 at 2:03 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi Andrzej, We are impatient for 1.7 release too. I'm having compiling problem of TIKA-1422 on me. If anyone can build successfully on Windows, I have no objection to release 1.7 Thanks, On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org wrote: Hi, Any news on the 1.7 release? or at least a 1.6.1 release that includes the fix for broken ODF parsing... --- Best regards, Andrzej Bialecki -- -- Hong-Thai
Re: 1.7 release?
Thanks Oleg, will try tomorrow for me Los angeles time! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Oleg Tikhonov o...@apache.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 20, 2014 at 11:20 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Please take a try with newest patch. Cheers, Oleg On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov olegtikho...@gmail.com wrote: Taken. Thanks. in progress ... On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Trunk is the current checkout/branch: http://svn.apache.org/repos/asf/tika/trunk ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Oleg Tikhonov olegtikho...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 20, 2014 at 10:16 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi, I can try this on. What is a trunk? Thanks, Oleg On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Hmm any idea why this is failing on Windows? Tyler P. and I were talking the other day - maybe we shouldn't run the tests from TIKA-1422 unless Tesseract is installed? Thoughts? ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hong-Thai Nguyen thaicha...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, October 16, 2014 at 2:03 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi Andrzej, We are impatient for 1.7 release too. I'm having compiling problem of TIKA-1422 on me. If anyone can build successfully on Windows, I have no objection to release 1.7 Thanks, On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org wrote: Hi, Any news on the 1.7 release? or at least a 1.6.1 release that includes the fix for broken ODF parsing... --- Best regards, Andrzej Bialecki -- -- Hong-Thai
Re: 1.7 release?
Sorry!!! On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Oleg, will try tomorrow for me Los angeles time! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Oleg Tikhonov o...@apache.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 20, 2014 at 11:20 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Please take a try with newest patch. Cheers, Oleg On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov olegtikho...@gmail.com wrote: Taken. Thanks. in progress ... On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Trunk is the current checkout/branch: http://svn.apache.org/repos/asf/tika/trunk ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Oleg Tikhonov olegtikho...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 20, 2014 at 10:16 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi, I can try this on. What is a trunk? Thanks, Oleg On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Hmm any idea why this is failing on Windows? Tyler P. and I were talking the other day - maybe we shouldn't run the tests from TIKA-1422 unless Tesseract is installed? Thoughts? ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hong-Thai Nguyen thaicha...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, October 16, 2014 at 2:03 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi Andrzej, We are impatient for 1.7 release too. I'm having compiling problem of TIKA-1422 on me. If anyone can build successfully on Windows, I have no objection to release 1.7 Thanks, On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org wrote: Hi, Any news on the 1.7 release? or at least a 1.6.1 release that includes the fix for broken ODF parsing... --- Best regards, Andrzej Bialecki -- -- Hong-Thai
Re: 1.7 release?
Hmm any idea why this is failing on Windows? Tyler P. and I were talking the other day - maybe we shouldn't run the tests from TIKA-1422 unless Tesseract is installed? Thoughts? ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hong-Thai Nguyen thaicha...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, October 16, 2014 at 2:03 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi Andrzej, We are impatient for 1.7 release too. I'm having compiling problem of TIKA-1422 on me. If anyone can build successfully on Windows, I have no objection to release 1.7 Thanks, On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org wrote: Hi, Any news on the 1.7 release? or at least a 1.6.1 release that includes the fix for broken ODF parsing... --- Best regards, Andrzej Bialecki -- -- Hong-Thai
Re: 1.7 release?
Hi, I can try this on. What is a trunk? Thanks, Oleg On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Hmm any idea why this is failing on Windows? Tyler P. and I were talking the other day - maybe we shouldn't run the tests from TIKA-1422 unless Tesseract is installed? Thoughts? ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hong-Thai Nguyen thaicha...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, October 16, 2014 at 2:03 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi Andrzej, We are impatient for 1.7 release too. I'm having compiling problem of TIKA-1422 on me. If anyone can build successfully on Windows, I have no objection to release 1.7 Thanks, On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org wrote: Hi, Any news on the 1.7 release? or at least a 1.6.1 release that includes the fix for broken ODF parsing... --- Best regards, Andrzej Bialecki -- -- Hong-Thai
Re: 1.7 release?
Trunk is the current checkout/branch: http://svn.apache.org/repos/asf/tika/trunk ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Oleg Tikhonov olegtikho...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Monday, October 20, 2014 at 10:16 PM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi, I can try this on. What is a trunk? Thanks, Oleg On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Hmm any idea why this is failing on Windows? Tyler P. and I were talking the other day - maybe we shouldn't run the tests from TIKA-1422 unless Tesseract is installed? Thoughts? ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hong-Thai Nguyen thaicha...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, October 16, 2014 at 2:03 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: 1.7 release? Hi Andrzej, We are impatient for 1.7 release too. I'm having compiling problem of TIKA-1422 on me. If anyone can build successfully on Windows, I have no objection to release 1.7 Thanks, On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org wrote: Hi, Any news on the 1.7 release? or at least a 1.6.1 release that includes the fix for broken ODF parsing... --- Best regards, Andrzej Bialecki -- -- Hong-Thai
Re: 1.7 release?
Hi Andrzej, We are impatient for 1.7 release too. I'm having compiling problem of TIKA-1422 on me. If anyone can build successfully on Windows, I have no objection to release 1.7 Thanks, On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org wrote: Hi, Any news on the 1.7 release? or at least a 1.6.1 release that includes the fix for broken ODF parsing… --- Best regards, Andrzej Bialecki -- -- Hong-Thai