RE: 1.7 release? | potential blocker?

2015-01-05 Thread Allison, Timothy B.
All,

I think I may have found a problem with the interaction of OutlookPSTParser 
with AutoDetectParser that I'd want to fix before 1.7.

If you use the AutoDetectParser instead of the OutlookPSTParser() in 
OutlookPSTParserTest:

//   OutlookPSTParser pstParser = new OutlookPSTParser();
Parser pstParser = new AutoDetectParser();

I'm seeing this exception:

org.apache.tika.exception.TikaException: Failed to close temporary resources
at 
org.apache.tika.io.TemporaryResources.dispose(TemporaryResources.java:152)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:127)

Are others seeing this?

I'll try to dig into this today, might not get to it until tomorrow.

Best,

Tim



-Original Message-
From: Tyler Palsulich [mailto:tpalsul...@gmail.com] 
Sent: Monday, December 22, 2014 1:58 PM
To: dev@tika.apache.org
Subject: Re: 1.7 release?

Hi All,

Nick added the temporary fix for TIKA-1445 and made the POI updates for
TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7!
:)

I'll start the process this weekend or a couple days into the new year.

Cheers,
Tyler
On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:

 +1

 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Tyler Palsulich tpalsul...@gmail.com
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Thursday, December 18, 2014 at 9:15 PM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As
 Nick
 just recommended, I'll try adding metadata extraction to Tesseract soon,
 then adding the extensible solution in 1.8.
 
 Tyler
 
 On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  I haven’t tried my hand at it - been super busy. tyler if you have a
  chance go for it, I think that’s the remaining blocker.
 
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Tyler Palsulich tpalsul...@gmail.com
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Thursday, December 18, 2014 at 12:54 PM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: Re: 1.7 release?
 
  Hi All,
  
  It's been a few months, so I just want to follow up on this thread.
 We've
  resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as
  1.7
  (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with
 TIKA-1445?
  Has anyone tried their hand at the suggested (significant) fix?
  
  Are there any other issues someone would like to fit in?
  
  Cheers,
  Tyler
  
  [0] -
  
 
 
 https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?select
 e
  dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel
  
  On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) 
  chris.a.mattm...@jpl.nasa.gov wrote:
  
   Thanks Tim saw your patch and am looking now.
  
   ++
   Chris Mattmann, Ph.D.
   Chief Architect
   Instrument Software and Science Data Systems Section (398)
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 168-519, Mailstop: 168-527
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   Adjunct Associate Professor, Computer Science Department
   University of Southern California, Los Angeles, CA 90089 USA
   ++
  
  
  
  
  
  
   -Original Message-
   From: Allison, Timothy B. talli...@mitre.org
   Reply-To: dev@tika.apache.org dev@tika.apache.org
   Date: Monday, October 27, 2014 at 12:30 PM
   To: dev@tika.apache.org dev@tika.apache.org
   Subject: RE: 1.7 release

Re: 1.7 release? | potential blocker?

2015-01-05 Thread Tyler Palsulich
Works for me. I got stalled midway through the process of getting RC#1 out
(authentication issues). But, going to try to finish it right now (best way
to upload to dist.apache.org?
http://www.apache.org/dev/release.html#upload-scp each file?). I won't send
a VOTE for RC#1, though -- I'll wait for Tim's patch then send an RC#2.

Sound good?

Tyler

On Mon, Jan 5, 2015 at 8:09 AM, Allison, Timothy B. talli...@mitre.org
wrote:

 All,

 I think I may have found a problem with the interaction of
 OutlookPSTParser with AutoDetectParser that I'd want to fix before 1.7.

 If you use the AutoDetectParser instead of the OutlookPSTParser() in
 OutlookPSTParserTest:

 //   OutlookPSTParser pstParser = new OutlookPSTParser();
 Parser pstParser = new AutoDetectParser();

 I'm seeing this exception:

 org.apache.tika.exception.TikaException: Failed to close temporary
 resources
 at
 org.apache.tika.io.TemporaryResources.dispose(TemporaryResources.java:152)
 at
 org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:127)

 Are others seeing this?

 I'll try to dig into this today, might not get to it until tomorrow.

 Best,

 Tim



 -Original Message-
 From: Tyler Palsulich [mailto:tpalsul...@gmail.com]
 Sent: Monday, December 22, 2014 1:58 PM
 To: dev@tika.apache.org
 Subject: Re: 1.7 release?

 Hi All,

 Nick added the temporary fix for TIKA-1445 and made the POI updates for
 TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7!
 :)

 I'll start the process this weekend or a couple days into the new year.

 Cheers,
 Tyler
 On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:

  +1
 
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Tyler Palsulich tpalsul...@gmail.com
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Thursday, December 18, 2014 at 9:15 PM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: Re: 1.7 release?
 
  I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As
  Nick
  just recommended, I'll try adding metadata extraction to Tesseract soon,
  then adding the extensible solution in 1.8.
  
  Tyler
  
  On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) 
  chris.a.mattm...@jpl.nasa.gov wrote:
  
   I haven’t tried my hand at it - been super busy. tyler if you have a
   chance go for it, I think that’s the remaining blocker.
  
   ++
   Chris Mattmann, Ph.D.
   Chief Architect
   Instrument Software and Science Data Systems Section (398)
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 168-519, Mailstop: 168-527
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   Adjunct Associate Professor, Computer Science Department
   University of Southern California, Los Angeles, CA 90089 USA
   ++
  
  
  
  
  
  
   -Original Message-
   From: Tyler Palsulich tpalsul...@gmail.com
   Reply-To: dev@tika.apache.org dev@tika.apache.org
   Date: Thursday, December 18, 2014 at 12:54 PM
   To: dev@tika.apache.org dev@tika.apache.org
   Subject: Re: 1.7 release?
  
   Hi All,
   
   It's been a few months, so I just want to follow up on this thread.
  We've
   resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked
 as
   1.7
   (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with
  TIKA-1445?
   Has anyone tried their hand at the suggested (significant) fix?
   
   Are there any other issues someone would like to fit in?
   
   Cheers,
   Tyler
   
   [0] -
   
  
  
 
 https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?select
  e
   dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel
   
   On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) 
   chris.a.mattm...@jpl.nasa.gov wrote:
   
Thanks Tim saw your patch and am looking now.
   
++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http

Re: 1.7 release? | potential blocker?

2015-01-05 Thread Nick Burch

On Mon, 5 Jan 2015, Tyler Palsulich wrote:

Works for me. I got stalled midway through the process of getting RC#1 out
(authentication issues). But, going to try to finish it right now (best way
to upload to dist.apache.org?


That's a svn checkout

For the RC, assuming it's the same process as for Apache POI, you checkout 
https://dist.apache.org/repos/dist/dev/tika and put the files there


Then, if the vote passes, you svn mv them to 
https://dist.apache.org/repos/dist/release/tika/ + upload things to maven 
central


Nick


Re: 1.7 release? | potential blocker?

2015-01-05 Thread Tyler Palsulich
Thanks, Nick! You were right. OK -- Technically, RC#1 is up at
https://dist.apache.org/repos/dist/dev/tika/.

 Should I also patch the rc1 branch or will you re-branch from trunk?
I'll re-branch.

Tyler

On Mon, Jan 5, 2015 at 12:03 PM, Allison, Timothy B. talli...@mitre.org
wrote:

 I'll patch trunk tonight (with null check, of course :)).  Should I also
 patch the rc1 branch or will you re-branch from trunk?

 -Original Message-
 From: Tyler Palsulich [mailto:tpalsul...@gmail.com]
 Sent: Monday, January 05, 2015 11:38 AM
 To: dev@tika.apache.org
 Subject: Re: 1.7 release? | potential blocker?

 Works for me. I got stalled midway through the process of getting RC#1 out
 (authentication issues). But, going to try to finish it right now (best way
 to upload to dist.apache.org?
 http://www.apache.org/dev/release.html#upload-scp each file?). I won't
 send
 a VOTE for RC#1, though -- I'll wait for Tim's patch then send an RC#2.

 Sound good?

 Tyler

 On Mon, Jan 5, 2015 at 8:09 AM, Allison, Timothy B. talli...@mitre.org
 wrote:

  All,
 
  I think I may have found a problem with the interaction of
  OutlookPSTParser with AutoDetectParser that I'd want to fix before 1.7.
 
  If you use the AutoDetectParser instead of the OutlookPSTParser() in
  OutlookPSTParserTest:
 
  //   OutlookPSTParser pstParser = new OutlookPSTParser();
  Parser pstParser = new AutoDetectParser();
 
  I'm seeing this exception:
 
  org.apache.tika.exception.TikaException: Failed to close temporary
  resources
  at
 
 org.apache.tika.io.TemporaryResources.dispose(TemporaryResources.java:152)
  at
  org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:127)
 
  Are others seeing this?
 
  I'll try to dig into this today, might not get to it until tomorrow.
 
  Best,
 
  Tim
 
 
 
  -Original Message-
  From: Tyler Palsulich [mailto:tpalsul...@gmail.com]
  Sent: Monday, December 22, 2014 1:58 PM
  To: dev@tika.apache.org
  Subject: Re: 1.7 release?
 
  Hi All,
 
  Nick added the temporary fix for TIKA-1445 and made the POI updates for
  TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for
 1.7!
  :)
 
  I'll start the process this weekend or a couple days into the new year.
 
  Cheers,
  Tyler
  On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) 
  chris.a.mattm...@jpl.nasa.gov wrote:
 
   +1
  
   ++
   Chris Mattmann, Ph.D.
   Chief Architect
   Instrument Software and Science Data Systems Section (398)
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 168-519, Mailstop: 168-527
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   Adjunct Associate Professor, Computer Science Department
   University of Southern California, Los Angeles, CA 90089 USA
   ++
  
  
  
  
  
  
   -Original Message-
   From: Tyler Palsulich tpalsul...@gmail.com
   Reply-To: dev@tika.apache.org dev@tika.apache.org
   Date: Thursday, December 18, 2014 at 9:15 PM
   To: dev@tika.apache.org dev@tika.apache.org
   Subject: Re: 1.7 release?
  
   I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As
   Nick
   just recommended, I'll try adding metadata extraction to Tesseract
 soon,
   then adding the extensible solution in 1.8.
   
   Tyler
   
   On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) 
   chris.a.mattm...@jpl.nasa.gov wrote:
   
I haven’t tried my hand at it - been super busy. tyler if you have a
chance go for it, I think that’s the remaining blocker.
   
++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++
   
   
   
   
   
   
-Original Message-
From: Tyler Palsulich tpalsul...@gmail.com
Reply-To: dev@tika.apache.org dev@tika.apache.org
Date: Thursday, December 18, 2014 at 12:54 PM
To: dev@tika.apache.org dev@tika.apache.org
Subject: Re: 1.7 release?
   
Hi All,

It's been a few months, so I just want to follow up on this thread.
   We've
resolved/closed 51 issues for v1.7 [0]. There are two on JIRA
 marked
  as
1.7
(TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with
   TIKA-1445?
Has anyone tried their hand at the suggested (significant) fix

RE: 1.7 release? | potential blocker?

2015-01-05 Thread Allison, Timothy B.
I'll patch trunk tonight (with null check, of course :)).  Should I also patch 
the rc1 branch or will you re-branch from trunk?

-Original Message-
From: Tyler Palsulich [mailto:tpalsul...@gmail.com] 
Sent: Monday, January 05, 2015 11:38 AM
To: dev@tika.apache.org
Subject: Re: 1.7 release? | potential blocker?

Works for me. I got stalled midway through the process of getting RC#1 out
(authentication issues). But, going to try to finish it right now (best way
to upload to dist.apache.org?
http://www.apache.org/dev/release.html#upload-scp each file?). I won't send
a VOTE for RC#1, though -- I'll wait for Tim's patch then send an RC#2.

Sound good?

Tyler

On Mon, Jan 5, 2015 at 8:09 AM, Allison, Timothy B. talli...@mitre.org
wrote:

 All,

 I think I may have found a problem with the interaction of
 OutlookPSTParser with AutoDetectParser that I'd want to fix before 1.7.

 If you use the AutoDetectParser instead of the OutlookPSTParser() in
 OutlookPSTParserTest:

 //   OutlookPSTParser pstParser = new OutlookPSTParser();
 Parser pstParser = new AutoDetectParser();

 I'm seeing this exception:

 org.apache.tika.exception.TikaException: Failed to close temporary
 resources
 at
 org.apache.tika.io.TemporaryResources.dispose(TemporaryResources.java:152)
 at
 org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:127)

 Are others seeing this?

 I'll try to dig into this today, might not get to it until tomorrow.

 Best,

 Tim



 -Original Message-
 From: Tyler Palsulich [mailto:tpalsul...@gmail.com]
 Sent: Monday, December 22, 2014 1:58 PM
 To: dev@tika.apache.org
 Subject: Re: 1.7 release?

 Hi All,

 Nick added the temporary fix for TIKA-1445 and made the POI updates for
 TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7!
 :)

 I'll start the process this weekend or a couple days into the new year.

 Cheers,
 Tyler
 On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:

  +1
 
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Tyler Palsulich tpalsul...@gmail.com
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Thursday, December 18, 2014 at 9:15 PM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: Re: 1.7 release?
 
  I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As
  Nick
  just recommended, I'll try adding metadata extraction to Tesseract soon,
  then adding the extensible solution in 1.8.
  
  Tyler
  
  On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) 
  chris.a.mattm...@jpl.nasa.gov wrote:
  
   I haven’t tried my hand at it - been super busy. tyler if you have a
   chance go for it, I think that’s the remaining blocker.
  
   ++
   Chris Mattmann, Ph.D.
   Chief Architect
   Instrument Software and Science Data Systems Section (398)
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 168-519, Mailstop: 168-527
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   Adjunct Associate Professor, Computer Science Department
   University of Southern California, Los Angeles, CA 90089 USA
   ++
  
  
  
  
  
  
   -Original Message-
   From: Tyler Palsulich tpalsul...@gmail.com
   Reply-To: dev@tika.apache.org dev@tika.apache.org
   Date: Thursday, December 18, 2014 at 12:54 PM
   To: dev@tika.apache.org dev@tika.apache.org
   Subject: Re: 1.7 release?
  
   Hi All,
   
   It's been a few months, so I just want to follow up on this thread.
  We've
   resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked
 as
   1.7
   (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with
  TIKA-1445?
   Has anyone tried their hand at the suggested (significant) fix?
   
   Are there any other issues someone would like to fit in?
   
   Cheers,
   Tyler
   
   [0] -
   
  
  
 
 https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?select
  e
   dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel
   
   On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) 
   chris.a.mattm...@jpl.nasa.gov wrote:
   
Thanks Tim saw your patch and am looking now

Re: 1.7 release?

2015-01-02 Thread David Meikle

 On 22 Dec 2014, at 18:57, Tyler Palsulich tpalsul...@gmail.com wrote:
 
 Hi All,
 
 Nick added the temporary fix for TIKA-1445 and made the POI updates for
 TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7!
 :)
 
 I'll start the process this weekend or a couple days into the new year.

Nice one Tyler!

Cheers,
Dave

Re: 1.7 release?

2014-12-22 Thread Tyler Palsulich
Hi All,

Nick added the temporary fix for TIKA-1445 and made the POI updates for
TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for 1.7!
:)

I'll start the process this weekend or a couple days into the new year.

Cheers,
Tyler
On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:

 +1

 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Tyler Palsulich tpalsul...@gmail.com
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Thursday, December 18, 2014 at 9:15 PM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As
 Nick
 just recommended, I'll try adding metadata extraction to Tesseract soon,
 then adding the extensible solution in 1.8.
 
 Tyler
 
 On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  I haven’t tried my hand at it - been super busy. tyler if you have a
  chance go for it, I think that’s the remaining blocker.
 
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Tyler Palsulich tpalsul...@gmail.com
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Thursday, December 18, 2014 at 12:54 PM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: Re: 1.7 release?
 
  Hi All,
  
  It's been a few months, so I just want to follow up on this thread.
 We've
  resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as
  1.7
  (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with
 TIKA-1445?
  Has anyone tried their hand at the suggested (significant) fix?
  
  Are there any other issues someone would like to fit in?
  
  Cheers,
  Tyler
  
  [0] -
  
 
 
 https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?select
 e
  dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel
  
  On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) 
  chris.a.mattm...@jpl.nasa.gov wrote:
  
   Thanks Tim saw your patch and am looking now.
  
   ++
   Chris Mattmann, Ph.D.
   Chief Architect
   Instrument Software and Science Data Systems Section (398)
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 168-519, Mailstop: 168-527
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   Adjunct Associate Professor, Computer Science Department
   University of Southern California, Los Angeles, CA 90089 USA
   ++
  
  
  
  
  
  
   -Original Message-
   From: Allison, Timothy B. talli...@mitre.org
   Reply-To: dev@tika.apache.org dev@tika.apache.org
   Date: Monday, October 27, 2014 at 12:30 PM
   To: dev@tika.apache.org dev@tika.apache.org
   Subject: RE: 1.7 release?
  
   Sounds good.  As long as the default behavior remains the same, I'm
   happy.  I'm going to play with a combination of your patch and
 Tyler's
   and see what the ramifications are for embedded docs.
   
   To confirm, the OCR integration is fantastic.  Thank you and Tyler!
   
   
   Best,
   
  Tim
   
   -Original Message-
   From: Mattmann, Chris A (3980)
 [mailto:chris.a.mattm...@jpl.nasa.gov]
   Sent: Friday, October 24, 2014 5:36 PM
   To: dev@tika.apache.org
   Subject: Re: 1.7 release?
   
   Hey Tim,
   
   What do you think about my existing patch for 1445? For example to
   just call all the parsers? I thought I was seeing behavior that was
   slow because of that, but it turned out to be Tesseract and my
 machine
   at the time?
   
   I think my patch for 1445 may be enough, and we should get the
 metadata
   I think? Thoughts?
   
   I honestly think we need

Re: 1.7 release?

2014-12-22 Thread Mattmann, Chris A (3980)
WOOO HOO! Go Tyler go! :0) Merry Christmas bud.

++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: Tyler Palsulich tpalsul...@gmail.com
Reply-To: dev@tika.apache.org dev@tika.apache.org
Date: Monday, December 22, 2014 at 10:57 AM
To: dev@tika.apache.org dev@tika.apache.org
Subject: Re: 1.7 release?

Hi All,

Nick added the temporary fix for TIKA-1445 and made the POI updates for
TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for
1.7!
:)

I'll start the process this weekend or a couple days into the new year.

Cheers,
Tyler
On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:

 +1

 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Tyler Palsulich tpalsul...@gmail.com
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Thursday, December 18, 2014 at 9:15 PM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As
 Nick
 just recommended, I'll try adding metadata extraction to Tesseract
soon,
 then adding the extensible solution in 1.8.
 
 Tyler
 
 On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  I haven’t tried my hand at it - been super busy. tyler if you have a
  chance go for it, I think that’s the remaining blocker.
 
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Tyler Palsulich tpalsul...@gmail.com
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Thursday, December 18, 2014 at 12:54 PM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: Re: 1.7 release?
 
  Hi All,
  
  It's been a few months, so I just want to follow up on this thread.
 We've
  resolved/closed 51 issues for v1.7 [0]. There are two on JIRA
marked as
  1.7
  (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with
 TIKA-1445?
  Has anyone tried their hand at the suggested (significant) fix?
  
  Are there any other issues someone would like to fit in?
  
  Cheers,
  Tyler
  
  [0] -
  
 
 
 
https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?select
 e
  dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel
  
  On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) 
  chris.a.mattm...@jpl.nasa.gov wrote:
  
   Thanks Tim saw your patch and am looking now.
  
   ++
   Chris Mattmann, Ph.D.
   Chief Architect
   Instrument Software and Science Data Systems Section (398)
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 168-519, Mailstop: 168-527
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   Adjunct Associate Professor, Computer Science Department
   University of Southern California, Los Angeles, CA 90089 USA
   ++
  
  
  
  
  
  
   -Original Message-
   From: Allison, Timothy B. talli...@mitre.org
   Reply-To: dev@tika.apache.org dev@tika.apache.org
   Date: Monday, October 27, 2014 at 12:30 PM
   To: dev@tika.apache.org dev@tika.apache.org
   Subject: RE: 1.7 release?
  
   Sounds good.  As long as the default

Re: 1.7 release?

2014-12-22 Thread Thomas Ledoux
+1 for going.
Many thanks to Tyler and to Nick to take the POI upgrade.

So many christmas gifts in advance or just after :-)

Merry christmas to all

2014-12-22 19:59 GMT+01:00 Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov:

 WOOO HOO! Go Tyler go! :0) Merry Christmas bud.

 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Tyler Palsulich tpalsul...@gmail.com
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Monday, December 22, 2014 at 10:57 AM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 Hi All,
 
 Nick added the temporary fix for TIKA-1445 and made the POI updates for
 TIKA-1469 (thanks!). And, I'll volunteer to be the Release Manager for
 1.7!
 :)
 
 I'll start the process this weekend or a couple days into the new year.
 
 Cheers,
 Tyler
 On Dec 18, 2014 9:45 PM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  +1
 
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Tyler Palsulich tpalsul...@gmail.com
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Thursday, December 18, 2014 at 9:15 PM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: Re: 1.7 release?
 
  I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As
  Nick
  just recommended, I'll try adding metadata extraction to Tesseract
 soon,
  then adding the extensible solution in 1.8.
  
  Tyler
  
  On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) 
  chris.a.mattm...@jpl.nasa.gov wrote:
  
   I haven’t tried my hand at it - been super busy. tyler if you have a
   chance go for it, I think that’s the remaining blocker.
  
   ++
   Chris Mattmann, Ph.D.
   Chief Architect
   Instrument Software and Science Data Systems Section (398)
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 168-519, Mailstop: 168-527
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   Adjunct Associate Professor, Computer Science Department
   University of Southern California, Los Angeles, CA 90089 USA
   ++
  
  
  
  
  
  
   -Original Message-
   From: Tyler Palsulich tpalsul...@gmail.com
   Reply-To: dev@tika.apache.org dev@tika.apache.org
   Date: Thursday, December 18, 2014 at 12:54 PM
   To: dev@tika.apache.org dev@tika.apache.org
   Subject: Re: 1.7 release?
  
   Hi All,
   
   It's been a few months, so I just want to follow up on this thread.
  We've
   resolved/closed 51 issues for v1.7 [0]. There are two on JIRA
 marked as
   1.7
   (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with
  TIKA-1445?
   Has anyone tried their hand at the suggested (significant) fix?
   
   Are there any other issues someone would like to fit in?
   
   Cheers,
   Tyler
   
   [0] -
   
  
  
 
 
 https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?select
  e
   dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel
   
   On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) 
   chris.a.mattm...@jpl.nasa.gov wrote:
   
Thanks Tim saw your patch and am looking now.
   
++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA

Re: 1.7 release?

2014-12-18 Thread Tyler Palsulich
Hi All,

It's been a few months, so I just want to follow up on this thread. We've
resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7
(TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445?
Has anyone tried their hand at the suggested (significant) fix?

Are there any other issues someone would like to fit in?

Cheers,
Tyler

[0] -
https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel

On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Thanks Tim saw your patch and am looking now.

 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Allison, Timothy B. talli...@mitre.org
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Monday, October 27, 2014 at 12:30 PM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: RE: 1.7 release?

 Sounds good.  As long as the default behavior remains the same, I'm
 happy.  I'm going to play with a combination of your patch and Tyler's
 and see what the ramifications are for embedded docs.
 
 To confirm, the OCR integration is fantastic.  Thank you and Tyler!
 
 
 Best,
 
Tim
 
 -Original Message-
 From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov]
 Sent: Friday, October 24, 2014 5:36 PM
 To: dev@tika.apache.org
 Subject: Re: 1.7 release?
 
 Hey Tim,
 
 What do you think about my existing patch for 1445? For example to
 just call all the parsers? I thought I was seeing behavior that was
 slow because of that, but it turned out to be Tesseract and my machine
 at the time?
 
 I think my patch for 1445 may be enough, and we should get the metadata
 I think? Thoughts?
 
 I honestly think we need to deliver Tesseract in 1.7. We're close. I'll
 even take it upon myself to try and experiment with the idea of multiple
 parsers being called. I think a simple solution to the metadata key
 conflict issue is simply to have a policy to add values (by default) and
 replace if a property is set in ParseContext. Some simple updates to
 CompositeParser would allow this.
 
 Thoughts?
 
 Cheers,
 Chris
 
 
 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++
 
 
 
 
 
 
 -Original Message-
 From: Allison, Timothy B. talli...@mitre.org
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Friday, October 24, 2014 at 2:24 PM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: RE: 1.7 release?
 
 Sorry for coming late to the game on the implications of TIKA-1445.  I
 don't want to hold up the release of 1.7.
 
 However, would it be possible to return to the legacy default behavior of
 extracting metadata from images?
 
 We can then document on the OCR parser page on the wiki that you need to
 install Tesseract _and_ make a change in the parser/mime config file. If
 you want this new capability, it will take a small bit of work until we
 solve TIKA-1445.
 
 I worry that the current behavior of 1.7 would be surprising to most
 non-dev users (well, even to at least one dev :) ).
 
 Cheers,
 
   Tim
 
 
 From: Oleg Tikhonov [olegtikho...@gmail.com]
 Sent: Friday, October 24, 2014 2:24 PM
 To: dev@tika.apache.org
 Subject: Re: 1.7 release?
 
 Hi Tyler,
 don't mention.
 
 Cheers,
 Oleg
 On Oct 24, 2014 8:02 PM, Tyler Palsulich tpalsul...@gmail.com wrote:
 
  Thank you for the help, Oleg! I just resolved TIKA-1422. So, are there
 any
  other issues anyone would like to resolve before a new release?
 
  Thanks,
  Tyler
 
  On Tue, Oct 21, 2014 at 2:42 AM, Oleg Tikhonov olegtikho...@gmail.com
 
  wrote:
 
   Sorry!!!
  
   On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) 
   chris.a.mattm...@jpl.nasa.gov wrote:
  
Thanks Oleg, will try tomorrow for me Los angeles time

Re: 1.7 release?

2014-12-18 Thread Thomas Ledoux
Hi, it might be worth waiting until POI 3.11-FINAL is released so that the
TIKA release do not depend on a beta version. It's due on Sunday, corrects
a lot of old office parsing and just needs the patch in TIKA-1469 to
properly work.

Regards
  Thomas

2014-12-18 21:54 GMT+01:00 Tyler Palsulich tpalsul...@gmail.com:

 Hi All,

 It's been a few months, so I just want to follow up on this thread. We've
 resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7
 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445?
 Has anyone tried their hand at the suggested (significant) fix?

 Are there any other issues someone would like to fit in?

 Cheers,
 Tyler

 [0] -

 https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel

 On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Thanks Tim saw your patch and am looking now.
 
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Allison, Timothy B. talli...@mitre.org
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Monday, October 27, 2014 at 12:30 PM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: RE: 1.7 release?
 
  Sounds good.  As long as the default behavior remains the same, I'm
  happy.  I'm going to play with a combination of your patch and Tyler's
  and see what the ramifications are for embedded docs.
  
  To confirm, the OCR integration is fantastic.  Thank you and Tyler!
  
  
  Best,
  
 Tim
  
  -Original Message-
  From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov]
  Sent: Friday, October 24, 2014 5:36 PM
  To: dev@tika.apache.org
  Subject: Re: 1.7 release?
  
  Hey Tim,
  
  What do you think about my existing patch for 1445? For example to
  just call all the parsers? I thought I was seeing behavior that was
  slow because of that, but it turned out to be Tesseract and my machine
  at the time?
  
  I think my patch for 1445 may be enough, and we should get the metadata
  I think? Thoughts?
  
  I honestly think we need to deliver Tesseract in 1.7. We're close. I'll
  even take it upon myself to try and experiment with the idea of multiple
  parsers being called. I think a simple solution to the metadata key
  conflict issue is simply to have a policy to add values (by default) and
  replace if a property is set in ParseContext. Some simple updates to
  CompositeParser would allow this.
  
  Thoughts?
  
  Cheers,
  Chris
  
  
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
  
  
  
  
  
  
  -Original Message-
  From: Allison, Timothy B. talli...@mitre.org
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Friday, October 24, 2014 at 2:24 PM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: RE: 1.7 release?
  
  Sorry for coming late to the game on the implications of TIKA-1445.  I
  don't want to hold up the release of 1.7.
  
  However, would it be possible to return to the legacy default behavior
 of
  extracting metadata from images?
  
  We can then document on the OCR parser page on the wiki that you need
 to
  install Tesseract _and_ make a change in the parser/mime config file.
 If
  you want this new capability, it will take a small bit of work until we
  solve TIKA-1445.
  
  I worry that the current behavior of 1.7 would be surprising to most
  non-dev users (well, even to at least one dev :) ).
  
  Cheers,
  
Tim
  
  
  From: Oleg Tikhonov [olegtikho...@gmail.com]
  Sent: Friday, October 24, 2014 2:24 PM
  To: dev@tika.apache.org
  Subject: Re: 1.7 release?
  
  Hi Tyler,
  don't mention.
  
  Cheers,
  Oleg
  On Oct 24, 2014 8:02 PM, Tyler Palsulich tpalsul...@gmail.com
 wrote:
  
   Thank you

Re: 1.7 release?

2014-12-18 Thread Mattmann, Chris A (3980)
I haven’t tried my hand at it - been super busy. tyler if you have a
chance go for it, I think that’s the remaining blocker.

++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: Tyler Palsulich tpalsul...@gmail.com
Reply-To: dev@tika.apache.org dev@tika.apache.org
Date: Thursday, December 18, 2014 at 12:54 PM
To: dev@tika.apache.org dev@tika.apache.org
Subject: Re: 1.7 release?

Hi All,

It's been a few months, so I just want to follow up on this thread. We've
resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as
1.7
(TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445?
Has anyone tried their hand at the suggested (significant) fix?

Are there any other issues someone would like to fit in?

Cheers,
Tyler

[0] -
https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?selecte
dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel

On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Thanks Tim saw your patch and am looking now.

 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Allison, Timothy B. talli...@mitre.org
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Monday, October 27, 2014 at 12:30 PM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: RE: 1.7 release?

 Sounds good.  As long as the default behavior remains the same, I'm
 happy.  I'm going to play with a combination of your patch and Tyler's
 and see what the ramifications are for embedded docs.
 
 To confirm, the OCR integration is fantastic.  Thank you and Tyler!
 
 
 Best,
 
Tim
 
 -Original Message-
 From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov]
 Sent: Friday, October 24, 2014 5:36 PM
 To: dev@tika.apache.org
 Subject: Re: 1.7 release?
 
 Hey Tim,
 
 What do you think about my existing patch for 1445? For example to
 just call all the parsers? I thought I was seeing behavior that was
 slow because of that, but it turned out to be Tesseract and my machine
 at the time?
 
 I think my patch for 1445 may be enough, and we should get the metadata
 I think? Thoughts?
 
 I honestly think we need to deliver Tesseract in 1.7. We're close. I'll
 even take it upon myself to try and experiment with the idea of
multiple
 parsers being called. I think a simple solution to the metadata key
 conflict issue is simply to have a policy to add values (by default)
and
 replace if a property is set in ParseContext. Some simple updates to
 CompositeParser would allow this.
 
 Thoughts?
 
 Cheers,
 Chris
 
 
 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++
 
 
 
 
 
 
 -Original Message-
 From: Allison, Timothy B. talli...@mitre.org
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Friday, October 24, 2014 at 2:24 PM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: RE: 1.7 release?
 
 Sorry for coming late to the game on the implications of TIKA-1445.  I
 don't want to hold up the release of 1.7.
 
 However, would it be possible to return to the legacy default
behavior of
 extracting metadata from images?
 
 We can then document on the OCR parser page on the wiki that you need
to
 install Tesseract _and_ make a change in the parser/mime config file.
If
 you want this new capability, it will take a small bit of work until
we
 solve

Re: 1.7 release?

2014-12-18 Thread Tyler Palsulich
I'm OK with trying the fix in 1.8 (or 1.7 if people feel strongly). As Nick
just recommended, I'll try adding metadata extraction to Tesseract soon,
then adding the extensible solution in 1.8.

Tyler

On Thu, Dec 18, 2014 at 11:58 PM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:

 I haven’t tried my hand at it - been super busy. tyler if you have a
 chance go for it, I think that’s the remaining blocker.

 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Tyler Palsulich tpalsul...@gmail.com
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Thursday, December 18, 2014 at 12:54 PM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 Hi All,
 
 It's been a few months, so I just want to follow up on this thread. We've
 resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as
 1.7
 (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445?
 Has anyone tried their hand at the suggested (significant) fix?
 
 Are there any other issues someone would like to fit in?
 
 Cheers,
 Tyler
 
 [0] -
 
 https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?selecte
 dTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel
 
 On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Thanks Tim saw your patch and am looking now.
 
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Allison, Timothy B. talli...@mitre.org
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Monday, October 27, 2014 at 12:30 PM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: RE: 1.7 release?
 
  Sounds good.  As long as the default behavior remains the same, I'm
  happy.  I'm going to play with a combination of your patch and Tyler's
  and see what the ramifications are for embedded docs.
  
  To confirm, the OCR integration is fantastic.  Thank you and Tyler!
  
  
  Best,
  
 Tim
  
  -Original Message-
  From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov]
  Sent: Friday, October 24, 2014 5:36 PM
  To: dev@tika.apache.org
  Subject: Re: 1.7 release?
  
  Hey Tim,
  
  What do you think about my existing patch for 1445? For example to
  just call all the parsers? I thought I was seeing behavior that was
  slow because of that, but it turned out to be Tesseract and my machine
  at the time?
  
  I think my patch for 1445 may be enough, and we should get the metadata
  I think? Thoughts?
  
  I honestly think we need to deliver Tesseract in 1.7. We're close. I'll
  even take it upon myself to try and experiment with the idea of
 multiple
  parsers being called. I think a simple solution to the metadata key
  conflict issue is simply to have a policy to add values (by default)
 and
  replace if a property is set in ParseContext. Some simple updates to
  CompositeParser would allow this.
  
  Thoughts?
  
  Cheers,
  Chris
  
  
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
  
  
  
  
  
  
  -Original Message-
  From: Allison, Timothy B. talli...@mitre.org
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Friday, October 24, 2014 at 2:24 PM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: RE: 1.7 release?
  
  Sorry for coming

Re: 1.7 release?

2014-10-27 Thread Mattmann, Chris A (3980)
Thanks Tim saw your patch and am looking now.

++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: Allison, Timothy B. talli...@mitre.org
Reply-To: dev@tika.apache.org dev@tika.apache.org
Date: Monday, October 27, 2014 at 12:30 PM
To: dev@tika.apache.org dev@tika.apache.org
Subject: RE: 1.7 release?

Sounds good.  As long as the default behavior remains the same, I'm
happy.  I'm going to play with a combination of your patch and Tyler's
and see what the ramifications are for embedded docs.

To confirm, the OCR integration is fantastic.  Thank you and Tyler!


Best,

   Tim

-Original Message-
From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov]
Sent: Friday, October 24, 2014 5:36 PM
To: dev@tika.apache.org
Subject: Re: 1.7 release?

Hey Tim,

What do you think about my existing patch for 1445? For example to
just call all the parsers? I thought I was seeing behavior that was
slow because of that, but it turned out to be Tesseract and my machine
at the time?

I think my patch for 1445 may be enough, and we should get the metadata
I think? Thoughts?

I honestly think we need to deliver Tesseract in 1.7. We're close. I'll
even take it upon myself to try and experiment with the idea of multiple
parsers being called. I think a simple solution to the metadata key
conflict issue is simply to have a policy to add values (by default) and
replace if a property is set in ParseContext. Some simple updates to
CompositeParser would allow this.

Thoughts?

Cheers,
Chris


++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: Allison, Timothy B. talli...@mitre.org
Reply-To: dev@tika.apache.org dev@tika.apache.org
Date: Friday, October 24, 2014 at 2:24 PM
To: dev@tika.apache.org dev@tika.apache.org
Subject: RE: 1.7 release?

Sorry for coming late to the game on the implications of TIKA-1445.  I
don't want to hold up the release of 1.7.

However, would it be possible to return to the legacy default behavior of
extracting metadata from images?

We can then document on the OCR parser page on the wiki that you need to
install Tesseract _and_ make a change in the parser/mime config file. If
you want this new capability, it will take a small bit of work until we
solve TIKA-1445.

I worry that the current behavior of 1.7 would be surprising to most
non-dev users (well, even to at least one dev :) ).

Cheers,
  
  Tim


From: Oleg Tikhonov [olegtikho...@gmail.com]
Sent: Friday, October 24, 2014 2:24 PM
To: dev@tika.apache.org
Subject: Re: 1.7 release?

Hi Tyler,
don't mention.

Cheers,
Oleg
On Oct 24, 2014 8:02 PM, Tyler Palsulich tpalsul...@gmail.com wrote:

 Thank you for the help, Oleg! I just resolved TIKA-1422. So, are there
any
 other issues anyone would like to resolve before a new release?

 Thanks,
 Tyler

 On Tue, Oct 21, 2014 at 2:42 AM, Oleg Tikhonov olegtikho...@gmail.com
 wrote:

  Sorry!!!
 
  On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) 
  chris.a.mattm...@jpl.nasa.gov wrote:
 
   Thanks Oleg, will try tomorrow for me Los angeles time!
  
   ++
   Chris Mattmann, Ph.D.
   Chief Architect
   Instrument Software and Science Data Systems Section (398)
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 168-519, Mailstop: 168-527
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   Adjunct Associate Professor, Computer Science Department
   University of Southern California, Los Angeles, CA 90089 USA
   ++
  
  
  
  
  
  
   -Original Message-
   From: Oleg Tikhonov o...@apache.org
   Reply-To: dev@tika.apache.org dev@tika.apache.org
   Date: Monday, October 20, 2014 at 11

Re: 1.7 release?

2014-10-24 Thread Oleg Tikhonov
Hi Tyler,
don't mention.

Cheers,
Oleg
On Oct 24, 2014 8:02 PM, Tyler Palsulich tpalsul...@gmail.com wrote:

 Thank you for the help, Oleg! I just resolved TIKA-1422. So, are there any
 other issues anyone would like to resolve before a new release?

 Thanks,
 Tyler

 On Tue, Oct 21, 2014 at 2:42 AM, Oleg Tikhonov olegtikho...@gmail.com
 wrote:

  Sorry!!!
 
  On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) 
  chris.a.mattm...@jpl.nasa.gov wrote:
 
   Thanks Oleg, will try tomorrow for me Los angeles time!
  
   ++
   Chris Mattmann, Ph.D.
   Chief Architect
   Instrument Software and Science Data Systems Section (398)
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 168-519, Mailstop: 168-527
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   Adjunct Associate Professor, Computer Science Department
   University of Southern California, Los Angeles, CA 90089 USA
   ++
  
  
  
  
  
  
   -Original Message-
   From: Oleg Tikhonov o...@apache.org
   Reply-To: dev@tika.apache.org dev@tika.apache.org
   Date: Monday, October 20, 2014 at 11:20 PM
   To: dev@tika.apache.org dev@tika.apache.org
   Subject: Re: 1.7 release?
  
   Please take a try with newest patch.
   Cheers,
   Oleg
   
   On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov 
 olegtikho...@gmail.com
   wrote:
   
Taken. Thanks. in progress ...
   
On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:
   
Trunk is the current checkout/branch:
   
http://svn.apache.org/repos/asf/tika/trunk
   
   
++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++
   
   
   
   
   
   
-Original Message-
From: Oleg Tikhonov olegtikho...@gmail.com
Reply-To: dev@tika.apache.org dev@tika.apache.org
Date: Monday, October 20, 2014 at 10:16 PM
To: dev@tika.apache.org dev@tika.apache.org
Subject: Re: 1.7 release?
   
Hi, I can try this on.
What is a trunk?


Thanks,
Oleg

On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Hmm any idea why this is failing on Windows? Tyler P. and
 I were talking the other day - maybe we shouldn't run the
 tests from TIKA-1422 unless Tesseract is installed? Thoughts?


 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/

 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA

 ++






 -Original Message-
 From: Hong-Thai Nguyen thaicha...@gmail.com
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Thursday, October 16, 2014 at 2:03 AM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 Hi Andrzej,
 
 We are impatient for 1.7 release too.
 I'm having compiling problem of TIKA-1422 on me. If anyone can
   build
 successfully on Windows, I have no objection to release 1.7
 
 Thanks,
 
 On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki 
  a...@getopt.org
wrote:
 
  Hi,
 
  Any news on the 1.7 release? or at least a 1.6.1 release that
includes
 the
  fix for broken ODF parsing...
 
  ---
  Best regards,
 
  Andrzej Bialecki
 
 
 
 
 --
 --
 Hong-Thai


   
   
   
  
  
 



RE: 1.7 release?

2014-10-24 Thread Allison, Timothy B.
Sorry for coming late to the game on the implications of TIKA-1445.  I don't 
want to hold up the release of 1.7.  

However, would it be possible to return to the legacy default behavior of 
extracting metadata from images?  

We can then document on the OCR parser page on the wiki that you need to 
install Tesseract _and_ make a change in the parser/mime config file. If you 
want this new capability, it will take a small bit of work until we solve 
TIKA-1445.

I worry that the current behavior of 1.7 would be surprising to most non-dev 
users (well, even to at least one dev :) ).

Cheers,
  
  Tim


From: Oleg Tikhonov [olegtikho...@gmail.com]
Sent: Friday, October 24, 2014 2:24 PM
To: dev@tika.apache.org
Subject: Re: 1.7 release?

Hi Tyler,
don't mention.

Cheers,
Oleg
On Oct 24, 2014 8:02 PM, Tyler Palsulich tpalsul...@gmail.com wrote:

 Thank you for the help, Oleg! I just resolved TIKA-1422. So, are there any
 other issues anyone would like to resolve before a new release?

 Thanks,
 Tyler

 On Tue, Oct 21, 2014 at 2:42 AM, Oleg Tikhonov olegtikho...@gmail.com
 wrote:

  Sorry!!!
 
  On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) 
  chris.a.mattm...@jpl.nasa.gov wrote:
 
   Thanks Oleg, will try tomorrow for me Los angeles time!
  
   ++
   Chris Mattmann, Ph.D.
   Chief Architect
   Instrument Software and Science Data Systems Section (398)
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 168-519, Mailstop: 168-527
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   Adjunct Associate Professor, Computer Science Department
   University of Southern California, Los Angeles, CA 90089 USA
   ++
  
  
  
  
  
  
   -Original Message-
   From: Oleg Tikhonov o...@apache.org
   Reply-To: dev@tika.apache.org dev@tika.apache.org
   Date: Monday, October 20, 2014 at 11:20 PM
   To: dev@tika.apache.org dev@tika.apache.org
   Subject: Re: 1.7 release?
  
   Please take a try with newest patch.
   Cheers,
   Oleg
   
   On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov 
 olegtikho...@gmail.com
   wrote:
   
Taken. Thanks. in progress ...
   
On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:
   
Trunk is the current checkout/branch:
   
http://svn.apache.org/repos/asf/tika/trunk
   
   
++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++
   
   
   
   
   
   
-Original Message-
From: Oleg Tikhonov olegtikho...@gmail.com
Reply-To: dev@tika.apache.org dev@tika.apache.org
Date: Monday, October 20, 2014 at 10:16 PM
To: dev@tika.apache.org dev@tika.apache.org
Subject: Re: 1.7 release?
   
Hi, I can try this on.
What is a trunk?


Thanks,
Oleg

On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Hmm any idea why this is failing on Windows? Tyler P. and
 I were talking the other day - maybe we shouldn't run the
 tests from TIKA-1422 unless Tesseract is installed? Thoughts?


 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/

 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA

 ++






 -Original Message-
 From: Hong-Thai Nguyen thaicha...@gmail.com
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Thursday, October 16, 2014 at 2:03 AM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 Hi Andrzej,
 
 We are impatient for 1.7 release too.
 I'm having compiling problem of TIKA-1422 on me. If anyone can
   build
 successfully

Re: 1.7 release?

2014-10-21 Thread Oleg Tikhonov
Taken. Thanks. in progress ...

On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Trunk is the current checkout/branch:

 http://svn.apache.org/repos/asf/tika/trunk


 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Oleg Tikhonov olegtikho...@gmail.com
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Monday, October 20, 2014 at 10:16 PM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 Hi, I can try this on.
 What is a trunk?
 
 
 Thanks,
 Oleg
 
 On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hmm any idea why this is failing on Windows? Tyler P. and
  I were talking the other day - maybe we shouldn't run the
  tests from TIKA-1422 unless Tesseract is installed? Thoughts?
 
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Hong-Thai Nguyen thaicha...@gmail.com
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Thursday, October 16, 2014 at 2:03 AM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: Re: 1.7 release?
 
  Hi Andrzej,
  
  We are impatient for 1.7 release too.
  I'm having compiling problem of TIKA-1422 on me. If anyone can build
  successfully on Windows, I have no objection to release 1.7
  
  Thanks,
  
  On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org
 wrote:
  
   Hi,
  
   Any news on the 1.7 release? or at least a 1.6.1 release that
 includes
  the
   fix for broken ODF parsing...
  
   ---
   Best regards,
  
   Andrzej Bialecki
  
  
  
  
  --
  --
  Hong-Thai
 
 




Re: 1.7 release?

2014-10-21 Thread Oleg Tikhonov
Please take a try with newest patch.
Cheers,
Oleg

On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov olegtikho...@gmail.com
wrote:

 Taken. Thanks. in progress ...

 On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:

 Trunk is the current checkout/branch:

 http://svn.apache.org/repos/asf/tika/trunk


 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Oleg Tikhonov olegtikho...@gmail.com
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Monday, October 20, 2014 at 10:16 PM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 Hi, I can try this on.
 What is a trunk?
 
 
 Thanks,
 Oleg
 
 On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hmm any idea why this is failing on Windows? Tyler P. and
  I were talking the other day - maybe we shouldn't run the
  tests from TIKA-1422 unless Tesseract is installed? Thoughts?
 
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Hong-Thai Nguyen thaicha...@gmail.com
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Thursday, October 16, 2014 at 2:03 AM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: Re: 1.7 release?
 
  Hi Andrzej,
  
  We are impatient for 1.7 release too.
  I'm having compiling problem of TIKA-1422 on me. If anyone can build
  successfully on Windows, I have no objection to release 1.7
  
  Thanks,
  
  On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org
 wrote:
  
   Hi,
  
   Any news on the 1.7 release? or at least a 1.6.1 release that
 includes
  the
   fix for broken ODF parsing...
  
   ---
   Best regards,
  
   Andrzej Bialecki
  
  
  
  
  --
  --
  Hong-Thai
 
 





Re: 1.7 release?

2014-10-21 Thread Mattmann, Chris A (3980)
Thanks Oleg, will try tomorrow for me Los angeles time!

++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: Oleg Tikhonov o...@apache.org
Reply-To: dev@tika.apache.org dev@tika.apache.org
Date: Monday, October 20, 2014 at 11:20 PM
To: dev@tika.apache.org dev@tika.apache.org
Subject: Re: 1.7 release?

Please take a try with newest patch.
Cheers,
Oleg

On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov olegtikho...@gmail.com
wrote:

 Taken. Thanks. in progress ...

 On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:

 Trunk is the current checkout/branch:

 http://svn.apache.org/repos/asf/tika/trunk


 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Oleg Tikhonov olegtikho...@gmail.com
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Monday, October 20, 2014 at 10:16 PM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 Hi, I can try this on.
 What is a trunk?
 
 
 Thanks,
 Oleg
 
 On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hmm any idea why this is failing on Windows? Tyler P. and
  I were talking the other day - maybe we shouldn't run the
  tests from TIKA-1422 unless Tesseract is installed? Thoughts?
 
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Hong-Thai Nguyen thaicha...@gmail.com
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Thursday, October 16, 2014 at 2:03 AM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: Re: 1.7 release?
 
  Hi Andrzej,
  
  We are impatient for 1.7 release too.
  I'm having compiling problem of TIKA-1422 on me. If anyone can
build
  successfully on Windows, I have no objection to release 1.7
  
  Thanks,
  
  On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org
 wrote:
  
   Hi,
  
   Any news on the 1.7 release? or at least a 1.6.1 release that
 includes
  the
   fix for broken ODF parsing...
  
   ---
   Best regards,
  
   Andrzej Bialecki
  
  
  
  
  --
  --
  Hong-Thai
 
 






Re: 1.7 release?

2014-10-21 Thread Oleg Tikhonov
Sorry!!!

On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Thanks Oleg, will try tomorrow for me Los angeles time!

 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Oleg Tikhonov o...@apache.org
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Monday, October 20, 2014 at 11:20 PM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 Please take a try with newest patch.
 Cheers,
 Oleg
 
 On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov olegtikho...@gmail.com
 wrote:
 
  Taken. Thanks. in progress ...
 
  On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) 
  chris.a.mattm...@jpl.nasa.gov wrote:
 
  Trunk is the current checkout/branch:
 
  http://svn.apache.org/repos/asf/tika/trunk
 
 
  ++
  Chris Mattmann, Ph.D.
  Chief Architect
  Instrument Software and Science Data Systems Section (398)
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 168-519, Mailstop: 168-527
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Associate Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Oleg Tikhonov olegtikho...@gmail.com
  Reply-To: dev@tika.apache.org dev@tika.apache.org
  Date: Monday, October 20, 2014 at 10:16 PM
  To: dev@tika.apache.org dev@tika.apache.org
  Subject: Re: 1.7 release?
 
  Hi, I can try this on.
  What is a trunk?
  
  
  Thanks,
  Oleg
  
  On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) 
  chris.a.mattm...@jpl.nasa.gov wrote:
  
   Hmm any idea why this is failing on Windows? Tyler P. and
   I were talking the other day - maybe we shouldn't run the
   tests from TIKA-1422 unless Tesseract is installed? Thoughts?
  
   ++
   Chris Mattmann, Ph.D.
   Chief Architect
   Instrument Software and Science Data Systems Section (398)
   NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
   Office: 168-519, Mailstop: 168-527
   Email: chris.a.mattm...@nasa.gov
   WWW:  http://sunset.usc.edu/~mattmann/
   ++
   Adjunct Associate Professor, Computer Science Department
   University of Southern California, Los Angeles, CA 90089 USA
   ++
  
  
  
  
  
  
   -Original Message-
   From: Hong-Thai Nguyen thaicha...@gmail.com
   Reply-To: dev@tika.apache.org dev@tika.apache.org
   Date: Thursday, October 16, 2014 at 2:03 AM
   To: dev@tika.apache.org dev@tika.apache.org
   Subject: Re: 1.7 release?
  
   Hi Andrzej,
   
   We are impatient for 1.7 release too.
   I'm having compiling problem of TIKA-1422 on me. If anyone can
 build
   successfully on Windows, I have no objection to release 1.7
   
   Thanks,
   
   On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org
  wrote:
   
Hi,
   
Any news on the 1.7 release? or at least a 1.6.1 release that
  includes
   the
fix for broken ODF parsing...
   
---
Best regards,
   
Andrzej Bialecki
   
   
   
   
   --
   --
   Hong-Thai
  
  
 
 
 




Re: 1.7 release?

2014-10-20 Thread Mattmann, Chris A (3980)
Hmm any idea why this is failing on Windows? Tyler P. and
I were talking the other day - maybe we shouldn't run the
tests from TIKA-1422 unless Tesseract is installed? Thoughts?

++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: Hong-Thai Nguyen thaicha...@gmail.com
Reply-To: dev@tika.apache.org dev@tika.apache.org
Date: Thursday, October 16, 2014 at 2:03 AM
To: dev@tika.apache.org dev@tika.apache.org
Subject: Re: 1.7 release?

Hi Andrzej,

We are impatient for 1.7 release too.
I'm having compiling problem of TIKA-1422 on me. If anyone can build
successfully on Windows, I have no objection to release 1.7

Thanks,

On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org wrote:

 Hi,

 Any news on the 1.7 release? or at least a 1.6.1 release that includes
the
 fix for broken ODF parsing...

 ---
 Best regards,

 Andrzej Bialecki




-- 
--
Hong-Thai



Re: 1.7 release?

2014-10-20 Thread Oleg Tikhonov
Hi, I can try this on.
What is a trunk?


Thanks,
Oleg

On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Hmm any idea why this is failing on Windows? Tyler P. and
 I were talking the other day - maybe we shouldn't run the
 tests from TIKA-1422 unless Tesseract is installed? Thoughts?

 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Hong-Thai Nguyen thaicha...@gmail.com
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Thursday, October 16, 2014 at 2:03 AM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 Hi Andrzej,
 
 We are impatient for 1.7 release too.
 I'm having compiling problem of TIKA-1422 on me. If anyone can build
 successfully on Windows, I have no objection to release 1.7
 
 Thanks,
 
 On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org wrote:
 
  Hi,
 
  Any news on the 1.7 release? or at least a 1.6.1 release that includes
 the
  fix for broken ODF parsing...
 
  ---
  Best regards,
 
  Andrzej Bialecki
 
 
 
 
 --
 --
 Hong-Thai




Re: 1.7 release?

2014-10-20 Thread Mattmann, Chris A (3980)
Trunk is the current checkout/branch:

http://svn.apache.org/repos/asf/tika/trunk


++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: Oleg Tikhonov olegtikho...@gmail.com
Reply-To: dev@tika.apache.org dev@tika.apache.org
Date: Monday, October 20, 2014 at 10:16 PM
To: dev@tika.apache.org dev@tika.apache.org
Subject: Re: 1.7 release?

Hi, I can try this on.
What is a trunk?


Thanks,
Oleg

On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Hmm any idea why this is failing on Windows? Tyler P. and
 I were talking the other day - maybe we shouldn't run the
 tests from TIKA-1422 unless Tesseract is installed? Thoughts?

 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398)
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Hong-Thai Nguyen thaicha...@gmail.com
 Reply-To: dev@tika.apache.org dev@tika.apache.org
 Date: Thursday, October 16, 2014 at 2:03 AM
 To: dev@tika.apache.org dev@tika.apache.org
 Subject: Re: 1.7 release?

 Hi Andrzej,
 
 We are impatient for 1.7 release too.
 I'm having compiling problem of TIKA-1422 on me. If anyone can build
 successfully on Windows, I have no objection to release 1.7
 
 Thanks,
 
 On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org
wrote:
 
  Hi,
 
  Any news on the 1.7 release? or at least a 1.6.1 release that
includes
 the
  fix for broken ODF parsing...
 
  ---
  Best regards,
 
  Andrzej Bialecki
 
 
 
 
 --
 --
 Hong-Thai





Re: 1.7 release?

2014-10-16 Thread Hong-Thai Nguyen
Hi Andrzej,

We are impatient for 1.7 release too.
I'm having compiling problem of TIKA-1422 on me. If anyone can build
successfully on Windows, I have no objection to release 1.7

Thanks,

On Thu, Oct 16, 2014 at 10:51 AM, Andrzej Białecki a...@getopt.org wrote:

 Hi,

 Any news on the 1.7 release? or at least a 1.6.1 release that includes the
 fix for broken ODF parsing…

 ---
 Best regards,

 Andrzej Bialecki




-- 
--
Hong-Thai