Re: [VOTE] Apache Tika 1.1 release rc #1

2012-03-07 Thread Mattmann, Chris A (388J)
Hey Ken, Sorry about that! Forgot to include the link to the staged Maven2 repo, here: https://repository.apache.org/content/repositories/orgapachetika-066/ There ya go. Cheers, Chris On Mar 7, 2012, at 4:36 PM, Ken Krugler wrote: > Hi Chris, > > On Mar 7, 2012, at 1:35pm, Mattmann, Chris A

Re: [VOTE] Apache Tika 1.1 release rc #1

2012-03-07 Thread Ken Krugler
Hi Chris, Built/tested/installed fine on Mac OS X 10.7.3 Switched Bixo to use Tika 1.1, and Bixo built/passed all tests. +1 -- Ken On Mar 7, 2012, at 1:35pm, Mattmann, Chris A (388J) wrote: > Hi Folks, > > A candidate for the Tika 1.1 release is available at: > > http://people.apache.org/~

Re: [VOTE] Apache Tika 1.1 release rc #1

2012-03-07 Thread Ken Krugler
Hi Chris, On Mar 7, 2012, at 1:35pm, Mattmann, Chris A (388J) wrote: > Hi Folks, > > A candidate for the Tika 1.1 release is available at: > > http://people.apache.org/~mattmann/apache-tika-1.1/rc1/ I'm curious why you've got just the tika-app-1.1.jar (plus release sources), and not any of t

Re: [VOTE] Apache Tika 1.1 release rc #1

2012-03-07 Thread Zabrane Mickael
Hi guys, Congrats for the v1.1 rc1. Compile fine for me (OSX Lion 10.7.3 + OSX Snow Leopard 10.8.6). All test passed. +1 Regards, Zabrane On Mar 7, 2012, at 10:35 PM, Mattmann, Chris A (388J) wrote: > Hi Folks, > > A candidate for the Tika 1.1 release is available at: > > http://people.ap

[jira] [Updated] (TIKA-859) DublinCore Metadata Keys Should be Prefixed and Property Objects

2012-03-07 Thread Ray Gauss II (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Gauss II updated TIKA-859: -- Attachment: (was: dublincore-prefixed-patch.diff) > DublinCore Metadata Keys Should be Prefixed a

[jira] [Updated] (TIKA-859) DublinCore Metadata Keys Should be Prefixed and Property Objects

2012-03-07 Thread Ray Gauss II (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Gauss II updated TIKA-859: -- Attachment: dublincore-prefixed-and-updated-references-parsers-patch dublincore-prefixed-a

[VOTE] Apache Tika 1.1 release rc #1

2012-03-07 Thread Mattmann, Chris A (388J)
Hi Folks, A candidate for the Tika 1.1 release is available at: http://people.apache.org/~mattmann/apache-tika-1.1/rc1/ The release candidate is a zip archive of the sources in: http://svn.apache.org/repos/asf/tika/tags/1.1/ The SHA1 checksum of the archive is d3185bb22fa3c7318488838989af

[jira] [Updated] (TIKA-870) Allow to use call parseToString with a additional parameter of MaxStringLength, so it can be changed per call

2012-03-07 Thread Michael McCandless (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated TIKA-870: Attachment: TIKA-870.patch Patch, with the sample code plus a test case. The test case faile

[jira] [Commented] (TIKA-870) Allow to use call parseToString with a additional parameter of MaxStringLength, so it can be changed per call

2012-03-07 Thread Michael McCandless (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224643#comment-13224643 ] Michael McCandless commented on TIKA-870: - I think this makes sense.

[jira] [Assigned] (TIKA-870) Allow to use call parseToString with a additional parameter of MaxStringLength, so it can be changed per call

2012-03-07 Thread Michael McCandless (Assigned) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned TIKA-870: --- Assignee: Michael McCandless > Allow to use call parseToString with a additional pa

[jira] [Created] (TIKA-870) Allow to use call parseToString with a additional parameter of MaxStringLength, so it can be changed per call

2012-03-07 Thread Shay Banon (Created) (JIRA)
Allow to use call parseToString with a additional parameter of MaxStringLength, so it can be changed per call - Key: TIKA-870 URL: https://issues.apache.or

[jira] [Updated] (TIKA-869) IdentityHtmlMapper.mapSafeElement() needs to return lower-cased incoming name

2012-03-07 Thread Ken Krugler (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ken Krugler updated TIKA-869: - Attachment: TIKA-869.patch > IdentityHtmlMapper.mapSafeElement() needs to return lower-cased incoming n

[jira] [Created] (TIKA-869) IdentityHtmlMapper.mapSafeElement() needs to return lower-cased incoming name

2012-03-07 Thread Ken Krugler (Created) (JIRA)
IdentityHtmlMapper.mapSafeElement() needs to return lower-cased incoming name - Key: TIKA-869 URL: https://issues.apache.org/jira/browse/TIKA-869 Project: Tika Issue

buildbot failure in ASF Buildbot on tika-trunk

2012-03-07 Thread buildbot
The Buildbot has detected a new failure on builder tika-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/tika-trunk/builds/751 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source Stamp: [

[jira] [Updated] (TIKA-774) ExifTool Parser

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-774: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > ExifT

[jira] [Updated] (TIKA-859) DublinCore Metadata Keys Should be Prefixed and Property Objects

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-859: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Dubli

[jira] [Updated] (TIKA-593) Tika network server

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-593: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Tika

[jira] [Updated] (TIKA-842) IPTC Properties Should be Defined Completely and Independently of the Drew Library

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-842: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > IPTC

[jira] [Updated] (TIKA-754) Automatic line break insertion (BR element) instead of '\n' in XHTMLContentHandler

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-754: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Autom

[jira] [Updated] (TIKA-775) Embed Capabilities

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-775: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Embed

[jira] [Updated] (TIKA-539) Encoding detection is too biased by encoding in meta tag

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-539: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Encod

[jira] [Updated] (TIKA-820) Locator is unset for HTML parser

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-820: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Locat

[jira] [Updated] (TIKA-758) Address TODOs when we upgrade to next PDFBox release

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-758: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Addre

[jira] [Updated] (TIKA-747) Ogg Vorbis and FLAC Parsers

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-747: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Ogg V

[jira] [Updated] (TIKA-776) ExifTool Embedder

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-776: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > ExifT

[jira] [Updated] (TIKA-819) Make Option to Exclude Embedded Files' Text for Text Content

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-819: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Make

[jira] [Updated] (TIKA-816) (XLS/XLSX) Improperly formatted date/time in text content.

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-816: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > (XLS/

[jira] [Updated] (TIKA-605) Tika GDAL parser

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-605: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Tika

[jira] [Updated] (TIKA-715) Some parsers produce non-well-formed XHTML SAX events

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-715: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Some

[jira] [Updated] (TIKA-817) (PPT/PPTX) Missing date/time in text content.

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-817: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > (PPT/

[jira] [Updated] (TIKA-861) Parse links in PDF

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-861: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Parse

[jira] [Updated] (TIKA-757) Address TODOs when we upgrade to next POI release (3.8 beta 5)

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-757: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > Addre

[jira] [Updated] (TIKA-868) TXT parser does not honour the specified encoding

2012-03-07 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-868: --- Fix Version/s: (was: 1.1) 1.2 - push out to 1.2 > TXT p