tika-parsers maven dependencies (commons-logging)

2011-10-18 Thread gross
that uses tika. -- Regards, Konstantin Gribov aka gross.

Re: Tika 0.9 integration in Solr 3.3.0

2011-08-19 Thread Tom Gross
- Development mailing list archive at Nabble.com. -- Author of the book Plone 3 Multimedia - http://amzn.to/dtrp0C Tom Gross email.@toms-projekte.de skype.tom_gross web.http://toms-projekte.de blog...http://blog.toms-projekte.de

Re: Tika 0.9 integration in Solr 3.3.0

2011-08-19 Thread Tom Gross
-tp3267799p3268030.html Sent from the Apache Tika - Development mailing list archive at Nabble.com. -- Author of the book Plone 3 Multimedia - http://amzn.to/dtrp0C Tom Gross email.@toms-projekte.de skype.tom_gross web.http://toms-projekte.de blog...http

Re: Failing word parse with tika 0.9

2011-06-23 Thread Tom Gross
Upgrading to poi 3.8beta3 fixed the issue. Thanks Nick! On 06/23/2011 07:24 PM, Nick Burch wrote: On Thu, 23 Jun 2011, Tom Gross wrote: which tika 0.9 can't parse. It fails with: Caused by: java.lang.NullPointerException at org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.uncompressPAP

[jira] [Commented] (TIKA-573) MimeType.getExtension()

2011-05-18 Thread Henning Gross (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13035265#comment-13035265 ] Henning Gross commented on TIKA-573: I got an Use-Case for getting all known valid

[jira] [Created] (TIKA-661) MimeType class does contain a String with accessor named Extension. This should be a ListString Extensions due to several reasons.

2011-05-18 Thread Henning Gross (JIRA)
URL: https://issues.apache.org/jira/browse/TIKA-661 Project: Tika Issue Type: Bug Components: mime Affects Versions: 0.9 Reporter: Henning Gross The javadoc for the method suggest that it will return the preferred extension

[jira] [Updated] (TIKA-661) MimeType class does contain a String with accessor named Extension. This should be a ListString Extensions due to several reasons.

2011-05-18 Thread Henning Gross (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henning Gross updated TIKA-661: --- Attachment: MimeType.getExtensionsPatch.txt Patch that adds getExtensions() and addExtension() as well

[jira] [Commented] (TIKA-573) MimeType.getExtension()

2011-05-18 Thread Henning Gross (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13035463#comment-13035463 ] Henning Gross commented on TIKA-573: Well the consumer is a portlet which allows uploads

[jira] [Commented] (TIKA-573) MimeType.getExtension()

2011-05-18 Thread Henning Gross (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13035465#comment-13035465 ] Henning Gross commented on TIKA-573: https://issues.apache.org/jira/browse/TIKA-661

[jira] Created: (TIKA-574) Support for IBM866 (CP866) encoding in TXTParser

2010-12-16 Thread gross (JIRA)
Affects Versions: 0.8 Environment: GNU/Linux 2.6.35-23, openjdk6 Reporter: gross Priority: Minor Fix For: 0.9, 1.0, 0.8 Attachments: tika-0.8-cp866.patch There's no recognizer for CP866 (DOS russian encoding) in tika yet. -- This message

[jira] Updated: (TIKA-574) Support for IBM866 (CP866) encoding in TXTParser

2010-12-16 Thread gross (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gross updated TIKA-574: --- Attachment: tika-0.8-cp866.patch I've used ngrams from cp1251 and wrote custom byteMap. All russian letters, used

Adding cp866 (dos) encoding support.

2010-12-14 Thread gross
. Is it enought to add support for this encoding? Is it useful for community? And, if so, what I should do to contribute such addition? *Best regards*, gross aka Kostya Gribov.