OOPS -- my mistake, text/plain issues

2010-11-10 Thread qubit
Greetings. I am afraid I owe you an apology -- I went to make some mods to tika for the app we are working on, and that got me into the code for text/plain translation to xhtml. For some reason -- I could have sworn it didn't work before -- I thought the translation of special characters wasn

Re: tika and plain text -- bug or feature?

2010-11-10 Thread qubit
I don't know if my mail is getting filtered and translated by mail. It is not coming out right and I am afraid you may not understand what I am saying. Please review the source of my message rather than the rendering, which is wrong. My mailer went and translated all the html so what I'm seei

RE: tika and plain text -- bug or feature?

2010-11-10 Thread Jukka Zitting
Hi, From: qubit [mailto:lauraea...@yahoo.com] > Then perhaps I am in the wrong place in the code... or I am still not > understanding all that sax is doing. The translation needs to be done > however because you are essentially outputting a plain text file as if > it were xhtml with only header a

Re: tika and plain text -- bug or feature?

2010-11-10 Thread qubit
Greetings and thanks for your reply. I'll reply to excerpted fragments. <<- Please avoid cross-posting between dev@ and u...@. Responding only on dev@, as this is mostly related to Tika internals. ->> Sorry about that. I will send the rest of the mail on this thread only to dev. <<- > However

RE: tika and plain text -- bug or feature?

2010-11-10 Thread Jukka Zitting
Hi, Please avoid cross-posting between dev@ and u...@. Responding only on dev@, as this is mostly related to Tika internals. From: qubit [mailto:lauraea...@yahoo.com] > First, it appears that the code in TextParser.java thinks it is > dealing with a file in plain text (isn't that the same as tex

tika and plain text -- bug or feature?

2010-11-10 Thread qubit
Greetings. I have been sifting through the code in TextParser.java and the various content handlers it invokes, and I have some questions. First, it appears that the code in TextParser.java thinks it is dealing with a file in plain text (isn't that the same as text/plain ?) However it is output

Re: ReviewBoard instance

2010-11-10 Thread Chris Mattmann
+1 Sent from my Verizon Wireless BlackBerry -Original Message- From: Jukka Zitting Date: Wed, 10 Nov 2010 11:23:10 To: dev@tika.apache.org Reply-To: "dev@tika.apache.org" Subject: Re: ReviewBoard instance Hi, On Tue, Oct 26, 2010 at 4:52 PM, Mattmann, Chris A (388J) wrote: > Gav from

[jira] Commented: (TIKA-482) Refactor image and jpeg parsers for access to MetadataExtractor API

2010-11-10 Thread Staffan Olsson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930730#action_12930730 ] Staffan Olsson commented on TIKA-482: - Yes, it's all meant for Apache license > Refactor

[jira] Commented: (TIKA-392) RTF parser smashes words together in subsequent table cells

2010-11-10 Thread Thiago Souza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930726#action_12930726 ] Thiago Souza commented on TIKA-392: --- This extra space is being added in case of words with

Re: ReviewBoard instance

2010-11-10 Thread Jukka Zitting
Hi, On Tue, Oct 26, 2010 at 4:52 PM, Mattmann, Chris A (388J) wrote: > Gav from infra@ set up a ReviewBoard instance for Apache [1]. I've never > used it before but I thought I'd request an account on it for Tika [2] > regardless, so if folks want to use it, they can. I wonder if we could set it

[jira] Commented: (TIKA-482) Refactor image and jpeg parsers for access to MetadataExtractor API

2010-11-10 Thread Jukka Zitting (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930703#action_12930703 ] Jukka Zitting commented on TIKA-482: Staffan, I notice that some of the files are missing

buildbot success in ASF Buildbot on tika-trunk

2010-11-10 Thread buildbot
The Buildbot has detected a restored build of tika-trunk on ASF Buildbot. Full details are available at: http://ci.apache.org/builders/tika-trunk/builds/190 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: Build Source Stamp: [branch tika/trunk] 1033548

[jira] Commented: (TIKA-482) Refactor image and jpeg parsers for access to MetadataExtractor API

2010-11-10 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930631#action_12930631 ] Nick Burch commented on TIKA-482: - Sorry for the delay in finally reviewing this, now committ

buildbot failure in ASF Buildbot on tika-trunk

2010-11-10 Thread buildbot
The Buildbot has detected a new failure of tika-trunk on ASF Buildbot. Full details are available at: http://ci.apache.org/builders/tika-trunk/builds/189 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: Build Source Stamp: [branch tika/trunk] 1033546 Bla