[jira] [Created] (TIKA-1090) Improve Java Documentation for Apache Tika Metadata

2013-02-27 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1090: -- Summary: Improve Java Documentation for Apache Tika Metadata Key: TIKA-1090 URL: https://issues.apache.org/jira/browse/TIKA-1090 Project: Tika

[jira] [Updated] (TIKA-1090) Improve Java Documentation for Apache Tika Metadata

2013-02-27 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1090: --- Attachment: TIKA-1090.patch The attached patch cleans up (and satisfies) my initial

[DISCUSS] Integrate Apache Any23 into Apache Tika

2013-10-18 Thread Lewis John Mcgibbney
Hi Tika Dev's/PMC, This thread is aimed at recognizing common ground shared by Any23 and Tika in an attempt to possibly integrate Any23 into Tika. First however it will serve a purpose for me to put this into context and also provide some rationale behind this initiative. It is my understanding

[VOTE] Release Apache ANY23 0.9.0

2013-10-28 Thread Lewis John Mcgibbney
Hi Everyone, (Hi dev@tika, I hope you don't mind me cross-pollinating this thread) This thread is a formal VOTE to release Apache Any23 0.9.0. In this release cycle we solved 11 issues: http://s.apache.org/6l1 Git source tag (86daaf897513efce88573c10e06b71d5eb36ad1a): http://s.apache.org/ORc

Re: [VOTE] Release Apache ANY23 0.9.0

2013-10-28 Thread Lewis John Mcgibbney
Sorry folks the KEYS file resides in http://apache.org/dist/any23/KEYS Thanks Lewis On Mon, Oct 28, 2013 at 10:49 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Everyone, (Hi dev@tika, I hope you don't mind me cross-pollinating this thread) This thread is a formal VOTE

Re: [VOTE] Release Apache ANY23 0.9.0

2013-10-31 Thread Lewis John Mcgibbney
Hi Giovanni, Thank you for the feedback. On Thu, Oct 31, 2013 at 6:41 PM, dev-digest-h...@tika.apache.org wrote: Re: [VOTE] Release Apache ANY23 0.9.0 10080 by: Lewis John Mcgibbney 10081 by: S.L 10082 by: Giovanni Tummarello my shallow understanding from

[ANNOUNCE] Apache Any23 0.9.0 Release

2013-11-03 Thread Lewis John Mcgibbney
Good Evening All, The Any23 PMC are proud to announce the release of Any23 0.9.0. Anything To Triples (Any23) is a library, a web service and a command line tool that extracts structured data in RDF format from a variety of Web documents. Currently it supports the following input formats: *

Re: [DISCUSS] Integrate Apache Any23 into Apache Tika

2013-11-04 Thread Lewis John Mcgibbney
to be quite handy. Ken, Julien and Chris, please see my replies in line. I've posted them as I received them within the digest email. Thanks Lewis On Sat, Oct 19, 2013 at 8:31 PM, dev-digest-h...@tika.apache.org wrote: [DISCUSS] Integrate Apache Any23 into Apache Tika 10047 by: Lewis John

[jira] [Commented] (TIKA-994) Type Detection Fault

2013-12-12 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846526#comment-13846526 ] Lewis John McGibbney commented on TIKA-994: --- Is the compiled code you've attached

[jira] [Created] (TIKA-1207) Parent task for integration of Any23 into Tika

2013-12-12 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1207: -- Summary: Parent task for integration of Any23 into Tika Key: TIKA-1207 URL: https://issues.apache.org/jira/browse/TIKA-1207 Project: Tika Issue

[jira] [Updated] (TIKA-1207) Parent task for integration of Any23 into Tika

2013-12-12 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1207: --- Attachment: the_initiative.txt This attachment creates a justification/argument

[jira] [Created] (TIKA-1208) Migrate Any23 mime contributions to Tika

2013-12-12 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1208: -- Summary: Migrate Any23 mime contributions to Tika Key: TIKA-1208 URL: https://issues.apache.org/jira/browse/TIKA-1208 Project: Tika Issue Type

Initial work on Any23 proposal migration to Tika

2013-12-12 Thread Lewis John Mcgibbney
Hi Folks, I managed to put some time in to the proposal document we promised a while back. Right now there is lots of background (which I think is equally as important as the migration itself) and I have identified the first area which work can begin on e.g. mime/mediatype detection. I opened

[jira] [Commented] (TIKA-1208) Migrate Any23 mime contributions to Tika

2013-12-12 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846966#comment-13846966 ] Lewis John McGibbney commented on TIKA-1208: Nice [~p_ansell] thank you

[jira] [Commented] (TIKA-1208) Migrate Any23 mime contributions to Tika

2013-12-13 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847729#comment-13847729 ] Lewis John McGibbney commented on TIKA-1208: I've started work on this one

[jira] [Commented] (TIKA-1208) Migrate Any23 mime contributions to Tika

2013-12-13 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847734#comment-13847734 ] Lewis John McGibbney commented on TIKA-1208: Hey [~p_ansell], can you please

[jira] [Created] (TIKA-1209) Upgrade Tika tests to JUnit 4.X

2013-12-13 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1209: -- Summary: Upgrade Tika tests to JUnit 4.X Key: TIKA-1209 URL: https://issues.apache.org/jira/browse/TIKA-1209 Project: Tika Issue Type

Support for marks in InputStream passed to Tika.detect

2013-12-13 Thread Lewis John Mcgibbney
Hi, I am wondering whether the concept of 'purifying' [0][1] is something which may be of interest to the detect API in Tika. Basically we have an interface which defines some logic which should be performed prior to MIMEType detection taking place. The only implementation we have right now is a

[jira] [Created] (TIKA-1210) Address tika-parsers o.a.t.mime.TestMimeTypes TODO: Need a test flash file

2013-12-14 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1210: -- Summary: Address tika-parsers o.a.t.mime.TestMimeTypes TODO: Need a test flash file Key: TIKA-1210 URL: https://issues.apache.org/jira/browse/TIKA-1210

[jira] [Updated] (TIKA-1209) Upgrade Tika tests to JUnit 4.X

2013-12-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1209: --- Attachment: TIKA-1209.patch Patch for trunk. Many test files are changed

[jira] [Commented] (TIKA-1209) Upgrade Tika tests to JUnit 4.X

2013-12-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848488#comment-13848488 ] Lewis John McGibbney commented on TIKA-1209: Thanks [~kkrugler] Upgrade Tika

[jira] [Commented] (TIKA-1209) Upgrade Tika tests to JUnit 4.X

2013-12-15 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848570#comment-13848570 ] Lewis John McGibbney commented on TIKA-1209: Hi [~kkrugler], just some more

[jira] [Commented] (TIKA-1209) Upgrade Tika tests to JUnit 4.X

2013-12-19 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13853233#comment-13853233 ] Lewis John McGibbney commented on TIKA-1209: Nice job [~kkrugler

[jira] [Updated] (TIKA-1210) Address tika-parsers o.a.t.mime.TestMimeTypes TODO: Need a test flash file

2013-12-20 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1210: --- Attachment: test3.swf test2.swf test1.swf

[jira] [Updated] (TIKA-1208) Migrate Any23 mime contributions to Tika

2014-01-08 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1208: --- Attachment: TIKA-1208.patch Hi [~p_ansell], I have been working on a patch

[jira] [Comment Edited] (TIKA-1208) Migrate Any23 mime contributions to Tika

2014-01-08 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866080#comment-13866080 ] Lewis John McGibbney edited comment on TIKA-1208 at 1/9/14 12:28 AM

[jira] [Commented] (TIKA-1208) Migrate Any23 mime contributions to Tika

2014-01-08 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866116#comment-13866116 ] Lewis John McGibbney commented on TIKA-1208: OK Peter lets work

[jira] [Comment Edited] (TIKA-1208) Migrate Any23 mime contributions to Tika

2014-01-08 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866116#comment-13866116 ] Lewis John McGibbney edited comment on TIKA-1208 at 1/9/14 12:39 AM

[jira] [Updated] (TIKA-1219) Add .svn to .gitignore

2014-01-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1219: --- Attachment: TIKA-1219.patch Patch for trunk. I've generated this with --no-prefix so

[jira] [Created] (TIKA-1219) Add .svn to .gitignore

2014-01-14 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1219: -- Summary: Add .svn to .gitignore Key: TIKA-1219 URL: https://issues.apache.org/jira/browse/TIKA-1219 Project: Tika Issue Type: Improvement

[jira] [Commented] (TIKA-1219) Add .svn to .gitignore

2014-01-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870836#comment-13870836 ] Lewis John McGibbney commented on TIKA-1219: I am always working with Tika

[jira] [Comment Edited] (TIKA-1219) Add .svn to .gitignore

2014-01-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870836#comment-13870836 ] Lewis John McGibbney edited comment on TIKA-1219 at 1/14/14 3:58 PM

[jira] [Commented] (TIKA-1219) Add .svn to .gitignore

2014-01-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870846#comment-13870846 ] Lewis John McGibbney commented on TIKA-1219: OK doke. Feel free to close

[jira] [Created] (TIKA-1220) Parser implementration for IFC files

2014-01-14 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1220: -- Summary: Parser implementration for IFC files Key: TIKA-1220 URL: https://issues.apache.org/jira/browse/TIKA-1220 Project: Tika Issue Type: New

[jira] [Updated] (TIKA-1220) Parser implementration for IFC files

2014-01-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1220: --- Attachment: 2012-03-23-Duplex-Programming.ifc Sample .ifc data model Parser

[jira] [Commented] (TIKA-1220) Parser implementration for IFC files

2014-01-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871365#comment-13871365 ] Lewis John McGibbney commented on TIKA-1220: Hi [~gagravarr], bq. However

Tika VM Service

2014-04-08 Thread Lewis John Mcgibbney
Hi FOlks, I would like to propose that we get a Tika service up and running on a VM. Tika users can do adhoc parsing, etc and can do this based on possibly stable nightly SNAPSHOT's or alternatively based on the most recent stable release. Preferably, the service should provide a list of parsers

[jira] [Created] (TIKA-1272) tika-server version is incorrectly defined

2014-04-11 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1272: -- Summary: tika-server version is incorrectly defined Key: TIKA-1272 URL: https://issues.apache.org/jira/browse/TIKA-1272 Project: Tika Issue

[jira] [Updated] (TIKA-1272) tika-server version is incorrectly defined

2014-04-11 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1272: --- Attachment: TIKA-1272.patch Patch for trunk. tika-server version is incorrectly

[jira] [Created] (TIKA-1273) old tika-server jar artifact contains no manifest so not able to invoke from shell

2014-04-11 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1273: -- Summary: old tika-server jar artifact contains no manifest so not able to invoke from shell Key: TIKA-1273 URL: https://issues.apache.org/jira/browse/TIKA-1273

[jira] [Commented] (TIKA-1269) Self-hosted documentation for the JAX-RS Server

2014-04-12 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967583#comment-13967583 ] Lewis John McGibbney commented on TIKA-1269: Hi [~sergey_beryozkin] I also

[jira] [Commented] (TIKA-1269) Self-hosted documentation for the JAX-RS Server

2014-04-12 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967584#comment-13967584 ] Lewis John McGibbney commented on TIKA-1269: I've just taken a look

[jira] [Commented] (TIKA-1269) Self-hosted documentation for the JAX-RS Server

2014-04-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968215#comment-13968215 ] Lewis John McGibbney commented on TIKA-1269: An alternative to miredot. http

[jira] [Commented] (TIKA-1287) Update NetCDF .jar file on Maven Central

2014-05-06 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990649#comment-13990649 ] Lewis John McGibbney commented on TIKA-1287: This is not so much a bug

[jira] [Commented] (TIKA-1287) Update NetCDF .jar file on Maven Central

2014-05-13 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996417#comment-13996417 ] Lewis John McGibbney commented on TIKA-1287: +1... if its required then one

Re: tika install fail on os x 10.9.2

2014-05-14 Thread Lewis John Mcgibbney
Hi Annie, On Mon, May 12, 2014 at 12:01 PM, dev-digest-h...@tika.apache.org wrote: snip... testiBooksParser(org.apache.tika.parser.ibooks.iBooksParserTest): Premature end of file. Tests run: 506, Failures: 0, Errors: 1, Skipped: 1 snip... Java version: 1.8.0_05, vendor: Oracle

[DISCUSS] Nightly Jenkins Builds for Trunk

2014-05-14 Thread Lewis John Mcgibbney
Hi Folks, Right now in Jenkins (builds.apache.org) we don't seem to have a Tika project directory which contains the trunk build... it is just a free standing project burried under the mountain of jobs currently running on that box. We also don't build Tika nightly... in fact AFAICT it has been

Re: [DISCUSS] Nightly Jenkins Builds for Trunk

2014-05-16 Thread Lewis John Mcgibbney
Hi Nick/Others, Please see link below for Tika trunk build on Oracle JDK's (latest) 6 and 7 respectively. We also have a now deprecated Tika trunk build which was doing zilch... we also have a currently disabled cob configured to run with Oracle JDK8 (latest) when this become available to build

[jira] [Commented] (TIKA-1272) tika-server version is incorrectly defined

2014-05-19 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001725#comment-14001725 ] Lewis John McGibbney commented on TIKA-1272: We this issue is kinda strange

[jira] [Commented] (TIKA-894) Add webapp mode for Tika Server, simplifies deployment

2014-05-20 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003553#comment-14003553 ] Lewis John McGibbney commented on TIKA-894: --- Can someone assign this to me and I

[jira] [Closed] (TIKA-1272) tika-server version is incorrectly defined

2014-05-20 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed TIKA-1272. -- Thank you checking this one over [~sergey_beryozkin]. Lets move on to bigger and better

Re: [IMPORTANT] INFRA-7751 - Create a VM for Apache Tika

2014-05-21 Thread Lewis John Mcgibbney
at 7:59 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi dev@, JanI from Infra has nearly provisioned us with a brand spanking new Ubuntu VM which we can use for the Tika service YAY Some things he requires first though... * - what is the external name used by users

[jira] [Created] (TIKA-1306) ClassCastException WARN [main] (COSDocument.java:303) - java.lang.ClassCastException: org.apache.pdfbox.cos.COSString cannot be cast to org.apache.pdfbox.cos.COSName in o

2014-05-21 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1306: -- Summary: ClassCastException WARN [main] (COSDocument.java:303) - java.lang.ClassCastException: org.apache.pdfbox.cos.COSString cannot be cast to org.apache.pdfbox.cos.COSName

[jira] [Commented] (TIKA-1258) Update NetCDF dependency

2014-06-02 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14015531#comment-14015531 ] Lewis John McGibbney commented on TIKA-1258: Some one can commit this patch

Re: Hello

2014-06-02 Thread Lewis John Mcgibbney
Hi Tyler, On Fri, May 30, 2014 at 8:55 AM, dev-digest-h...@tika.apache.org wrote: Thanks, Tim! I'm more of an IntelliJ guy myself. IDEA has a feature where you can check out a project directly from Subversion, which works pretty well. Eclipse also has this feature. Just for the heads up.

[jira] [Commented] (TIKA-1319) Translation

2014-06-04 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14017830#comment-14017830 ] Lewis John McGibbney commented on TIKA-1319: I think this is a nice (and pretty

[jira] [Commented] (TIKA-1303) Parsing Html page (not well formed) containing two title tags results in metadata (title) to be overwritten

2014-06-06 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020019#comment-14020019 ] Lewis John McGibbney commented on TIKA-1303: Hi [~hakram], can you possibly

[jira] [Commented] (TIKA-1303) Parsing Html page (not well formed) containing two title tags results in metadata (title) to be overwritten

2014-06-09 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025254#comment-14025254 ] Lewis John McGibbney commented on TIKA-1303: The code is integrated into trunk

Re: Timezone issue with TTF parser?

2014-06-09 Thread Lewis John Mcgibbney
+1 Can reproduce On Mon, Jun 9, 2014 at 11:41 AM, dev-digest-h...@tika.apache.org wrote: Subject: Re: Timezone issue with TTF parser? +1 Having the same issue. That test passed for me before the update. I'm on Pacific time, for what it's worth.

Re: Working on a new Translation plugin using Joshua

2014-06-18 Thread Lewis John Mcgibbney
Nice Chris. On Tue, Jun 17, 2014 at 5:59 PM, dev-digest-h...@tika.apache.org wrote: In the meanwhile I should have a review board patch up soon too for the JoshuaTranslator. I'll keep my eyes peeled for this one. Thanks Lewis

[jira] [Commented] (TIKA-1302) Let's run Tika against a large batch of docs nightly

2014-06-26 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044891#comment-14044891 ] Lewis John McGibbney commented on TIKA-1302: I would love to work

Re: Expected output

2014-06-28 Thread Lewis John Mcgibbney
Hi Kevin, On Fri, Jun 27, 2014 at 7:56 AM, dev-digest-h...@tika.apache.org wrote: Subject: Expected output Hello everyone. I have a question about the expected output for tika. I am working on integrating my python application with tika-server. One of the test files for unit test

Re: NASA's OCO-2 mission instrument processing system: OODT and Tika inside!

2014-07-03 Thread Lewis John Mcgibbney
Dynamite. Now to the next one http://snap.Jpl.NASA.gov On Jul 3, 2014 11:26 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Hey Guys, Just as an FYI: the NASA OCO-2 mission successfully launched yesterday July 2, 2014, and is now in its in-orbit checkout phase and slowly

Re: NASA's OCO-2 mission instrument processing system: OODT and Tika inside!

2014-07-03 Thread Lewis John Mcgibbney
Bloody predictive text. Sorry folks. On Thu, Jul 3, 2014 at 12:03 PM, Ramirez, Paul M (398J) paul.m.rami...@jpl.nasa.gov wrote: Yep SMAP is using OODT big time in their ground data system. Minor correction http://smap.jpl.nasa.gov --Paul Ramirez On Jul 3, 2014, at 9:00 AM, Lewis John

Miredot License Key for Apache Tika Project

2014-07-18 Thread lewis john mcgibbney
Good Afternoon Sir/Madam, On behalf of the Apache Tika [0] project I am writing to enquire about the possibility of using Miredot to build, represent and communicate our REST API documentation to the large community of users and developers who use Tika in a plethora of applications across the

Re: Miredot License Key for Apache Tika Project

2014-07-22 Thread Lewis John Mcgibbney
Hi Yves, Thanks for the feeddback on this one. On Fri, Jul 18, 2014 at 11:38 PM, Yves Vandewoude yves.vandewo...@miredot.com wrote: Due to the open source nature of Apache Tika, you are indeed granted free licence key(s) if you wish to use MireDot to document your REST API documentation. If

Re: Miredot License Key for Apache Tika Project

2014-07-23 Thread Lewis John Mcgibbney
. Kind Regards, Yves Lewis John Mcgibbney schreef op 22/07/2014 22:54: Hi Yves, Thanks for the feeddback on this one. On Fri, Jul 18, 2014 at 11:38 PM, Yves Vandewoude yves.vandewo...@miredot.com wrote: Due to the open source nature of Apache Tika, you are indeed granted free licence

[jira] [Updated] (TIKA-1269) Self-hosted documentation for the JAX-RS Server

2014-07-23 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1269: --- Attachment: TIKA-1269-miredot.patch Patch for enabling recent free license from

[jira] [Commented] (TIKA-1269) Self-hosted documentation for the JAX-RS Server

2014-07-23 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072368#comment-14072368 ] Lewis John McGibbney commented on TIKA-1269: Hi [~gagravarr] yeah I have

[jira] [Commented] (TIKA-1269) Self-hosted documentation for the JAX-RS Server

2014-07-24 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073562#comment-14073562 ] Lewis John McGibbney commented on TIKA-1269: Yep I am on it right now. Patch

[jira] [Commented] (TIKA-1287) Update NetCDF .jar file on Maven Central

2014-07-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080203#comment-14080203 ] Lewis John McGibbney commented on TIKA-1287: [~annieburgess] bq. Please let me

Re: [VOTE] Release Apache Tika 1.6 RC #2

2014-09-03 Thread Lewis John Mcgibbney
Hi Folks, OK I started this two days ago... here I finish up. On Mon, Sep 1, 2014 at 9:39 AM, dev-digest-h...@tika.apache.org wrote: A candidate for the Tika 1.6 release is available at: http://people.apache.org/~mattmann/apache-tika-1.6/rc2/ So I check out all artifacts and all are

Re: [ANNOUNCE] Apache Tika 1.6 release

2014-09-08 Thread Lewis John Mcgibbney
Brilliant :) On Sun, Sep 7, 2014 at 1:28 PM, dev-digest-h...@tika.apache.org wrote: dev Digest 7 Sep 2014 20:28:53 - Issue 1052 Topics (messages 12880 through 12886) [ANNOUNCE] Apache Tika 1.6 release 12880 by: Chris Mattmann

Re: MediaTypeRegistry normalize query

2014-09-08 Thread Lewis John Mcgibbney
Hi Tom, On Sun, Sep 7, 2014 at 1:28 PM, dev-digest-h...@tika.apache.org wrote: now when parsing HTML files these days Tika adds the charset attribute to the string. Is this behavhiour consistent with other MimeTypes? I would have thought the normalize call was designed to remove this

[jira] [Commented] (TIKA-1268) Extract images from PDF documents

2014-09-09 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127173#comment-14127173 ] Lewis John McGibbney commented on TIKA-1268: Was there ever a patch

[jira] [Commented] (TIKA-1268) Extract images from PDF documents

2014-09-10 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128534#comment-14128534 ] Lewis John McGibbney commented on TIKA-1268: They sure do it [~talli

[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-09-22 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143167#comment-14143167 ] Lewis John McGibbney commented on TIKA-1423: [~vinegh] do you have any

[jira] [Created] (TIKA-1425) Automatic batching of Microsoft service calls

2014-09-22 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1425: -- Summary: Automatic batching of Microsoft service calls Key: TIKA-1425 URL: https://issues.apache.org/jira/browse/TIKA-1425 Project: Tika Issue

[jira] [Updated] (TIKA-1425) Automatic batching of Microsoft service calls

2014-09-22 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1425: --- Description: Right now when I use the following code I get the stack trace

[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-09-22 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144024#comment-14144024 ] Lewis John McGibbney commented on TIKA-1423: Excellent -- *Lewis* Build

[jira] [Commented] (TIKA-1220) Parser implementration for IFC files

2014-09-27 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14150810#comment-14150810 ] Lewis John McGibbney commented on TIKA-1220: Can you assign this to me please

Re: Apache Tika - JSON?

2014-09-28 Thread Lewis John Mcgibbney
Hi Vineet, On Sun, Sep 28, 2014 at 1:21 AM, dev-digest-h...@tika.apache.org wrote: I was wondering if there any in built parser to get help in conversion from XHTML to JSON. My research showed that there is one named org.apache.io.json which just one method implemented. Also, I tried GJSON

[jira] [Commented] (TIKA-1220) Parser implementration for IFC files

2014-09-28 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151209#comment-14151209 ] Lewis John McGibbney commented on TIKA-1220: Dynamite [~davemeikle] some man

[jira] [Updated] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1423: --- Attachment: NLDAS_FORA0125_H.A20130112.1200.002.grb Here you go [~vinegh] Build

Tesseract OCR always activeated parser for images

2014-10-06 Thread Lewis John Mcgibbney
Hi Folks, Now, once I install Tesseract, it is run for every image I pass through Tika server or Tika app. This is not okay as it does not give me the type of MD I am looking for. This is a just a note to folks, to say that AFAIK you would need to unregister the the parser from [0] then rebuild

[jira] [Created] (TIKA-1438) PhoneExtractingContentHandler to not add individual MD entries for individual phone numbers

2014-10-08 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1438: -- Summary: PhoneExtractingContentHandler to not add individual MD entries for individual phone numbers Key: TIKA-1438 URL: https://issues.apache.org/jira/browse/TIKA

[jira] [Updated] (TIKA-1438) PhoneExtractingContentHandler to not add individual MD entries for individual phone numbers

2014-10-08 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1438: --- Attachment: TIKA-1438.patch Patch for trunk, actively validating it right now

[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-21 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178036#comment-14178036 ] Lewis John McGibbney commented on TIKA-1423: Hi [~vinegh] how is this coming

Re: Tika 1.6 update in Maven Central?

2014-10-21 Thread Lewis John Mcgibbney
Hi Chris, On Mon, Oct 20, 2014 at 11:37 PM, dev-digest-h...@tika.apache.org wrote: We do need to make a 1.7 release. I¹d like to get TIKA-1422 fully working on Windows first. Any one of the other devs having things we should get into 1.7? I would very much like to see

[jira] [Assigned] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-21 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned TIKA-1423: -- Assignee: Lewis John McGibbney Build a parser to extract data from GRIB

[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-23 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182208#comment-14182208 ] Lewis John McGibbney commented on TIKA-1423: p.s. do you have a patch against

[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-23 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182204#comment-14182204 ] Lewis John McGibbney commented on TIKA-1423: Output looks fantastic, can you

[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-27 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185612#comment-14185612 ] Lewis John McGibbney commented on TIKA-1423: Hi [~vinegh], if you are working

[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191256#comment-14191256 ] Lewis John McGibbney commented on TIKA-1423: Hi [~vinegh], if you can submit

[jira] [Commented] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-31 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191943#comment-14191943 ] Lewis John McGibbney commented on TIKA-1423: Hi [~vinegh] I've commented

[jira] [Created] (TIKA-1465) Implement extraction of non-global variables from netCDF3 and netCDF4

2014-11-03 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created TIKA-1465: -- Summary: Implement extraction of non-global variables from netCDF3 and netCDF4 Key: TIKA-1465 URL: https://issues.apache.org/jira/browse/TIKA-1465

Re: Eclipse Mylyn Plugin uses Apache Tika!

2014-11-14 Thread Lewis John Mcgibbney
Lovely On Fri, Nov 14, 2014 at 5:38 AM, dev-digest-h...@tika.apache.org wrote: Hey Guys, Looks like Tika is now used in the Eclipse Mylyn plugin! http://planet.eclipse.org/planet/ http://blog.resheim.net/2014/11/epub-tools-in-mylyn-3-12.html That is super cool to know that every time

[jira] [Commented] (TIKA-1445) Figure out how to add Image metadata extraction to Tesseract parser

2014-11-18 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217100#comment-14217100 ] Lewis John McGibbney commented on TIKA-1445: We can run many extractors against

[jira] [Commented] (TIKA-1445) Figure out how to add Image metadata extraction to Tesseract parser

2014-11-18 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217407#comment-14217407 ] Lewis John McGibbney commented on TIKA-1445: OK so in Any23, if we were to take

Kill Buildbot Builds

2014-12-02 Thread Lewis John Mcgibbney
Hi Folks, Wanted to poll the dev@ list and see if the Buildbot builds at ci.apache.org are required? We have nightly and also hourly polling builds for Tika trunk against JDK 1.6 and 1.7. Failures and unable builds are shadowed to dev@, so AFAIAC we are *covered* for CI builds including SNAPSHOT

  1   2   3   4   5   >