[jira] [Updated] (TIKA-1610) CBOR Parser and detection improvement

2015-04-21 Thread Luke sh (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke sh updated TIKA-1610: -- Attachment: cbor_tika.mimetypes.xml.jpg rfc_cbor.jpg CBOR Parser and detection improvement

[jira] [Updated] (TIKA-1610) CBOR Parser and detection improvement

2015-04-21 Thread Luke sh (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke sh updated TIKA-1610: -- Description: CBOR is a data format whose design goals include the possibility of extremely small code size,

[jira] [Created] (TIKA-1610) CBOR Parser and detection improvement

2015-04-21 Thread Luke sh (JIRA)
Luke sh created TIKA-1610: - Summary: CBOR Parser and detection improvement Key: TIKA-1610 URL: https://issues.apache.org/jira/browse/TIKA-1610 Project: Tika Issue Type: New Feature

[jira] [Updated] (TIKA-1610) CBOR Parser and detection improvement

2015-04-21 Thread Luke sh (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke sh updated TIKA-1610: -- Description: CBOR is a data format whose design goals include the possibility of extremely small code size,

[jira] [Updated] (TIKA-1610) CBOR Parser and detection improvement

2015-04-21 Thread Luke sh (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke sh updated TIKA-1610: -- Description: CBOR is a data format whose design goals include the possibility of extremely small code size,

[jira] [Updated] (TIKA-1610) CBOR Parser and detection improvement

2015-04-21 Thread Luke sh (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke sh updated TIKA-1610: -- Description: CBOR is a data format whose design goals include the possibility of extremely small code size,

[jira] [Updated] (TIKA-1610) CBOR Parser and detection improvement

2015-04-21 Thread Luke sh (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke sh updated TIKA-1610: -- Attachment: 142440269.html cbor file dumped by the nutch tool. CBOR Parser and detection improvement

[jira] [Updated] (TIKA-1610) CBOR Parser and detection [improvement]

2015-04-21 Thread Luke sh (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke sh updated TIKA-1610: -- Summary: CBOR Parser and detection [improvement] (was: CBOR Parser and detection improvement) CBOR Parser and

[jira] [Resolved] (TIKA-1611) Allow RecursiveParserWrapper to catch exceptions from embedded documents

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1611. --- Resolution: Fixed r1675159. Nothing like testing to see behavior, rather than assumptions. :( Allow

[jira] [Updated] (TIKA-1611) Allow RecursiveParserWrapper to catch exceptions from embedded documents

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1611: -- Description: While parsing embedded documents, currently, if a parser hits an

[jira] [Commented] (TIKA-1612) Exceptions getting image data in PPT files

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505335#comment-14505335 ] Tim Allison commented on TIKA-1612: --- Not sure how we want to fix this. To make this

[jira] [Commented] (TIKA-1611) Allow RecursiveParserWrapper to catch exceptions from embedded documents

2015-04-21 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505358#comment-14505358 ] Hudson commented on TIKA-1611: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #639 (See

[jira] [Commented] (TIKA-879) Detection problem: message/rfc822 file is detected as text/plain.

2015-04-21 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505368#comment-14505368 ] Luis Filipe Nassif commented on TIKA-879: - Yes, thank you very much for testing with

[jira] [Commented] (TIKA-879) Detection problem: message/rfc822 file is detected as text/plain.

2015-04-21 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505367#comment-14505367 ] Luis Filipe Nassif commented on TIKA-879: - Yes, thank you very much for testing with

[jira] [Updated] (TIKA-1611) Allow RecursiveParserWrapper to catch exceptions from embedded documents

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1611: -- Description: While parsing embedded documents, currently, if a parser hits an Exception, the Exception

NUTCH-1994 and UCAR Dependencies

2015-04-21 Thread Lewis John Mcgibbney
Hi Folks, Whilst addressing NUTCH-1994, I've experienced a dependency problem (related to unpublished artifacts on Maven Central) which I am working through right now. When Kaing the upgrade in Nutch, I get the following [ivy:resolve] -- artifact edu.ucar#udunits;4.5.5!udunits.jar:

Detection problem: Parsing scientific source codes for geoscientists

2015-04-21 Thread Oh, Ji-Hyun (329F-Affiliate)
Hi Tika friends, I am currently engaged in a project funded by National Science Foundation. Our goal is to develop a research-friendly environment where geoscientists, like me, can easily find source codes they need. According to a survey, scientists spend a considerable amount of their time

[jira] [Commented] (TIKA-1601) Integrate Jackcess to handle MSAccess files

2015-04-21 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505377#comment-14505377 ] Luis Filipe Nassif commented on TIKA-1601: -- Great! Give me more 3 days to submit

[jira] [Created] (TIKA-1612) Exceptions getting image data in PPT files

2015-04-21 Thread Tim Allison (JIRA)
Tim Allison created TIKA-1612: - Summary: Exceptions getting image data in PPT files Key: TIKA-1612 URL: https://issues.apache.org/jira/browse/TIKA-1612 Project: Tika Issue Type: Bug

[jira] [Commented] (TIKA-1532) DIF Parser

2015-04-21 Thread Konstantin Gribov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504904#comment-14504904 ] Konstantin Gribov commented on TIKA-1532: - {{text/\*+xml}} is quite unusual type.

[jira] [Commented] (TIKA-1513) Add mime detection and parsing for dbf files

2015-04-21 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505057#comment-14505057 ] Luis Filipe Nassif commented on TIKA-1513: -- No, I did not give a try to 0x03. How

[jira] [Commented] (TIKA-1501) Fix the disabled Tika Bundle OSGi related unit tests

2015-04-21 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505051#comment-14505051 ] Hudson commented on TIKA-1501: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #638 (See

[jira] [Updated] (TIKA-1607) Introduce new arbitrary object key/values data structure for persitsence of Tika Metadata

2015-04-21 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1607: --- Summary: Introduce new arbitrary object key/values data structure for persitsence of

[jira] [Updated] (TIKA-1608) RuntimeException on extracting text from Word 97-2004 Document

2015-04-21 Thread Jeremy B. Merrill (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy B. Merrill updated TIKA-1608: Attachment: 1534-attachment.doc document failing under this bug RuntimeException on

[jira] [Commented] (TIKA-1608) RuntimeException on extracting text from Word 97-2004 Document

2015-04-21 Thread Jeremy B. Merrill (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505102#comment-14505102 ] Jeremy B. Merrill commented on TIKA-1608: - POI bug:

[jira] [Commented] (TIKA-1315) Basic list support in WordExtractor

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505008#comment-14505008 ] Tim Allison commented on TIKA-1315: --- Ha. Ok, but your patch is really well done. Let me

[jira] [Commented] (TIKA-1513) Add mime detection and parsing for dbf files

2015-04-21 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504996#comment-14504996 ] Luis Filipe Nassif commented on TIKA-1513: -- Hi Tim, I am ok with 1) and 2). But I

[jira] [Commented] (TIKA-1607) Introduce new HashMapString, Object data structure for persitsence of Tika Metadata

2015-04-21 Thread Ray Gauss II (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505054#comment-14505054 ] Ray Gauss II commented on TIKA-1607: We've had a few discussions on structured metadata

[jira] [Closed] (TIKA-1554) Improve EMF file detection

2015-04-21 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luis Filipe Nassif closed TIKA-1554. Resolution: Fixed Fix Version/s: 1.8 Resolved in r4608ff5. Thanks. Improve EMF file

[jira] [Commented] (TIKA-1608) RuntimeException on extracting text from Word 97-2004 Document

2015-04-21 Thread Jeremy B. Merrill (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505093#comment-14505093 ] Jeremy B. Merrill commented on TIKA-1608: - Hi Tim, I added the document. I'm

[jira] [Commented] (TIKA-1513) Add mime detection and parsing for dbf files

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505092#comment-14505092 ] Tim Allison commented on TIKA-1513: --- Completely agree. Only 2,386 files. This is the

[jira] [Updated] (TIKA-1608) RuntimeException on extracting text from Word 97-2004 Document

2015-04-21 Thread Jeremy B. Merrill (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy B. Merrill updated TIKA-1608: Description: Extracting text from the Word 97-2004 document attached here fails with the

[jira] [Commented] (TIKA-1607) Introduce new HashMapString, Object data structure for persitsence of Tika Metadata

2015-04-21 Thread Sergey Beryozkin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504999#comment-14504999 ] Sergey Beryozkin commented on TIKA-1607: Hi, IMHO it indeed makes sense to keep

[jira] [Resolved] (TIKA-1501) Fix the disabled Tika Bundle OSGi related unit tests

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1501. --- Resolution: Fixed Fix Version/s: 1.9 r1675121. Thank you, [~bobpaulin]! Fix the disabled

[jira] [Created] (TIKA-1611) Allow RecursiveParserWrapper to catch exceptions from embedded documents

2015-04-21 Thread Tim Allison (JIRA)
Tim Allison created TIKA-1611: - Summary: Allow RecursiveParserWrapper to catch exceptions from embedded documents Key: TIKA-1611 URL: https://issues.apache.org/jira/browse/TIKA-1611 Project: Tika

[jira] [Commented] (TIKA-1315) Basic list support in WordExtractor

2015-04-21 Thread Moritz Dorka (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505004#comment-14505004 ] Moritz Dorka commented on TIKA-1315: Well, the original patch by Filip is essentially

[jira] [Commented] (TIKA-1513) Add mime detection and parsing for dbf files

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505006#comment-14505006 ] Tim Allison commented on TIKA-1513: --- Y, I was concerned by that generally. Are you

[jira] [Commented] (TIKA-1513) Add mime detection and parsing for dbf files

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504951#comment-14504951 ] Tim Allison commented on TIKA-1513: --- From govdocs1, it looks like first byte of 0X03 is a

Re: [ANNOUNCE] Apache Tika 1.8 Released

2015-04-21 Thread Mattmann, Chris A (3980)
Yay thanks Tyler! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email:

[jira] [Commented] (TIKA-1315) Basic list support in WordExtractor

2015-04-21 Thread Moritz Dorka (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505042#comment-14505042 ] Moritz Dorka commented on TIKA-1315: I believe I could speed up the process by

[jira] [Commented] (TIKA-1608) RuntimeException on extracting text from Word 97-2004 Document

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504871#comment-14504871 ] Tim Allison commented on TIKA-1608: --- [~jeremybmerrill], thank you for raising this issue.

RE: [ANNOUNCE] Apache Tika 1.8 Released

2015-04-21 Thread Allison, Timothy B.
Thank you, Tyler! -Original Message- From: Tyler Palsulich [mailto:tpalsul...@apache.org] Sent: Monday, April 20, 2015 5:09 PM To: dev@tika.apache.org; u...@tika.apache.org; annou...@apache.org Subject: [ANNOUNCE] Apache Tika 1.8 Released The Apache Tika project is pleased to announce

[jira] [Commented] (TIKA-1295) Make some Dublin Core items multi-valued

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504884#comment-14504884 ] Tim Allison commented on TIKA-1295: --- [~lewismc], +1 to adding potential for hierarchical

[jira] [Commented] (TIKA-1608) RuntimeException on extracting text from Word 97-2004 Document

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505113#comment-14505113 ] Tim Allison commented on TIKA-1608: --- In govdocs1, there are 24 of these: {noformat}

[jira] [Commented] (TIKA-879) Detection problem: message/rfc822 file is detected as text/plain.

2015-04-21 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505132#comment-14505132 ] Luis Filipe Nassif commented on TIKA-879: - Maybe we could keep the original magics

[jira] [Commented] (TIKA-1554) Improve EMF file detection

2015-04-21 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505172#comment-14505172 ] Luis Filipe Nassif commented on TIKA-1554: -- Actually r1667661 Improve EMF file

[jira] [Commented] (TIKA-1608) RuntimeException on extracting text from Word 97-2004 Document

2015-04-21 Thread Jeremy B. Merrill (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505178#comment-14505178 ] Jeremy B. Merrill commented on TIKA-1608: - It's the only one I've found so far out

Re: NUTCH-1994 and UCAR Dependencies

2015-04-21 Thread Tyler Palsulich
Hi Lewis, I also tried upgrading Tika in Nutch. But, ran into the same issue (but, udunits is found, as expected): [ivy:retrieve] :: [ivy:retrieve] :: UNRESOLVED DEPENDENCIES :: [ivy:retrieve]

[GitHub] tika pull request: add entry for cbor glob extension in the tika-m...

2015-04-21 Thread LukeLiush
GitHub user LukeLiush opened a pull request: https://github.com/apache/tika/pull/42 add entry for cbor glob extension in the tika-mimetypes.xml You can merge this pull request into a Git repository by running: $ git pull https://github.com/LukeLiush/tika cborExtension

[jira] [Commented] (TIKA-1601) Integrate Jackcess to handle MSAccess files

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505633#comment-14505633 ] Tim Allison commented on TIKA-1601: --- I don't. That's half the fun of a patch, right. :)

[jira] [Commented] (TIKA-1513) Add mime detection and parsing for dbf files

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506214#comment-14506214 ] Tim Allison commented on TIKA-1513: --- In looking at

[jira] [Assigned] (TIKA-1610) CBOR Parser and detection [improvement]

2015-04-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned TIKA-1610: --- Assignee: Chris A. Mattmann CBOR Parser and detection [improvement]

[jira] [Commented] (TIKA-1610) CBOR Parser and detection [improvement]

2015-04-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506359#comment-14506359 ] Chris A. Mattmann commented on TIKA-1610: - Applied Pull request #42 thanks

[GitHub] tika pull request: add entry for cbor glob extension in the tika-m...

2015-04-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/tika/pull/42 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

Re: NUTCH-1994 and UCAR Dependencies

2015-04-21 Thread Mattmann, Chris A (3980)
Thanks Lewis! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email:

Re: [memex-jpl] this week action from luke

2015-04-21 Thread Chris Mattmann
Thanks Luke. So I guess all I was asking was could you try it out. Thanks for the lesson in the RFC. Cheers, Chris Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Luke hanson311...@gmail.com Date: Wednesday, April 22, 2015 at 1:46 AM To:

[jira] [Commented] (TIKA-1610) CBOR Parser and detection [improvement]

2015-04-21 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506414#comment-14506414 ] Hudson commented on TIKA-1610: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #640 (See

RE: [memex-jpl] this week action from luke

2015-04-21 Thread Luke
Hi professor, I think it highly depends on the content being read by tika, e.g. if there is a sequence of bytes in the file that is being read and is the same as one or more of mime types being defined in our tika-mimes.xml, I guess that tika will put those types in its estimation list,

Re: Detection problem: Parsing scientific source codes for geoscientists

2015-04-21 Thread Nick Burch
On Tue, 21 Apr 2015, Oh, Ji-Hyun (329F-Affiliate) wrote: For the first step, I listed up the file formats that widely used in climate science. FORTRAN (.f, .f90, f77) Python (.py) R (.R) Matlab (.m) GrADS (Grid Analysis and Display System) (.gs) NCL (NCAR Command Language) (.ncl) IDL

Re: NUTCH-1994 and UCAR Dependencies

2015-04-21 Thread Lewis John Mcgibbney
Hi Folks, OK, so the final part of this jigsaw is as follows I've requested a staging area [0] on Sonatype OSSRH to release the MIT licensed 3rd party bzip2 artifacts. I had to Mavenize the project. I will submit this patch to the bzip2 project and hopefully they will pull it in. If not then I

Re: Detection problem: Parsing scientific source codes for geoscientists

2015-04-21 Thread Lewis John Mcgibbney
Hi Ji-Hyun, On Tue, Apr 21, 2015 at 4:15 PM, dev-digest-h...@tika.apache.org wrote: FORTRAN (.f, .f90, f77) Python (.py) R (.R) Matlab (.m) GrADS (Grid Analysis and Display System) (.gs) NCL (NCAR Command Language) (.ncl) IDL (Interactive Data Language) (.pro) NICE list I checked

Re: NUTCH-1994 and UCAR Dependencies

2015-04-21 Thread Lewis John Mcgibbney
Hi Folks, Update On Tue, Apr 21, 2015 at 10:50 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: [ivy:resolve] :: [ivy:resolve] :: edu.ucar#jj2000;5.2: not found [ivy:resolve] :: edu.ucar#udunits;4.5.5: not found

Re: NUTCH-1994 and UCAR Dependencies

2015-04-21 Thread Lewis John Mcgibbney
Patch for Mavenizing the bzip2 project https://code.google.com/p/jbzip2/issues/detail?id=3 Lewis On Tue, Apr 21, 2015 at 4:14 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Folks, OK, so the final part of this jigsaw is as follows I've requested a staging area [0] on Sonatype

[jira] [Commented] (TIKA-879) Detection problem: message/rfc822 file is detected as text/plain.

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505269#comment-14505269 ] Tim Allison commented on TIKA-879: -- Y, will do. Results probably tomorrow. Detection

[jira] [Comment Edited] (TIKA-879) Detection problem: message/rfc822 file is detected as text/plain.

2015-04-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505269#comment-14505269 ] Tim Allison edited comment on TIKA-879 at 4/21/15 5:04 PM: --- Y,