[tika] 01/05: Add a test .sas7bdat file with labels, and generate the columnar/tabular test file in a few more formats

2018-05-10 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit d0324f8e4fa70fce67d56dc70f611f5535fe229b Author: Nick Burch AuthorDate: Wed May 9 18:19:34 2018 +0100 Add a test

[tika] 05/05: Remaining values to check

2018-05-10 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit cfd62569a8f6bf79ba5d15bb3f4063d49347c7fd Author: Nick Burch AuthorDate: Thu May 10 15:41:16 2018 +0100 Remaining

[tika] 03/05: CSV assert as best we can (no dedicated parser), start on XLS and SAS7BDAT consistency tests

2018-05-10 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 1d7a113cdecf64a97a349d8ff74cad1ecd9127d3 Author: Nick Burch AuthorDate: Thu May 10 13:48:03 2018 +0100 CSV assert as

[tika] 04/05: Check header contents, check data rows count, add XLSX test

2018-05-10 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 7f89db35d066e6c4ae35490c5bad67d376e5365e Author: Nick Burch AuthorDate: Thu May 10 15:13:43 2018 +0100 Check header

[tika] branch master updated: Handle .epub files using .htm rather than .html extensions for the embedded contents (TIKA-1288)

2018-05-09 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new a0ffec1 Handle .epub files using .htm rather than

[tika] branch master updated: Add explicit constructors to the ENVI parser, to allow config dump/load to work

2018-05-04 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new 856cf35 Add explicit constructors to the ENVI

[tika] 01/02: Clean up imports

2018-05-03 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 7d76112903777e909943fba2b50d9689801c6af9 Author: Nick Burch AuthorDate: Thu May 3 21:25:38 2018 +0100 Clean up imports

[tika] 02/02: Stub a unit test for TIKA-2641

2018-05-03 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit d4719f63ffb381dbbfc53e667379389cb26593c1 Author: Nick Burch AuthorDate: Thu May 3 21:56:07 2018 +0100 Stub a unit test

[tika] branch master updated (90720ae -> d4719f6)

2018-05-03 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from 90720ae SAS7BDAT html tests new 7d76112 Clean up imports new d4719f6 Stub a unit test for TIKA-2641 The 2

[tika] 02/02: SAS7BDAT html tests

2018-05-03 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 90720aed2836da7114f6495d61e12fd9af01d4fc Author: Nick Burch AuthorDate: Thu May 3 18:58:14 2018 +0100 SAS7BDAT html

[tika] 01/02: More SAS7BDAT metadata

2018-05-03 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 79f313d85ee558a2ba2e61477ec669ab0281dcb6 Author: Nick Burch AuthorDate: Thu May 3 16:52:27 2018 +0100 More SAS7BDAT

[tika] branch master updated (fb1a85b -> 90720ae)

2018-05-03 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from fb1a85b Merge pull request #235 from lewismc/TIKA-2639 new 79f313d More SAS7BDAT metadata new 90720ae

[tika] branch master updated (95d967b -> fb1a85b)

2018-05-02 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from 95d967b TIKA-2636 ENVI Header metadata fields can span more than one line add 48d2650 TIKA-2639 Update

[tika] 01/01: Merge pull request #235 from lewismc/TIKA-2639

2018-05-02 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit fb1a85ba1e5df0da997468b1da5c274a46011a2e Merge: 95d967b 48d2650 Author: Gagravarr AuthorDate: Wed May 2 11:41:23 2018 +0100

[tika] branch master updated: Some SAS7BDAT metadata and unit testing

2018-04-27 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new 15bcd74 Some SAS7BDAT metadata and unit testing

[tika] 02/06: Depend on Parso for SAS7BDAT support

2018-04-26 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit d036e95c8ee3e41272545efd58810a9bd8fa16c3 Author: Nick Burch AuthorDate: Thu Apr 26 22:24:10 2018 +0100 Depend on Parso

[tika] 04/06: TIKA-2462 Initial parser for SAS7BDAT files powered by Parso (now ASLv2). Still to do: Metadata, Unit Tests, Consistency with similar format tests

2018-04-26 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 754fb4c93b7229abc3512168228df2c269fb5274 Author: Nick Burch AuthorDate: Thu Apr 26 23:43:16 2018 +0100 TIKA-2462

[tika] branch master updated (84d64a7 -> 1520197)

2018-04-26 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from 84d64a7 For now, if there's a network problem grabbing dl4j's model, skip the test silently. Do the same thi

[tika] 05/06: XHTML improvements

2018-04-26 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 53b61739b57a6d8e7db081708b3acd00535f8e99 Author: Nick Burch AuthorDate: Fri Apr 27 00:06:21 2018 +0100 XHTML

[tika] 06/06: Changelog update

2018-04-26 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 152019715856d4f6f565850674e96100420f9359 Author: Nick Burch AuthorDate: Fri Apr 27 00:07:05 2018 +0100 Changelog

[tika] 01/06: Test Columnar files - SAS7BDAT and CSV (other spreadsheet+DB formats still required)

2018-04-26 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit c3f75fdd62503d95ab7bfbb74a1bde7d65cce44d Author: Nick Burch AuthorDate: Thu Apr 26 18:37:14 2018 +0100 Test Columnar

[tika] 03/06: Add parso to the OSGi bundle

2018-04-26 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 6562431ed9b75c821e68ce3e6675712ed0680724 Author: Nick Burch AuthorDate: Thu Apr 26 22:47:17 2018 +0100 Add parso to

svn commit: r1830103 - in /tika/site/src/site/apt/1.19: ./ configuring.apt detection.apt formats.apt gettingstarted.apt index.apt parser.apt parser_guide.apt

2018-04-25 Thread nick
Author: nick Date: Wed Apr 25 17:11:54 2018 New Revision: 1830103 URL: http://svn.apache.org/viewvc?rev=1830103&view=rev Log: As per the release guide, start the examples and format pages for 1.19 Added: tika/site/src/site/apt/1.19/ - copied from r1830102, tika/site/src/site/apt/

[tika] branch master updated: Serialise the details of multiple parsers

2018-04-08 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new c05c524 Serialise the details of multiple parsers

[tika] branch master updated (c8b9b44 -> 742e60e)

2018-04-08 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from c8b9b44 TIKA-2626 add aa3134d Begin implementing the Supplemental and Fallback parsers from https

[tika] branch multiple-parsers updated (26391d2 -> 742e60e)

2018-04-08 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git. from 26391d2 Ignore expected warnings, mark TODOs done add ee9e4f4 TIKA-2579 and TIKA-2607: Upgrade PDFBox

[tika] 01/01: Merge branch 'master' of https://github.com/apache/tika into multiple-parsers

2018-04-08 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 742e60e20363aa66db992df4d1cae26b18060f45 Merge: 26391d2 c8b9b44 Author: Nick Burch AuthorDate: Sun Apr 8 13:15:22

[tika] 02/05: Pass the params to the composite parser constructors

2018-04-08 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit be246f1d395007efc26e6e1648ed6dc2912c2efa Author: Nick Burch AuthorDate: Sun Apr 8 10:43:05 2018 +0100 Pass

[tika] branch multiple-parsers updated (d909b77 -> 26391d2)

2018-04-08 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git. from d909b77 Preseve old fallback decorator behaviour with explicit mime types override new 4665561 Support

[tika] 04/05: TikaConfig loading of Multiple Parsers with Policy

2018-04-08 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit f412d56218ffe21f65be4640aaf435c9896b3d16 Author: Nick Burch AuthorDate: Sun Apr 8 13:08:19 2018 +0100

[tika] 03/05: MultipleParser constructor that accepts Params

2018-04-08 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit dda50b6a82dcf45d38dc5a89cb60cc5ca82a01f6 Author: Nick Burch AuthorDate: Sun Apr 8 13:00:23 2018 +0100

[tika] 01/05: Support loading well known enum params

2018-04-08 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 4665561c883d3eab2a34d55edab64bb97977b4e1 Author: Nick Burch AuthorDate: Sun Apr 8 10:37:21 2018 +0100

[tika] 05/05: Ignore expected warnings, mark TODOs done

2018-04-08 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 26391d292df6ac001f812ab252659de30114bf8c Author: Nick Burch AuthorDate: Sun Apr 8 13:09:49 2018 +0100 Ignore

[tika] 02/05: Replace the old experimental Fallback ParserDecorator code with a call to the new FallbackParser

2018-04-04 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 4e7bdaf5b1cb6964075007a1f437b3d6fd08d139 Author: Nick Burch AuthorDate: Wed Apr 4 08:11:13 2018 +0100

[tika] 03/05: Mark the ContentHandlerFactory override parse method as still experimental, pending more feedback and a decision on returning a list of created handlers

2018-04-04 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 6fde37491d70ceefdc1181d7fa5d2d40be2720e4 Author: Nick Burch AuthorDate: Wed Apr 4 08:14:21 2018 +0100 Mark

[tika] 01/05: List of parsers can just be a Collection, does not require a list

2018-04-04 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit a830e20f544bb06bc5747cdf585f63243432677b Author: Nick Burch AuthorDate: Wed Apr 4 08:10:35 2018 +0100 List

[tika] 04/05: Fix XML to be valid

2018-04-04 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit a55a1de2017a7def07e2219d8ef0a718b62f622d Author: Nick Burch AuthorDate: Wed Apr 4 08:49:18 2018 +0100 Fix

[tika] branch multiple-parsers updated (54477aa -> d909b77)

2018-04-04 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git. from 54477aa Simplify stream resetting logic by using new ParserUtil methods for it new a830e20 List of

[tika] 05/05: Preseve old fallback decorator behaviour with explicit mime types override

2018-04-04 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit d909b7763689a70e0422f43282e4ee67978cd471 Author: Nick Burch AuthorDate: Wed Apr 4 08:49:43 2018 +0100

[tika] 02/02: Simplify stream resetting logic by using new ParserUtil methods for it

2018-03-21 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 54477aa3904944ca964e1f70188ac69b9f994042 Author: Nick Burch AuthorDate: Wed Mar 21 08:40:23 2018 +

[tika] branch multiple-parsers updated (6514a00 -> 54477aa)

2018-03-21 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git. from 6514a00 Merge commit '682c38d' into multiple-parsers new 5bd8b28 ParserUtils methods for ha

[tika] 01/02: ParserUtils methods for handling the reset/re-read of the stream

2018-03-21 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 5bd8b281c932d39aac4b6d7babe0732f853ed2ae Author: Nick Burch AuthorDate: Wed Mar 21 08:37:17 2018 +

[tika] branch master updated (1df10c3 -> 682c38d)

2018-03-21 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from 1df10c3 TIKA-2604 -- properly escape (or not) class path in windows and linux environments. add 682c38d TIKA

[tika] branch multiple-parsers updated (f80fc23 -> 6514a00)

2018-03-21 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git. from f80fc23 All obvious places that need changing have, alias back in the original name for compatibility

[tika] 01/01: Merge commit '682c38d' into multiple-parsers

2018-03-21 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 6514a007940840ef83b896c05a165aa45c9d4a6e Merge: f80fc23 682c38d Author: Nick Burch AuthorDate: Wed Mar 21 08:25

[tika] branch multiple-parsers updated: All obvious places that need changing have, alias back in the original name for compatibility

2018-03-19 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/multiple-parsers by this push: new f80fc23 All obvious places

[tika] 01/02: Remove un-used reference

2018-03-19 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit f1f30673fea0d08f138488154c27f2c2205dd14f Author: Nick Burch AuthorDate: Mon Mar 19 08:55:43 2018 +

[tika] 02/02: Fix test references to embedded exception property definition

2018-03-19 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 50f8591c6c87b756d6d8881c95f9794c51ffa240 Author: Nick Burch AuthorDate: Mon Mar 19 10:42:01 2018 + Fix

[tika] branch multiple-parsers updated (12a98b6 -> 50f8591)

2018-03-19 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git. from 12a98b6 Keep all implemented and unit test new f1f3067 Remove un-used reference new 50f8591 Fix

[tika] 01/04: Correct Metadata merging by policy, and get (incomplete) unit tests to pass

2018-03-14 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit d181c3c05a70163597dd40a9e1b140303ec2739f Author: Nick Burch AuthorDate: Wed Mar 14 16:53:26 2018 +

[tika] 03/04: Optionally use a new Handler for each Parser, if a factory was given

2018-03-14 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit c989d2f6aedcdf5375719ea1abe6ea136fc0e92c Author: Nick Burch AuthorDate: Wed Mar 14 17:28:07 2018 +

[tika] 02/04: Further unit tests

2018-03-14 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit e82f7ce5743536b406fd6971016f6f048f42b3f5 Author: Nick Burch AuthorDate: Wed Mar 14 17:15:27 2018 +

[tika] branch multiple-parsers updated (6a39214 -> 12a98b6)

2018-03-14 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git. from 6a39214 Some (currently failing) Supplemental Parser tests new d181c3c Correct Metadata merging by

[tika] 04/04: Keep all implemented and unit test

2018-03-14 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 12a98b63babc8515177d6f0e3df17ae8912142ee Author: Nick Burch AuthorDate: Wed Mar 14 17:35:01 2018 + Keep

[tika] 03/03: Some (currently failing) Supplemental Parser tests

2018-03-14 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 6a39214cc8303d393d0c5c288a973398d25a94c3 Author: Nick Burch AuthorDate: Wed Mar 14 07:01:36 2018 + Some

[tika] 02/03: Give parserCompleted the ParseContext, use that to pass around for the pick-best-text case what charsets to try next and what text we got from them

2018-03-14 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 62b02b0af4bb9260dc9417b5537144a0744fa55a Author: Nick Burch AuthorDate: Wed Mar 14 06:42:30 2018 + Give

[tika] branch multiple-parsers updated (348bfb9 -> 6a39214)

2018-03-14 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git. from 348bfb9 More metadata handling between parsers, start on unit testing new 819898f Start on a multiple

[tika] 01/03: Start on a multiple parser that would try several text encodings, pick the best and use that, to ensure it would be possible

2018-03-14 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 819898fbde33384844ebc6b2caa4e6c6986463cf Author: Nick Burch AuthorDate: Wed Mar 14 06:28:12 2018 + Start

[tika] 07/13: Pull common "Real Parser" identification logic out to utils

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit d229ab6f666cde8b007f568b13001a2c780ff477 Author: Nick Burch AuthorDate: Tue Mar 13 15:10:16 2018 + Pull

[tika] 08/13: Use utils for recording details of the parser used

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit f4a926ca94c50a6158891c7746e725cd720a2faa Author: Nick Burch AuthorDate: Tue Mar 13 15:13:19 2018 + Use

[tika] 02/13: Add TODOs for code to be shared/copied with other areas

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 62cf6f6cb3539ffbdb2886ff5485a997b0fe6773 Author: Nick Burch AuthorDate: Tue Mar 13 07:17:41 2018 + Add

[tika] 05/13: Prepare to track metadata between parsers

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 427417c5d17f1e03724f3e6ded64779bf7366677 Author: Nick Burch AuthorDate: Tue Mar 13 15:04:43 2018 +

[tika] 04/13: Pull out deep Metadata clone to a utils method for re-use

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit d5a06ba6d17b0846cfc58b2e3c0a3df6abc31b0c Author: Nick Burch AuthorDate: Tue Mar 13 15:02:31 2018 + Pull

[tika] 01/13: Name sample config files based on issue number

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 217a9cef62eae3bfdc23882f4483a00baea259fb Author: Nick Burch AuthorDate: Tue Mar 13 07:15:11 2018 + Name

[tika] 10/13: TODO updates, enforce allowed policies

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 9be93c6bef2eabfb5ea93f60549762a2510b2dce Author: Nick Burch AuthorDate: Tue Mar 13 17:03:50 2018 + TODO

[tika] 13/13: More metadata handling between parsers, start on unit testing

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 348bfb9be46036833bbfda38c1912c9bf9eeb06e Author: Nick Burch AuthorDate: Tue Mar 13 18:15:14 2018 + More

[tika] 12/13: Implement some metadata policies for merging values from multiple parsers

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit ee60f5e8ac4002cb6a296adc24cbcb7183cb1f8e Author: Nick Burch AuthorDate: Tue Mar 13 17:43:30 2018 +

[tika] 11/13: Bring over stream reset logic from ParserDecorator and update comments

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 82f6f5f6068d72b2afcb6c47840b9124554afdbf Author: Nick Burch AuthorDate: Tue Mar 13 17:12:34 2018 + Bring

[tika] 03/13: Ignore vim temp files

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 3555745fcbb6a8601dcd1af27a6a9ab07fa40250 Author: Nick Burch AuthorDate: Tue Mar 13 14:54:02 2018 +

[tika] branch multiple-parsers updated (bc8a75e -> 348bfb9)

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git. from bc8a75e Sample fallback and supplemental config files based on https://wiki.apache.org/tika

[tika] 09/13: Move logic for recording embedded parser failures in the metadata to utils, and use for multiple parsers

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit 97b97b345b49b7dd510af560598e6d1ab7baf28c Author: Nick Burch AuthorDate: Tue Mar 13 15:24:41 2018 + Move

[tika] 06/13: Fix exception handling

2018-03-13 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch multiple-parsers in repository https://gitbox.apache.org/repos/asf/tika.git commit c3897db807970e7eb39c87840e4e040713eb759c Author: Nick Burch AuthorDate: Tue Mar 13 15:06:42 2018 + Fix

[tika] branch master updated: Changelog update

2018-02-07 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new e0544ef Changelog update e0544ef is described

[tika] branch master updated: TIKA-2567 Make the Matlib single+no output function magic a bit more specific, to avoid false positives with JavaScript

2018-02-07 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new dbf35b6 TIKA-2567 Make the Matlib single+no output

[tika] 01/04: TIKA-2554 Separate out Makefile from text/plain to a specific subtype

2018-01-25 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit db75e85fc9cf0d5f2c25f7eae2ff8deb59611b00 Author: Nick Burch AuthorDate: Thu Jan 25 14:42:28 2018 + TIKA-2554

[tika] 02/04: TIKA-2554 Separate out Config formats from text/plain to a specific subtype

2018-01-25 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 1ba30ef32fee372566790650e9ab8a36bc9ab807 Author: Nick Burch AuthorDate: Thu Jan 25 14:46:24 2018 + TIKA-2554

[tika] 03/04: Another now-expected difference from HTTPD

2018-01-25 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 9b00c9300bb4a207b34daad4789bb389c7a6fc8b Author: Nick Burch AuthorDate: Thu Jan 25 15:12:18 2018 + Another now

[tika] 04/04: Resync with http://www.apache.org/dev/svn-eol-style.txt , adding new plain-text extensions from there

2018-01-25 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit d72ae53d2e1c767c1e5c6d150bb87b33829d10f0 Author: Nick Burch AuthorDate: Thu Jan 25 15:18:24 2018 + Resync with

[tika] branch master updated (3ce43ad -> d72ae53)

2018-01-25 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from 3ce43ad clean up test dependencies in tika-nlp new db75e85 TIKA-2554 Separate out Makefile from text/plain to a

[tika] branch master updated: Note on Java 7, and suggest new users just download the binaries

2018-01-23 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new e67f220 Note on Java 7, and suggest new users just

[tika] branch master updated (cadbc40 -> 7f6072c)

2018-01-10 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from cadbc40 AC3 magic detection tests add 6a398bd fix for TIKA-1191 contributed by BenRomberg new 7f6072c Merge

[tika] 01/01: Merge branch 'BenRomberg-TIKA-1191'

2018-01-10 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 7f6072ca26dccf1d791c426d4c32068535ddae8a Merge: cadbc40 6a398bd Author: Nick Burch AuthorDate: Thu Jan 11 06:46:21 2018

[tika] 03/04: Changelog update

2017-12-23 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 982003761bdadfc1dcf32b105800d59e5b622c83 Author: Nick Burch AuthorDate: Sat Dec 23 14:11:18 2017 + Changelog

[tika] 01/04: Test AC3 and EAC3 files, produced by ffmpeg from testWAV.wav

2017-12-23 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 1bddca4c382c40dc12a4f3ad674ecec08f9a4347 Author: Nick Burch AuthorDate: Sat Dec 23 14:05:26 2017 + Test AC3 and

[tika] 02/04: Mime magic for AC3 and EAC3 files

2017-12-23 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 10cd2055b43c6983aa4e7d95da53680355f39bac Author: Nick Burch AuthorDate: Sat Dec 23 14:05:46 2017 + Mime magic for

[tika] branch master updated (700b38a -> cadbc40)

2017-12-23 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from 700b38a TIKA-1141 Few more well-known JS library headers new 1bddca4 Test AC3 and EAC3 files, produced by ffmpeg

[tika] 04/04: AC3 magic detection tests

2017-12-23 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit cadbc405519e5adbada1ddb6d2d4beff1f953072 Author: Nick Burch AuthorDate: Sat Dec 23 14:13:48 2017 + AC3 magic

[tika] branch master updated: TIKA-1141 Few more well-known JS library headers

2017-12-21 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new 700b38a TIKA-1141 Few more well-known JS library

[tika] branch master updated: TIKA-1141 - There is no unique magic for JavaScript files, no matter how much we might like there to be... However, to avoid mis-detection, for a few common JS libraries

2017-12-21 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new 9769959 TIKA-1141 - There is no unique magic for

[tika] branch master updated: TIKA-2531 Unit test to ensure that, for Encyrpted RAR files which we do not yet support, a helpful EncryptedDocumentException is thrown

2017-12-18 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new c2c73e9 TIKA-2531 Unit test to ensure that, for

[tika] 02/03: Have the iWorks 13 parser set the content type on the metadata if possible, otherwise remains no-op

2017-10-18 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 5c7547bac9208082920859a5040a8b9fa31da642 Author: Nick Burch AuthorDate: Wed Oct 18 14:59:35 2017 +0100 Have the iWorks

[tika] 03/03: Add notes on why we can't get the Numbers or Pages type just yet - need to call out to another library or decode the Document.iwa snappy stream ourselves

2017-10-18 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 0d92bc862c3c344d65d3f6c260b0f5ea4c389fc0 Author: Nick Burch AuthorDate: Wed Oct 18 15:50:59 2017 +0100 Add notes on

[tika] branch master updated (ad23d84 -> 0d92bc8)

2017-10-18 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from ad23d84 TIKA-2469 -- narrow mime detection for ms-owner files and add detection for nls files. new 17e4b66 A

[tika] 01/03: A dummy parser unit test for iWorks 13

2017-10-18 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 17e4b66410a3bca6c749dda8c49bdb41f3d1b609 Author: Nick Burch AuthorDate: Wed Oct 18 14:31:34 2017 +0100 A dummy parser

[tika] branch master updated: TIKA-2473 PCX and DCX mime magic and detection unit tests

2017-10-06 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new 450ab4b TIKA-2473 PCX and DCX mime magic and

[tika] branch master updated: Add test PCX and DCX files, generated by ImageMagick from the Test PNG file TIKA-2473

2017-10-06 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new abfca01 Add test PCX and DCX files, generated by

[tika] branch master updated (5b57ae4 -> 21c0f37)

2017-09-07 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from 5b57ae4 Merge branch 'gsoc17' add 21c0f37 PicturesSource has been copied to Apache POI, mark the class

[tika] 01/02: TIKA-2447 Inspired by the patch from Jan Burkhardt, do not bother fetching+keeping data from PSD sections we ignore

2017-08-24 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 587e4ae5b0a87e01315156115c8b88d056036f96 Author: Nick Burch AuthorDate: Thu Aug 24 17:09:35 2017 +0100 TIKA-2447

[tika] 02/02: Changelog

2017-08-24 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 930e677dcd4f228d7bcb4f233c1d2a58930ab5d3 Author: Nick Burch AuthorDate: Thu Aug 24 17:10:06 2017 +0100 Changelog

[tika] branch master updated (2c54f93 -> 930e677)

2017-08-24 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from 2c54f93 Changes update new 587e4ae TIKA-2447 Inspired by the patch from Jan Burkhardt, do not bother fetching

<    1   2   3   4   5   6   7   8   9   10   >