[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098412#comment-15098412
 ] 

Tilman Hausherr commented on TIKA-1830:
---

Another possibility is that the change I mentioned has different implications 
depending on what JDK is used. Btw these files don't have errors with the non 
sequential parser.

> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096866#comment-15096866
 ] 

Tilman Hausherr edited comment on TIKA-1830 at 1/14/16 5:05 PM:


I can't reproduce the difference for the file 074531.pdf. ExtractText returns 
identical results, that makes me doubt on the entire test :-(

(edit: also 362980.pdf, 058103.pdf, and 760707.pdf )

I can reproduce the difference for 290377.pdf, this is because of a change in 
decompression (rev 1709182) that tries to squeeze as much as possible from 
corrupt streams.

There may be some differences due to a bugfix related to "article beads". This 
will mean improved results for files with correct beads, but worse results for 
files where bead rectangles are incorrect.


was (Author: tilman):
I can't reproduce the difference for the file 074531.pdf. ExtractText returns 
identical results, that makes me doubt on the entire test :-(

I can reproduce the difference for 290377.pdf, this is because of a change in 
decompression (rev 1709182) that tries to squeeze as much as possible from 
corrupt streams.

There may be some differences due to a bugfix related to "article beads". This 
will mean improved results for files with correct beads, but worse results for 
files where bead rectangles are incorrect.

> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098465#comment-15098465
 ] 

Tim Allison edited comment on TIKA-1830 at 1/14/16 5:50 PM:


Y, 074531.pdf has uncovered a Tika issue.  I can reproduce the exception with 
{{Tika.getInputStream()}}, but there is no problem if I call {{new 
FileInputStream}} or {{Files.newInputStream()}}.

Were there any changes to stream manipulation...mark/reset etc in 1.8.11 vs 
1.8.10? 

I confirmed that {{Tika.getInputStream}} works with 1.8.10 but not 
1.8.11interesting...

Also confirmed that this problem does not happen in 1.8.11 with the 
NonSequentialParser...only with the classic parser.


was (Author: talli...@mitre.org):
Y, 074531.pdf has uncovered a Tika issue.  I can reproduce the exception with 
{{Tika.getInputStream()}}, but there is no problem if I call {{new 
FileInputStream}} or {{Files.newInputStream()}}.

Were there any changes to stream manipulation...mark/reset etc in 1.8.11 vs 
1.8.10? 

I confirmed that {{Tika.getInputStream}} works with 1.8.10 but not 
1.8.11interesting...

> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098515#comment-15098515
 ] 

Tim Allison commented on TIKA-1830:
---

Doh. Right.  Thank you.

> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098465#comment-15098465
 ] 

Tim Allison edited comment on TIKA-1830 at 1/14/16 5:37 PM:


Y, 074531.pdf has uncovered a Tika issue.  I can reproduce the exception with 
{{Tika.getInputStream()}}, but there is no problem if I call {{new 
FileInputStream}} or {{Files.newInputStream()}}.

Were there any changes to stream manipulation...mark/reset etc in 1.8.11 vs 
1.8.10? 

I confirmed that {{Tika.getInputStream}} works with 1.8.10 but not 
1.8.11interesting...


was (Author: talli...@mitre.org):
Y, 074531.pdf has uncovered a Tika issue.  I can reproduce the exception with 
Tika.getInputStream(), but there is no problem if I call new FileInputStream or 
Files.newInputStream().

> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098465#comment-15098465
 ] 

Tim Allison commented on TIKA-1830:
---

Y, 074531.pdf has uncovered a Tika issue.  I can reproduce the exception with 
Tika.getInputStream(), but there is no problem if I call new FileInputStream or 
Files.newInputStream().

> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098503#comment-15098503
 ] 

Tilman Hausherr commented on TIKA-1830:
---

Not that, but the change I mentioned
https://svn.apache.org/viewvc?view=revision=date=1709182
may play a role.

> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: Tika questions on StackOverflow

2016-01-14 Thread Nick Burch

On Wed, 13 Jan 2016, Allison, Timothy B. wrote:

Are there other consumer lists we should be following?  Elastic Search?


I think Elastic Search only has a forum-type thingy, this probably should 
let you see Tika posts there (not that frequent)

https://discuss.elastic.co/search?q=tika%20category%3A6%20order%3Alatest

Otherwise Alfresco, Nutch and StormCrawler are probably the next biggest 
open source users, I guess?


Nick


[jira] [Commented] (TIKA-1824) Tika 2.0 - Create Initial Parser Modules

2016-01-14 Thread Nick Burch (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097884#comment-15097884
 ] 

Nick Burch commented on TIKA-1824:
--

Tika already supports using a custom classloader for loading parser + detector 
classes + spi files - 
http://tika.apache.org/1.11/api/org/apache/tika/config/TikaConfig.html#TikaConfig%28java.lang.ClassLoader%29

> Tika 2.0 -  Create Initial Parser Modules
> -
>
> Key: TIKA-1824
> URL: https://issues.apache.org/jira/browse/TIKA-1824
> Project: Tika
>  Issue Type: Improvement
>Affects Versions: 2.0
>Reporter: Bob Paulin
>Assignee: Bob Paulin
>
> Create initial break down of parser modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098112#comment-15098112
 ] 

Tim Allison commented on TIKA-1830:
---

Argh...I'll rerun the 1.8.10 batch and see what we get.

> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098427#comment-15098427
 ] 

Tim Allison commented on TIKA-1830:
---

I just tested casting a null object that started life as a null String, and it 
seems not to throw an NPE.

This is probably a Tika issue.  I can replicate the exception via our 
commandline but the file works fine in our GUI...bizarre...  More digging...

> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098393#comment-15098393
 ] 

Tim Allison commented on TIKA-1830:
---

Finished the rerun...and the results look the same.

Question: On PDFBOX-3193, you've set affected versions to 1.8.10 and 1.8.11.  
Are you sure that that affects 1.8.10?  The discovery of that wouldn't have 
happened unless I was actually running 1.8.11. 

In 1.8.10, 074531.pdf has ~30k words.  When I run 1.8.11 as a unit test within 
our PDFParser wrapper, I also get ~30k words.  However, when I rerun our batch 
wrapper around 1.8.11 on this file, I get the same exception in a rerun as I 
did in the original run (reported in the reports attached yesterday).

The exception is:

{noformat}
java.lang.NullPointerException
at org.apache.pdfbox.pdfparser.BaseParser.parseCOSArray(BaseParser.java:1077)
at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1275
at org.apache.pdfbox.pdfparser.BaseParser.parseCOSArray(BaseParser.java:1066)
at 
org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:276)
at 
org.apache.pdfbox.pdfparser.PDFStreamParser.access$000(PDFStreamParser.java:49)
at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:193)
at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:205)
at 
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:256)
at 
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:236)
at 
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:216)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:471)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:395)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:354)
at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:148)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:148)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
{noformat}

I get the same exception when I run this in our batch code with 1 consumer or 
10 consumers...so it isn't a multithreading issuehwill dig some 
more.

As a side note, I thought I wasn't comparing contents if there was an exception 
in one of the files...I need to fix my SQL to make sure this is the case.


> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098401#comment-15098401
 ] 

Tilman Hausherr commented on TIKA-1830:
---

{quote}
On PDFBOX-3193, you've set affected versions to 1.8.10 and 1.8.11. Are you sure 
that that affects 1.8.10? The discovery of that wouldn't have happened unless I 
was actually running 1.8.11. 
{quote}
Indeed, sorry.

> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098401#comment-15098401
 ] 

Tilman Hausherr edited comment on TIKA-1830 at 1/14/16 5:02 PM:


{quote}
On PDFBOX-3193, you've set affected versions to 1.8.10 and 1.8.11. Are you sure 
that that affects 1.8.10? The discovery of that wouldn't have happened unless I 
was actually running 1.8.11. 
{quote}
Indeed, sorry. Fixed.


was (Author: tilman):
{quote}
On PDFBOX-3193, you've set affected versions to 1.8.10 and 1.8.11. Are you sure 
that that affects 1.8.10? The discovery of that wouldn't have happened unless I 
was actually running 1.8.11. 
{quote}
Indeed, sorry.

> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1830) Upgrade to PDFBox 1.8.11 when available

2016-01-14 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098418#comment-15098418
 ] 

Tilman Hausherr commented on TIKA-1830:
---

The line at {{BaseParser.java:1077}} is
{code}
COSInteger number = (COSInteger)po.remove( po.size() -1 );
{code}
po is never null, it is created earlier. Or would there be an NPE if 
{{po.remove}} returns null?

> Upgrade to PDFBox 1.8.11 when available
> ---
>
> Key: TIKA-1830
> URL: https://issues.apache.org/jira/browse/TIKA-1830
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
> Attachments: reports_pdfbox_1_8_11-rc1.zip
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


WMF extraction

2016-01-14 Thread Andreas Beeker
Hi,

POI will have a WMF module (org.apache.poi.hwmf.*) in the next beta.
Looking over the govdocs collection, those embedded wmfs might contain 
interesting
information for TIKA.

Although my main goal is to integrate the rendering for common sl,
it shouldn't be to laborious to provide something afterwards.

Should the output be part of the embedding document, e.g. ppt, or
does it make sense to crawl over various extensions and extract those metadata
separately?
(I haven't checked how the parsers are called, so this might be nonsense ...)

Andi


[GitHub] tika pull request: tika_2.x

2016-01-14 Thread kulkarniachyut
GitHub user kulkarniachyut opened a pull request:

https://github.com/apache/tika/pull/70

tika_2.x

test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/tika 2.x

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tika/pull/70.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #70


commit 64f606e1d5154154673df62e2067e35ee5026087
Author: Bob Paulin 
Date:   2015-12-01T03:47:16Z

Created 2.x Branch

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1717371 
13f79535-47bb-0310-9956-ffa450edef68

commit 1034ba7605d37bc80839282852f3569ee08346f3
Author: Bob Paulin 
Date:   2015-12-01T04:08:25Z

Moved Versions to 2.0-SNAPSHOT

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1717374 
13f79535-47bb-0310-9956-ffa450edef68

commit 04bd7e34d26b25f5108cab4683acc2b60d96b848
Author: Nick Burch 
Date:   2015-12-01T23:58:32Z

Change the default LoadErrorHandler for Tika 2.x to be warn (TIKA-1805)

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1717557 
13f79535-47bb-0310-9956-ffa450edef68

commit f49f155a2bd5688a5c88b18bf19d9b7a2c9bd1de
Author: Nick Burch 
Date:   2015-12-02T00:33:37Z

Change what CLIRR checks against - we expect breakages vs Tika Core 1.0, 
that is why it is 2.0!

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1717559 
13f79535-47bb-0310-9956-ffa450edef68

commit cd50cd1843f12c1fc08af26d4ba5cf6d19c3452f
Author: Nick Burch 
Date:   2015-12-02T00:33:41Z

TIKA-1805 Notify via LoadErrorHandler if DefaultParser or DefaultDetector 
could not find any implementations of their service classes

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1717560 
13f79535-47bb-0310-9956-ffa450edef68

commit 5ab8e2b3e5fb66d5f1c38c6536742dfbe71d564d
Author: Bob Paulin 
Date:   2015-12-03T23:48:49Z

TIKA-1807 - Adding PAX-Exam to parent to allow standard test framework 
versions.

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1717881 
13f79535-47bb-0310-9956-ffa450edef68

commit f28173035134ebba1e1696cfb225f53b258e8af6
Author: Bob Paulin 
Date:   2015-12-13T19:34:17Z

TIKA-1809 Enhanced Tika OSGi Service with test for core bundle.

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1719820 
13f79535-47bb-0310-9956-ffa450edef68

commit cc4a2bb1a924336d0fd7f79013bcab64588c8d13
Author: Bob Paulin 
Date:   2015-12-13T19:35:33Z

TIKA-1810 - Tika Parser Module Parent POM

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1719822 
13f79535-47bb-0310-9956-ffa450edef68

commit db9c6c28985b468af9ee8128e6402f9842f88048
Author: Bob Paulin 
Date:   2015-12-13T19:37:44Z

TIKA-1811 - Tika Multimedia Module

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1719824 
13f79535-47bb-0310-9956-ffa450edef68

commit 19c8259be0fbb38e2953a97ecf300da76bb2bfab
Author: Bob Paulin 
Date:   2015-12-13T19:40:03Z

TIKA-1811 - Fixed ignores for svn.

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1719825 
13f79535-47bb-0310-9956-ffa450edef68

commit b3c979bf9f29880882a3be25b69962f833cbf002
Author: Bob Paulin 
Date:   2015-12-14T01:32:23Z

TIKA-1810 - Added tika parser modules to the parent pom

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1719854 
13f79535-47bb-0310-9956-ffa450edef68

commit 00b2b9e97ace292aa41cc95203b2306549694a71
Author: Bob Paulin 
Date:   2015-12-28T23:10:16Z

TIKA-1818 - Decouple test documents from parsers so they can be reused.

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1722027 
13f79535-47bb-0310-9956-ffa450edef68

commit a43670685274490d2ff7424e8ae0c3ea1bd29c93
Author: Bob Paulin 
Date:   2015-12-28T23:14:03Z

TIKA-1818 - Added ignores to test project.

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1722028 
13f79535-47bb-0310-9956-ffa450edef68

commit 016c52fdaaf7250f6b762e8356447598bcd873a4
Author: Bob Paulin 
Date:   2015-12-28T23:22:46Z

TIKA-1812 - Moving multimedia sources to module.

git-svn-id: https://svn.apache.org/repos/asf/tika/branches/2.x@1722029 
13f79535-47bb-0310-9956-ffa450edef68

commit f7109c58b744beca7b579920369a0588def6dde9
Author: Bob Paulin 
Date:   2015-12-28T23:26:00Z

TIKA-1812 - Copying the multimedia module classes back into tika-parsers 
with the maven shade plugin.  This will allow creation of an uber jar.

git-svn-id: 

[GitHub] tika pull request: tika_2.x

2016-01-14 Thread kulkarniachyut
Github user kulkarniachyut closed the pull request at:

https://github.com/apache/tika/pull/70


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---