Tika Api consumes given stream

2014-11-12 Thread Runomu
I use Apache Tika bundle dependency for a Project to find out MimeTypes for
Files. due to some issues we have to find out through InputStream. it is
actually guaranteed to mark / reset given InputStream. Tika-Bundle includes
core and parser api and uses PoifscontainerDetector , ZipContainerDetector,
OggDetector, MimeTypes and Magic for detection. I have been debugging for 3
hours and all of Detectors mark and reset after detection. I did it in
following way.

TikaInputStream tis = null;
try {
TikaConfig config = new TikaConfig();
tikaDetector = config.getDetector();
tis =  TikaInputStream.get(in);
MediaType mediaType = tikaDetector.detect(tis, new Metadata());

if (mediaType != null) {
String[] types = mediaType.toString().split(",");

for (int i = 0; i < types.length; i++) {
mimeTypes.add(new MimeType(types[i]));
}
}

} catch (Exception e) {
logger.error("Mime Type for given Stream could not be resolved: ",
e);
} 

But Stream is consumed. Does anyone know how to find out MimeTypes without
consuming Stream?








--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tika-Api-consumes-given-stream-tp4168960.html
Sent from the Apache Tika - Development mailing list archive at Nabble.com.


[jira] [Commented] (TIKA-1471) OOM with corrupt PDF file

2014-11-12 Thread Alan Burlison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208383#comment-14208383
 ] 

Alan Burlison commented on TIKA-1471:
-

Running a separate indexer JVM would be safer but up until now I haven't had 
anything that causes fatal errors. I already have to spawn ps2ascii 
(ghostscript) sub-processes for Postscript files as PDFBox doesn't cope with 
some of the older ones in the corpus and the impact on indexing time is 
significant, so I want to do as much as possible from within the same JVM.

bq. I wonder if PDFBOX-2200/TIKA-1424 is the culprit for the memory leak you 
mention.

Adding the workaround from TIKA-1424 (calling 
org.apache.pdfbox.pdmodel.font.PDFont.clearResources) does seem to help a bit 
but I'm a bit wary about calling a static method that affects global state when 
multiple threads are running. I'm therefore just going to call it a the end of 
each index run - they are normally incremental so it's only the initial index 
build that reads the whole corpus. Although mem usage is approx ~4Gb after a 
full reindex I can just restart the appserver if necessary.

Thanks for the helpful hints and tips :-)

> OOM with corrupt PDF file
> -
>
> Key: TIKA-1471
> URL: https://issues.apache.org/jira/browse/TIKA-1471
> Project: Tika
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.6
> Environment: Linux, JVM 1.8.0_25-b17, 64-bit
>Reporter: Alan Burlison
>Priority: Blocker
> Fix For: 1.7
>
>
> Use of PDFBox 1.8.6 by Tika 1.6 is causing OOM errors with corrupt PDF files, 
> due to a bug in PDFBox, see PDFBOX-2493. This makes Tika 1.6 unusable from 
> inside a long-running webapp and I've had to revert to Tika 1.5. Although 1.5 
> also throws errors with the corrupt file it does not cause OOM errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1471) OOM with corrupt PDF file

2014-11-12 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208100#comment-14208100
 ] 

Tim Allison commented on TIKA-1471:
---

Ah, thank you for sharing this use case. The first step for tika-batch is disk 
to disk, but if there are other common use cases, we should add those (more 
robust tika-server, for example).  I've found a separate jvm for Tika alone 
(despite the added storage) is the most robust way to handle large batches of 
potentially dangerous files; keep tika in a separate jvm from the indexer or 
next step in processing.

Right, I had forgotten to mention memory leaks as one of the things integrators 
have to deal with.  Thank you.

I wonder if PDFBOX-2200/TIKA-1424 is the culprit for the memory leak you 
mention.

> OOM with corrupt PDF file
> -
>
> Key: TIKA-1471
> URL: https://issues.apache.org/jira/browse/TIKA-1471
> Project: Tika
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.6
> Environment: Linux, JVM 1.8.0_25-b17, 64-bit
>Reporter: Alan Burlison
>Priority: Blocker
> Fix For: 1.7
>
>
> Use of PDFBox 1.8.6 by Tika 1.6 is causing OOM errors with corrupt PDF files, 
> due to a bug in PDFBox, see PDFBOX-2493. This makes Tika 1.6 unusable from 
> inside a long-running webapp and I've had to revert to Tika 1.5. Although 1.5 
> also throws errors with the corrupt file it does not cause OOM errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1471) OOM with corrupt PDF file

2014-11-12 Thread Alan Burlison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208089#comment-14208089
 ] 

Alan Burlison commented on TIKA-1471:
-

In my case I'm using Tika to extract text from a corpus of around 350,000 
documents, many of which are attachments to emails that I'm in turn handling 
with JavaMail. I therefore don't have an on-disk representation of many of the 
documents so doing all the processing inside the same JVM makes life a little 
easier. To keep performance reasonable I'm also using a thread pool with each 
thread containing a Tika instance which is reused for many (10s of thousands) 
documents . During a full re-index memory use creeps inexorably upwards but as 
I destroy the thread pool after each indexing run the memory is reclaimed.  I'm 
guessing that one or more of the components that Tika uses is a bit tardy in 
releasing memory.

> OOM with corrupt PDF file
> -
>
> Key: TIKA-1471
> URL: https://issues.apache.org/jira/browse/TIKA-1471
> Project: Tika
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.6
> Environment: Linux, JVM 1.8.0_25-b17, 64-bit
>Reporter: Alan Burlison
>Priority: Blocker
> Fix For: 1.7
>
>
> Use of PDFBox 1.8.6 by Tika 1.6 is causing OOM errors with corrupt PDF files, 
> due to a bug in PDFBox, see PDFBOX-2493. This makes Tika 1.6 unusable from 
> inside a long-running webapp and I've had to revert to Tika 1.5. Although 1.5 
> also throws errors with the corrupt file it does not cause OOM errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TIKA-1469) Upgrade to POI 3.11-beta3 when available

2014-11-12 Thread Tim Allison (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated TIKA-1469:
--
Attachment: Upgrade_to_poi-3_11-beta3v1.patch

This doesn't fix the bundle issues, but this should be a good start.

[~gagravarr], do we need to add a dependency on ooxml-security?

Any changes I'd make to bundle would be dangerous given my lack of OSGi 
knowledge, but if the solution is to make optional anything that it can't find, 
then these work:
{noformat}

  org.apache.jcp.xml.dsig.internal.dom;resolution:=optional,

  org.apache.xml.security;resolution:=optional,
  org.apache.xml.security.c14n;resolution:=optional,
  org.apache.xml.security.utils;resolution:=optional,


  org.bouncycastle.cert;resolution:=optional,
  org.bouncycastle.cert.jcajce;resolution:=optional,
  org.bouncycastle.cert.ocsp;resolution:=optional,
  org.bouncycastle.cms.bc;resolution:=optional,
  org.bouncycastle.operator;resolution:=optional,
  org.bouncycastle.operator.bc;resolution:=optional,
  org.bouncycastle.tsp;resolution:=optional,
{noformat}

> Upgrade to POI 3.11-beta3 when available
> 
>
> Key: TIKA-1469
> URL: https://issues.apache.org/jira/browse/TIKA-1469
> Project: Tika
>  Issue Type: Improvement
>Reporter: Tim Allison
>Priority: Minor
> Attachments: Upgrade_to_poi-3_11-beta3v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1446) CHM parser : wrong decompression of aligned blocks

2014-11-12 Thread Hong-Thai Nguyen (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208079#comment-14208079
 ] 

Hong-Thai Nguyen edited comment on TIKA-1446 at 11/12/14 2:38 PM:
--

Hi [~binhawking], I've merged your contribution and make title comparison 
before/after on a local corpus of CHM files.
Before merge, we have only one failed file, after merge we have 10 failed 
files. I've pushed failed CHM files under _test-documents/chm_ & a checking 
test case into: https://github.com/thaichat04/tika
I made also some clean-up.

Any chance you have a look again ?


was (Author: thaichat04):
Hi [~binhawking], I've merge your pull request and make title comparison 
before/after on a local corpus of CHM files.
Before merge, we have only one failed file, after merge we have 10 failed 
files. I've pushed failed CHM files under _test-documents/chm_ & a checking 
test case into: https://github.com/thaichat04/tika
I made also some clean-up.

Any chance you have a look again ?

> CHM parser : wrong decompression of aligned blocks
> --
>
> Key: TIKA-1446
> URL: https://issues.apache.org/jira/browse/TIKA-1446
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 1.7
>Reporter: Bin Hawking
>Priority: Critical
> Attachments: chm.zip
>
>
> If an embedded file contains aligned blocks, the parser outputs chaotic text 
> or empty text as to this file.
> I have fixed it myself, corrected decompressAlignedBlock() and its 
> preparation methods. Mostly this bug is due to misusing main tree/align 
> tree/length tree. And some tree is built wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1446) CHM parser : wrong decompression of aligned blocks

2014-11-12 Thread Hong-Thai Nguyen (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208079#comment-14208079
 ] 

Hong-Thai Nguyen commented on TIKA-1446:


Hi [~binhawking], I've merge your pull request and make title comparison 
before/after on a local corpus of CHM files.
Before merge, we have only one failed file, after merge we have 10 failed 
files. I've pushed failed CHM files under _test-documents/chm_ & a checking 
test case into: https://github.com/thaichat04/tika
I made also some clean-up.

Any chance you have a look again ?

> CHM parser : wrong decompression of aligned blocks
> --
>
> Key: TIKA-1446
> URL: https://issues.apache.org/jira/browse/TIKA-1446
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 1.7
>Reporter: Bin Hawking
>Priority: Critical
> Attachments: chm.zip
>
>
> If an embedded file contains aligned blocks, the parser outputs chaotic text 
> or empty text as to this file.
> I have fixed it myself, corrected decompressAlignedBlock() and its 
> preparation methods. Mostly this bug is due to misusing main tree/align 
> tree/length tree. And some tree is built wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1471) OOM with corrupt PDF file

2014-11-12 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208008#comment-14208008
 ] 

Tim Allison edited comment on TIKA-1471 at 11/12/14 1:00 PM:
-

>From the discussion on PDFBOX-2493, this looks to be solved by PDFBox 1.8.7, 
>which we're now using in trunk.

Thank you, [~alanbur], for reporting this issue on both Tika and PDFBox.  We 
need to fix these serious errors as they are discovered.  

At this point, code that uses Tika needs to be able to handle regular 
exceptions, OOM errors and permanent hangs...these catastrophic errors will 
happen...rarely...but they do happen.  

Use of the ForkParser and tika server can help avoid some of these issues, and 
on TIKA-1330, we're working to develop a robust wrapper around Tika that can 
handle these types of problems so that every integrator doesn't have to 
reinvent the wheel.




was (Author: talli...@mitre.org):
>From the discussion on PDFBOX-2493, this looks to be solved by PDFBox 1.8.8.  
>I'll leave this open until we upgrade.

Thank you, [~alanbur], for reporting this issue on both Tika and PDFBox.  We 
need to fix these serious errors as they are discovered.  

At this point, code that uses Tika needs to be able to handle regular 
exceptions, OOM errors and permanent hangs...these catastrophic errors will 
happen...rarely...but they do happen.  

Use of the ForkParser and tika server can help avoid some of these issues, and 
on TIKA-1330, we're working to develop a robust wrapper around Tika that can 
handle these types of problems so that every integrator doesn't have to 
reinvent the wheel.



> OOM with corrupt PDF file
> -
>
> Key: TIKA-1471
> URL: https://issues.apache.org/jira/browse/TIKA-1471
> Project: Tika
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.6
> Environment: Linux, JVM 1.8.0_25-b17, 64-bit
>Reporter: Alan Burlison
>Priority: Blocker
> Fix For: 1.7
>
>
> Use of PDFBox 1.8.6 by Tika 1.6 is causing OOM errors with corrupt PDF files, 
> due to a bug in PDFBox, see PDFBOX-2493. This makes Tika 1.6 unusable from 
> inside a long-running webapp and I've had to revert to Tika 1.5. Although 1.5 
> also throws errors with the corrupt file it does not cause OOM errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TIKA-1471) OOM with corrupt PDF file

2014-11-12 Thread Tim Allison (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated TIKA-1471:
--
Fix Version/s: 1.7

> OOM with corrupt PDF file
> -
>
> Key: TIKA-1471
> URL: https://issues.apache.org/jira/browse/TIKA-1471
> Project: Tika
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.6
> Environment: Linux, JVM 1.8.0_25-b17, 64-bit
>Reporter: Alan Burlison
>Priority: Blocker
> Fix For: 1.7
>
>
> Use of PDFBox 1.8.6 by Tika 1.6 is causing OOM errors with corrupt PDF files, 
> due to a bug in PDFBox, see PDFBOX-2493. This makes Tika 1.6 unusable from 
> inside a long-running webapp and I've had to revert to Tika 1.5. Although 1.5 
> also throws errors with the corrupt file it does not cause OOM errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1471) OOM with corrupt PDF file

2014-11-12 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208008#comment-14208008
 ] 

Tim Allison commented on TIKA-1471:
---

>From the discussion on PDFBOX-2493, this looks to be solved by PDFBox 1.8.8.  
>I'll leave this open until we upgrade.

Thank you, [~alanbur], for reporting this issue on both Tika and PDFBox.  We 
need to fix these serious errors as they are discovered.  

At this point, code that uses Tika needs to be able to handle regular 
exceptions, OOM errors and permanent hangs...these catastrophic errors will 
happen...rarely...but they do happen.  

Use of the ForkParser and tika server can help avoid some of these issues, and 
on TIKA-1330, we're working to develop a robust wrapper around Tika that can 
handle these types of problems so that every integrator doesn't have to 
reinvent the wheel.



> OOM with corrupt PDF file
> -
>
> Key: TIKA-1471
> URL: https://issues.apache.org/jira/browse/TIKA-1471
> Project: Tika
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.6
> Environment: Linux, JVM 1.8.0_25-b17, 64-bit
>Reporter: Alan Burlison
>Priority: Blocker
>
> Use of PDFBox 1.8.6 by Tika 1.6 is causing OOM errors with corrupt PDF files, 
> due to a bug in PDFBox, see PDFBOX-2493. This makes Tika 1.6 unusable from 
> inside a long-running webapp and I've had to revert to Tika 1.5. Although 1.5 
> also throws errors with the corrupt file it does not cause OOM errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1472) Warning on Tika Server startup - Failed to load class "org.slf4j.impl.StaticLoggerBinder"

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207974#comment-14207974
 ] 

Hudson commented on TIKA-1472:
--

SUCCESS: Integrated in tika-trunk-jdk1.6 #288 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.6/288/])
Fix for TIKA-1472 Warning on Tika Server startup - Failed to load class 
org.slf4j.impl.StaticLoggerBinder contributed by Konstantin Gribov 
 this closes #22. (mattmann: 
http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1638761)
* /tika/trunk/CHANGES.txt
* /tika/trunk/tika-server/pom.xml


> Warning on Tika Server startup - Failed to load class 
> "org.slf4j.impl.StaticLoggerBinder"
> -
>
> Key: TIKA-1472
> URL: https://issues.apache.org/jira/browse/TIKA-1472
> Project: Tika
>  Issue Type: Bug
>  Components: server
>Affects Versions: 1.6
> Environment: Windows 8, JDK 1.8, Maven 3.2.3
>Reporter: Darya Arbuzova
>Assignee: Chris A. Mattmann
>Priority: Minor
> Fix For: 1.7
>
> Attachments: 0001-Added-slf4j-jcl-impl-to-tika-server-deps.patch
>
>
> Hello!
> I want to use Apache Tika in server mode.
> I downloaded {{tika-server-1.6.jar}} from 
> http://mirror.vorboss.net/apache/tika/
> When I try to start the server, I get
> {{SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".}}
> So I go to the link you direct me to 
> (http://www.slf4j.org/codes.html#StaticLoggerBinder), download other slfj4 
> {{jar}}-files, but what next? I can't put them to the "class path", since I 
> don't have a project. I can't change dependencies in {{pom.xml}} for the same 
> reason. Whant should I do?
> I tried downloading the whole source code, but couldn't build it using Maven, 
> still haven't figured out why. Previous discussion see here:
> https://issues.apache.org/jira/browse/TIKA-1470
> Thank you!
> Best regards,
> Darya Arbuzova



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1472) Warning on Tika Server startup - Failed to load class "org.slf4j.impl.StaticLoggerBinder"

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207956#comment-14207956
 ] 

Hudson commented on TIKA-1472:
--

SUCCESS: Integrated in tika-trunk-jdk1.7 #308 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.7/308/])
Fix for TIKA-1472 Warning on Tika Server startup - Failed to load class 
org.slf4j.impl.StaticLoggerBinder contributed by Konstantin Gribov 
 this closes #22. (mattmann: 
http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1638761)
* /tika/trunk/CHANGES.txt
* /tika/trunk/tika-server/pom.xml


> Warning on Tika Server startup - Failed to load class 
> "org.slf4j.impl.StaticLoggerBinder"
> -
>
> Key: TIKA-1472
> URL: https://issues.apache.org/jira/browse/TIKA-1472
> Project: Tika
>  Issue Type: Bug
>  Components: server
>Affects Versions: 1.6
> Environment: Windows 8, JDK 1.8, Maven 3.2.3
>Reporter: Darya Arbuzova
>Assignee: Chris A. Mattmann
>Priority: Minor
> Fix For: 1.7
>
> Attachments: 0001-Added-slf4j-jcl-impl-to-tika-server-deps.patch
>
>
> Hello!
> I want to use Apache Tika in server mode.
> I downloaded {{tika-server-1.6.jar}} from 
> http://mirror.vorboss.net/apache/tika/
> When I try to start the server, I get
> {{SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".}}
> So I go to the link you direct me to 
> (http://www.slf4j.org/codes.html#StaticLoggerBinder), download other slfj4 
> {{jar}}-files, but what next? I can't put them to the "class path", since I 
> don't have a project. I can't change dependencies in {{pom.xml}} for the same 
> reason. Whant should I do?
> I tried downloading the whole source code, but couldn't build it using Maven, 
> still haven't figured out why. Previous discussion see here:
> https://issues.apache.org/jira/browse/TIKA-1470
> Thank you!
> Best regards,
> Darya Arbuzova



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TIKA-1472) Warning on Tika Server startup - Failed to load class "org.slf4j.impl.StaticLoggerBinder"

2014-11-12 Thread Chris A. Mattmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann resolved TIKA-1472.
-
   Resolution: Fixed
Fix Version/s: 1.7

- merged pull request #22 into master in r1638761. Thanks to Konstantin Gribov 
 for the patch!

> Warning on Tika Server startup - Failed to load class 
> "org.slf4j.impl.StaticLoggerBinder"
> -
>
> Key: TIKA-1472
> URL: https://issues.apache.org/jira/browse/TIKA-1472
> Project: Tika
>  Issue Type: Bug
>  Components: server
>Affects Versions: 1.6
> Environment: Windows 8, JDK 1.8, Maven 3.2.3
>Reporter: Darya Arbuzova
>Assignee: Chris A. Mattmann
>Priority: Minor
> Fix For: 1.7
>
> Attachments: 0001-Added-slf4j-jcl-impl-to-tika-server-deps.patch
>
>
> Hello!
> I want to use Apache Tika in server mode.
> I downloaded {{tika-server-1.6.jar}} from 
> http://mirror.vorboss.net/apache/tika/
> When I try to start the server, I get
> {{SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".}}
> So I go to the link you direct me to 
> (http://www.slf4j.org/codes.html#StaticLoggerBinder), download other slfj4 
> {{jar}}-files, but what next? I can't put them to the "class path", since I 
> don't have a project. I can't change dependencies in {{pom.xml}} for the same 
> reason. Whant should I do?
> I tried downloading the whole source code, but couldn't build it using Maven, 
> still haven't figured out why. Previous discussion see here:
> https://issues.apache.org/jira/browse/TIKA-1470
> Thank you!
> Best regards,
> Darya Arbuzova



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1472) Warning on Tika Server startup - Failed to load class "org.slf4j.impl.StaticLoggerBinder"

2014-11-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207940#comment-14207940
 ] 

ASF GitHub Bot commented on TIKA-1472:
--

Github user asfgit closed the pull request at:

https://github.com/apache/tika/pull/22


> Warning on Tika Server startup - Failed to load class 
> "org.slf4j.impl.StaticLoggerBinder"
> -
>
> Key: TIKA-1472
> URL: https://issues.apache.org/jira/browse/TIKA-1472
> Project: Tika
>  Issue Type: Bug
>  Components: server
>Affects Versions: 1.6
> Environment: Windows 8, JDK 1.8, Maven 3.2.3
>Reporter: Darya Arbuzova
>Assignee: Chris A. Mattmann
>Priority: Minor
> Fix For: 1.7
>
> Attachments: 0001-Added-slf4j-jcl-impl-to-tika-server-deps.patch
>
>
> Hello!
> I want to use Apache Tika in server mode.
> I downloaded {{tika-server-1.6.jar}} from 
> http://mirror.vorboss.net/apache/tika/
> When I try to start the server, I get
> {{SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".}}
> So I go to the link you direct me to 
> (http://www.slf4j.org/codes.html#StaticLoggerBinder), download other slfj4 
> {{jar}}-files, but what next? I can't put them to the "class path", since I 
> don't have a project. I can't change dependencies in {{pom.xml}} for the same 
> reason. Whant should I do?
> I tried downloading the whole source code, but couldn't build it using Maven, 
> still haven't figured out why. Previous discussion see here:
> https://issues.apache.org/jira/browse/TIKA-1470
> Thank you!
> Best regards,
> Darya Arbuzova



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] tika pull request: Added slf4j-jcl impl to tika-server deps.

2014-11-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/tika/pull/22


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (TIKA-1472) Warning on Tika Server startup - Failed to load class "org.slf4j.impl.StaticLoggerBinder"

2014-11-12 Thread Chris A. Mattmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann reassigned TIKA-1472:
---

Assignee: Chris A. Mattmann

> Warning on Tika Server startup - Failed to load class 
> "org.slf4j.impl.StaticLoggerBinder"
> -
>
> Key: TIKA-1472
> URL: https://issues.apache.org/jira/browse/TIKA-1472
> Project: Tika
>  Issue Type: Bug
>  Components: server
>Affects Versions: 1.6
> Environment: Windows 8, JDK 1.8, Maven 3.2.3
>Reporter: Darya Arbuzova
>Assignee: Chris A. Mattmann
>Priority: Minor
> Attachments: 0001-Added-slf4j-jcl-impl-to-tika-server-deps.patch
>
>
> Hello!
> I want to use Apache Tika in server mode.
> I downloaded {{tika-server-1.6.jar}} from 
> http://mirror.vorboss.net/apache/tika/
> When I try to start the server, I get
> {{SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".}}
> So I go to the link you direct me to 
> (http://www.slf4j.org/codes.html#StaticLoggerBinder), download other slfj4 
> {{jar}}-files, but what next? I can't put them to the "class path", since I 
> don't have a project. I can't change dependencies in {{pom.xml}} for the same 
> reason. Whant should I do?
> I tried downloading the whole source code, but couldn't build it using Maven, 
> still haven't figured out why. Previous discussion see here:
> https://issues.apache.org/jira/browse/TIKA-1470
> Thank you!
> Best regards,
> Darya Arbuzova



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1446) CHM parser : wrong decompression of aligned blocks

2014-11-12 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207922#comment-14207922
 ] 

Chris A. Mattmann commented on TIKA-1446:
-

Hi guys, what is the status on this? Is this ready to be merged?

> CHM parser : wrong decompression of aligned blocks
> --
>
> Key: TIKA-1446
> URL: https://issues.apache.org/jira/browse/TIKA-1446
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 1.7
>Reporter: Bin Hawking
>Priority: Critical
> Attachments: chm.zip
>
>
> If an embedded file contains aligned blocks, the parser outputs chaotic text 
> or empty text as to this file.
> I have fixed it myself, corrected decompressAlignedBlock() and its 
> preparation methods. Mostly this bug is due to misusing main tree/align 
> tree/length tree. And some tree is built wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)