[ https://issues.apache.org/jira/browse/TIKA-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056108#comment-17056108 ]
Tilman Hausherr commented on TIKA-3065: --------------------------------------- I suggest you save whatever you get in that inputstream into a file, and then check whether the content is the same. > Not able to parse the document with inline image > ------------------------------------------------ > > Key: TIKA-3065 > URL: https://issues.apache.org/jira/browse/TIKA-3065 > Project: Tika > Issue Type: Bug > Affects Versions: 1.23 > Reporter: suchendra > Priority: Major > Attachments: 2.pdf > > > I am using apache tika in my project to detect the file extension. I am using > Scala for development. > Below is the code, > {code:java} > // code placeholder > val sinkIs = StreamConverters.asInputStream(60 seconds) > val is = data.runWith(sinkIs) > val tikaStream = TikaInputStream.get(is) > val mediaType = new Tika().detect(tikaStream, fileName) > val extension = MimeTypes.getDefaultMimeTypes.forName(mediaType) > print("extension: " + extension.getExtension) > {code} > I have a API which upload the files and gives back the extension as response. > I have attached the file I used to upload. But I am facing the problem with > all types of doc which contains the image. Basically it stuck at > POICSContainerDetector's detect method. -- This message was sent by Atlassian Jira (v8.3.4#803005)