[jira] [Commented] (TIKA-3019) [9.8] [CVE-2019-17571] [tika-app] [1.23]

2020-01-10 Thread Konstantin Gribov (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013000#comment-17013000
 ] 

Konstantin Gribov commented on TIKA-3019:
-

[~tallison], there seems to be actually twofold issue with downstream users who 
depends on tika-app/server/eval using log4j 1.2.x: logging backend 
configuration and directly using log4j 1.x API (e.g. LogManager etc). As I 
don't use log4j logging backend I may overlook something.

It's unlikely that downstream folks would depend on tika-app/server, so I'll 
say that if we encounter someone really using it that way we advice to update 
or use log4j-1.2-api module (1.2.x bridge to 2.x API). If they don't use 
*internal* API it should be ok. See 
https://logging.apache.org/log4j/2.x/manual/migration.html and 
https://logging.apache.org/log4j/2.x/manual/compatibility.html. Most likely we 
will break programmatic configuration in this case (like someone use their own 
main class with -q/-v parameters).

As for configuration side downstream user could use {{log4j1.compatibility}} 
system property to use old configs but there're some caveats (like custom 
appender depends on some log4j12 implementation). Again, recommend to update or 
downgrade to 1.2.x like [~kkrugler] said with clear warning about CVE is all we 
can do here, I guess.

Also it seems this vulnerability in SocketServer will only affect those who 
wish to accept logging events via tcp from different services. I couldn't 
imagine such use for tika-app/server off the top of my head. Most likely we 
aren't affected by this CVE at all.

My vote is for migration to 2.x and pointing users to aforementioned 
migration/compatibility guides.

> [9.8] [CVE-2019-17571] [tika-app] [1.23]
> 
>
> Key: TIKA-3019
> URL: https://issues.apache.org/jira/browse/TIKA-3019
> Project: Tika
>  Issue Type: Bug
>  Components: tika-batch
>Affects Versions: 1.23
>Reporter: Aman Mishra
>Priority: Major
>
> *Description :*
> *Severity :* Sonatype CVSS 3: 9.8CVE CVSS 2.0: 0.0
> *Weakness :* Sonatype CWE: 502
> *Source :* National Vulnerability Database
> *Categories :* Data
> *Description from CVE :* Included in Log4j 1.2 is a SocketServer class that 
> is vulnerable to deserialization of untrusted data which can be exploited to 
> remotely execute arbitrary code when combined with a deserialization gadget 
> when listening to untrusted network traffic for log data. This affects Log4j 
> versions up to 1.2 up to 1.2.17.
> *Explanation :* The log4j:log4j package is vulnerable to Remote Code 
> Execution [RCE] due to Deserialization of Untrusted Data. The 
> configureHierarchy and genericHierarchy methods in SocketServer.class do not 
> verify if the file at a given file path contains any untrusted objects prior 
> to deserializing them. A remote attacker can exploit this vulnerability by 
> providing a path to crafted files, which result in arbitrary code execution 
> when deserialized.
> NOTE: Starting with version[s] 2.x, log4j:log4j was relocated to 
> org.apache.logging.log4j:log4j-core. A variation of this vulnerability exists 
> in org.apache.logging.log4j:log4j-core as CVE-2017-5645, in versions up to 
> but excluding 2.8.2.
> *Detection :* The application is vulnerable by using this component.
> *Recommendation :* Starting with version[s] 2.x, log4j:log4j was relocated to 
> org.apache.logging.log4j:log4j-core. A variation of this vulnerability exists 
> in org.apache.logging.log4j:log4j-core as CVE-2017-5645, in versions up to 
> but excluding 2.8.2. Therefore,it is recommended to upgrade to 
> org.apache.logging.log4j:log4j-core version[s] 2.8.2 and above. For 
> log4j:log4j 1.x versions however, a fix does not exist.
> *Root Cause :* tika-app-1.23.jarorg/apache/log4j/net/SocketServer.class : [,]
> *Advisories :* Project: [https://bugzilla.redhat.com/show_bug.cgi?id=1785616]
> *CVSS Details :* Sonatype CVSS 3: 9.8CVSS Vector: 
> CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TIKA-2310) Try to order chapters in epub correctly

2020-01-10 Thread Alexey Zhukov (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012593#comment-17012593
 ] 

Alexey Zhukov commented on TIKA-2310:
-

Opf file does correctly processed, but EpubParser implementation presumes that 
spine contents are to be placed into htm and html files only (see 
EpubParser.java:282) and ignores those with different type. But looks like EPUB 
specification 
([link|[https://www.w3.org/publishing/epub3/epub-spec.html#dfn-epub-content-document]])
 does allow file extension that are differ from htm/html and there may exist 
epub files (see attached) that can't be correctly parsed 

[^Dzhordzh_Oruell_1984_en_.epub]

> Try to order chapters in epub correctly
> ---
>
> Key: TIKA-2310
> URL: https://issues.apache.org/jira/browse/TIKA-2310
> Project: Tika
>  Issue Type: Bug
>Reporter: Tim Allison
>Assignee: Tim Allison
>Priority: Minor
> Fix For: 1.21
>
> Attachments: Dzhordzh_Oruell_1984_en_.epub
>
>
> [~johanvanderknijff] recently pointed out on twitter that our Epub parser 
> doesn't handle chapters in the right order.  We should try to fix our parser 
> so that the output is in the correct order.
> Epub is new to me, but it looks like we can scrape the order out of 
> content.opf(?).
> This would require dumping the stream to a ZipFile for direct access to zip 
> entries, but we require that of ooxml...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (TIKA-2310) Try to order chapters in epub correctly

2020-01-10 Thread Alexey Zhukov (Jira)


 [ 
https://issues.apache.org/jira/browse/TIKA-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Zhukov updated TIKA-2310:

Attachment: Dzhordzh_Oruell_1984_en_.epub

> Try to order chapters in epub correctly
> ---
>
> Key: TIKA-2310
> URL: https://issues.apache.org/jira/browse/TIKA-2310
> Project: Tika
>  Issue Type: Bug
>Reporter: Tim Allison
>Assignee: Tim Allison
>Priority: Minor
> Fix For: 1.21
>
> Attachments: Dzhordzh_Oruell_1984_en_.epub
>
>
> [~johanvanderknijff] recently pointed out on twitter that our Epub parser 
> doesn't handle chapters in the right order.  We should try to fix our parser 
> so that the output is in the correct order.
> Epub is new to me, but it looks like we can scrape the order out of 
> content.opf(?).
> This would require dumping the stream to a ZipFile for direct access to zip 
> entries, but we require that of ooxml...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)