ppkarwasz commented on PR #4144: URL: https://github.com/apache/logging-log4j2/pull/4144#issuecomment-4680432972
Hi @vy, > * Adding a new dependency for a feature that is used by a majority of the user base (i.e., XML-based configuration) can break fat-JAR setups. I do not yet see the failure mode. An uber-JAR absorbs its dependencies by design, so one more compile-scope dependency is shaded in like any other. > * We will effective be replacing 10 LoC (which is working and bug-free) in `log4j-core` with an external dependency. > * This doesn't really bring any compelling benefit either to users, or to maintainers. I would push back on "bug-free". The current block: https://github.com/apache/logging-log4j2/blob/6beea3feb0e5ec59ed85db14075bb7837b0a968f/log4j-core/src/main/java/org/apache/logging/log4j/core/config/xml/XmlConfiguration.java#L191-L197 blocks external fetches only if the factory supports the features we set, and swallows the exception when it does not, so on an unsupported parser we silently get no hardening and never learn about it. Two more things are hard to justify: - we disable external entities but never enable `FEATURE_SECURE_PROCESSING` (Billion Laughs stays open), and - we set `setExpandEntityReferences(false)`, which has nothing to do with external fetching and at worst changes our DOM, since entities then arrive as `Entity` rather than `Text` nodes. These are all hardenings (the configuration is trusted), but in this case we should choose whether we want to make the parser safe from known XML-based attacks or don't do anything at all. The benefits for maintainers: - Independence from the concrete `DocumentBuilderFactory` chosen by `ServiceLoader` at runtime: the library applies the features each implementation actually supports, not a fixed list. - A `DocumentBuilder` protected against Billion Laughs and XXE/SSRF by library contract, with the entity-attack tests maintained upstream rather than here. - One audited place to point SAST tools and AI reviewers at. This does not make findings vanish on its own (as the CodeQL comment above shows, an intraprocedural scanner cannot see hardening behind a factory method, and may even flag it more), but the rebuttal and any suppression then have a single home instead of being re-argued per call site. If consensus is that the dependency is not worth it I will keep the current code, but "10 bug-free lines" is not the right baseline. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
