RE: [log4j] Improving log4j security

2023-10-12 Thread Klebanov, Vladimir
Hi Volkan,

It's not just about exchanging data between systems - that is just one 
particular instance of a larger problem. If you use a pattern layout for _any_ 
reason, it is currently extremely inconvenient to configure securely. If you 
use a structured layout, again for any reason, it's still inconvenient to 
configure securely, though indeed less so than a pattern layout. My 
understanding is that not everyone can, will, or should always use a structured 
layout over a pattern layout. For entertainment, I have collected some layout 
statistics, which I include below.

For the pattern layout case, I have prototyped improved encoders that can be 
used with log4j. The code has already been shared with you, though it will 
obviously need (lots of) discussion. I am happy to continue discussing the 
topic / working on the code with anyone who finds it worthwhile.

Thanks,
Vladimir

Statistics: The dataset is certainly debatable, but it's the best one I have. 
Out of the top 1000 starred Java repositories on GitHub, 89 contain a file 
log4j2.xml with at least one element matching .*Layout. Out of these 89 repos, 
every single one defines at least one pattern layout. Only two repos out of 89 
define a layout that is not a pattern layout: one repo a JSONLayout and one a 
StackdriverLayout. 


-Original Message-
From: Volkan Yazıcı  
Sent: Wednesday, 11 October 2023 11:32
To: dev@logging.apache.org
Subject: Re: [log4j] Improving log4j security

Your use case sounds to me as follows: "I want to use `PatternLayout` for
exchanging data between two systems and ... [it is insecure.]" (Please
correct me if I am wrong.) My answer is: "Don't".

`PatternLayout` is not designed to be machine-readable. If I am not
mistaken, there is not even a standard format for stack traces. Consider
ones generated from exceptions containing messages with newline characters.
How are you gonna deal with parsing those? Or thread names, custom levels,
custom markers, etc. with a newline? My point is, don't use `PatternLayout`
for exchanging data between systems. For that purpose, we recommend using
structured layouts, e.g., `JsonTemplateLayout`. ELK, Splunk, Datadog,
NewRelic, etc. they all accept JSON.

In conclusion, I recommend you to use JTL for publishing logs to other
systems. If you have `PatternLayout` [encoder?] enhancements that we can
incorporate in a backward-compatible way, please share.



RE: [log4j] Improving log4j security

2023-10-10 Thread Klebanov, Vladimir
Hi Volkan,

Let me try to clarify. The goal/usecase is not to log as an HTML document. We 
are assuming a typical text-based log here. Yet, in practice, the logs will be 
processed by a variety of systems, including web-based ones, which may have 
various vulnerabilities. These vulnerabilities can be exploited by attackers if 
they can use the log-producing application to inject various strings into the 
log.

(At this point, I would like to refer to the context paragraph of my previous 
message.)

Here is an example scenario spelled out. An application uses log4j to produce a 
text log, while logging the username supplied by the user in every login 
attempt. The log is ingested into Splunk (or ELK), as it often is. An attacker 
can try to login with the username 

RE: [log4j] Improving log4j security

2023-10-09 Thread Klebanov, Vladimir
Thanks, Piotr. I don't know what happened to your replies (maybe the spam 
filter dropped them), but I am happy that we recovered from that now.

Log injections are definitely security issues, but if you prefer to talk about 
them in the open, I will follow suit.

For context: a log injection occurs when an application logs user-supplied data 
(which is often the case). Attacker can exploit log injection to forge log 
records and impede forensics or exploit potential vulnerabilities in 
log-processing systems. There is a variety of string classes that attackers can 
try to inject, including newlines, ANSI sequences, Unicode direction markers, 
Unicode homographs, JavaScript, PHP, etc.

Ideally, applications defend against log injection attacks by encoding (aka 
escaping) user-supplied data before logging. The specific encoding depends on 
the desired level of protection. URL-encoding, for instance, would protect 
against all of the above-mentioned attack classes, but weaker encodings may be 
sometimes acceptable as well.

A natural place to implement encoding is in the pattern layout configuration. 
Some encoding pattern converters are already available in log4j, but there are 
still gaps that I would like to help fill. I think there are roughly three of 
them:

1. The documentation should more prominently explain the issue. Today, most 
users would probably think that the following layout is HTML-safe, while it's 
not:


2. The HTML encoder is not always sufficient. I would like to see an addition 
of a stricter one, such as a URL-encoder.

3. The current encoders encode all structured data (like the complete exception 
stacktrace) and not just the injection-prone parts (i.e., the exception 
message). This means I cannot replace the insecure layout above with the secure 
layout



without changing how logs are parsed (as the stack frames will not be separated 
by newlines anymore).

I have created a PoC implementation of an improved encoder, but I would 
obviously need help to make it productive. Is anyone here interested in that? 
Questions and comments are welcome as well.

Thanks,
Vladimir


-Original Message-
From: Piotr P. Karwasz  
Sent: Thursday, 5 October 2023 22:06
To: dev@logging.apache.org; Klebanov, Vladimir 
Subject: Re: [log4j] Improving log4j security

[You don't often get email from piotr.karw...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Hi Vladimir,

On Thu, 5 Oct 2023 at 21:47, Klebanov, Vladimir
 wrote:
> I would like to contribute some code in order to make log4j usage more 
> secure. I have now sent two emails to the log4j security team but did not 
> receive a response. Is anybody here interested? How can we discuss this 
> further?

Both times (10 Aug 2023, 23:19 and 29 Aug 2023, 20:49) we sent an
answer to your address at sap.com.

Anyway the general consensus was that the issue with generating HTML
using PatternLayout does not constitute a security problem and you can
discuss it on this mailing list or file an issue in Github issues.

Piotr


[log4j] Improving log4j security

2023-10-05 Thread Klebanov, Vladimir
Hello,

I would like to contribute some code in order to make log4j usage more secure. 
I have now sent two emails to the log4j security team but did not receive a 
response. Is anybody here interested? How can we discuss this further?

Thanks,
Vladimir