Re: Logstash and Filebeat guaranteed delivery

2023-11-30 Thread Ralph Goers
Volkan, 

Notice that neither of the links you have provided use the term “guaranteed 
delivery”. That is because that is not really what they are providing. In 
addition, notice that Logstash says "Input plugins that do not use a 
request-response protocol cannot be protected from data loss”, and "Data may be 
lost if an abnormal shutdown occurs before the checkpoint file has been 
committed”. Note that Flume’s FileChannel does not face the second issue while 
the first would also be a problem if it is using a source that doesn’t support 
acknowledgements.However, Log4j’s FlumeAppender always gets acks.

To make this clearer let me review the architecture for my implementation again.

First the phone system maintains a list of ip addresses that can handle Radius 
accounting records. We host 2 Flume servers in the same data center as the 
phone system and configure the phone system with their ip addresses. The Radius 
records will be sent to those Flume servers which will accept them with our 
custom Radius Source. That converts them to JSON and passes the JSON to the 
File Channel. Once the File Channel has written them to disk the source 
responds back to the phone system with an ACK that the record was received. It 
the record is not processed quickly enough (I believe it is 100ms) then the 
phone system will try a different ip address assuming it couldn’t be delivered 
by the first server.  Another thread reads the records from the File Channel 
and sends them to a Flume in a different data center for processing. This 
follows the same pattern. The Avro Sink serializes the record and sends it to 
the target Flume. That Flume writes the record to a File channel and the Avro 
Source responds with an ACK that the record was received, at which point the 
originating Flume will remove it from the File Channel.

If you will notice, the application itself knows that delivery is guaranteed 
because it gets an ACK to say so. Due to this, Filbeat cannot possibly 
implement guaranteed delivery. The application will expect that once it writes 
to a file or to System.out delivery is guaranteed, which really cannot be true.

As for using Google Cloud that would default the whole point. If your data 
center has lost contact with the outside world it won’t be able to get to 
Google Cloud.

While Redis would work it would require a) an application component that 
interacts with Redis such as a Redis Appender (which we don’t have) b) a Redis 
deployment c) a Logstash (or some other Redis consumer) to forward the event. 
It is a lot simpler to configure Flume than to do all of that.

Ralph


> On Nov 30, 2023, at 4:32 AM, Volkan Yazıcı  wrote:
> 
> Ralph, could you elaborate on your response, please? AFAIK, Logstash and 
> Filebeat provide guaranteed delivery, if configured correctly. As a matter of 
> fact they have docs (here and here) explaining how to do it – actually, there 
> are several ways on how to do it. What makes you think they don't provide 
> guaranteed delivery?
> 
> I have implemented two different types of logging pipelines with guaranteed 
> delivery:
> • 
> Using a Google Cloud BigQuery appender
> • Using a Redis appender (Redis queue is ingested to Elasticsearch 
> through Logstash)
> I want to learn where I can potentially violate the delivery guarantee.
> 
> On Thu, Nov 30, 2023 at 5:54 AM Ralph Goers  
> wrote:
> Fluentbit, Fluentd, Logstash, and Filebeat are the main tools used for log 
> forwarding. While they all have some amount of plugability none of the are as 
> flexible as Flume. In addition, as I have mentioned before, none of them 
> provide guaranteed delivery so I would never recommend them for forwarding 
> audit logs.



Re: [Flume] Integration with OpenTelemetry

2023-11-30 Thread Christian Grobmeier



On Thu, Nov 30, 2023, at 18:04, Matt Sicker wrote:
> Oh yes, I’d still love to see how we can adapt the Log4j plugin system 
> here! And yeah, this would likely make sense as its own repository. 
> I’ll make a proposal about the OTel stuff before working on any code.

Please do. I would also like to learn more about this stuff and maybe we can 
make Chainsaw to receive Flume messages or OTel things too!

> —
> Matt Sicker
>
>> On Nov 29, 2023, at 22:54, Ralph Goers  wrote:
>> 
>> This is a great post Matt!
>> 
>> Fluentbit, Fluentd, Logstash, and Filebeat are the main tools used for log 
>> forwarding. While they all have some amount of plugability none of the are 
>> as flexible as Flume. In addition, as I have mentioned before, none of them 
>> provide guaranteed delivery so I would never recommend them for forwarding 
>> audit logs. 
>> 
>> I have also previously explained my use case for using Flume, which is for 
>> forwarding Call Detail Records that start off as records in the Radius 
>> protocol [1] across data centers, which also requires guaranteed delivery. I 
>> wouldn’t be able to use any of those other tools to do that without 
>> significant modification.
>> 
>> I am all for supporting standards. If you can outline what you are proposing 
>> on a Confluence page I would wholeheartedly support it.
>> 
>> As you probably know, I started work on separating out things that I don’t 
>> consider to be “core” to Flume into separate repos. That work is only half 
>> completed. I would suggest that you consider whether what you are proposing 
>> also be in its own repo. As it is, the CI for Flume fails because the 
>> generated logs are exceeding the available disk space. In addition, the 
>> build takes a long time. 
>> 
>> Also, I have never really been a big fan of the configuration mechanism 
>> Flume uses. I was able to somewhat bypass it by implementing support for 
>> Spring Boot, but it would be great if the Log4j Plugin system could somehow 
>> be used to simplify configuring Flume for those who don’t want to use Spring 
>> boot. I know that is right up your alley.
>> 
>> Ralph
>> 
>> 
>> 1. https://networkradius.com/doc/current/introduction/RADIUS.html
>> 
>>> On Nov 29, 2023, at 5:32 PM, Matt Sicker  wrote:
>>> 
>>> One of the main reasons why I supported Flume joining this PMC was that I 
>>> noticed it has significant overlap with projects in the observability space 
>>> despite not being advertised as such. For example, the project FluentBit is 
>>> extremely similar to Flume, but its main purpose is for collecting, 
>>> processing, forwarding, etc., logs, metrics, and traces (i.e., 
>>> observability data). FluentBit is not the only thing in this space, though 
>>> it seems to be fairly popular. These sorts of tools are used for ultimately 
>>> publishing observability data to one or more observability tools like 
>>> Prometheus, Splunk, Jaeger, Grafana, etc., and with a unified collector and 
>>> processor, it becomes possible to publish all your observability data into 
>>> one tool rather than three or more disparate tools (and the added 
>>> operational costs of storing tons of duplicated log data from three or more 
>>> methods of generating log data).
>>> 
>>> A project at the CNCF, OpenTelemetry, has become the sort of de facto 
>>> standard for interoperability in this space. In particular, they’ve 
>>> published the OTLP specification 
>>>  for general telemetry data 
>>> delivery and the OpenTelemetry specification 
>>>  for various common APIs. While 
>>> I’m still researching in this space, I think it would be useful for Flume 
>>> to integrate with some of these APIs and SDKs (while other parts might be 
>>> more relevant in our logging libraries instead). There is also the Open 
>>> Agent Management Protocol  
>>> which is still in beta status that might also be relevant here (and 
>>> potentially relevant in the logging libraries).
>>> 
>>> Supporting common standards for our projects seems like a useful thing to 
>>> do, and despite the popularity of some existing solutions there, I believe 
>>> there is plenty of space for us to contribute. I also think that this can 
>>> provide opportunity for the various components in this PMC to interoperate 
>>> as these specs are fairly language neutral with some sample versions in 
>>> many different languages.
>> 
>>


Re: [log4j] Upgrade `2.x` compiler baseline to Java 17

2023-11-30 Thread Matt Sicker
No, only that it requires Java 17 to build. It still targets the Java 8 release 
profile.

> On Nov 30, 2023, at 11:15 AM, Gary Gregory  wrote:
> 
> Nice! That means that 2.23.0 will require Java 17 at runtime right?
> 
> Gary
> 
> 
> On Thu, Nov 30, 2023, 11:05 AM Volkan Yazıcı  wrote:
> 
>> Heads up! #2021  bumps
>> the `2.x` baseline to Java 17. Everything works locally. If CI agrees too,
>> I will merge it tomorrow and start porting to `main`.
>> 



Re: [log4j] Upgrade `2.x` compiler baseline to Java 17

2023-11-30 Thread Gary Gregory
Nice! That means that 2.23.0 will require Java 17 at runtime right?

Gary


On Thu, Nov 30, 2023, 11:05 AM Volkan Yazıcı  wrote:

> Heads up! #2021  bumps
> the `2.x` baseline to Java 17. Everything works locally. If CI agrees too,
> I will merge it tomorrow and start porting to `main`.
>


Re: [log4j] Upgrade `2.x` compiler baseline to Java 17

2023-11-30 Thread Piotr P. Karwasz
Hi Matt,

On Thu, 30 Nov 2023 at 18:09, Matt Sicker  wrote:
>
> Sounds great! We even finally updated Spinnaker this week to build on Java 
> 17, so great timing.

As far as I have seen most Commons build on JDK 21. This might however
be too high for us, since tests using SecurityManager start to fail.

Piotr


Re: [log4j] Upgrade `2.x` compiler baseline to Java 17

2023-11-30 Thread Matt Sicker
Sounds great! We even finally updated Spinnaker this week to build on Java 17, 
so great timing.
—
Matt Sicker

> On Nov 30, 2023, at 10:03, Volkan Yazıcı  wrote:
> 
> Heads up! #2021  bumps
> the `2.x` baseline to Java 17. Everything works locally. If CI agrees too,
> I will merge it tomorrow and start porting to `main`.



Re: [Flume] Integration with OpenTelemetry

2023-11-30 Thread Matt Sicker
Oh yes, I’d still love to see how we can adapt the Log4j plugin system here! 
And yeah, this would likely make sense as its own repository. I’ll make a 
proposal about the OTel stuff before working on any code.
—
Matt Sicker

> On Nov 29, 2023, at 22:54, Ralph Goers  wrote:
> 
> This is a great post Matt!
> 
> Fluentbit, Fluentd, Logstash, and Filebeat are the main tools used for log 
> forwarding. While they all have some amount of plugability none of the are as 
> flexible as Flume. In addition, as I have mentioned before, none of them 
> provide guaranteed delivery so I would never recommend them for forwarding 
> audit logs. 
> 
> I have also previously explained my use case for using Flume, which is for 
> forwarding Call Detail Records that start off as records in the Radius 
> protocol [1] across data centers, which also requires guaranteed delivery. I 
> wouldn’t be able to use any of those other tools to do that without 
> significant modification.
> 
> I am all for supporting standards. If you can outline what you are proposing 
> on a Confluence page I would wholeheartedly support it.
> 
> As you probably know, I started work on separating out things that I don’t 
> consider to be “core” to Flume into separate repos. That work is only half 
> completed. I would suggest that you consider whether what you are proposing 
> also be in its own repo. As it is, the CI for Flume fails because the 
> generated logs are exceeding the available disk space. In addition, the build 
> takes a long time. 
> 
> Also, I have never really been a big fan of the configuration mechanism Flume 
> uses. I was able to somewhat bypass it by implementing support for Spring 
> Boot, but it would be great if the Log4j Plugin system could somehow be used 
> to simplify configuring Flume for those who don’t want to use Spring boot. I 
> know that is right up your alley.
> 
> Ralph
> 
> 
> 1. https://networkradius.com/doc/current/introduction/RADIUS.html
> 
>> On Nov 29, 2023, at 5:32 PM, Matt Sicker  wrote:
>> 
>> One of the main reasons why I supported Flume joining this PMC was that I 
>> noticed it has significant overlap with projects in the observability space 
>> despite not being advertised as such. For example, the project FluentBit is 
>> extremely similar to Flume, but its main purpose is for collecting, 
>> processing, forwarding, etc., logs, metrics, and traces (i.e., observability 
>> data). FluentBit is not the only thing in this space, though it seems to be 
>> fairly popular. These sorts of tools are used for ultimately publishing 
>> observability data to one or more observability tools like Prometheus, 
>> Splunk, Jaeger, Grafana, etc., and with a unified collector and processor, 
>> it becomes possible to publish all your observability data into one tool 
>> rather than three or more disparate tools (and the added operational costs 
>> of storing tons of duplicated log data from three or more methods of 
>> generating log data).
>> 
>> A project at the CNCF, OpenTelemetry, has become the sort of de facto 
>> standard for interoperability in this space. In particular, they’ve 
>> published the OTLP specification  
>> for general telemetry data delivery and the OpenTelemetry specification 
>>  for various common APIs. While 
>> I’m still researching in this space, I think it would be useful for Flume to 
>> integrate with some of these APIs and SDKs (while other parts might be more 
>> relevant in our logging libraries instead). There is also the Open Agent 
>> Management Protocol  which is 
>> still in beta status that might also be relevant here (and potentially 
>> relevant in the logging libraries).
>> 
>> Supporting common standards for our projects seems like a useful thing to 
>> do, and despite the popularity of some existing solutions there, I believe 
>> there is plenty of space for us to contribute. I also think that this can 
>> provide opportunity for the various components in this PMC to interoperate 
>> as these specs are fairly language neutral with some sample versions in many 
>> different languages.
> 
> 



Re: [log4j] Upgrade `2.x` compiler baseline to Java 17

2023-11-30 Thread Piotr P. Karwasz
Hi Volkan,

On Thu, 30 Nov 2023 at 17:05, Volkan Yazıcı  wrote:
>
> Heads up! #2021  bumps
> the `2.x` baseline to Java 17. Everything works locally. If CI agrees too,
> I will merge it tomorrow and start porting to `main`.

Nice job.

This allows you to start a chain reaction of fixed issues:
logging-log4j2#1851 [1] followed by logging-parent#62 [2], which
should greatly simplify the BND configuration. Basically all our deps
(like Jackson) that are JPMS named modules, but have a
`module-info.class` descriptor in their `META-INF/versions` folder
require a configuration override with BND 6.4.x, but with BND 7.x they
no longer require it.

Piotr

[1] https://github.com/apache/logging-log4j2/issues/1851
[2] https://github.com/apache/logging-parent/issues/62


[log4j] Upgrade `2.x` compiler baseline to Java 17

2023-11-30 Thread Volkan Yazıcı
Heads up! #2021  bumps
the `2.x` baseline to Java 17. Everything works locally. If CI agrees too,
I will merge it tomorrow and start porting to `main`.


Logstash and Filebeat guaranteed delivery

2023-11-30 Thread Volkan Yazıcı
Ralph, could you elaborate on your response, please? AFAIK, Logstash and
Filebeat provide guaranteed delivery, if configured correctly. As a matter
of fact they have docs (here

and here
)
explaining how to do it – actually, there are several ways on how to do it.
What makes you think they don't provide guaranteed delivery?

I have implemented two different types of logging pipelines with guaranteed
delivery:

   1. Using a Google Cloud BigQuery appender
   2. Using a Redis appender (Redis queue is ingested to Elasticsearch
   through Logstash)

I want to learn where I can potentially violate the delivery guarantee.

On Thu, Nov 30, 2023 at 5:54 AM Ralph Goers 
wrote:

> Fluentbit, Fluentd, Logstash, and Filebeat are the main tools used for log
> forwarding. While they all have some amount of plugability none of the are
> as flexible as Flume. In addition, as I have mentioned before, none of them
> provide guaranteed delivery so I would never recommend them for forwarding
> audit logs.
>


Re: [Flume] Integration with OpenTelemetry

2023-11-30 Thread Gary Gregory
A key feature for me is guaranteed delivery, which is why is sometimes I
use JMS with Log4j.

Gary

On Wed, Nov 29, 2023, 11:54 PM Ralph Goers 
wrote:

> This is a great post Matt!
>
> Fluentbit, Fluentd, Logstash, and Filebeat are the main tools used for log
> forwarding. While they all have some amount of plugability none of the are
> as flexible as Flume. In addition, as I have mentioned before, none of them
> provide guaranteed delivery so I would never recommend them for forwarding
> audit logs.
>
> I have also previously explained my use case for using Flume, which is for
> forwarding Call Detail Records that start off as records in the Radius
> protocol [1] across data centers, which also requires guaranteed delivery.
> I wouldn’t be able to use any of those other tools to do that without
> significant modification.
>
> I am all for supporting standards. If you can outline what you are
> proposing on a Confluence page I would wholeheartedly support it.
>
> As you probably know, I started work on separating out things that I don’t
> consider to be “core” to Flume into separate repos. That work is only half
> completed. I would suggest that you consider whether what you are proposing
> also be in its own repo. As it is, the CI for Flume fails because the
> generated logs are exceeding the available disk space. In addition, the
> build takes a long time.
>
> Also, I have never really been a big fan of the configuration mechanism
> Flume uses. I was able to somewhat bypass it by implementing support for
> Spring Boot, but it would be great if the Log4j Plugin system could somehow
> be used to simplify configuring Flume for those who don’t want to use
> Spring boot. I know that is right up your alley.
>
> Ralph
>
>
> 1. https://networkradius.com/doc/current/introduction/RADIUS.html
>
> > On Nov 29, 2023, at 5:32 PM, Matt Sicker  wrote:
> >
> > One of the main reasons why I supported Flume joining this PMC was that
> I noticed it has significant overlap with projects in the observability
> space despite not being advertised as such. For example, the project
> FluentBit is extremely similar to Flume, but its main purpose is for
> collecting, processing, forwarding, etc., logs, metrics, and traces (i.e.,
> observability data). FluentBit is not the only thing in this space, though
> it seems to be fairly popular. These sorts of tools are used for ultimately
> publishing observability data to one or more observability tools like
> Prometheus, Splunk, Jaeger, Grafana, etc., and with a unified collector and
> processor, it becomes possible to publish all your observability data into
> one tool rather than three or more disparate tools (and the added
> operational costs of storing tons of duplicated log data from three or more
> methods of generating log data).
> >
> > A project at the CNCF, OpenTelemetry, has become the sort of de facto
> standard for interoperability in this space. In particular, they’ve
> published the OTLP specification <
> https://opentelemetry.io/docs/specs/otlp/> for general telemetry data
> delivery and the OpenTelemetry specification <
> https://opentelemetry.io/docs/specs/otel/> for various common APIs. While
> I’m still researching in this space, I think it would be useful for Flume
> to integrate with some of these APIs and SDKs (while other parts might be
> more relevant in our logging libraries instead). There is also the Open
> Agent Management Protocol 
> which is still in beta status that might also be relevant here (and
> potentially relevant in the logging libraries).
> >
> > Supporting common standards for our projects seems like a useful thing
> to do, and despite the popularity of some existing solutions there, I
> believe there is plenty of space for us to contribute. I also think that
> this can provide opportunity for the various components in this PMC to
> interoperate as these specs are fairly language neutral with some sample
> versions in many different languages.
>
>
>