gsoundar opened a new issue, #4115:
URL: https://github.com/apache/logging-log4j2/issues/4115
## Feature Request
### Description
Add a new `log4j-iceberg` module that provides an `IcebergAppender` plugin
for writing log events as Parquet-backed rows in an Apache Iceberg table. This
enables structured, columnar log storage with time-travel, schema evolution,
and partition pruning capabilities out of the box.
### Motivation
Modern observability pipelines increasingly rely on data lake formats
(Iceberg, Delta, Hudi) for log analytics due to their advantages over flat
files:
- **Columnar storage** (Parquet) enables efficient analytical queries over
large log volumes
- **Partition pruning** by date allows fast time-range scans without full
table reads
- **Schema evolution** means log schemas can be extended without rewriting
history
- **Time travel** enables querying historical log state at any snapshot
- **Catalog integration** (REST, Hive, AWS Glue) provides unified metadata
management
Log4j already supports structured output to databases (JDBC, Cassandra,
MongoDB) and message systems (Kafka, JMS). An Iceberg appender fills the gap
for the data lake ecosystem.
### Proposed Implementation
A new `log4j-iceberg` module with:
- `IcebergAppender` — Log4j plugin (`<Iceberg>`) that buffers events and
flushes them as Parquet data files
- `IcebergManager` — Manages catalog lifecycle, table creation, buffered
writes, and commit retry
- Table partitioned by `event_date` (day granularity)
- Schema validation on startup when loading existing tables
- Configurable catalog properties for S3 credentials, REST auth, etc.
- Exponential backoff retry on commit conflicts
### Configuration Example
```xml
<Iceberg name="IcebergAppender"
catalogName="my_catalog"
catalogImpl="rest"
catalogUri="http://localhost:8181"
catalogWarehouse="s3://my-bucket/warehouse"
tableNamespace="logs"
tableName="app_logs"
batchSize="1000"
flushIntervalSeconds="30">
<CatalogProperties>
<Property name="s3.access-key-id">AKIA...</Property>
<Property name="s3.secret-access-key">secret</Property>
</CatalogProperties>
</Iceberg>
```
### Dependencies
- Apache Iceberg 1.10.1
- Apache Parquet 1.16.0
- Hadoop 3.4.1
### Related PR
- #4104
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]