[ 
https://issues.apache.org/jira/browse/SAMZA-388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini updated SAMZA-388:
----------------------------------

    Attachment: SAMZA-388-0.patch

Attaching patch. RB at:

https://reviews.apache.org/r/25039/

# Forced checkpoint manager's producer to disable compression.
# Forced checkpoint manager to turn on log compaction when creating a 
checkpoint topic.
# Wrote a test to validate that the topic is created with compaction.

Defaulted to a 25 meg segment for log compacted checkpoint topics, which should 
be enough for checkpoint topics with many messages. We shouldn't have to worry 
too much about file handles on the broker since there should be only a couple 
of segments per checkpoint topic.

> Log compaction on checkpoint topics fails with compression
> ----------------------------------------------------------
>
>                 Key: SAMZA-388
>                 URL: https://issues.apache.org/jira/browse/SAMZA-388
>             Project: Samza
>          Issue Type: Bug
>          Components: kafka
>    Affects Versions: 0.8.0
>            Reporter: Chris Riccomini
>         Attachments: SAMZA-388-0.patch
>
>
> I have a job that has 10,000+ partitions that it's consuming from. After 
> SAMZA-123, it's been switched to use the GroupBySystemStreamPartition 
> strategy, which means it's got 10,000+ tasks, and thus, 10,000+ checkpoint 
> messages being sent every minute.
> To keep the checkpoint topic from getting too large, we enabled log 
> compaction on the Kafka topic, but we discovered that the topic then grew to 
> be very large. This behavior was triggered because we were sending compressed 
> messages to the Kafka checkpoint topic.
> Based on KAFKA-1374, it appears that we can't use compressed checkpoint 
> topics with log compaction.
> I'm mostly opening this ticket as a place holder for KAFKA-1374. Once the 
> ticket is resolved, we can update the Samza code to default the checkpoint 
> topics to be log compacted (with a small segment size), and not worry about 
> the compression anymore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to