Re: Turning on compaction - can I rolling restart?

2018-04-16 Thread Xin Li
Hey Sean,

You can do rolling restart, then can do a describe of the topic, to check if 
the broker is back. Then continue with the next one.

Best,
Xin

On 13.04.18, 21:09, "McNealy, Sean"  wrote:

I have a cluster of 3 kafka brokers running 0.10.2 that have been running 
for a while with the property “log.cleaner.enable=false” and I would like to 
turn compaction on.

Is it safe to do this using a rolling restart? Will the cluster stay 
healthy with some brokers running with the compaction property turned on and 
some still off? Will some partitions start compacting and some not yet? Can 
partitions stay in sync?

Or does this property require every broker to be stopped simultaneously, 
then make the change, and then started.

Thanks,

Sean





Re: Kafka/zookeeper logs in every command

2018-01-23 Thread Xin Li
Looking for log4j.properties, in the config directoires.

Best,
Xin
On 23.01.18, 16:58, "José Ribeiro"  wrote:

Good morning.


I have a problem about kafka logs showing in my outputs.

When i started to work with kafka, the outputs were normal. For example:


kafka-topics.sh --list --zookeeper localhost:2181

returned

test


Now, with the same command line, i get this:


SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/opt/apache-hive-2.3.2-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/opt/kafka_2.11-1.0.0/libs/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.
SLF4J: Actual binding is of type 
[org.apache.logging.slf4j.Log4jLoggerFactory]
2018-01-23T11:26:53,964 INFO [ZkClient-EventThread-12-localhost:2181] 
org.I0Itec.zkclient.ZkEventThread - Starting ZkClient event thread.
2018-01-23T11:26:53,973 INFO [main] org.apache.zookeeper.ZooKeeper - Client 
environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2018-01-23T11:26:53,974 INFO [main] org.apache.zookeeper.ZooKeeper - Client 
environment:host.name=ubuntu
2018-01-23T11:26:53,974 INFO [main] org.apache.zookeeper.ZooKeeper - Client 
environment:java.version=1.8.0_151
2018-01-23T11:26:53,974 INFO [main] org.apache.zookeeper.ZooKeeper - Client 
environment:java.vendor=Oracle Corporation
2018-01-23T11:26:53,975 INFO [main] org.apache.zookeeper.ZooKeeper - Client 
environment:java.home=/usr/lib/jvm/java-8-oracle/jre
2018-01-23T11:26:53,975 INFO [main] org.apache.zookeeper.ZooKeeper - Client 
environment:java.class.path=:/opt/hadoop-2.9.0/lib/*:.:/opt/apache-hive-2.3.2-bin/lib/aether-util-0.9.0.M2.jar:/opt/apache-hive-2.3.2-bin/lib/datanucleus-core-4.1.17.jar:/opt/apache-hive-2.3.2-bin/lib/guice-servlet-4.1.0.jar:/opt/apache-hive-2.3.2-bin/lib/maven-model-builder-3.1.1.jar:/opt/apache-hive-2.3.2-bin/lib/jackson-jaxrs-1.9.13.jar:/opt/apache-hive-2.3.2-bin/lib/metrics-jvm-3.1.0.jar:/opt/apache-hive-2.3.2-bin/lib/httpclient-4.4.jar:/opt/apache-hive-2.3.2-bin/lib/antlr-runtime-3.5.2.jar:/opt/apache-hive-2.3.2-bin/lib/avatica-metrics-1.8.0.jar:/opt/apache-hive-2.3.2-bin/lib/hbase-server-1.1.1.jar:/opt/apache-hive-2.3.2-bin/lib/druid-common-0.9.2.jar:/opt/apache-hive-2.3.2-bin/lib/commons-lang-2.6.jar:/opt/apache-hive-2.3.2-bin/lib/datanucleus-rdbms-4.1.19.jar:/opt/apache-hive-2.3.2-bin/lib/netty-all-4.0.52.Final.jar:/opt/apache-hive-2.3.2-bin/lib/hive-accumulo-handler-2.3.2.jar:/opt/apache-hive-2.3.2-bin/lib/log4j-core-2.6.2.jar:/opt/apache-hive-2.3.2-bin/lib/bonecp-0.8.0.RELEASE.jar:/opt/apache-hive-2.3.2-bin/lib/org.abego.treelayout.core-1.0.1.jar:/opt/apache-hive-2.3.2-bin/lib/accumulo-core-1.6.0.jar:/opt/apache-hive-2.3.2-bin/lib/geronimo-jaspic_1.0_spec-1.0.jar:/opt/apache-hive-2.3.2-bin/lib/validation-api-1.1.0.Final.jar:/opt/apache-hive-2.3.2-bin/lib/jetty-6.1.26.jar:/opt/apache-hive-2.3.2-bin/lib/calcite-core-1.10.0.jar:/opt/apache-hive-2.3.2-bin/lib/hive-hplsql-2.3.2.jar:/opt/apache-hive-2.3.2-bin/lib/javax.servlet-3.0.0.v201112011016.jar:/opt/apache-hive-2.3.2-bin/lib/jsp-api-2.1-6.1.14.jar:/opt/apache-hive-2.3.2-bin/lib/regexp-1.3.jar:/opt/apache-hive-2.3.2-bin/lib/jackson-dataformat-smile-2.4.6.jar:/opt/apache-hive-2.3.2-bin/lib/jackson-datatype-guava-2.4.6.jar:/opt/apache-hive-2.3.2-bin/lib/transaction-api-1.1.jar:/opt/apache-hive-2.3.2-bin/lib/wagon-provider-api-2.4.jar:/opt/apache-hive-2.3.2-bin/lib/hive-shims-common-2.3.2.jar:/opt/apache-hive-2.3.2-bin/lib/hibernate-validator-5.1.3.Final.jar:/opt/apache-hive-2.3.2-bin/lib/servlet-api-2.5-6.1.14.jar:/opt/apache-hive-2.3.2-bin/lib/maven-scm-api-1.4.jar:/opt/apache-hive-2.3.2-bin/lib/druid-hdfs-storage-0.9.2.jar:/opt/apache-hive-2.3.2-bin/lib/config-magic-0.9.jar:/opt/apache-hive-2.3.2-bin/lib/commons-dbcp2-2.0.1.jar:/opt/apache-hive-2.3.2-bin/lib/curator-x-discovery-2.11.0.jar:/opt/apache-hive-2.3.2-bin/lib/asm-3.1.jar:/opt/apache-hive-2.3.2-bin/lib/spymemcached-2.11.7.jar:/opt/apache-hive-2.3.2-bin/lib/metrics-json-3.1.0.jar:/opt/apache-hive-2.3.2-bin/lib/log4j-web-2.6.2.jar:/opt/apache-hive-2.3.2-bin/lib/commons-compress-1.9.jar:/opt/apache-hive-2.3.2-bin/lib/hive-vector-code-gen-2.3.2.jar:/opt/apache-hive-2.3.2-bin/lib/google-http-client-jackson2-1.15.0-rc.jar:/opt/apache-hive-2.3.2-bin/lib/jetty-util-6.1.26.jar:/opt/apache-hive-2.3.2-bin/lib/hive-cli-2.3.2.jar:/opt/apache-hive-2.3.2-bin/lib/commons-el-1.0.jar:/opt/apache-hive-2.3.2-bin/lib/hive-service-2.3.2.jar:/opt/apache-hive-2.3.2-bin/lib/jersey-server-1.14.jar:/opt/apache-hive-2.3.2-bin/lib/jackson-jaxrs-base-2.4.6.jar:/opt/apache-hive-2.3.2-bin/lib/log4j-1.2-api-2.6.2.jar:/opt/apache-hive-2.3.2-bin/lib/hive-shims-0.23-2.3.2.jar:/opt/apache-hive-2.3.2-bin/lib/compress-lzf-1.0.3.jar:/opt/apache-hive-2.3

Re: How to always consume from latest offset in kafka-streams

2018-01-22 Thread Xin Li
This?
consumer.auto.offset.reset = latest

Best,
Xin

On 19.01.18, 19:34, "Saloni Vithalani"  wrote:

Our requirement is such that if a kafka-stream app is consuming a
partition, it should start it's consumption from latest offset of that
partition.

This seems like do-able using

streamsConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest")

Now, let's say using above configuration, the kafka-stream app started
consuming data from latest offset for a partition. And after some time, the
app crashes. When the app comes back live, we want it to consume data from
the latest offset of that partition, instead of the where it left last
reading.

But I can't find anything that can help achieve it using kafka-streams api.

P.S. We are using kafka-1.0.0.
Saloni Vithalani
Developer
Email salo...@thoughtworks.com
Telephone +91 8552889571 <8552889571>
[image: ThoughtWorks]






Re: Best practice for publishing byte messages to Kafka

2018-01-11 Thread Xin Li
Protobuf?

On 11.01.18, 01:33, "Ali Nazemian"  wrote:

Oops, I was mistaken. I meant serialization of an object as a byte array
from the first place!

On Wed, Jan 10, 2018 at 3:20 PM, Thunder Stumpges <
thunder.stump...@gmail.com> wrote:

> Byte Array is essentially "serialized" already isn't it? I mean the 
message
> itself is sent as a byte array, so the default byte array serializer is as
> "efficient" as it gets, as it's just sending your byte array through as 
the
> message... there's no serialization happening.
> -Thunder
>
> On Tue, Jan 9, 2018 at 8:17 PM Ali Nazemian  wrote:
>
> > Thanks, Matt. Have you done any benchmarking to see how using different
> > Serializers may impact throughput/latency?
> >
> > Regards,
> > Ali
> >
> > On Wed, Jan 10, 2018 at 7:55 AM, Matt Farmer  wrote:
> >
> > > We use the default byte array serializer provided with Kafka and it
> works
> > > great for us.
> > >
> > > > On Jan 9, 2018, at 8:12 AM, Ali Nazemian 
> > wrote:
> > > >
> > > > Hi All,
> > > >
> > > > I was wondering whether there is any best practice/recommendation 
for
> > > > publishing byte messages to Kafka. Is there any specific Serializer
> > that
> > > is
> > > > recommended for this matter?
> > > >
> > > > Cheers,
> > > > Ali
> > >
> > >
> >
> >
> > --
> > A.Nazemian
> >
>



-- 
A.Nazemian




Re: Need feedbacks to consumer if hangs due to some __consumer_offsets partitions failed

2018-01-10 Thread Xin Li
Hey,

I think you can do it in two ways:

1. Manually increase the replication factor
2. In the server.config file,  change the default.replication.factor
Best,
Xin

On 09.01.18, 07:46, "1095193...@qq.com" <1095193...@qq.com> wrote:

hi
My kafka cluster has  two brokers( 0 and 1) and  it  created  
__consumer_offsets  topic  has 50 partitions . When  broker 0 failed,  half of 
partitions failed (because topic   __consumer_offsets   ReplicationFactor is 1).
 This results consequence that my kafkaConsumer sometimes can work well and 
sometimes hangs with 0 outputs.  My consumer generated a new goupid upon it 
start up . When consumer startup, it will request groupcoordinator with groupid 
 . When  Consumer's groupid is routed to a good partition, it can work well, 
when  grouid is routed to a failed partition, it hangs with a response 
GROUP_COORDINATOR_NOT_AVAILABLE. I hope broker can feedback  to consumer if 
hangs  due to some __consumer_offsets   partitions failed , it can help user 
improve  expericence and  debug issure fastly.



1095193...@qq.com




Re: Consumer Offsets Being Deleted by Broker

2017-12-10 Thread Xin Li
Hey,

I think what you looking for is offsets.retention.minutes from server side 
configuration.

Best,
Xin

On 10.12.17, 07:13, "M. Musheer"  wrote:

Hi,
We are using "kafka_2.11-1.0.0" kafka version with default "offset" related
configurations.
Issue:
Consumer offsets are being deleted and we are not using auto commits at
consumer side.
Is there any configuration we need to add for consumer offset retention ??

Please help us.



Thanks,
Musheer




Re: Log Compaction Not Picking up Topic

2017-10-25 Thread Xin Li
Hey Elmar,
The only thing you need to do is upgrade, 
Kafka track cleaned offset using cleaner-offset-checkpoint file.

Best,
Xin

Xin Li Data EngineeringXin.Li@ <mailto:xin...@xin.li>trivago.com 
<mailto:y...@trivago.com>www.trivago.com <http://www.trivago.com/>F +49 (0) 211 
540 65 115We're hiring! Check out our vacancies 
http://company.trivago.com/jobs/Court of registration: Amtsgericht Düsseldorf, 
HRB 51842
Managing directors: Rolf Schrömgens · Malte Siewert · Peter Vinnemeier · Andrej 
Lehnert · Johannes Thomas
trivago GmbH · Bennigsen-Platz 1 · D – 40474 Düsseldorf
* This email message may contain legally privileged and/or confidential 
information.
You are hereby notified that any disclosure, copying, distribution, or use of 
this email message is strictly prohibited.

On 25.10.17, 12:34, "Elmar Weber"  wrote:

Hi,

thanks, I'll give it a try, we run on Kubernetes so it's not a big issue 
to replicate the whole env including data.

One question I'd have left:
- How can I force a re-compaction over the whole topic? Because I guess
   the Log Cleaner market everything so far as not able to clean, how
   will it recheck the whole log?

Best,
Elmar




On 10/25/2017 12:29 PM, Jan Filipiak wrote:
> Hi,
> 
> unfortunatly there is nothing trivial you could do here.
> Without upgrading your kafkas you can only bounce the partition back and 
> forth
> between brokers so they compact while its still small.
> 
> With upgrading you could also just cherrypick this very commit or put a 
> logstatement to verify.
> 
> Given the Logsizes your dealing with, I am very confident that this is 
> your issue.
> 
> Best Jan
> 
> 
    > On 25.10.2017 12:21, Elmar Weber wrote:
>> Hi,
>>
>> On 10/25/2017 12:15 PM, Xin Li wrote:
>> > I think that is a bug, and  should be fixed in this task 
>> https://issues.apache.org/jira/browse/KAFKA-6030.
>> > We experience that in our kafka cluster, we just check out the 
>> 11.0.2 version, build it ourselves.
>>
>> thanks for the hint, as it looks like a calculation issue, would it be 
>> possible to verify this by manually changing the clean ratio or some 
>> other settings?
>>
>> Best,
>> Elmar
>>
> 
> 





Re: Log Compaction Not Picking up Topic

2017-10-25 Thread Xin Li
Hey, 
Because of the overflow the calculation for dirty ratios is minus, and I guess 
upgrade is the one time for good fix.

And we running that for quite a while, so far so good.

Best,
Xin

Xin Li Data EngineeringXin.Li@ <mailto:xin...@xin.li>trivago.com 
<mailto:y...@trivago.com>www.trivago.com <http://www.trivago.com/>F +49 (0) 211 
540 65 115We're hiring! Check out our vacancies 
http://company.trivago.com/jobs/Court of registration: Amtsgericht Düsseldorf, 
HRB 51842
Managing directors: Rolf Schrömgens · Malte Siewert · Peter Vinnemeier · Andrej 
Lehnert · Johannes Thomas
trivago GmbH · Bennigsen-Platz 1 · D – 40474 Düsseldorf
* This email message may contain legally privileged and/or confidential 
information.
You are hereby notified that any disclosure, copying, distribution, or use of 
this email message is strictly prohibited.

On 25.10.17, 12:21, "Elmar Weber"  wrote:

Hi,
    
    On 10/25/2017 12:15 PM, Xin Li wrote:
 > I think that is a bug, and  should be fixed in this task 
https://issues.apache.org/jira/browse/KAFKA-6030.
 > We experience that in our kafka cluster, we just check out the 11.0.2 
version, build it ourselves.

thanks for the hint, as it looks like a calculation issue, would it be 
possible to verify this by manually changing the clean ratio or some 
other settings?

Best,
Elmar





Re: Log Compaction Not Picking up Topic

2017-10-25 Thread Xin Li
Hey,
I think that is a bug, and  should be fixed in this task 
https://issues.apache.org/jira/browse/KAFKA-6030.
We experience that in our kafka cluster, we just check out the 11.0.2 version, 
build it ourselves.

Best,
Xin

Xin Li Data EngineeringXin.Li@ <mailto:xin...@xin.li>trivago.com 
<mailto:y...@trivago.com>www.trivago.com <http://www.trivago.com/>F +49 (0) 211 
540 65 115We're hiring! Check out our vacancies 
http://company.trivago.com/jobs/Court of registration: Amtsgericht Düsseldorf, 
HRB 51842
Managing directors: Rolf Schrömgens · Malte Siewert · Peter Vinnemeier · Andrej 
Lehnert · Johannes Thomas
trivago GmbH · Bennigsen-Platz 1 · D – 40474 Düsseldorf
* This email message may contain legally privileged and/or confidential 
information.
You are hereby notified that any disclosure, copying, distribution, or use of 
this email message is strictly prohibited.

On 25.10.17, 12:04, "Manikumar"  wrote:

any errors in log cleaner logs?

On Wed, Oct 25, 2017 at 3:12 PM, Elmar Weber  wrote:

> Hello,
>
> I'm having trouble getting Kafka to compact a topic. It's over 300GB and
> has enough segments to warrant cleaning. It should only be about 40 GB
> (there is a copy in a db that is unique on the key). Below are the configs
> we have (default broker) and topic override.
>
>
> Is there something I'm missing on which setting is overriding which one or
> something still wrongly?
>
> retention.ms and delete.retentions.ms I set manually after creation on
> the topic and some segments have been created already.
>
> Kafka version 0.11
>
> Server Defaults for new segments of the topic:
>
> The settings used when a new log was created for the topic:
>
> {compression.type -> producer, message.format.version -> 0.11.0-IV2,
> file.delete.delay.ms -> 6, max.message.bytes -> 2097152,
> min.compaction.lag.ms -> 0, message.timestamp.type -> CreateTime,
> min.insync.replicas -> 1, segment.jitter.ms -> 0, preallocate -> false,
> min.cleanable.dirty.ratio -> 0.5, index.interval.bytes -> 4096,
> unclean.leader.election.enable -> false, retention.bytes -> -1,
> delete.retention.ms -> 8640, cleanup.policy -> compact, flush.ms ->
> 9223372036854775807, segment.ms -> 60480, segment.bytes ->
> 1073741824, retention.ms -> -1, message.timestamp.difference.max.ms ->
> 9223372036854775807, segment.index.bytes -> 10485760, flush.messages ->
> 9223372036854775807}
>
> Topic Overrides (overridden after creation).
>
> {retention.ms=360, delete.retention.ms=360,
> max.message.bytes=10485760, cleanup.policy=compact}
>
>
>
> The full server startup config:
>
> advertised.host.name = null
> advertised.listeners = null
> advertised.port = null
> alter.config.policy.class.name = null
> authorizer.class.name =
> auto.create.topics.enable = false
> auto.leader.rebalance.enable = true
> background.threads = 10
> broker.id = 1
> broker.id.generation.enable = true
> broker.rack = europe-west1-c
> compression.type = producer
> connections.max.idle.ms = 60
> controlled.shutdown.enable = true
> controlled.shutdown.max.retries = 3
> controlled.shutdown.retry.backoff.ms = 5000
> controller.socket.timeout.ms = 3
> create.topic.policy.class.name = null
> default.replication.factor = 1
> delete.records.purgatory.purge.interval.requests = 1
> delete.topic.enable = true
> fetch.purgatory.purge.interval.requests = 1000
> group.initial.rebalance.delay.ms = 0
> group.max.session.timeout.ms = 30
> group.min.session.timeout.ms = 6000
> host.name =
> inter.broker.listener.name = null
> inter.broker.protocol.version = 0.11.0-IV2
> leader.imbalance.check.interval.seconds = 300
> leader.imbalance.per.broker.percentage = 10
> listener.security.protocol.map = SSL:SSL,SASL_PLAINTEXT:SASL_PL
> AINTEXT,TRACE:TRACE,SASL_SSL:SASL_SSL,PLAINTEXT:PLAINTEXT
> listeners = null
> log.cleaner.backoff.ms = 15000
> log.cleaner.dedupe.buffer.size = 134217728
> log.cleaner.delete.retention.ms = 8640
> log.cleaner.enable = true
> log.cleaner.io.buffer.load.factor = 0.9
> log.cleaner.io.buffer.size = 524288
> log.cleaner.io.max.bytes.per.second = 1.7976931348623157E308
> log.cleaner.min.cleanable.ratio = 0.5
> log.cleaner.min.compaction.lag.ms = 0
> log.cleaner.threads = 1
> l