Kafka error handling

2018-02-15 Thread Maria Pilar
Hi everyone

I'm designing a control handling for my kafka stream api.
I'm would like to know any documentation or best practise that I can read.
Basically I'm creating some topics error for failed messages and retry
topics.

Any suggestions?

Thanks


Consumer group

2018-02-05 Thread Maria Pilar
Hi everyone,

I have design a kafka api solution which has define some topics, these
topics are produced by other api connect with Cassandra data model.
So I need to fixed the sensical order and sequential order of the events,
it means i need to guarantee that create, update and delete events will
have a sensical order for the consumer.

At begining we only have one consumer, so the solution was one-to-one, one
entity to one topic/partition to guarantee the order, as the best practice
explains.

However now we have a new consumer, which the messages order matters as
well. The problem is if i create Consumer groups for my different consumer
and create more partitions in to each topic, i can´t guarantee that
sensical sequential order.

So i´m thinking to create a diferent topics for each consumer group, but
i´m not sure the performance or any issue that i will be able to have.

Any suggestion or feedback please

Thanks


Choose the number of partitions/topics

2018-01-29 Thread Maria Pilar
Hi everyone

I have design an integration between 2 systems throug our API Stream Kafka,
and the requirements are unclear to choose properly the number of
partitions/topics.

That is the use case:

My producer will send 28 different type of events, so I have decided to
create 28 topics.

The max size value for one message will be 4,096 bytes and the total size
(MB/day) will be 2.469,888 mb/day.

The retention will be 2 days.

By default I´m thinking in one partition that as recomentation by confluent
it can produce 10 Mb/second.

However the requirement for the consumer is the minimun latency (sub 3
seconds), so I thinking to create more leader partitions/per topic to
paralle and achive the thoughput.

Do you know what is the best practice or formule to define it properly?

Thanks


Best practices Partition Key

2018-01-25 Thread Maria Pilar
Hi everyone,

I´m trying to understand the best practice to define the partition key. I
have defined some topics that they are related with entities in cassandra
data model, the relationship is one-to-one, one entity - one topic, because
I need to ensure the properly ordering in the events. I have created one
partition for each topic to ensure it as well.

If I will use kafka like a datastore and search throgh the records, I know
that could be a best practice use the partition key of Cassandra (e.g
Customer ID) as a partition key in kafka

any comment please ??

thanks


Error sink parse Json

2018-01-18 Thread Maria Pilar
Hi everyone,

I trying to send events from the topic

This is the sink json configuration I’ve used:



{

  "name":"CPSConnector",

  "config":{


"connector.class":"com.google.pubsub.kafka.sink.CloudPubSubSinkConnector",

"tasks.max":"1",

"topics":"STREAM-CUSTOMER-ACCOUNTS",

"cps.topic":"test",

"cps.project":"test-dev",

"maxBufferSize":"10"

  }

}


and these are the ‘converter’ configuration I have in the properties file
(basically they are the default ones):




key.converter=org.apache.kafka.connect.json.JsonConverter

value.converter=org.apache.kafka.connect.json.JsonConverter

key.converter.schemas.enable=true

value.converter.schemas.enable=true



internal.key.converter=org.apache.kafka.connect.json.JsonConverter

internal.value.converter=org.apache.kafka.connect.json.JsonConverter

internal.key.converter.schemas.enable=false

internal.value.converter.schemas.enable=false





but I ‘m getting this trace in the log of Connect:



"org.apache.kafka.connect.errors.DataException: Converting byte[] to Kafka
Connect data failed due to serialization error: \n\tat
org.apache.kafka.connect.json.JsonConverter.toConnectData(JsonConverter.java:304)\n\tat
org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:453)\n\tat
org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:287)\n\tat
org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:198)\n\tat
org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:166)\n\tat
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)\n\tat
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)\n\tat
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat
java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat
java.lang.Thread.run(Thread.java:748)\nCaused by:
org.apache.kafka.common.errors.SerializationException:
com.fasterxml.jackson.core.JsonParseException: Unrecognized token
'ac0e69cb': was expecting ('true', 'false' or 'null')\n at [Source:
(byte[])\"ac0e69cb-5cab-4134-90c1-91ac70ce8b11\"; line: 1, column:
10]\nCaused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized
token 'ac0e69cb': was expecting ('true', 'false' or 'null')\n at [Source:
(byte[])\"ac0e69cb-5cab-4134-90c1-91ac70ce8b11\"; line: 1, column:
10]\n\tat
com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1798)\n\tat
com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:673)\n\tat
com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidToken(UTF8StreamJsonParser.java:3527)\n\tat
com.fasterxml.jackson.core.json.UTF8StreamJsonParser._handleUnexpectedValue(UTF8StreamJsonParser.java:2622)\n\tat
com.fasterxml.jackson.core.json.UTF8StreamJsonParser._nextTokenNotInObject(UTF8StreamJsonParser.java:826)\n\tat
com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:723)\n\tat
com.fasterxml.jackson.databind.ObjectMapper._readTreeAndClose(ObjectMapper.java:4030)\n\tat
com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2559)\n\tat
org.apache.kafka.connect.json.JsonDeserializer.deserialize(JsonDeserializer.java:50)\n\tat
org.apache.kafka.connect.json.JsonConverter.toConnectData(JsonConverter.java:302)\n\tat
org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:453)\n\tat
org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:287)\n\tat
org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:198)\n\tat
org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:166)\n\tat
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)\n\tat
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)\n\tat
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat
java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat
java.lang.Thread.run(Thread.java:748)\n"




I´m not sure what configuration needs to be able to send the event


Thanks


One type of event per topic?

2018-01-18 Thread Maria Pilar
Hi everyone,

I´m working in the configuration of the topics for the integration between
one API and Data platform system. We have created topic for each entity
that they would need to integrate in to the datawarehouse.


My question and I hope you can help me is, each entity will have diferent
type of events, for example to create and entity or update entity, and I´m
not sure if create one event per topic? or perhaps shared differents events
per topic, but I think that this option will have more complexity

Thanks a lot
Maria