Artemis Source connector into apache kafka.

2018-05-30 Thread meow licous
Hi,


I am trying to write a source connector from Apache Artemis into Kafka.
Here `Source Partition` doesn't make much sense since you can't keep track
of position in Artemis. You either `Ack` or `Nack` a message.

is connect only for sources that support reading from a specific point like
a file position, dbtimestamp ect?

Is is possible to copy these messages into kafka in a lossless manner.

- Meow


Re: Question regarding offset commits by Consumers - docs

2018-05-30 Thread M. Manna
That's what i was after! Thanks a lot.

On 30 May 2018 at 17:29, Emmett Butler  wrote:

> From the docs  >
> :
>
> > The brokers periodically compact the offsets topic since it only needs to
> > maintain the most recent offset commit per partition
>
> Offsets are maintained per partition per consumer group, so it doesn't
> matter which member of a consumer group is reading a given partition - the
> offset will remain consistent.
>
>
> On Wed, May 30, 2018 at 9:23 AM, M. Manna  wrote:
>
> > Hello,
> >
> > I was trying to remember from docs (it's been a while) how the last
> > committed offsets work i.e. whether it's being tracked per consumer
> > group-basis or something else. This is specific to when auto.offset.reset
> > is set to "earliest"/"latest" and the last committed offset is
> determined.
> >
> >
> > . For example:
> >
> > 1) C0, C1, C2 are subscribed to topics t0 and t1 - each topic has 3
> > partitions:
> > 2) Using range assignment - C0 [t0p0,t1p0] C1[t0p1,t1p1] and
> C2[t0p2,t1p2]
> > 3) Now C2 dies or leaves, or "Deactivates" - now rebalance occurs
> > 4) After Rebalance - C0[t0p0, t0p1, t1p0, t1p1] and C1[t0p2, t1p2] are
> the
> > new assignments
> >
> > Assuming that C0,C1,C2 are both part of *same *consumer group, and C2
> > successfully committed to offset #6 (t0p2 and t1p2) before leaving, will
> #6
> > be considered as the latest committed offset for that consumer group? In
> > other words, reassignment of that partition to any existing consumer
> under
> > that group will acknowlede #6 as the latest offset?
> >
> > If someone could quote the docs, it'll be appreciated. Meanwhile, I will
> > try to figure out something from the code if possible.
> >
> > Thanks,
> >
>
>
>
> --
> Emmett Butler | Senior Software Engineer
>  emmett-butler&utm_campaign=Signature>
>


Question regarding offset commits by Consumers - docs

2018-05-30 Thread M. Manna
Hello,

I was trying to remember from docs (it's been a while) how the last
committed offsets work i.e. whether it's being tracked per consumer
group-basis or something else. This is specific to when auto.offset.reset
is set to "earliest"/"latest" and the last committed offset is determined.


. For example:

1) C0, C1, C2 are subscribed to topics t0 and t1 - each topic has 3
partitions:
2) Using range assignment - C0 [t0p0,t1p0] C1[t0p1,t1p1] and C2[t0p2,t1p2]
3) Now C2 dies or leaves, or "Deactivates" - now rebalance occurs
4) After Rebalance - C0[t0p0, t0p1, t1p0, t1p1] and C1[t0p2, t1p2] are the
new assignments

Assuming that C0,C1,C2 are both part of *same *consumer group, and C2
successfully committed to offset #6 (t0p2 and t1p2) before leaving, will #6
be considered as the latest committed offset for that consumer group? In
other words, reassignment of that partition to any existing consumer under
that group will acknowlede #6 as the latest offset?

If someone could quote the docs, it'll be appreciated. Meanwhile, I will
try to figure out something from the code if possible.

Thanks,


Re: Question regarding offset commits by Consumers - docs

2018-05-30 Thread Emmett Butler
>From the docs 
:

> The brokers periodically compact the offsets topic since it only needs to
> maintain the most recent offset commit per partition

Offsets are maintained per partition per consumer group, so it doesn't
matter which member of a consumer group is reading a given partition - the
offset will remain consistent.


On Wed, May 30, 2018 at 9:23 AM, M. Manna  wrote:

> Hello,
>
> I was trying to remember from docs (it's been a while) how the last
> committed offsets work i.e. whether it's being tracked per consumer
> group-basis or something else. This is specific to when auto.offset.reset
> is set to "earliest"/"latest" and the last committed offset is determined.
>
>
> . For example:
>
> 1) C0, C1, C2 are subscribed to topics t0 and t1 - each topic has 3
> partitions:
> 2) Using range assignment - C0 [t0p0,t1p0] C1[t0p1,t1p1] and C2[t0p2,t1p2]
> 3) Now C2 dies or leaves, or "Deactivates" - now rebalance occurs
> 4) After Rebalance - C0[t0p0, t0p1, t1p0, t1p1] and C1[t0p2, t1p2] are the
> new assignments
>
> Assuming that C0,C1,C2 are both part of *same *consumer group, and C2
> successfully committed to offset #6 (t0p2 and t1p2) before leaving, will #6
> be considered as the latest committed offset for that consumer group? In
> other words, reassignment of that partition to any existing consumer under
> that group will acknowlede #6 as the latest offset?
>
> If someone could quote the docs, it'll be appreciated. Meanwhile, I will
> try to figure out something from the code if possible.
>
> Thanks,
>



-- 
Emmett Butler | Senior Software Engineer



Kafka Connect Fails - java.lang.OutOfMemoryError: Direct buffer memory

2018-05-30 Thread Bunty Ray
We enable SSL on Kafka Connect and post that we are seeing this below
OutOfMemory error.

[2018-05-30 13:15:36,006] ERROR Task rxrdah-hdfs-sink-uat-55 threw an
uncaught and unrecoverable exception
(org.apache.kafka.connect.runtime.WorkerTask)
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:693)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
at sun.nio.ch.IOUtil.read(IOUtil.java:195)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at
org.apache.kafka.common.network.PlaintextTransportLayer.read(PlaintextTransportLayer.java:109)
at
org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:101)
at
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:75)
at
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:203)
at
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:167)
at
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:381)
at org.apache.kafka.common.network.Selector.poll(Selector.java:326)
at
org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:433)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:232)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:208)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:184)
at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:214)
at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:200)
at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
at
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078)
at
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.pollConsumer(WorkerSinkTask.java:366)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:246)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:180)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
at
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146)
at
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:190)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


Re: retention.ms not honored for topic

2018-05-30 Thread Thomas Hays
I tried setting segment.ms=360 and retention.ms=6 and the
segments were still not deleting after several hours. Also, this is a
high volume topic, so log.segment.bytes=1073741824 is easily met
within an hour.

What seems to have worked was to shutdown all Kafka brokers, manually
deleting all data for that topic across all brokers, and restarting
the brokers. Since then they have been deleting data according to the
retention.ms setting. I am now monitoring the logs and will slowly
increase retention.ms back to the desired 2 days. We have ran Kafka in
this config for three years and this is the first time this has
happened. No idea what triggered the issue and no errors/warnings were
in the Kafka logs.


On Wed, May 30, 2018 at 12:50 AM, Shantanu Deshmukh
 wrote:
> Hey,
>
> You should try setting topic level config by doing kafka-topics.sh --alter
> --topic  --config = --zookeeper 
>
> Make sure you also set segment.ms for topics which are not that populous.
> This setting specifies amount of time after which a new segment is rolled.
> So Kafka deletes only those messages which lie in a segment which is old or
> full. Basically Kafka doesn't touch current segment. So if we roll soon
> enough changes of messages in it getting eligible for retention.ms setting
> increases. I am not fully sure what effect it might have on cluster
> resources if segment.ms value is kept  too low, as broker might spend too
> much resources just rolling many segments. So keep it some reasonable value.
>
> On Tue, May 29, 2018 at 9:31 PM Thomas Hays  wrote:
>
>> A single topic does not appear to be honoring the retention.ms
>> setting. Three other topics (plus __consumer_offsets) on the Kafka
>> instance are deleting segments normally.
>>
>> Kafka version: 2.12-0.10.2.1
>> OS: CentOS 7
>> Java: openjdk version "1.8.0_161"
>> Zookeeper: 3.4.6
>>
>> Retention settings (from kafka-topics.sh describe): Topic:elk
>> PartitionCount:50 ReplicationFactor:2 Configs:retention.ms=720
>>
>> Other config settings from server.properties
>>
>> log.retention.hours=48
>> log.segment.bytes=1073741824
>> log.retention.check.interval.ms=30
>>
>> Looking in the data directory, I see multiple segment files older than 48
>> hours:
>>
>> -rw-r--r-- 1 root root 1073676782 May 26 20:16 004713142447.log
>> -rw-r--r-- 1 root root 1073105605 May 26 20:18 004715239774.log
>> -rw-r--r-- 1 root root 1072907965 May 26 20:20 004717450325.log
>>
>> Current date/time on server: Tue May 29 10:51:49 CDT 2018
>>
>> This issue appears on all Kafka brokers and I have tried multiple
>> rolling restarts of all Kafka brokers and the issue remains. These
>> servers stopped deleting segments for this topic on May 15. This does
>> not correlate to any known config change. I have found no
>> error/warning messages in the logs to indicate a problem.
>>
>> What am I missing? Thank you.
>>


Errors observed in performance test using kafka-producer-perf-test.sh|

2018-05-30 Thread Localhost shell
Hello All,

I am trying to perform a benchmark test in our kafka env. I have played
with few configurations such as request.timeout.ms and max.block.ms and
throughout but not able to avoid the error:
org.apache.kafka.common.errors.TimeoutException: The request timed out.
org.apache.kafka.common.errors.NetworkException: The server disconnected
before a response was received.
org.apache.kafka.common.errors.TimeoutException: Expiring 148 record(s) for
benchmark-6-3r-2isr-none-0: 182806 ms has passed since last append

Produce Perf Test command:
nohup sh ~/kafka/kafka_2.11-1.0.0/bin/kafka-producer-perf-test.sh --topic
benchmark-6p-3r-2isr-none --num-records 1000 --record-size 100
--throughput -1 --print-metrics --producer-props acks=all
bootstrap.servers=node1:9092,node2:9092,node3:9092 request.timeout.ms=18
max.block.ms=18 buffer.memory=1 >
~/kafka/load_test/results/6p-3r-10M-100B-t-1-ackall-rto3m-block2m-bm100m-2
2>&1

Cluster: 3 nodes, topic: 6 partitions, RF=3 and minISR=2
I am monitoring the kafka metrics using a tsdb and grafana. I know that
disk IO perf is bad [disk await(1.5 secs), IO queue size and disk
utilization metrics are high(60-75%)] but I don't see any issue in kafka
logs that can relate slow disk io to the above perf errors.

I have even run the test with throughput=1000(all above params same) but
still get timeout exceptions.

Need suggestions to understand the issue and fix the above errors?

--Unilocal