Hey guys,
The locking argument is correct for very small records (< 50 bytes),
batching will help here because for small records locking becomes the big
bottleneck. I think these use cases are rare but not unreasonable.
Overall I'd emphasize that the new producer is way faster at virtually all
us
I must agree with @Roshan – it's hard to imagine anything more intuitive
and easy to use for atomic batching as old sync batch api. Also, it's fast.
Coupled with a separate instance of producer per
broker:port:topic:partition it works very well. I would be glad if it finds
its way into new producer
We recently upgraded kafka in our production environment cluster of 5
brokers from 0.8.0 to 0.8.2. Since then the consumerOffsetChecker script is
unable to fetch offset due to
kafka.common.NotCoordinatorForConsumerException.
Note I'm able to run the 'consumerOffsetChecker' from an older version
0.8
@Ewen
No I did not use compression in my measurements.
Does increasing PartitionFetchInfo.fetchSize help?
Speaking of Kafka API, it looks like throwing exception would be less
confusing if fetchSize is not enough to get at least one message at
requested offset.
2015-04-28 21:12 GMT+03:00 Laran Evans :
> I’ve got a simple consumer. According to GetOf
@Joel,
If flush() works for this use case it may be an acceptable starting point
(although not as clean as a native batched sync). I am not as yet clear
about some aspects of flush's batch semantics and its suitability for this
mode of operation. Allow me explore it with you folks..
1) flush() gu
1. We’re using version 0.8.1.1.
2. No failures in the consumer logs
3. We’re using the ConsumerOffsetChecker to see what partitions are assigned to
the consumer group and what their offsets are. 8 of the 12 process each have
been assigned two partitions and they’re keeping up with the topic. The
Hi,
I wonder if anyone has a good example of how to write Spark RDDs into
Kafka. Specifically, my question is if there is an advantage of sending a
list of messages each time over sending one message at a time.
Sample code for sending one message at a time:
dStream.foreachRDD(rdd => {
rdd.col
Ok, all of that makes sense. The only way to possibly recover from that
state is either for K2 to come back up allowing the metadata refresh to
eventually succeed or to eventually try some other node in the cluster.
Reusing the bootstrap nodes is one possibility. Another would be for the
client to
I’ve got a simple consumer. According to GetOffsetShell my offset is 209418.
But the SimpleConsumer doesn’t get any messages past offset 123146. It just
won’t pull down any messages after that offset. If I send more messages onto
the topic it still won’t pull them down. Though the offset does in
Couple of questions:
- What version of the consumer API are you using?
- Are you seeing any rebalance failures in the consumer logs?
- How do you determine that some partitions are unassigned? Just confirming
that you have partitions that are not being consumed from as opposed to
consumer threads
Kind of what you need but not quiet:
Sqoop2 is capable of getting data from HDFS to Kafka.
AFAIK it doesn't support Hive queries, but feel free to open a JIRA for
Sqoop :)
Gwen
On Tue, Apr 28, 2015 at 4:09 AM, Svante Karlsson
wrote:
> What's the best way of exporting contents (avro encoded) fr
Also note that the metadata for the topic is missing. I tried creating few
more topics and all have the same issue.
Using the Kafka console producer on the topic, I see these error messages
indicating the missing metadata:
WARN Error while fetching metadata [{TopicMetadata for topic my-topic ->
N
I’m sorry, I forgot to specify that these processes are in the same consumer
group.
Thanks,
Dave
On 4/28/15, 1:15 PM, "Aditya Auradkar" wrote:
>Hi Dave,
>
>The simple consumer doesn't do any state management across consumer instances.
>So I'm not sure how you are assigning partitions in y
Hi Dave,
The simple consumer doesn't do any state management across consumer instances.
So I'm not sure how you are assigning partitions in your application code. Did
you mean to say that you are using the high level consumer API?
Thanks,
Aditya
From: D
Yes, if you set the offset storage to Kafka, high level consumer will be
using Kafka for all offset related operations.
Jiangjie (Becket) Qin
On 4/27/15, 7:03 PM, "Gomathivinayagam Muthuvinayagam"
wrote:
>I am trying to commit offset request in a background thread. I am able to
>commit it so f
You can use the min.insync.replicas topic level configuration in this case. It
must be used with acks=-1 which is a producer config.
http://kafka.apache.org/documentation.html#topic-config
Aditya
From: Gomathivinayagam Muthuvinayagam [sankarm...@gmail.co
Hi Ewen,
Thanks for the response. I agree with you, In some case we should use
bootstrap servers.
>
> If you have logs at debug level, are you seeing this message in between the
> connection attempts:
>
> Give up sending metadata request since no node is available
>
Yes, this log came for co
Hi
I have had zk and kafka(2_8.0-0.8.1) set up nicely running for a week or so, I
decided to stop the zk and the kafka brokers and re-start them, since stopping
zk I can't start it again!! It gives me this fatal exception that is related to
one of my test topics "multinode1partition4reptopic"!?
If I use high level consumer API, and also offset storage as kafka, does
high level consumer API takes care of everything?
Thanks & Regards,
On Tue, Apr 28, 2015 at 7:47 AM, Manoj Khangaonkar
wrote:
> Use the Simple Consumer API if you want to control which offset in a
> partition the client
Hi, I am trying to consume a 24-partition topic across 12 processes. Each
process is using the simple consumer API, and each is being assigned two
consumer threads. I have noticed when starting these processes that sometimes
some of my processes are not being assigned any partitions, and no reba
Use the Simple Consumer API if you want to control which offset in a
partition the client should read messages from.
Use the High level consumer, if you don'nt care about offsets.
Offsets are per partition and stored on zookeeper.
regards
On Tue, Apr 28, 2015 at 4:38 AM, Gomathivinayagam Muthuv
Hi Zakee,
Thanks for your reply.
>message.send.max.retries
3
>retry.backoff.ms
100
>topic.metadata.refresh.interval.ms
600*1000
This is my properties.
Regards,
Madhukar
On Tue, Apr 28, 2015 at 3:26 AM, Zakee wrote:
> What values do you have for below properties? Or are these set to default
subscribe
I am trying to setup a cluster where messages should never be lost once it
is published. Say if I have 3 brokers, and if I configure the replicas to
be 3 also, and if I consider max failures as 1, and I can achieve the above
requirement. But when I post a message, how do I prevent kafka from
accept
I have just posted the following question in stackoverflow. Could you
answer the following questions?
I would like to use Kafka high level consumer API, and at the same time I
would like to disable auto commit of offsets. I tried to achieve this
through the following steps.
1) auto.commit.enable
I may be wrong but try to use flume (http://flume.apache.org), just not
sure if it has hive source.
On 28 April 2015 at 15:09, Svante Karlsson wrote:
> What's the best way of exporting contents (avro encoded) from hive queries
> to kafka?
>
> Kind of camus, the other way around
>
> best regards
What's the best way of exporting contents (avro encoded) from hive queries
to kafka?
Kind of camus, the other way around
best regards
svante
28 matches
Mail list logo