Yeah if you want to file and JIRA and post a patch for a new option its
possible others would want it. Maybe something like
pre.initialize.topics=x,y,z
pre.initialize.timeout=x
The metadata fetch timeout is a bug...that behavior is inherited from
Object.wait which defines zero to mean infinite
Also, if log.cleaner.enable is true in your broker config, that enables the
log-compaction retention strategy.
Then, for topics with the per-topic "cleanup.policy=compact" config
parameter set, Kafka will scan the topic periodically, nuking old versions of
the data with the same key.
Hi Jay,
I have implemented a wrapper around the producer to behave like I want it
to. Where it diverges from current 0.8.2 producer is that it accepts three
new inputs:
- A list of expected topics
- A timeout value to init meta for those topics during producer creationg
- An option to blow up if
Thanks, didn't know that.
On Fri, Dec 19, 2014 at 10:39 AM, Jiangjie Qin
wrote:
>
> Hi Rajiv,
>
> You can send messages without keys. Just provide null for key.
>
> Jiangjie (Becket) Qin
>
>
> On 12/19/14, 10:14 AM, "Rajiv Kurian" wrote:
>
> >Hi all,
> >
> >I was wondering what why every Produce
Hi
I would like to get some feedback on design choices with kafka consumers.
We have an application that a consumer reads a message and the thread does
a number of things, including database accesses before a message is
produced to another topic. The time between consuming and producing the
message
@Joe, Achanta is using Indian English numerals which is why it's a little
confusing. http://en.wikipedia.org/wiki/Indian_English#Numbering_system
1,00,000 [1 lakh] (Indian English) == 100,000 [1 hundred thousand] (The
rest of the world :P)
On Fri Dec 19 2014 at 9:40:29 AM Achanta Vamsi Subhash <
a
Hi folks,
I am new to both Kafka and Storm and I have problem having KafkaSpout to
get data from Kafka in our three-node environment with Kafka 0.8.1.1 and
Storm 0.9.3.
What is working:
- I have a Kafka producer (a java application) to generate random string to
a topic and I was able to run the f
Hey Paul,
I agree we should document this better.
We allow and encourage using partitions to semantically distribute data. So
unfortunately we can't just arbitrarily assign a partition (say 0) as that
would actually give incorrect answers for any consumer that made use of the
partitioning. It is
Hi Rajiv,
You can send messages without keys. Just provide null for key.
Jiangjie (Becket) Qin
On 12/19/14, 10:14 AM, "Rajiv Kurian" wrote:
>Hi all,
>
>I was wondering what why every ProducerRecord sent requires a serialized
>key. I am using kafka, to send opaque bytes and I am ending up crea
Hi all,
I was wondering what why every ProducerRecord sent requires a serialized
key. I am using kafka, to send opaque bytes and I am ending up creating
garbage keys because I don't really have a good one.
Thanks,
Rajiv
Joe,
- Correction, it's 1,00,000 partitions
- We can have at max only 1 consumer/partition. Not 50 per 1 partition.
Yes, we have a hashing mechanism to support future partition increase as
well. We override the Default Partitioner.
- We use both Simple and HighLevel consumers depending on the cons
Hi Jay,
Many thanks for the info. All that makes sense, but from an API
standpoint when something is labelled async and returns a Future, this will
be misconstrued and developers will place async sends in critical client
facing request/response pathways of code that should never block. If the
app
Wait, how do you get 2,000 topics each with 50 partitions == 1,000,000
partitions? I think you can take what I said below and change my 250 to 25
as I went with your result (1,000,000) and not your arguments (2,000 x 50).
And you should think on the processing as a separate step from fetch and
com
see some comments inline
On Fri, Dec 19, 2014 at 11:30 AM, Achanta Vamsi Subhash <
achanta.va...@flipkart.com> wrote:
>
> We require:
> - many topics
> - ordering of messages for every topic
>
Ordering is only on a per partition basis so you might have to pick a
partition key that makes sense for
We require:
- many topics
- ordering of messages for every topic
- Consumers hit different Http EndPoints which may be slow (in a push
model). In case of a Pull model, consumers may pull at the rate at which
they can process.
- We need parallelism to hit with as many consumers. Hence, we currently
Technically/conceptually it is possible to have 200,000 topics, but do you
really need it like that?What do you intend to do with those messages - i.e.
how do you forsee them being processed downstream? And are those topics really
there to segregate different kinds of processing or different ids
Yes. We need those many max partitions as we have a central messaging
service and thousands of topics.
On Friday, December 19, 2014, nitin sharma
wrote:
> hi,
>
> Few things you have to plan for:
> a. Ensure that from resilience point of view, you are having sufficient
> follower brokers for you
hi,
Few things you have to plan for:
a. Ensure that from resilience point of view, you are having sufficient
follower brokers for your partitions.
b. In my testing of kafka (50TB/week) so far, haven't seen much issue with
CPU utilization or memory. I had 24 CPU and 32GB RAM.
c. 200,000 partitions
We definitely need a retention policy of a week. Hence.
On Fri, Dec 19, 2014 at 7:40 PM, Achanta Vamsi Subhash <
achanta.va...@flipkart.com> wrote:
>
> Hi,
>
> We are using Kafka for our messaging system and we have an estimate for
> 200 TB/week in the coming months. Will it impact any performance
Hi,
We are using Kafka for our messaging system and we have an estimate for 200
TB/week in the coming months. Will it impact any performance for Kafka?
PS: We will be having greater than 2 lakh partitions.
--
Regards
Vamsi Subhash
20 matches
Mail list logo