Storing large blobs in S3 or HDFS and placing URIs in Kafka is the most
common solution I've seen in use.
On Tue, Oct 6, 2015 at 8:32 AM, Joel Koshy wrote:
> The best practice I think is to just put large objects in a blob store
> and have messages embed references to those
The best practice I think is to just put large objects in a blob store
and have messages embed references to those blobs. Interestingly we
ended up having to implement large-message-support at LinkedIn but for
various reasons were forced to put messages inline (i.e., against the
above
hi!
We have a use case where we want to store ~100m keys in kafka. Is there any
problem with this approach?
I have heard from some people using kafka, that kafka has a problem when doing
log compaction with those many number of keys.
Another topic might have around 10 different K/V pairs for
At Lithium, we have multiple datacenters and we distcp our data across our
Hadoop clusters. We have 2 DCs in NA and 1 in EU. We have a non-redundant
direct connect from our EU cluster to one of our NA DCs. If and when this
fails, we have automatic failover to a VPN that goes over the internet. The
Thanks for the replies!
I was rather hoping not to have to implement a side channel solution. :/
If we have to do this, we may use an HBase table with a TTL the same as our
topic so the large objects are "gc'ed"... thoughts?
On Tue, Oct 6, 2015 at 8:45 AM, Gwen Shapira
Hi!
Is there a way to track current partition ownership when using the
high-level consumer? It looks like the rebalance callback only tells me the
partitions I'm (potentially) losing.
-Joey
On Sat, Oct 3, 2015 at 4:36 PM, Jun Rao wrote:
>
> We will update the download link in our website shortly.
>
The download page has been updated:
http://kafka.apache.org/downloads.html
Ismael
Zookeeper will have this information under /consumers//owners
On Tue, Oct 6, 2015 at 12:22 PM, Joey Echeverria wrote:
> Hi!
>
> Is there a way to track current partition ownership when using the
> high-level consumer? It looks like the rebalance callback only tells me the
>
But nothing in the API?
-Joey
On Tue, Oct 6, 2015 at 3:43 PM, Gwen Shapira wrote:
> Zookeeper will have this information under /consumers//owners
>
>
>
> On Tue, Oct 6, 2015 at 12:22 PM, Joey Echeverria wrote:
>
> > Hi!
> >
> > Is there a way to track
I don't think so. AFAIK, even the new API won't send this information to
every consumer, because in some cases it can be huge.
On Tue, Oct 6, 2015 at 1:44 PM, Joey Echeverria wrote:
> But nothing in the API?
>
> -Joey
>
> On Tue, Oct 6, 2015 at 3:43 PM, Gwen Shapira
Hello,
How do you consume a kafka topic from a remote location without a dedicated
connection? How do you protect the server?
The setup: data streams into our datacenter. We process it, and publish it
to a kafka cluster. The consumer is located in a different datacenter with
no direct
Jason,
What is the config values for your producer, especially "acks"? And what is
the replication scheme you were using on the broker side?
Guozhang
On Tue, Oct 6, 2015 at 6:25 AM, Jason Kania wrote:
> Hello,
> I am using 8.2.1 and getting a scenario where my send
Hello!
Can somebody explain me how to use multiple consumers with different commit
storage...
For example, java-based consumers use kafka commit storage...
python-based consumers use zookeeper commit storage
My question is:
Is it true that when one consumer commit to kafka, server also commit
Thanks Grant for quick reply!
I've used AdminUtils.topicExists("__consumer_offsets") check and even 10sec
after Kafka broker startup, the check fails.
When, on which event, does this internal topic get created? Is there some
broker config property preventing it from being created? Does one have
Hello,
At my organization we are already using kafka in a few areas, but we're
looking to expand our use and we're struggling with how best to distribute
our events on to topics.
We have on the order of 30 different kinds of events that we'd like to
distribute via kafka. We have one or two
I usually approach this questions by looking at possible consumers.
You usually want each consumer to read from relatively few topics, use most
of the messages it receives and have fairly cohesive logic for using these
messages.
Signs that things went wrong with too few topics:
* Consumers that
You can configure "advertised.host.name" for each broker, which is the name
external consumers and producers will use to refer to the brokers.
On Tue, Oct 6, 2015 at 3:31 PM, Tom Brown wrote:
> Hello,
>
> How do you consume a kafka topic from a remote location without a
Debugged, and found in KafkaApis.handleConsumerMetadataRequest that
consumer offsets topic gets created on first lookup of offsets topic
metadata, even when auto topic creation is disabled.
In that method there is following call:
// get metadata (and create the topic if necessary)
val
I really only want them for the partitions I own. The client should know
that in order to acquire the zookeeper locks and could potentially execute
a callback to tell me the partitions I own after a rebalance.
-Joey
On Tue, Oct 6, 2015 at 4:08 PM, Gwen Shapira wrote:
> I
Hello,
I am using 8.2.1 and getting a scenario where my send works fine but the
subsequent call to Future.get() does not return - it hangs for at least 5
minutes. When I kill the client with the producer I get a Connection reset by
peer message in the server.log. I am not sure what to check to
20 matches
Mail list logo