Hey Josh,
NoSQL DBs may actually be easier because they themselves generally don't
have a global order. I.e. I believe Mongo has a per-partition oplog, is
that right? Their partitions would match our partitions.
-Jay
On Wed, Mar 4, 2015 at 5:18 AM, Josh Rader jrader...@gmail.com wrote:
Thanks
Hey Xiao,
1. Nothing prevents applying transactions transactionally on the
destination side, though that is obviously more work. But I think the key
point here is that much of the time the replication is not Oracle=Oracle,
but Oracle={W, X, Y, Z} where W/X/Y/Z are totally heterogenous systems
Hey Jay,
Yeah. I understood the advantage of Kafka is one to many. That is why I am
reading the source codes of Kafka. Your guys did a good product! : )
Our major concern is its message persistency. Zero data loss is a must in our
applications. Below is what I copied from the Kafka document.
Hey all, it seems that 0.8.2 has added a handful more errors to the
protocol which are not yet reflected on the wiki page [1]. Specifically,
[2] seems to indicate that codes 17-20 now have associated meanings.
My questions are:
- Which of these are exposed publicly? (for example, the existing
Hey Xiao,
Yeah I agree that without fsync you will not get durability in the case of
a power outage or other correlated failure, and likewise without
replication you won't get durability in the case of disk failure.
If each batch is fsync'd it will definitely be slower, depending on the
Yes you are right on the oplog per partition as well as that mapping well
to the Kafka partitions. I think we are making this harder than it is
based on previous attempts and trying to leverage something like Databus
for propagating log changes from MongoDB and Cassandra since it requires a
scn.
So I've got 3 kafka brokers that were started with delete.topic.enable set
to true. When they start, I can see in the logs that the property was
successfully set. The dataset in each broker is only approximately 2G (per
du). When running kafaka-delete.sh with the correct arguments to delete all
of
Hi Jeff,
Are you seeing any errors in state-change.log or controller.log
after issuing kafka-topics.sh --delete command.
There is another known issue is if you have auto.topic.enable.create =
true (this is true by default) your consumer or producer can re-create
the topic. So try
Thanks Joe, keeping documentation in sync with KIPs does seem like a
reasonable process going forward. And I apologize for the confrontational
tone I used to end my original email, that was not called for.
In the mean time, where can I find the answers to my two actual questions?
I think I've
Hi Jeff,
The controller should have a Topic deletion thread running
coordinating the delete in the cluster, and the progress should be
logged to the controller log.
Can you look at the controller log to see what's going on?
Tim
On Wed, Mar 4, 2015 at 10:28 AM, Jeff Schroeder
Hey Evan, moving forward (so 0.8.3.0 and beyond) the release documentation
is going to match up more with specific KIP changes
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
which elaborated on things like breaking changes and major modifications
you should adopt
Hello,
I'm using the high level consumer with auto-commit disabled and a
single thread per consumer, in order to consume messages in batches.
In case of failures on the database, I'd like to stop processing,
rollback and restart from the last commited offset.
Is there a way to receive the
Hi Gwen,
The root cause of all io related problems seems to be file rename that
Camus does and underlying Hadoop MapR FS.
We are copying files from user volume to a day volume (rename does copy)
when mapper commits file to FS. Please refer to
As per my knowledge, I don't think we you can do that with an online
stream. You will have to reset the offsets to a particular offset in the
past to start consuming from that. Another way would be start a separate
consumer with different groupId.
In any case you cannot consume from past offset
Looking around the nom repo, it looks like there is no current support for
0.8.2.
Is the only alternative to use REST/Proxy?
Thanks
Julio Castillo
NOTICE: This e-mail and any attachments to it may be privileged, confidential
or contain trade secret information and is intended only for the
I think the camus mailing list would be more suitable for this
question.
Thanks,
Joel
On Wed, Mar 04, 2015 at 11:00:51AM -0500, max square wrote:
Hi all,
I have browsed through different conversations around Camus, and bring this
as a kinda Kafka question. I know is not the most orthodox,
I think what you may be looking for is being discussed here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-6+-+New+reassignment+partition+logic+for+rebalancing
On Wed, Mar 04, 2015 at 12:34:30PM +0530, sunil kalva wrote:
Is there any way to automate
On Mar 3, 2015 11:57 AM, sunil kalva
This is not possible with the current high-level consumer without a
restart, but the new consumer (under development) does have support
for this.
On Wed, Mar 04, 2015 at 03:04:57PM -0500, Luiz Geovani Vier wrote:
Hello,
I'm using the high level consumer with auto-commit disabled and a
single
Also see the related tool
http://confluent.io/downloads/
Confluent is bringing the glue together for Kafta , Avro , Camus
Though there is no clarity around support (e.g update of Kafta) around it
at this moment.
On Thu, Mar 5, 2015 at 8:57 AM, Joel Koshy jjkosh...@gmail.com wrote:
I think
Thanks for that info Jun.
On Tue, Mar 3, 2015 at 3:56 PM, Jun Rao j...@confluent.io wrote:
Camus only fetches from different partitions in parallel.
Thanks,
Jun
On Fri, Feb 27, 2015 at 4:24 PM, Yang tedd...@gmail.com wrote:
we have a single partition, and the topic contains 300k
Thanks, Mayuresh and Joel. Reconnecting works just fine, although it's
much more complex than just calling rollback(), so I'm looking forward
to the new version :)
-Geovani
On Wed, Mar 4, 2015 at 4:57 PM, Joel Koshy jjkosh...@gmail.com wrote:
This is not possible with the current high-level
Cool. So then this is a non issue then. To make things better we can expose
the availablePartitons() api through Kafka producer. What do you think?
Thanks,
Mayuresh
On Tue, Mar 3, 2015 at 4:56 PM, Guozhang Wang wangg...@gmail.com wrote:
Hey Jun,
You are right. Previously I thought only in
I think the libjars is not required. Maven package command for the camus
project, builds the uber jar(fat jar) which contains all the dependencies
in it. I generally run camus the following way.
hadoop jar camus-example-0.1.0-SNAPSHOT-shaded.jar
com.linkedin.camus.etl.kafka.CamusJob -P
Thanks James. This is really helpful. Another extreme edge case might be
that the single producer is sending the database log changes and the
network causes them to reach Kafka out of order. How do you prevent
something like this, I guess relying on the scn on the consumer side?
On Wed, Mar
Another thing to think about is delivery guarantees. Exactly once, at least
once, etc.
If you have a publisher that consumes from the database log and pushes out to
Kafka, and then the publisher crashes, what happens when it starts back up?
Depending on how you keep track of the database's
What branch of camus are you using? We have our own fork that we updated the
camus dependency from the avro snapshot of the REST Schema Repository to the
new official one you mention in github.com/schema-repo. I was not aware of a
branch on the main linked-in camus repo that has this.
That
Hello hello,
Results of the poll are here!
Any guesses before looking?
What % of Kafka users are on 0.8.2.x already?
What % of people are still on 0.7.x?
http://blog.sematext.com/2015/03/04/poll-results-kafka-version-distribution/
Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized
Hi,
You can see the number of voters in the poll itself (view poll results link
in the poll widget).
Audience details unknown, but the poll was posted on:
* twitter - https://twitter.com/sematext/status/57050147435776
* LinkedIn - a few groups - Kafka, DevOps, and I think another larger one
*
Hello,
We use docker for kafka on vm's with both nas and local disk. We mount the
volumes externally. We havent had many problems at all, and a restart has
cleared any issue. We are on .8.1
We are also started to deploy to aws.
--
Colin
+1 612 859 6129
Skype colin.p.clark
On Mar 4,
Do you have a anything on the number of voters, or audience breakdown?
Christian
On Wed, Mar 4, 2015 at 8:08 PM, Otis Gospodnetic otis.gospodne...@gmail.com
wrote:
Hello hello,
Results of the poll are here!
Any guesses before looking?
What % of Kafka users are on 0.8.2.x already?
What %
+1. Verified quick start, unit tests.
On Tue, Mar 3, 2015 at 12:09 PM, Joe Stein joe.st...@stealth.ly wrote:
Ok, lets fix the transient test failure on trunk agreed not a blocker.
+1 quick start passed, verified artifacts, updates in scala
Hi,
On Sat, Feb 28, 2015 at 9:16 AM, Gene Robichaux gene.robich...@match.com
wrote:
What is the best way to detect consumer lag?
We are running each consumer as a separate group and I am running the
ConsumerOffsetChecker to assess the partitions and the lag for each
group/consumer. I run
Thanks a lot Jeff for redirecting me to the right place.. :-)
Is there any tentative date when we can get the official release with this
patch.
On 4 March 2015 at 19:42, Jeff Holoman jholo...@cloudera.com wrote:
Take a look here:
https://issues.apache.org/jira/browse/KAFKA-1865
On
Hi team,
Is there a built-in metric that can measure the end to end latency in MM?
--
Regards,
Tao
Thunder,
thanks for your reply. The hadoop job is now correctly configured (the
client was not getting the correct jars), however I am getting Avro
formatting exceptions due to the format the schema-repo server follows. I
think I will do something similar and create our own branch that uses the
Seeing around 5k msgs/s. The messages are small (average 42 bytes after
snappy compression)
On Wed, Mar 4, 2015 at 11:34 PM, Vineet Mishra clearmido...@gmail.com
wrote:
Hi Roger,
I have already enabled the snappy, the throughput which I have mentioned is
after only.
Could you mention
Hi,
On Fri, Feb 27, 2015 at 1:36 AM, James Cheng jch...@tivo.com wrote:
Hi,
I know that Netflix might be talking about Kafka on AWS at the March
meetup, but I wanted to bring up the topic anyway.
I'm sure that some people are running Kafka in AWS.
I'd say most, not some :)
Is anyone
Hi Roger,
I have already enabled the snappy, the throughput which I have mentioned is
after only.
Could you mention what's the throughput you have reaching.
Thanks!
On Thu, Mar 5, 2015 at 12:56 PM, Roger Hoover roger.hoo...@gmail.com
wrote:
Hi Vineet,
Try enabling compression. That
Thanks for running the poll and sharing the results!
On Wed, Mar 4, 2015 at 8:34 PM, Otis Gospodnetic otis.gospodne...@gmail.com
wrote:
Hi,
You can see the number of voters in the poll itself (view poll results link
in the poll widget).
Audience details unknown, but the poll was posted on:
Thanks Jagat for the callout!
Confluent Platform 1.0 http://confluent.io/product/ includes Camus and we
were happy to address any questions in our community mailing list
confluent-platf...@googlegroups.com.
On Wed, Mar 4, 2015 at 8:41 PM, max square max2subscr...@gmail.com wrote:
Thunder,
thank you
Hi Vineet,
Try enabling compression. That improves throughput 3-4x usually for me.
Also, you can use async mode if you're willing to trade some chance of
dropping messages for more throughput.
kafka {
codec = 'json'
broker_list = localhost:9092
topic_id = blah
On Mar 3, 2015, at 4:18 PM, Guozhang Wang wangg...@gmail.com wrote:
Additionally to Jay's recommendation, you also need to have some special
cares in error handling of the producer in order to preserve ordering since
producer uses batching and async sending. That is, if you already sent
Hi Group,
I have started using Kafka 0.8.2 with the new producer API.
Just wanted to know if we can have an explicit control over flushing the
messages batch to Kafka cluster.
Configuring batch.size will flush the messages when the batch.size is
reached for a partition.
But is there
Thanks for responding.
I was creating an instance of kafka.server.KafkaServer in my code for running
some tests and this was what I referred to by an embedded broker.
The scenario you described was what was happening. In my case when I kill my
broker, it fails to send an ack. I added
Hi,
When I start a new consumer, it throws a Rebalance exception.
However I hit it only on some machines where the run time libraries are
different
The stack given below is what I encounter - is this a known issue?
I saw this Jira but it's not resolved so thought to confirm -
When we ran in to this problem we ended up going in to zookeeper and changing
the leader to point to one of the replicas, then did a force leader election.
This got the partition back online.
Original Message
From: Virendra Pratap Singh
Sent: Wednesday, March 4, 2015 2:00 AM
To: Gwen
Hi,
Using kafka-Web-Console:
when i run the command play start, it works fine.
I tried to register the zookeeper, but getting the below error.
*java.nio.channels.ClosedChannelException*
at
Thanks everyone for your responses! These are great. It seems our cases
matches closest to Jay's recommendations.
The one part that sounds a little tricky is point #5 'Include in each
message the database's transaction id, scn, or other identifier '. This is
pretty straightforward with the
Thanks guy. with unclean.leader.election.enable set to false the issue is
fixed
On Tue, Mar 3, 2015 at 2:50 PM, Gwen Shapira gshap...@cloudera.com wrote:
of course :)
unclean.leader.election.enable
On Mon, Mar 2, 2015 at 9:10 PM, tao xiao xiaotao...@gmail.com wrote:
How do I achieve point
Take a look here:
https://issues.apache.org/jira/browse/KAFKA-1865
On Wed, Mar 4, 2015 at 4:28 AM, Ponmani Rayar ymmu...@gmail.com wrote:
Hi Group,
I have started using Kafka 0.8.2 with the new producer API.
Just wanted to know if we can have an explicit control over flushing
Hi, Josh,
That depends on how you implemented it.
Basically, Kafka can provide a good throughput only when you have multiple
partitions.
- If you have multiple consumers and multiple partitions, each of which has a
dedicated partition. That means, you need a coordinator to ensure all the
52 matches
Mail list logo