I missed your reply, I'm using 1.5.0. Will be upgrading to 1.5.1 soon.
On Tue, Jul 10, 2018, 10:15 PM Jeff Zhang wrote:
> Which flink version do you use ?
>
>
> Garrett Barton 于2018年7月11日周三 上午1:09写道:
>
> > Hey all,
> > I am running flink in batch mode on yarn wit
Hey all,
I am running flink in batch mode on yarn with independant jobs creating
their own clusters.
I have a flow defined that scales parallelism based on input size (to keep
overall processing time somewhat constant). Right now the flow initializes
with around ~22k tasks for a flow that sets
Hey all,
My jobs that I am trying to write in Flink 1.5 are failing after a few
minutes. I recon its because the idle task managers are shutting down, but
it seems to kill the client and the running job, which was still going on
one of the other task managers. either way I get:
When building these kinds of tests I always just orchestrated my producers
and consumers to spit metrics out somewhere easy to collect. Never looked
for a ui/tool to do it before.
Assuming good NTP configs (sub ms accuracy), I would typically put timing
data into the key portion of the messages
ol) with what offset
> ranges.
> From the code path the broker should at least maintain one empty segment
> even if all data gets truncated, in trunk, but I'm not sure if you are
> running on an older version that may have some bug on the broker logs.
>
> Guozhang
>
>
>
>
>
position to 0, did you run
> some tools periodically to reset the offset to 0?
>
>
> Guozhang
>
>
>
>
>
> On Wed, Aug 9, 2017 at 7:16 AM, Garrett Barton <garrett.bar...@gmail.com>
> wrote:
>
> > I have a small test setup with a local zk/kafka server and a streams
I have a small test setup with a local zk/kafka server and a streams app
that loads sample data. The test setup is usually up for a day or two
before a new build goes out and its blown away and loaded from scratch.
Lately I've seen that after a few hours the stream app will stop processing
and
;
> This one
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/KAFKA-5510
>
> Best,
> Dmitry
>
> пн, 7 авг. 2017 г. в 14:22, Garrett Barton <garrett.bar...@gmail.com>:
>
> > Dmitry, which KIP are you referring to? I see this behavior too
> sometim
Dmitry, which KIP are you referring to? I see this behavior too sometimes.
On Fri, Aug 4, 2017 at 10:25 AM, Dmitry Minkovsky
wrote:
> Thank you Matthias and Bill,
>
> Just want to confirm that was my offsets *were *being committed but I was
> being affected by
consumer groups and doesn't commit
> offsets. The offsets are checkpointed to local disk, so they won't show up
> with the ConsumerGroupCommand.
>
> That said it would be useful to see the lag, so maybe raise a JIRA for it?
>
> Thanks,
> Damian
>
> On Tue, 1 Aug 2017 at 15:0
Could you take in both topics via the same stream? Meaning don't do a kafka
streams join, literally just read both streams. If KStream cant do this,
dunno haven't tried, then simple upstream merge job to throw them into 1
topic with same partitioning scheme.
I'd assume you would have the products
Getting good concurrency in a webapp is more than doable. Check out these
benchmarks:
https://www.techempower.com/benchmarks/#section=data-r14=ph=db
I linked to the single query one because thats closest to a single
operation like you will be doing.
I'd also note if the data delivery does not
Is your time usage correct? It sounds like you want event time not
load/process time which is default unless you have a TimestampExtractor
defined somewhere upstream? Otherwise I could see far fewer events coming
out as streams is just aggregating whatever showed up in that 10 second
window.
On
Mahendra,
Did increasing those two properties do the trick? I am running into this
exact issue testing streams out on a single Kafka instance. Yet I can
manually start a consumer and read the topics fine while its busy doing
this dead stuffs.
On Tue, May 23, 2017 at 12:30 AM, Mahendra Kariya
t; data 1-to-1, but additionally registers a punctuation schedule. When
>>> punctuation is called, it would be required to send tombstone messages
>>> downstream (or a simliar) that deletes windows that are older than the
>>> retention time. Sound tricky to implement though... `tra
as you said, the downstream is aware of the
> >> statefulness of the upstream and correctly treats each record as
> >> an update)
> >> * If you want to reduce message volume further, you can break these
> >> into separate KafkaStreams instances
ay.
>
> Does this answer all your questions?
>
> (We don't document those details on purpose, because it's an internal
> design and we want the flexibility to change this if required -- thus,
> you should also not rely on "stream time" advance assumptions in your
> code.)
>
>
mit <k,(cnt,sum)>
> records (ie, a custom data data for value) and in mapValue() you compute
> <k,avg>.
>
> Hope this helps.
>
> -Matthias
>
> On 5/4/17 7:36 PM, Garrett Barton wrote:
> > I think I have an understanding of how Kafka Streams is handling time
I think I have an understanding of how Kafka Streams is handling time
behind the scenes and would like someone to verify it for me. The actual
reason is I am running into behavior where I only can join two streams for
a little, then it stops working.
Assuming a topology like this:
FEED ->
That depends on if your using event, processing or ingestion time.
My understanding is that if you play a record through that is T-6, the only
way that
TimeWindows.of(TimeUnit.MINUTES.toMillis(1)).until(TimeUnit.MINUTES.toMillis(5))
would actually process that record in your window is if your
're talking about Rocksdb
> >> On Tue, May 2, 2017 at 10:08 AM Damian Guy <damian@gmail.com>
> wrote:
> >>
> >>> Hi Garret,
> >>>
> >>> No, log.retention.hours doesn't impact compacted topics.
> >>>
> >>> Thanks,
>
Thanks Damian,
Does setting log.retention.hours have anything to do with compacted
topics? Meaning would a topic not compact now for 90 days? I am thinking
all the internal topics that streams creates in the flow. Having recovery
through 90 days of logs would take a good while I'd imagine.
Lets say I want to sum values over increasing window sizes of 1,5,15,60
minutes. Right now I have them running in parallel, meaning if I am
producing 1k/sec records I am consuming 4k/sec to feed each calculation.
In reality I am calculating far more than sum, and in this pattern I'm
looking at
Greetings all,
I have a use case where I want to calculate some metrics against sensor
data using event time semantics (record time is event time) that I already
have. I have years of it, but for this POC I'd like to just load the last
few months so that we can start deriving trend lines now vs
Mahendra,
One possible thing I have seen that exhibits the same behavior of missing
windows of data is the configuration of the topics (internal and your own)
retention policies. I was loading data that was fairly old (weeks) and
using event time semantics as the record timestamp (custom
I am trying to write a junit test for working with an embedded Kafka 0.8
server where I send and receive messages just to verify my embedded server,
producer and consumer wrappers work. Order of operation in the junit looks
something like:
-Start zk. [own thread] (wait for init)
-Start Kafka
26 matches
Mail list logo