How to block tests of Kafka Streams until messages processed?

2016-10-19 Thread Ali Akhtar
I'm using Kafka Streams, and I'm attempting to write integration tests for
a stream processor.

The processor listens to a topic, processes incoming messages, and writes
some data to Cassandra tables.

I'm attempting to write a test which produces some test data, and then
checks whether or not the expected data was written to Cassandra.

It looks like this:

- Step 1: Produce data in the test
- Step 2: Kafka stream gets triggered
- Step 3: Test checks whether cassandra got populated

The problem is, Step 3 is occurring before Step 2, and as a result, the
test fails as it doesn't find the data in the table.

I've resolved this by adding a Thread.sleep(2000) call after Step 1, which
ensures that Step 2 gets triggered before Step 3.

However, I'm wondering if there's a more reliable way of blocking the test
until Kafka stream processor gets triggered?

At the moment, I'm using 1 thread for the processor. If I increase that to
2 threads, will that achieve what I want?


Re: How to block tests of Kafka Streams until messages processed?

2016-10-19 Thread Eno Thereska
Hi Ali,

Any chance you could recycle some of the code we have in 
streams/src/test/java/.../streams/integration/utils? (I know we don't have it 
easily accessible in Maven, for now perhaps you could copy to your directory?)

For example there is a method there 
"IntegrationTestUtils.waitUntilMinValuesRecordsReceived" that could help. E.g., 
it is used in almost all our integration tests. One caveat: this method checks 
if values have been received in a topic, not Cassandra, so your streams test 
might have to write to a dummy output topic, as well as to Cassandra.

Let me know what you think,
Eno


> On 19 Oct 2016, at 21:37, Ali Akhtar  wrote:
> 
> I'm using Kafka Streams, and I'm attempting to write integration tests for
> a stream processor.
> 
> The processor listens to a topic, processes incoming messages, and writes
> some data to Cassandra tables.
> 
> I'm attempting to write a test which produces some test data, and then
> checks whether or not the expected data was written to Cassandra.
> 
> It looks like this:
> 
> - Step 1: Produce data in the test
> - Step 2: Kafka stream gets triggered
> - Step 3: Test checks whether cassandra got populated
> 
> The problem is, Step 3 is occurring before Step 2, and as a result, the
> test fails as it doesn't find the data in the table.
> 
> I've resolved this by adding a Thread.sleep(2000) call after Step 1, which
> ensures that Step 2 gets triggered before Step 3.
> 
> However, I'm wondering if there's a more reliable way of blocking the test
> until Kafka stream processor gets triggered?
> 
> At the moment, I'm using 1 thread for the processor. If I increase that to
> 2 threads, will that achieve what I want?



Re: How to block tests of Kafka Streams until messages processed?

2016-10-19 Thread Eno Thereska
Wanted to add that there is nothing too special about these utility functions, 
they are built using a normal consumer.

Eno

> On 19 Oct 2016, at 21:59, Eno Thereska  wrote:
> 
> Hi Ali,
> 
> Any chance you could recycle some of the code we have in 
> streams/src/test/java/.../streams/integration/utils? (I know we don't have it 
> easily accessible in Maven, for now perhaps you could copy to your directory?)
> 
> For example there is a method there 
> "IntegrationTestUtils.waitUntilMinValuesRecordsReceived" that could help. 
> E.g., it is used in almost all our integration tests. One caveat: this method 
> checks if values have been received in a topic, not Cassandra, so your 
> streams test might have to write to a dummy output topic, as well as to 
> Cassandra.
> 
> Let me know what you think,
> Eno
> 
> 
>> On 19 Oct 2016, at 21:37, Ali Akhtar  wrote:
>> 
>> I'm using Kafka Streams, and I'm attempting to write integration tests for
>> a stream processor.
>> 
>> The processor listens to a topic, processes incoming messages, and writes
>> some data to Cassandra tables.
>> 
>> I'm attempting to write a test which produces some test data, and then
>> checks whether or not the expected data was written to Cassandra.
>> 
>> It looks like this:
>> 
>> - Step 1: Produce data in the test
>> - Step 2: Kafka stream gets triggered
>> - Step 3: Test checks whether cassandra got populated
>> 
>> The problem is, Step 3 is occurring before Step 2, and as a result, the
>> test fails as it doesn't find the data in the table.
>> 
>> I've resolved this by adding a Thread.sleep(2000) call after Step 1, which
>> ensures that Step 2 gets triggered before Step 3.
>> 
>> However, I'm wondering if there's a more reliable way of blocking the test
>> until Kafka stream processor gets triggered?
>> 
>> At the moment, I'm using 1 thread for the processor. If I increase that to
>> 2 threads, will that achieve what I want?
> 



RE: How to block tests of Kafka Streams until messages processed?

2016-10-19 Thread Tauzell, Dave
For similar queue related tests we put the check in a loop.  Check every second 
until either the result is found or a timeout  happens.

-Dave

-Original Message-
From: Ali Akhtar [mailto:ali.rac...@gmail.com]
Sent: Wednesday, October 19, 2016 3:38 PM
To: users@kafka.apache.org
Subject: How to block tests of Kafka Streams until messages processed?

I'm using Kafka Streams, and I'm attempting to write integration tests for a 
stream processor.

The processor listens to a topic, processes incoming messages, and writes some 
data to Cassandra tables.

I'm attempting to write a test which produces some test data, and then checks 
whether or not the expected data was written to Cassandra.

It looks like this:

- Step 1: Produce data in the test
- Step 2: Kafka stream gets triggered
- Step 3: Test checks whether cassandra got populated

The problem is, Step 3 is occurring before Step 2, and as a result, the test 
fails as it doesn't find the data in the table.

I've resolved this by adding a Thread.sleep(2000) call after Step 1, which 
ensures that Step 2 gets triggered before Step 3.

However, I'm wondering if there's a more reliable way of blocking the test 
until Kafka stream processor gets triggered?

At the moment, I'm using 1 thread for the processor. If I increase that to
2 threads, will that achieve what I want?
This e-mail and any files transmitted with it are confidential, may contain 
sensitive information, and are intended solely for the use of the individual or 
entity to whom they are addressed. If you have received this e-mail in error, 
please notify the sender by reply e-mail immediately and destroy all copies of 
the e-mail and any attachments.


Re: How to block tests of Kafka Streams until messages processed?

2016-10-19 Thread Ali Akhtar
Yeah, I did think to use that method, but as you said, it writes to a dummy
output topic, which means I'd have to put in magic code just for the tests
to pass (the actual code writes to cassandra and not to a dummy topic).


On Thu, Oct 20, 2016 at 2:00 AM, Tauzell, Dave  wrote:

> For similar queue related tests we put the check in a loop.  Check every
> second until either the result is found or a timeout  happens.
>
> -Dave
>
> -Original Message-
> From: Ali Akhtar [mailto:ali.rac...@gmail.com]
> Sent: Wednesday, October 19, 2016 3:38 PM
> To: users@kafka.apache.org
> Subject: How to block tests of Kafka Streams until messages processed?
>
> I'm using Kafka Streams, and I'm attempting to write integration tests for
> a stream processor.
>
> The processor listens to a topic, processes incoming messages, and writes
> some data to Cassandra tables.
>
> I'm attempting to write a test which produces some test data, and then
> checks whether or not the expected data was written to Cassandra.
>
> It looks like this:
>
> - Step 1: Produce data in the test
> - Step 2: Kafka stream gets triggered
> - Step 3: Test checks whether cassandra got populated
>
> The problem is, Step 3 is occurring before Step 2, and as a result, the
> test fails as it doesn't find the data in the table.
>
> I've resolved this by adding a Thread.sleep(2000) call after Step 1, which
> ensures that Step 2 gets triggered before Step 3.
>
> However, I'm wondering if there's a more reliable way of blocking the test
> until Kafka stream processor gets triggered?
>
> At the moment, I'm using 1 thread for the processor. If I increase that to
> 2 threads, will that achieve what I want?
> This e-mail and any files transmitted with it are confidential, may
> contain sensitive information, and are intended solely for the use of the
> individual or entity to whom they are addressed. If you have received this
> e-mail in error, please notify the sender by reply e-mail immediately and
> destroy all copies of the e-mail and any attachments.
>


Re: How to block tests of Kafka Streams until messages processed?

2016-10-20 Thread Michael Noll
Ali,

my main feedback is similar to what Eno and Dave have already said.  In
your situation, options like these are what you'd currently need to do
since you are writing directly from your Kafka Stream app to Cassandra,
rather than writing from your app to Kafka and then using Kafka Connect to
ingest into Cassandra.



On Wed, Oct 19, 2016 at 11:03 PM, Ali Akhtar  wrote:

> Yeah, I did think to use that method, but as you said, it writes to a dummy
> output topic, which means I'd have to put in magic code just for the tests
> to pass (the actual code writes to cassandra and not to a dummy topic).
>
>
> On Thu, Oct 20, 2016 at 2:00 AM, Tauzell, Dave <
> dave.tauz...@surescripts.com
> > wrote:
>
> > For similar queue related tests we put the check in a loop.  Check every
> > second until either the result is found or a timeout  happens.
> >
> > -Dave
> >
> > -Original Message-
> > From: Ali Akhtar [mailto:ali.rac...@gmail.com]
> > Sent: Wednesday, October 19, 2016 3:38 PM
> > To: users@kafka.apache.org
> > Subject: How to block tests of Kafka Streams until messages processed?
> >
> > I'm using Kafka Streams, and I'm attempting to write integration tests
> for
> > a stream processor.
> >
> > The processor listens to a topic, processes incoming messages, and writes
> > some data to Cassandra tables.
> >
> > I'm attempting to write a test which produces some test data, and then
> > checks whether or not the expected data was written to Cassandra.
> >
> > It looks like this:
> >
> > - Step 1: Produce data in the test
> > - Step 2: Kafka stream gets triggered
> > - Step 3: Test checks whether cassandra got populated
> >
> > The problem is, Step 3 is occurring before Step 2, and as a result, the
> > test fails as it doesn't find the data in the table.
> >
> > I've resolved this by adding a Thread.sleep(2000) call after Step 1,
> which
> > ensures that Step 2 gets triggered before Step 3.
> >
> > However, I'm wondering if there's a more reliable way of blocking the
> test
> > until Kafka stream processor gets triggered?
> >
> > At the moment, I'm using 1 thread for the processor. If I increase that
> to
> > 2 threads, will that achieve what I want?
> > This e-mail and any files transmitted with it are confidential, may
> > contain sensitive information, and are intended solely for the use of the
> > individual or entity to whom they are addressed. If you have received
> this
> > e-mail in error, please notify the sender by reply e-mail immediately and
> > destroy all copies of the e-mail and any attachments.
> >
>


Re: How to block tests of Kafka Streams until messages processed?

2016-10-20 Thread Ali Akhtar
Michael,

Would there be any advantage to using the kafka connect method? Seems like
it'd just add an extra step of overhead?

On Thu, Oct 20, 2016 at 12:35 PM, Michael Noll  wrote:

> Ali,
>
> my main feedback is similar to what Eno and Dave have already said.  In
> your situation, options like these are what you'd currently need to do
> since you are writing directly from your Kafka Stream app to Cassandra,
> rather than writing from your app to Kafka and then using Kafka Connect to
> ingest into Cassandra.
>
>
>
> On Wed, Oct 19, 2016 at 11:03 PM, Ali Akhtar  wrote:
>
> > Yeah, I did think to use that method, but as you said, it writes to a
> dummy
> > output topic, which means I'd have to put in magic code just for the
> tests
> > to pass (the actual code writes to cassandra and not to a dummy topic).
> >
> >
> > On Thu, Oct 20, 2016 at 2:00 AM, Tauzell, Dave <
> > dave.tauz...@surescripts.com
> > > wrote:
> >
> > > For similar queue related tests we put the check in a loop.  Check
> every
> > > second until either the result is found or a timeout  happens.
> > >
> > > -Dave
> > >
> > > -----Original Message-----
> > > From: Ali Akhtar [mailto:ali.rac...@gmail.com]
> > > Sent: Wednesday, October 19, 2016 3:38 PM
> > > To: users@kafka.apache.org
> > > Subject: How to block tests of Kafka Streams until messages processed?
> > >
> > > I'm using Kafka Streams, and I'm attempting to write integration tests
> > for
> > > a stream processor.
> > >
> > > The processor listens to a topic, processes incoming messages, and
> writes
> > > some data to Cassandra tables.
> > >
> > > I'm attempting to write a test which produces some test data, and then
> > > checks whether or not the expected data was written to Cassandra.
> > >
> > > It looks like this:
> > >
> > > - Step 1: Produce data in the test
> > > - Step 2: Kafka stream gets triggered
> > > - Step 3: Test checks whether cassandra got populated
> > >
> > > The problem is, Step 3 is occurring before Step 2, and as a result, the
> > > test fails as it doesn't find the data in the table.
> > >
> > > I've resolved this by adding a Thread.sleep(2000) call after Step 1,
> > which
> > > ensures that Step 2 gets triggered before Step 3.
> > >
> > > However, I'm wondering if there's a more reliable way of blocking the
> > test
> > > until Kafka stream processor gets triggered?
> > >
> > > At the moment, I'm using 1 thread for the processor. If I increase that
> > to
> > > 2 threads, will that achieve what I want?
> > > This e-mail and any files transmitted with it are confidential, may
> > > contain sensitive information, and are intended solely for the use of
> the
> > > individual or entity to whom they are addressed. If you have received
> > this
> > > e-mail in error, please notify the sender by reply e-mail immediately
> and
> > > destroy all copies of the e-mail and any attachments.
> > >
> >
>


Re: How to block tests of Kafka Streams until messages processed?

2016-10-20 Thread Michael Noll
> Would there be any advantage to using the kafka connect method?

The advantage is to decouple the data processing (which you do in your app)
from the responsibility of making the processing results available to one
or more downstream systems, like Cassandra.

For example, what will your application (that uses Kafka Streams) do if
Cassandra is unavailable or slow?  Will you retry, and if so -- for how
long?  Retrying writes to external systems means that the time spent doing
this will not be spent on processing the next input records, thus
increasing the latency of your processing topology.  At this point you have
coupled your app to the availability of both Kafka and Cassandra.
Upgrading or doing maintenance on Cassandra will now also mean there's
potential impact on your app.



On Thu, Oct 20, 2016 at 9:39 AM, Ali Akhtar  wrote:

> Michael,
>
> Would there be any advantage to using the kafka connect method? Seems like
> it'd just add an extra step of overhead?
>
> On Thu, Oct 20, 2016 at 12:35 PM, Michael Noll 
> wrote:
>
> > Ali,
> >
> > my main feedback is similar to what Eno and Dave have already said.  In
> > your situation, options like these are what you'd currently need to do
> > since you are writing directly from your Kafka Stream app to Cassandra,
> > rather than writing from your app to Kafka and then using Kafka Connect
> to
> > ingest into Cassandra.
> >
> >
> >
> > On Wed, Oct 19, 2016 at 11:03 PM, Ali Akhtar 
> wrote:
> >
> > > Yeah, I did think to use that method, but as you said, it writes to a
> > dummy
> > > output topic, which means I'd have to put in magic code just for the
> > tests
> > > to pass (the actual code writes to cassandra and not to a dummy topic).
> > >
> > >
> > > On Thu, Oct 20, 2016 at 2:00 AM, Tauzell, Dave <
> > > dave.tauz...@surescripts.com
> > > > wrote:
> > >
> > > > For similar queue related tests we put the check in a loop.  Check
> > every
> > > > second until either the result is found or a timeout  happens.
> > > >
> > > > -Dave
> > > >
> > > > -Original Message-
> > > > From: Ali Akhtar [mailto:ali.rac...@gmail.com]
> > > > Sent: Wednesday, October 19, 2016 3:38 PM
> > > > To: users@kafka.apache.org
> > > > Subject: How to block tests of Kafka Streams until messages
> processed?
> > > >
> > > > I'm using Kafka Streams, and I'm attempting to write integration
> tests
> > > for
> > > > a stream processor.
> > > >
> > > > The processor listens to a topic, processes incoming messages, and
> > writes
> > > > some data to Cassandra tables.
> > > >
> > > > I'm attempting to write a test which produces some test data, and
> then
> > > > checks whether or not the expected data was written to Cassandra.
> > > >
> > > > It looks like this:
> > > >
> > > > - Step 1: Produce data in the test
> > > > - Step 2: Kafka stream gets triggered
> > > > - Step 3: Test checks whether cassandra got populated
> > > >
> > > > The problem is, Step 3 is occurring before Step 2, and as a result,
> the
> > > > test fails as it doesn't find the data in the table.
> > > >
> > > > I've resolved this by adding a Thread.sleep(2000) call after Step 1,
> > > which
> > > > ensures that Step 2 gets triggered before Step 3.
> > > >
> > > > However, I'm wondering if there's a more reliable way of blocking the
> > > test
> > > > until Kafka stream processor gets triggered?
> > > >
> > > > At the moment, I'm using 1 thread for the processor. If I increase
> that
> > > to
> > > > 2 threads, will that achieve what I want?
> > > > This e-mail and any files transmitted with it are confidential, may
> > > > contain sensitive information, and are intended solely for the use of
> > the
> > > > individual or entity to whom they are addressed. If you have received
> > > this
> > > > e-mail in error, please notify the sender by reply e-mail immediately
> > and
> > > > destroy all copies of the e-mail and any attachments.
> > > >
> > >
> >
>