Re: How to block tests of Kafka Streams until messages processed?
> Would there be any advantage to using the kafka connect method? The advantage is to decouple the data processing (which you do in your app) from the responsibility of making the processing results available to one or more downstream systems, like Cassandra. For example, what will your application (that uses Kafka Streams) do if Cassandra is unavailable or slow? Will you retry, and if so -- for how long? Retrying writes to external systems means that the time spent doing this will not be spent on processing the next input records, thus increasing the latency of your processing topology. At this point you have coupled your app to the availability of both Kafka and Cassandra. Upgrading or doing maintenance on Cassandra will now also mean there's potential impact on your app. On Thu, Oct 20, 2016 at 9:39 AM, Ali Akhtar wrote: > Michael, > > Would there be any advantage to using the kafka connect method? Seems like > it'd just add an extra step of overhead? > > On Thu, Oct 20, 2016 at 12:35 PM, Michael Noll > wrote: > > > Ali, > > > > my main feedback is similar to what Eno and Dave have already said. In > > your situation, options like these are what you'd currently need to do > > since you are writing directly from your Kafka Stream app to Cassandra, > > rather than writing from your app to Kafka and then using Kafka Connect > to > > ingest into Cassandra. > > > > > > > > On Wed, Oct 19, 2016 at 11:03 PM, Ali Akhtar > wrote: > > > > > Yeah, I did think to use that method, but as you said, it writes to a > > dummy > > > output topic, which means I'd have to put in magic code just for the > > tests > > > to pass (the actual code writes to cassandra and not to a dummy topic). > > > > > > > > > On Thu, Oct 20, 2016 at 2:00 AM, Tauzell, Dave < > > > dave.tauz...@surescripts.com > > > > wrote: > > > > > > > For similar queue related tests we put the check in a loop. Check > > every > > > > second until either the result is found or a timeout happens. > > > > > > > > -Dave > > > > > > > > -Original Message- > > > > From: Ali Akhtar [mailto:ali.rac...@gmail.com] > > > > Sent: Wednesday, October 19, 2016 3:38 PM > > > > To: users@kafka.apache.org > > > > Subject: How to block tests of Kafka Streams until messages > processed? > > > > > > > > I'm using Kafka Streams, and I'm attempting to write integration > tests > > > for > > > > a stream processor. > > > > > > > > The processor listens to a topic, processes incoming messages, and > > writes > > > > some data to Cassandra tables. > > > > > > > > I'm attempting to write a test which produces some test data, and > then > > > > checks whether or not the expected data was written to Cassandra. > > > > > > > > It looks like this: > > > > > > > > - Step 1: Produce data in the test > > > > - Step 2: Kafka stream gets triggered > > > > - Step 3: Test checks whether cassandra got populated > > > > > > > > The problem is, Step 3 is occurring before Step 2, and as a result, > the > > > > test fails as it doesn't find the data in the table. > > > > > > > > I've resolved this by adding a Thread.sleep(2000) call after Step 1, > > > which > > > > ensures that Step 2 gets triggered before Step 3. > > > > > > > > However, I'm wondering if there's a more reliable way of blocking the > > > test > > > > until Kafka stream processor gets triggered? > > > > > > > > At the moment, I'm using 1 thread for the processor. If I increase > that > > > to > > > > 2 threads, will that achieve what I want? > > > > This e-mail and any files transmitted with it are confidential, may > > > > contain sensitive information, and are intended solely for the use of > > the > > > > individual or entity to whom they are addressed. If you have received > > > this > > > > e-mail in error, please notify the sender by reply e-mail immediately > > and > > > > destroy all copies of the e-mail and any attachments. > > > > > > > > > >
Re: How to block tests of Kafka Streams until messages processed?
Michael, Would there be any advantage to using the kafka connect method? Seems like it'd just add an extra step of overhead? On Thu, Oct 20, 2016 at 12:35 PM, Michael Noll wrote: > Ali, > > my main feedback is similar to what Eno and Dave have already said. In > your situation, options like these are what you'd currently need to do > since you are writing directly from your Kafka Stream app to Cassandra, > rather than writing from your app to Kafka and then using Kafka Connect to > ingest into Cassandra. > > > > On Wed, Oct 19, 2016 at 11:03 PM, Ali Akhtar wrote: > > > Yeah, I did think to use that method, but as you said, it writes to a > dummy > > output topic, which means I'd have to put in magic code just for the > tests > > to pass (the actual code writes to cassandra and not to a dummy topic). > > > > > > On Thu, Oct 20, 2016 at 2:00 AM, Tauzell, Dave < > > dave.tauz...@surescripts.com > > > wrote: > > > > > For similar queue related tests we put the check in a loop. Check > every > > > second until either the result is found or a timeout happens. > > > > > > -Dave > > > > > > -Original Message- > > > From: Ali Akhtar [mailto:ali.rac...@gmail.com] > > > Sent: Wednesday, October 19, 2016 3:38 PM > > > To: users@kafka.apache.org > > > Subject: How to block tests of Kafka Streams until messages processed? > > > > > > I'm using Kafka Streams, and I'm attempting to write integration tests > > for > > > a stream processor. > > > > > > The processor listens to a topic, processes incoming messages, and > writes > > > some data to Cassandra tables. > > > > > > I'm attempting to write a test which produces some test data, and then > > > checks whether or not the expected data was written to Cassandra. > > > > > > It looks like this: > > > > > > - Step 1: Produce data in the test > > > - Step 2: Kafka stream gets triggered > > > - Step 3: Test checks whether cassandra got populated > > > > > > The problem is, Step 3 is occurring before Step 2, and as a result, the > > > test fails as it doesn't find the data in the table. > > > > > > I've resolved this by adding a Thread.sleep(2000) call after Step 1, > > which > > > ensures that Step 2 gets triggered before Step 3. > > > > > > However, I'm wondering if there's a more reliable way of blocking the > > test > > > until Kafka stream processor gets triggered? > > > > > > At the moment, I'm using 1 thread for the processor. If I increase that > > to > > > 2 threads, will that achieve what I want? > > > This e-mail and any files transmitted with it are confidential, may > > > contain sensitive information, and are intended solely for the use of > the > > > individual or entity to whom they are addressed. If you have received > > this > > > e-mail in error, please notify the sender by reply e-mail immediately > and > > > destroy all copies of the e-mail and any attachments. > > > > > >
Re: How to block tests of Kafka Streams until messages processed?
Ali, my main feedback is similar to what Eno and Dave have already said. In your situation, options like these are what you'd currently need to do since you are writing directly from your Kafka Stream app to Cassandra, rather than writing from your app to Kafka and then using Kafka Connect to ingest into Cassandra. On Wed, Oct 19, 2016 at 11:03 PM, Ali Akhtar wrote: > Yeah, I did think to use that method, but as you said, it writes to a dummy > output topic, which means I'd have to put in magic code just for the tests > to pass (the actual code writes to cassandra and not to a dummy topic). > > > On Thu, Oct 20, 2016 at 2:00 AM, Tauzell, Dave < > dave.tauz...@surescripts.com > > wrote: > > > For similar queue related tests we put the check in a loop. Check every > > second until either the result is found or a timeout happens. > > > > -Dave > > > > -Original Message- > > From: Ali Akhtar [mailto:ali.rac...@gmail.com] > > Sent: Wednesday, October 19, 2016 3:38 PM > > To: users@kafka.apache.org > > Subject: How to block tests of Kafka Streams until messages processed? > > > > I'm using Kafka Streams, and I'm attempting to write integration tests > for > > a stream processor. > > > > The processor listens to a topic, processes incoming messages, and writes > > some data to Cassandra tables. > > > > I'm attempting to write a test which produces some test data, and then > > checks whether or not the expected data was written to Cassandra. > > > > It looks like this: > > > > - Step 1: Produce data in the test > > - Step 2: Kafka stream gets triggered > > - Step 3: Test checks whether cassandra got populated > > > > The problem is, Step 3 is occurring before Step 2, and as a result, the > > test fails as it doesn't find the data in the table. > > > > I've resolved this by adding a Thread.sleep(2000) call after Step 1, > which > > ensures that Step 2 gets triggered before Step 3. > > > > However, I'm wondering if there's a more reliable way of blocking the > test > > until Kafka stream processor gets triggered? > > > > At the moment, I'm using 1 thread for the processor. If I increase that > to > > 2 threads, will that achieve what I want? > > This e-mail and any files transmitted with it are confidential, may > > contain sensitive information, and are intended solely for the use of the > > individual or entity to whom they are addressed. If you have received > this > > e-mail in error, please notify the sender by reply e-mail immediately and > > destroy all copies of the e-mail and any attachments. > > >
Re: How to block tests of Kafka Streams until messages processed?
Yeah, I did think to use that method, but as you said, it writes to a dummy output topic, which means I'd have to put in magic code just for the tests to pass (the actual code writes to cassandra and not to a dummy topic). On Thu, Oct 20, 2016 at 2:00 AM, Tauzell, Dave wrote: > For similar queue related tests we put the check in a loop. Check every > second until either the result is found or a timeout happens. > > -Dave > > -Original Message- > From: Ali Akhtar [mailto:ali.rac...@gmail.com] > Sent: Wednesday, October 19, 2016 3:38 PM > To: users@kafka.apache.org > Subject: How to block tests of Kafka Streams until messages processed? > > I'm using Kafka Streams, and I'm attempting to write integration tests for > a stream processor. > > The processor listens to a topic, processes incoming messages, and writes > some data to Cassandra tables. > > I'm attempting to write a test which produces some test data, and then > checks whether or not the expected data was written to Cassandra. > > It looks like this: > > - Step 1: Produce data in the test > - Step 2: Kafka stream gets triggered > - Step 3: Test checks whether cassandra got populated > > The problem is, Step 3 is occurring before Step 2, and as a result, the > test fails as it doesn't find the data in the table. > > I've resolved this by adding a Thread.sleep(2000) call after Step 1, which > ensures that Step 2 gets triggered before Step 3. > > However, I'm wondering if there's a more reliable way of blocking the test > until Kafka stream processor gets triggered? > > At the moment, I'm using 1 thread for the processor. If I increase that to > 2 threads, will that achieve what I want? > This e-mail and any files transmitted with it are confidential, may > contain sensitive information, and are intended solely for the use of the > individual or entity to whom they are addressed. If you have received this > e-mail in error, please notify the sender by reply e-mail immediately and > destroy all copies of the e-mail and any attachments. >
RE: How to block tests of Kafka Streams until messages processed?
For similar queue related tests we put the check in a loop. Check every second until either the result is found or a timeout happens. -Dave -Original Message- From: Ali Akhtar [mailto:ali.rac...@gmail.com] Sent: Wednesday, October 19, 2016 3:38 PM To: users@kafka.apache.org Subject: How to block tests of Kafka Streams until messages processed? I'm using Kafka Streams, and I'm attempting to write integration tests for a stream processor. The processor listens to a topic, processes incoming messages, and writes some data to Cassandra tables. I'm attempting to write a test which produces some test data, and then checks whether or not the expected data was written to Cassandra. It looks like this: - Step 1: Produce data in the test - Step 2: Kafka stream gets triggered - Step 3: Test checks whether cassandra got populated The problem is, Step 3 is occurring before Step 2, and as a result, the test fails as it doesn't find the data in the table. I've resolved this by adding a Thread.sleep(2000) call after Step 1, which ensures that Step 2 gets triggered before Step 3. However, I'm wondering if there's a more reliable way of blocking the test until Kafka stream processor gets triggered? At the moment, I'm using 1 thread for the processor. If I increase that to 2 threads, will that achieve what I want? This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.
Re: How to block tests of Kafka Streams until messages processed?
Wanted to add that there is nothing too special about these utility functions, they are built using a normal consumer. Eno > On 19 Oct 2016, at 21:59, Eno Thereska wrote: > > Hi Ali, > > Any chance you could recycle some of the code we have in > streams/src/test/java/.../streams/integration/utils? (I know we don't have it > easily accessible in Maven, for now perhaps you could copy to your directory?) > > For example there is a method there > "IntegrationTestUtils.waitUntilMinValuesRecordsReceived" that could help. > E.g., it is used in almost all our integration tests. One caveat: this method > checks if values have been received in a topic, not Cassandra, so your > streams test might have to write to a dummy output topic, as well as to > Cassandra. > > Let me know what you think, > Eno > > >> On 19 Oct 2016, at 21:37, Ali Akhtar wrote: >> >> I'm using Kafka Streams, and I'm attempting to write integration tests for >> a stream processor. >> >> The processor listens to a topic, processes incoming messages, and writes >> some data to Cassandra tables. >> >> I'm attempting to write a test which produces some test data, and then >> checks whether or not the expected data was written to Cassandra. >> >> It looks like this: >> >> - Step 1: Produce data in the test >> - Step 2: Kafka stream gets triggered >> - Step 3: Test checks whether cassandra got populated >> >> The problem is, Step 3 is occurring before Step 2, and as a result, the >> test fails as it doesn't find the data in the table. >> >> I've resolved this by adding a Thread.sleep(2000) call after Step 1, which >> ensures that Step 2 gets triggered before Step 3. >> >> However, I'm wondering if there's a more reliable way of blocking the test >> until Kafka stream processor gets triggered? >> >> At the moment, I'm using 1 thread for the processor. If I increase that to >> 2 threads, will that achieve what I want? >
Re: How to block tests of Kafka Streams until messages processed?
Hi Ali, Any chance you could recycle some of the code we have in streams/src/test/java/.../streams/integration/utils? (I know we don't have it easily accessible in Maven, for now perhaps you could copy to your directory?) For example there is a method there "IntegrationTestUtils.waitUntilMinValuesRecordsReceived" that could help. E.g., it is used in almost all our integration tests. One caveat: this method checks if values have been received in a topic, not Cassandra, so your streams test might have to write to a dummy output topic, as well as to Cassandra. Let me know what you think, Eno > On 19 Oct 2016, at 21:37, Ali Akhtar wrote: > > I'm using Kafka Streams, and I'm attempting to write integration tests for > a stream processor. > > The processor listens to a topic, processes incoming messages, and writes > some data to Cassandra tables. > > I'm attempting to write a test which produces some test data, and then > checks whether or not the expected data was written to Cassandra. > > It looks like this: > > - Step 1: Produce data in the test > - Step 2: Kafka stream gets triggered > - Step 3: Test checks whether cassandra got populated > > The problem is, Step 3 is occurring before Step 2, and as a result, the > test fails as it doesn't find the data in the table. > > I've resolved this by adding a Thread.sleep(2000) call after Step 1, which > ensures that Step 2 gets triggered before Step 3. > > However, I'm wondering if there's a more reliable way of blocking the test > until Kafka stream processor gets triggered? > > At the moment, I'm using 1 thread for the processor. If I increase that to > 2 threads, will that achieve what I want?