The injunction that tuple processing should be "as fast as possible" is based
on anassumption and a fact:
1. In most cases, users want to maximize application throughput.2. If a
callback (like beginWindow(), process(), endWindow(), etc.) takes too long,
the platform deems the operator hung and restarts it.
Neither imposes a hard constraint: If, for a particular class of
applications,it is OK to sacrifice throughput to allow some CPU intensive
computations to occur,that is certainly possible; the constraint of (2) can be
relaxed by simply increasingthe TIMEOUT_WINDOW_COUNT attribute, for some or all
operators.
Secondly, nothing prevents an operator from starting worker threads that
asynchronouslyperform CPU intensive computations. Naturally, careful
synchronization will be necessarybetween the main and worker threads to ensure
correctness and timelydelivery of results.
Ram
On Friday, May 12, 2017 6:38 PM, Ananth G <[email protected]> wrote:
I guess the use cases as documented look really compelling. There might be
more comments from code review perspective and below is more from a use case
perspective only.
I was wondering if you have any latency measurements for the tests you ran.
If the image processing calls ( in the process function overridden from the
Toolkit class ) are time consuming it might not be an ideal use case for a
streaming engine? A very old "blog" (2012) talks about latencies anywhere
between tens of milliseconds to almost a second depending on the use case and
image size. Of course there were hardware improvements and those numbers might
no longer hold good and hence the question (of course the latencies depend on
hardware being used as well )
This brings me to the next question in general about Apex to the community :
what is considered an acceptable tolerance level in terms of latencies for
streaming compute engine like Apex. Is there a way to tune the acceptable
tolerance level depending on the use case ? I keep reading from the mailing
lists that the aspect of tuple processing is part of the main thread and hence
should be as fast as possible.
Regards
Ananth
> On 12 May 2017, at 9:05 pm, Aditya gholba <[email protected]> wrote:
>
> Hello,
> I have been working on an image processing library for Malhar and few of
> the operators are ready. I would like to merge them in Malhar contrib. You
> can read about the operators and the applications I have created so far
> here.
> <https://docs.google.com/document/d/19OrqHJ_QzbuB0XZ4bzdQ9yjN2dGfDhsuMX6XUjDpqYw/edit>
>
> Link to my GitHub <https://github.com/adiv2/imIO4>
>
> All suggestions and opinions are welcome.
>
>
> Thanks,
> Aditya.