Hi, Yevhenii,

Could you clarify how you want to consume the "materialized views"? Are you
planning to access it just in realtime analytic pipeline (i.e. Samza)? Or
are you planning to serve it s.t. it can be accessed by some online
application outside Samza? If it is the later case, usually the
"materialized view" will be stored in a different persistent store. In
LinkedIn, the destination store is usually a remote KV-store and we have
two patterns in loading/updating the materialized view: a) direct write to
the KV-store; b) direct to Kafka and the remote KV-store will consume it
asynchronously. Usually the later give better performance and less impact
to applications reading the "materialized view".

As far as set up a Samza application to run, it is pretty straight forward.
Hello Samza example on the web should let you start a Samza application
within 15 min. In LinkedIn, with all the internal tools/deployment
requirements, we have a guide for users to build and deploy a Samza job in
30 min. The biggest overhead is probably not on Samza itself, but to setup
Kafka and YARN.

Let us know if there is any difficulties that you encounter.

Cheers!

-Yi

On Tue, Jan 10, 2017 at 9:08 PM, Yevhenii Kurtov <yevhenii.kur...@gmail.com>
wrote:

> Hello,
>
> Recently I watched  "Turning the database inside out with Apache Samza" and
> was very impressed with "Fully precomputed cache" part as it seems to hold
> a remedy for the exact problem that our company currently faced with.
> We are doing a niche-specific software and nowhere near LinkedIn or Uber,
>  but have a stable growth of data that we are operating on.
> As probably almost everybody, from the very beginning we normalized our
> database as much as possible and now years after reading performance
> becomes less and less satisfying.
>
> The idea is to feed MySQL log into Samza and build a "materialized views"
> for all use-cases that we want and the part that I don't understand is
> where those "materialized views"/"caches" will be stored in? In
> Samza itself or Samza will write it back to the, say Kafka queue or another
> MySQL database or anything else?
>
> Does anyone have an experience of implementing such scenario in production?
> Will be great to hear your experience as that this is my first encounter
> with stream processors and thus I don't have any clues about difficulties
> and challenges that introducing  Apache Samza into application stack may
> bring along.
>

Reply via email to