That makes a lot of sense to me, thanks.

Could you also point me to how implement a custom MessageCollector, for
example if I want to send messages to ActiveMQ instead of Kafka?

Thanks for your help
Max
On 5 Sep 2014 09:52, "Yan Fang" <[email protected]> wrote:

> Hi Massimiliano,
>
> From my understanding, what you want to do is to process the messages and
> then store them into, say, Cassandra. To implement this, it's not necessary
> to write MessageCollector. What you only need to do is to put the writing
> logic in the process method, see the API doc
> <
> https://samza.incubator.apache.org/learn/documentation/latest/api/overview.html
> >.
> The method is called for every message. So you can process the message and
> store it into the remote DB if you want.
>
> Assume you already tested the hello-samza
> <https://samza.incubator.apache.org/startup/hello-samza/latest/> project,
> you can have a look at the WikipediaFeedStreamTask
> <
> https://github.com/apache/incubator-samza-hello-samza/blob/master/samza-wikipedia/src/main/java/samza/examples/wikipedia/task/WikipediaFeedStreamTask.java
> >
> and
> WikipediaParserStreamTask
> <
> https://github.com/apache/incubator-samza-hello-samza/blob/master/samza-wikipedia/src/main/java/samza/examples/wikipedia/task/WikipediaParserStreamTask.java
> >
> .
> You can put your logic in the process method.
>
> In terms of writing a batch into the DB for better performance, you can
> have a batch variable (such as List, Map) in the StreamTask to store the
> processed result. Then write the results in the variable into the DB after
> certain number of messages or after certain time ( by implementing the
> Window
> <
> https://samza.incubator.apache.org/learn/documentation/latest/container/windowing.html
> >
>  interface).
>
> Hope that helps.
>
> Thanks,
>
> Fang, Yan
> [email protected]
> +1 (206) 849-4108
>
>
> On Thu, Sep 4, 2014 at 2:58 PM, Massimiliano Tomassi <
> [email protected]>
> wrote:
>
> > Hello all,
> > I was trying to figure out what's the way to implement and use a custom
> > MessageCollector. Let's say I want to send messages to a system different
> > from Kafka. How should I do that? Is there any tutorial explaining this?
> >
> > I was also thinking at the following use case, not sure if it makes sense
> > at all but here it is: let's say we receive messages, process them
> somehow
> > and then we want to store the results in a remote DB, Cassandra for
> > example. Does it make sense to create an implementation of
> MessageCollector
> > that stores to Cassandra? Also, given that performing a write for every
> > single message can be not very efficient, would it be possible to collect
> > some data and then write them to Cassandra as a single batch operation?
> >
> > I hope to have explained myself decently...and I hope to receive some
> > suggestions.
> >
> > All the best.
> > Max
> >
> >
> > --
> > ------------------------------------------------
> > Massimiliano Tomassi
> > ------------------------------------------------
> > e-mail: [email protected]
> > ------------------------------------------------
> >
>

Reply via email to