Hi all,

I have built a simple project that integrates Kite SDK Morphlines with
Storm.
The main main concept is to give to user the oportunity to create a simple
topology that can run configurable ETL.

Details:
- source code:
https://github.com/qiozas/sourcevirtues-samples/tree/master/sv-etl-storm-morphlines
- blog post:
http://sourcevirtues.com/2016/01/04/configurable-etl-processing-using-apache-storm-and-kite-sdk-morphlines/
- Morphlines: http://kitesdk.org/docs/current/morphlines/

Main Bolt is MorphlinesBolt. There are 3 main configurable sections:
a) Morphlines configuration file (could also include java code or not).
b) Mapper of incoming Tuple to Morphlines execution input.
c) (Optional) Mapper of Morphlines output to new Tuple.

If you find this implementation useful, then I can try to fork and pull
this code in your repository according to your contribution procedure.

Current implementation is not final and you might find bugs or things that
need changes, but I will try it. Trident approach is not yet implemented,
as I would like to first implement a stable version of pure Storm and then
add a new User Story for Trident users.

I find configurable ETL idea very useful, so please tell me if like this
concept or not, in order to contribute this code :)
Using Morphlines and Flux seems a valid approach for configurable ETL.

As this is my first email in dev emailing list, I would like to say great
thanks  to all of you for this great streaming engine. I think Storm is
great because of its simplicity. You can built too many concepts using this
generic engine. Great job!!!

Regards,
Adrianos Dadis.

Reply via email to