Next milestone of Storm SQL

Jungtaek Lim Thu, 20 Oct 2016 23:13:30 -0700

Hi devs,

Since current milestone of Storm SQL is about to be done (all of issues are
resolved or in reviewing), I'd like to share next milestone and hear
feedbacks.


Please refer current milestone, and next works for Storm SQL.
https://cwiki.apache.org/confluence/display/STORM/Milestone+of+Storm+SQL

*Works on next milestone*

- Support more input / output formats
  - Avro
  - CSV
- Add socket as input and output data source (only for test purpose, no
security)
- Automatic parallelism for input data source
  - Kafka
- Replace script evaluation in runtime with full code generation

*How this milestone will help users to play with Storm SQL?*

- Users can also read/write message from/to Avro, and CSV.
  - Only JSON is supported now.
- Users can test their topology with socket (easily with 'nc'), no need to
set up test kafka topic.
- Storm SQL will automatically set parallelism hint when input source
provides partition informations.
  - 'PARALLELISM' can overwrite its value, so manual parallelism hint is
always applied.

*How this milestone will improve Storm SQL internally?*

- Code generation will get rid of overheads of evaluating code block, hence
making operator more faster.

More data sources, and more scalar functions may be added in this milestone
when some contributors are interested to grow the Storm SQL.
This doesn't restrict the area of contribution. Direct contributions on
issues in next milestone are also appreciated! Please let me know so that
we can talk about dividing the works.

I expected this milestone takes about 1 or 2 month(s) - unless something
(?) is happening, which may be target to 1.2.0 (if releasing 1.1.0 is
happen earlier).

What do you think about the next milestone?

Thanks,
Jungtaek Lim (HeartSaVioR)

Next milestone of Storm SQL

Reply via email to