[jira] [Commented] (S2GRAPH-15) S2Lambda, speed and batch layers of the lambda architecture

Minseok Kim (JIRA) Mon, 28 Mar 2016 01:51:40 -0700

    [ 
https://issues.apache.org/jira/browse/S2GRAPH-15?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213974#comment-15213974
 ]


Minseok Kim commented on S2GRAPH-15:
------------------------------------

As I wrote http://markmail.org/message/6ykv7uxhreo2bkmm, 

I would like to break down this issue as follow sub-tasks:

  1. Configurable Spark launcher like predictionio
  2. Resumable Kafka stream (at least once or exact once)
  3. HA and fault tolerance scheduler using such as
Marathon or Chronos.


> S2Lambda, speed and batch layers of the lambda architecture
> -----------------------------------------------------------
>
>                 Key: S2GRAPH-15
>                 URL: https://issues.apache.org/jira/browse/S2GRAPH-15
>             Project: S2Graph
>          Issue Type: New Feature
>            Reporter: Minseok Kim
>              Labels: features
>         Attachments: s2lambda.001.png
>
>
> h4. Background
> As the lambda architecture view, S2Graph provides a great real-time view with 
> serving layer on HBase.
> The input stream came from the REST API is stored to HBase, and it can be 
> served by the graph query in real-time.
> The stream, which is write-ahead log is also written to Kafka, it allows us 
> to do a lot of things. 
> There are several works (or sub-projects) using this stream.
>   * S2Counter - computes the real-time count by the combinations of 
> properties using Kafka stream directly.
>   * WalToHdfs - Kafka stream to the incremental view
>   * S2ML - performs machine learning algorithm using the incremental view.
>   * …
> h4. S2Lambda
> Because the above works have been developed, respectively, they use different 
> Spark versions and duplicated codes.
> This causes difficulty of build and code reusability.
> S2Lambda should be designed to solve this problem to support a general 
> framework of speed and batch layers.
> IMHO, first, A JSON-formatted job description is designed for compatible with 
> both speed and batch layer.
> then the S2Lambda is implemented by corresponding it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (S2GRAPH-15) S2Lambda, speed and batch layers of the lambda architecture

Reply via email to