Hi -

This is very interesting. I’ve been thinking about using Heron for this 
functionality.

An Admin API for configuring the functions on live Executors and specifying a 
unique return value Topic need discussion. I would also like to chain Functions.

I think Functions will need Profiles to include metadata for parallelism, 
memory, configuration, etc. 

Regards,
Dave

Sent from my iPhone

> On Feb 20, 2018, at 4:05 PM, Sanjeev Kulkarni <sanjee...@gmail.com> wrote:
> 
> https://github.com/apache/incubator-pulsar/wiki/PIP-15:-Pulsar-Functions
> 
> -------
> 
> * **Status**: Proposal
> * **Author**: Sanjeev Kulkarni/Sijie Guo/Jerry Peng - Streamlio
> * **Pull Request**: See Below
> * **Mailing List discussion**:
> 
> Motivation
> 
> There has been a renewed interest from users in lightweight computing
> frameworks. Typical things what they mean by lightweight is:
> 
>  1. They are not compute systems that need to be installed/run/monitored.
>  Thus they are much more ops light. Some of them are offered as pure
>  SaaS(like AWS Lambda) while others are integrated with message queues(like
>  KStreams)
>  2. Their interface should be as simple as it gets. Typically it takes
>  the form of a function/subroutine that is the basic compute block in most
>  programming languages. And API must be multi-language capable.
>  3. The deployment models should be flexible. Users should be able to run
>  these functions using their favorite management tools, or they can run them
>  with the brokers.
> 
> The aim of all of these would be to dramatically increase the pace of
> experimentation/dev productivity. They also fit in the event driven
> architecture that most companies are moving towards where data is
> constantly arriving. The aim is for users to run simple functions against
> arriving data and not really worry about mastering the complicated
> API/semantics as well as managing/monitoring a complex compute infra.
> 
> A message queue like Pulsar sits at the heart of any event driven
> architecture. Data coming in from all sources typically lands in the
> message bus first. Thus if Pulsar(or a Pulsar extension) has this feature
> of being able to register/run simple user functions, it could be a long way
> to drive Pulsar adoption. Users could just deploy Pulsar and instantly have
> a very flexible way of doing basic computation.
> 
> This document outlines the goals/design of what we want in such a system
> and how they can be built into Pulsar.
> <https://github.com/apache/incubator-pulsar/wiki/PIP-15:-Pulsar-Functions#goals>
> Goals
> 
>  1. Simplest possible programmability: This is the overarching goal.
>  Anyone with the ability to write a function in a supported language should
>  be able to get productive in matter of minutes.
>  2. Multi Language Capability:- We should provide the API in at-least the
>  most popular languages, Java/Scala/Python/Go/JavaScript.
>  3. Flexible runtime deployment:- User should be able to run these
>  functions as a simple process using their favorite management tools. They
>  should also be able to submit their functions to be run in a Pulsar cluster.
>  4. Built in State Management:- Computations should be allowed to keep
>  state across computations. The system should take care of persisting this
>  state in a robust manner. Basic things like incrBy/get/put/update
>  functionality is a must. This dramatically simplifies the architecture for
>  the developer.
>  5. Queryable State:- The state written by a function should be queryable
>  using standard rest apis.
>  6. Automatic Load Balancing:- The Managed runtime should take care of
>  assigning workers to the functions.
>  7. Scale Up/Down:- Users should be able to scale up/down the number of
>  function instances in the managed runtime.
>  8. Flexible Invocation:- Thread based, process based and docker based
>  invocation should be supported for running each function.
>  9. Metrics:- Basic metrics like events processed per second, failures,
>  latency etc should be made available on a per function basis. Users should
>  also be able to publish their own metrics
>  10. REST interface:- Function control should be using REST protocol to
>  have the widest adoption.
>  11. Library/CLI:- Simple Libraries in all supported languages should
>  exist. Also should come with basic CLI to register/list/query/stats and
>  other admin activities.
> 
> More details on the PIP page.
> Thanks!

Reply via email to