Hi Sanjeev - I have read the PIP more carefully on my computer (rather than iPhone).
Process Runtime in which each instance is run as a process. Docker Runtime in which each instance is run as a docker container Threaded Runtime in which each instance is run as a thread. This type is applicable only to Java instance since Pulsar Functions framework itself is written in Java. I’m interested in knowing a bit more about the Runtime API for these three types. How much of the PIP exists in code? Best Regards, Dave > On Feb 20, 2018, at 7:33 PM, Sanjeev Kulkarni <sanjee...@gmail.com> wrote: > > Hi Dave, > Chaining functions is certainly on the roadmap. The PIP document briefly > talks about at-least two ways of doing it, but it probably requires another > PIP by itself at a later stage. > Wrt parallelism, for functions managed by Pulsar cluster, parallelism can > be provided at submission time. For functions that will be run as a simple > process, the parallelism should be managed by the user. > WRT cpu/memory and other configuration, the aim of the inbuilt Pulsar > cluster is to keep it simple by just doing some simple distribution across > multiple workers. The aim is not to replicate features that are already > present in full-fledged schedulers like Mesos/Yarn/K8. If one needs > memory/cpu bounds for a function, the ideal way to do that would be to run > them on one of these full-blown schedulers. We could provide an easier > path for users to run these functions onto these schedulers by providing > launch templates. > Hope that helps. > > On Tue, Feb 20, 2018 at 6:08 PM, Dave Fisher <dave2w...@comcast.net> wrote: > >> Hi - >> >> This is very interesting. I’ve been thinking about using Heron for this >> functionality. >> >> An Admin API for configuring the functions on live Executors and >> specifying a unique return value Topic need discussion. I would also like >> to chain Functions. >> >> I think Functions will need Profiles to include metadata for parallelism, >> memory, configuration, etc. >> >> Regards, >> Dave >> >> Sent from my iPhone >> >>> On Feb 20, 2018, at 4:05 PM, Sanjeev Kulkarni <sanjee...@gmail.com> >> wrote: >>> >>> https://github.com/apache/incubator-pulsar/wiki/PIP-15:-Pulsar-Functions >>> >>> ------- >>> >>> * **Status**: Proposal >>> * **Author**: Sanjeev Kulkarni/Sijie Guo/Jerry Peng - Streamlio >>> * **Pull Request**: See Below >>> * **Mailing List discussion**: >>> >>> Motivation >>> >>> There has been a renewed interest from users in lightweight computing >>> frameworks. Typical things what they mean by lightweight is: >>> >>> 1. They are not compute systems that need to be installed/run/monitored. >>> Thus they are much more ops light. Some of them are offered as pure >>> SaaS(like AWS Lambda) while others are integrated with message >> queues(like >>> KStreams) >>> 2. Their interface should be as simple as it gets. Typically it takes >>> the form of a function/subroutine that is the basic compute block in >> most >>> programming languages. And API must be multi-language capable. >>> 3. The deployment models should be flexible. Users should be able to run >>> these functions using their favorite management tools, or they can run >> them >>> with the brokers. >>> >>> The aim of all of these would be to dramatically increase the pace of >>> experimentation/dev productivity. They also fit in the event driven >>> architecture that most companies are moving towards where data is >>> constantly arriving. The aim is for users to run simple functions against >>> arriving data and not really worry about mastering the complicated >>> API/semantics as well as managing/monitoring a complex compute infra. >>> >>> A message queue like Pulsar sits at the heart of any event driven >>> architecture. Data coming in from all sources typically lands in the >>> message bus first. Thus if Pulsar(or a Pulsar extension) has this feature >>> of being able to register/run simple user functions, it could be a long >> way >>> to drive Pulsar adoption. Users could just deploy Pulsar and instantly >> have >>> a very flexible way of doing basic computation. >>> >>> This document outlines the goals/design of what we want in such a system >>> and how they can be built into Pulsar. >>> <https://github.com/apache/incubator-pulsar/wiki/PIP-15:- >> Pulsar-Functions#goals> >>> Goals >>> >>> 1. Simplest possible programmability: This is the overarching goal. >>> Anyone with the ability to write a function in a supported language >> should >>> be able to get productive in matter of minutes. >>> 2. Multi Language Capability:- We should provide the API in at-least the >>> most popular languages, Java/Scala/Python/Go/JavaScript. >>> 3. Flexible runtime deployment:- User should be able to run these >>> functions as a simple process using their favorite management tools. >> They >>> should also be able to submit their functions to be run in a Pulsar >> cluster. >>> 4. Built in State Management:- Computations should be allowed to keep >>> state across computations. The system should take care of persisting >> this >>> state in a robust manner. Basic things like incrBy/get/put/update >>> functionality is a must. This dramatically simplifies the architecture >> for >>> the developer. >>> 5. Queryable State:- The state written by a function should be queryable >>> using standard rest apis. >>> 6. Automatic Load Balancing:- The Managed runtime should take care of >>> assigning workers to the functions. >>> 7. Scale Up/Down:- Users should be able to scale up/down the number of >>> function instances in the managed runtime. >>> 8. Flexible Invocation:- Thread based, process based and docker based >>> invocation should be supported for running each function. >>> 9. Metrics:- Basic metrics like events processed per second, failures, >>> latency etc should be made available on a per function basis. Users >> should >>> also be able to publish their own metrics >>> 10. REST interface:- Function control should be using REST protocol to >>> have the widest adoption. >>> 11. Library/CLI:- Simple Libraries in all supported languages should >>> exist. Also should come with basic CLI to register/list/query/stats and >>> other admin activities. >>> >>> More details on the PIP page. >>> Thanks! >> >>
signature.asc
Description: Message signed with OpenPGP