Re: [DISCUSS] StreamPipes proposal

2019-11-04 Thread Christofer Dutz
Hi all,

So are there any more comments, or should we start the vote?

Chris

Am 03.11.19, 11:17 schrieb "Julian Feinauer" :

Hi,

it would of course be awesome to have JB on board.
And indeed what JB suggests was the way I was also thinking of Streampipes 
and had a several discussions with Dominik already.

Currently, Streampipes is some kind of mediator between several "external" 
engines running in different processes or even different nodes.
But I would especially for edge applications highly welcome something which 
is more "edge" centered and designed to run in one process (which then brings 
us back to OSGi or Karaf or something in the long run).

My idea would be to have this as some sort of subproject which shares 
things like a data model and other abstractions, so ideally we could peel out 
into a "shared" core (although Dominik already assumed it to be tons of work... 
__ ).

Julian

Am 02.11.19, 18:14 schrieb "Jean-Baptiste Onofré" :

Thanks guys !

I'm cloning the existing codebase to dig into a little ;)

Regards
JB

On 02/11/2019 17:39, Christofer Dutz wrote:
> Hi all,
> 
> I added him to the list.
> 
> Chris
> 
> Am 02.11.19, 11:53 schrieb "Dominik Riemer" :
> 
> Yes, it would be super cool to have you as a mentor, thanks!
> We'll update the list in the wiki.
> 
> Dominik
> 
> -Original Message-
> From: Jean-Baptiste Onofré  
> Sent: Friday, November 1, 2019 6:49 PM
> To: general@incubator.apache.org
> Subject: Re: [DISCUSS] StreamPipes proposal
> 
> Hi Dominik,
> 
> it's an interesting proposal !
> 
> It sounds kind of integration platform for IoT protocols (a 
specialized platform compared to frameworks like Apache Camel or NiFi).
> 
> I would be happy to be mentor on the podling if you want !
> 
> Regards
> JB
> 
> On 01/11/2019 16:51, Dominik Riemer wrote:
> > Hi all,
> > 
> > following up my previous mail, we would now like to start an 
open discussion on bringing StreamPipes to the Apache Incubator. StreamPipes is 
an open source self-service toolbox for analyzing (Industrial) IoT data 
streams. We are aware that one of our main challenges will be to diversify the 
developer base and we are willing (and look forward!) to work on that. 
> > 
> > The proposal can be found below and is also listed in the 
Incubator wiki: 
https://cwiki.apache.org/confluence/display/INCUBATOR/StreamPipesProposal, 
thanks @Chris Dutz for creating the page!
> > 
> > We appreciate anyone who would be willing to support us a an 
additional mentor!
> > 
> > 
> > Dominik
> > 
> > 
> > 
> > StreamPipes Proposal
> > 
> > == Abstract ==
> > StreamPipes is a self-service (Industrial) IoT toolbox to 
enable non-technical users to connect, analyze and explore (Industrial) IoT 
data streams.
> > 
> > = Proposal =
> > 
> > The goal of StreamPipes (www.streampipes.org) is to provide an 
easy-to-use toolbox for non-technical users, e.g., domain experts, to exploit 
data streams coming from (Industrial) IoT devices. Such users are provided with 
an intuitive graphical user interface with the Pipeline Editor at its core. 
Users are able to graphically model processing pipelines based on data sources 
(streams), data processors and data sinks. Data processors and sinks are 
self-contained microservices, which implement either stateful or stateless 
processing logic (e.g., a trend detection or image classifier). Their 
processing logic is implemented using one of several provided wrappers (we 
currently have wrappers for standalone/Edge-based processing, Apache Flink, 
Siddhi and working wrapper prototypes for Apache Kafka Streams and Spark, in 
the future we also plan to integrate with Apache Beam). An SDK allows to easily 
create new pipeline elements. Pipeline elements can be installed at runtime. To 
support users in creating pipelines, an underlying semantics-based data model 
enables pipeline elements to express requirements on incoming data streams that 
need to be fulfilled, thus reducing modeling errors.

Re: [DISCUSS] StreamPipes proposal

2019-11-03 Thread Julian Feinauer
Hi,

it would of course be awesome to have JB on board.
And indeed what JB suggests was the way I was also thinking of Streampipes and 
had a several discussions with Dominik already.

Currently, Streampipes is some kind of mediator between several "external" 
engines running in different processes or even different nodes.
But I would especially for edge applications highly welcome something which is 
more "edge" centered and designed to run in one process (which then brings us 
back to OSGi or Karaf or something in the long run).

My idea would be to have this as some sort of subproject which shares things 
like a data model and other abstractions, so ideally we could peel out into a 
"shared" core (although Dominik already assumed it to be tons of work... __ ).

Julian

Am 02.11.19, 18:14 schrieb "Jean-Baptiste Onofré" :

Thanks guys !

I'm cloning the existing codebase to dig into a little ;)

Regards
JB

On 02/11/2019 17:39, Christofer Dutz wrote:
> Hi all,
> 
> I added him to the list.
> 
> Chris
> 
> Am 02.11.19, 11:53 schrieb "Dominik Riemer" :
> 
> Yes, it would be super cool to have you as a mentor, thanks!
> We'll update the list in the wiki.
> 
> Dominik
> 
> -Original Message-
> From: Jean-Baptiste Onofré  
>     Sent: Friday, November 1, 2019 6:49 PM
> To: general@incubator.apache.org
> Subject: Re: [DISCUSS] StreamPipes proposal
> 
> Hi Dominik,
> 
> it's an interesting proposal !
> 
> It sounds kind of integration platform for IoT protocols (a 
specialized platform compared to frameworks like Apache Camel or NiFi).
> 
> I would be happy to be mentor on the podling if you want !
> 
> Regards
> JB
> 
> On 01/11/2019 16:51, Dominik Riemer wrote:
> > Hi all,
> > 
> > following up my previous mail, we would now like to start an open 
discussion on bringing StreamPipes to the Apache Incubator. StreamPipes is an 
open source self-service toolbox for analyzing (Industrial) IoT data streams. 
We are aware that one of our main challenges will be to diversify the developer 
base and we are willing (and look forward!) to work on that. 
> > 
> > The proposal can be found below and is also listed in the Incubator 
wiki: 
https://cwiki.apache.org/confluence/display/INCUBATOR/StreamPipesProposal, 
thanks @Chris Dutz for creating the page!
> > 
> > We appreciate anyone who would be willing to support us a an 
additional mentor!
> > 
> > 
> > Dominik
> > 
> > 
> > 
> > StreamPipes Proposal
> > 
> > == Abstract ==
> > StreamPipes is a self-service (Industrial) IoT toolbox to enable 
non-technical users to connect, analyze and explore (Industrial) IoT data 
streams.
> > 
> > = Proposal =
> > 
> > The goal of StreamPipes (www.streampipes.org) is to provide an 
easy-to-use toolbox for non-technical users, e.g., domain experts, to exploit 
data streams coming from (Industrial) IoT devices. Such users are provided with 
an intuitive graphical user interface with the Pipeline Editor at its core. 
Users are able to graphically model processing pipelines based on data sources 
(streams), data processors and data sinks. Data processors and sinks are 
self-contained microservices, which implement either stateful or stateless 
processing logic (e.g., a trend detection or image classifier). Their 
processing logic is implemented using one of several provided wrappers (we 
currently have wrappers for standalone/Edge-based processing, Apache Flink, 
Siddhi and working wrapper prototypes for Apache Kafka Streams and Spark, in 
the future we also plan to integrate with Apache Beam). An SDK allows to easily 
create new pipeline elements. Pipeline elements can be installed at runtime. To 
support users in creating pipelines, an underlying semantics-based data model 
enables pipeline elements to express requirements on incoming data streams that 
need to be fulfilled, thus reducing modeling errors.
> > Data streams are integrated by using StreamPipes Connect, which 
allows to connect data sources (based on standard protocols, such as MQTT, 
Kafka, Pulsar, OPC-UA and further PLC4X-supported protocols) without further 
programming using a graphical wizard. Additional user-faced modules of 
StreamPipes are a Live dashboard to quickly explore IoT data streams and a 
wizard that generates code templates for new pip

Re: [DISCUSS] StreamPipes proposal

2019-11-02 Thread Jean-Baptiste Onofré
Thanks guys !

I'm cloning the existing codebase to dig into a little ;)

Regards
JB

On 02/11/2019 17:39, Christofer Dutz wrote:
> Hi all,
> 
> I added him to the list.
> 
> Chris
> 
> Am 02.11.19, 11:53 schrieb "Dominik Riemer" :
> 
> Yes, it would be super cool to have you as a mentor, thanks!
> We'll update the list in the wiki.
> 
> Dominik
> 
> -Original Message-
> From: Jean-Baptiste Onofré  
> Sent: Friday, November 1, 2019 6:49 PM
> To: general@incubator.apache.org
> Subject: Re: [DISCUSS] StreamPipes proposal
> 
> Hi Dominik,
> 
> it's an interesting proposal !
> 
> It sounds kind of integration platform for IoT protocols (a specialized 
> platform compared to frameworks like Apache Camel or NiFi).
> 
> I would be happy to be mentor on the podling if you want !
> 
> Regards
> JB
> 
> On 01/11/2019 16:51, Dominik Riemer wrote:
> > Hi all,
> > 
> > following up my previous mail, we would now like to start an open 
> discussion on bringing StreamPipes to the Apache Incubator. StreamPipes is an 
> open source self-service toolbox for analyzing (Industrial) IoT data streams. 
> We are aware that one of our main challenges will be to diversify the 
> developer base and we are willing (and look forward!) to work on that. 
> > 
> > The proposal can be found below and is also listed in the Incubator 
> wiki: 
> https://cwiki.apache.org/confluence/display/INCUBATOR/StreamPipesProposal, 
> thanks @Chris Dutz for creating the page!
> > 
> > We appreciate anyone who would be willing to support us a an additional 
> mentor!
> > 
> > 
> > Dominik
> > 
> > 
> > 
> > StreamPipes Proposal
> > 
> > == Abstract ==
> > StreamPipes is a self-service (Industrial) IoT toolbox to enable 
> non-technical users to connect, analyze and explore (Industrial) IoT data 
> streams.
> > 
> > = Proposal =
> > 
> > The goal of StreamPipes (www.streampipes.org) is to provide an 
> easy-to-use toolbox for non-technical users, e.g., domain experts, to exploit 
> data streams coming from (Industrial) IoT devices. Such users are provided 
> with an intuitive graphical user interface with the Pipeline Editor at its 
> core. Users are able to graphically model processing pipelines based on data 
> sources (streams), data processors and data sinks. Data processors and sinks 
> are self-contained microservices, which implement either stateful or 
> stateless processing logic (e.g., a trend detection or image classifier). 
> Their processing logic is implemented using one of several provided wrappers 
> (we currently have wrappers for standalone/Edge-based processing, Apache 
> Flink, Siddhi and working wrapper prototypes for Apache Kafka Streams and 
> Spark, in the future we also plan to integrate with Apache Beam). An SDK 
> allows to easily create new pipeline elements. Pipeline elements can be 
> installed at runtime. To support users in creating pipelines, an underlying 
> semantics-based data model enables pipeline elements to express requirements 
> on incoming data streams that need to be fulfilled, thus reducing modeling 
> errors.
> > Data streams are integrated by using StreamPipes Connect, which allows 
> to connect data sources (based on standard protocols, such as MQTT, Kafka, 
> Pulsar, OPC-UA and further PLC4X-supported protocols) without further 
> programming using a graphical wizard. Additional user-faced modules of 
> StreamPipes are a Live dashboard to quickly explore IoT data streams and a 
> wizard that generates code templates for new pipeline elements, a Pipeline 
> Element Installer used to extend the algorithm feature set at runtime.
> > 
> > === Background ===
> > StreamPipes was started in 2014 by researchers from FZI Research Center 
> for Information Technology in Karlsruhe, Germany. The original prototype was 
> funded by an EU project centered around predictive analytics for the 
> manufacturing domain. Since then, StreamPipes was constantly improved and 
> extended by public funding mainly from federal German ministries. In early 
> 2018, the source code was officially released under the Apache License 2.0. 
> At the same time, while we focused on bringing the research prototype to a 
> production-grade tool, the first companies started to use StreamPipes. 
> Currently, the primary goal is to widen the user and developer base. At 
> ApacheCon NA 2019, after having talked to many people from the Apache 
> Commun

Re: [DISCUSS] StreamPipes proposal

2019-11-02 Thread Christofer Dutz
Hi all,

I added him to the list.

Chris

Am 02.11.19, 11:53 schrieb "Dominik Riemer" :

Yes, it would be super cool to have you as a mentor, thanks!
We'll update the list in the wiki.

Dominik

-Original Message-
From: Jean-Baptiste Onofré  
Sent: Friday, November 1, 2019 6:49 PM
To: general@incubator.apache.org
    Subject: Re: [DISCUSS] StreamPipes proposal

Hi Dominik,

it's an interesting proposal !

It sounds kind of integration platform for IoT protocols (a specialized 
platform compared to frameworks like Apache Camel or NiFi).

I would be happy to be mentor on the podling if you want !

Regards
JB

On 01/11/2019 16:51, Dominik Riemer wrote:
> Hi all,
> 
> following up my previous mail, we would now like to start an open 
discussion on bringing StreamPipes to the Apache Incubator. StreamPipes is an 
open source self-service toolbox for analyzing (Industrial) IoT data streams. 
We are aware that one of our main challenges will be to diversify the developer 
base and we are willing (and look forward!) to work on that. 
> 
> The proposal can be found below and is also listed in the Incubator wiki: 
https://cwiki.apache.org/confluence/display/INCUBATOR/StreamPipesProposal, 
thanks @Chris Dutz for creating the page!
> 
> We appreciate anyone who would be willing to support us a an additional 
mentor!
> 
> 
> Dominik
> 
> 
> 
> StreamPipes Proposal
> 
> == Abstract ==
> StreamPipes is a self-service (Industrial) IoT toolbox to enable 
non-technical users to connect, analyze and explore (Industrial) IoT data 
streams.
> 
> = Proposal =
> 
> The goal of StreamPipes (www.streampipes.org) is to provide an 
easy-to-use toolbox for non-technical users, e.g., domain experts, to exploit 
data streams coming from (Industrial) IoT devices. Such users are provided with 
an intuitive graphical user interface with the Pipeline Editor at its core. 
Users are able to graphically model processing pipelines based on data sources 
(streams), data processors and data sinks. Data processors and sinks are 
self-contained microservices, which implement either stateful or stateless 
processing logic (e.g., a trend detection or image classifier). Their 
processing logic is implemented using one of several provided wrappers (we 
currently have wrappers for standalone/Edge-based processing, Apache Flink, 
Siddhi and working wrapper prototypes for Apache Kafka Streams and Spark, in 
the future we also plan to integrate with Apache Beam). An SDK allows to easily 
create new pipeline elements. Pipeline elements can be installed at runtime. To 
support users in creating pipelines, an underlying semantics-based data model 
enables pipeline elements to express requirements on incoming data streams that 
need to be fulfilled, thus reducing modeling errors.
> Data streams are integrated by using StreamPipes Connect, which allows to 
connect data sources (based on standard protocols, such as MQTT, Kafka, Pulsar, 
OPC-UA and further PLC4X-supported protocols) without further programming using 
a graphical wizard. Additional user-faced modules of StreamPipes are a Live 
dashboard to quickly explore IoT data streams and a wizard that generates code 
templates for new pipeline elements, a Pipeline Element Installer used to 
extend the algorithm feature set at runtime.
> 
> === Background ===
> StreamPipes was started in 2014 by researchers from FZI Research Center 
for Information Technology in Karlsruhe, Germany. The original prototype was 
funded by an EU project centered around predictive analytics for the 
manufacturing domain. Since then, StreamPipes was constantly improved and 
extended by public funding mainly from federal German ministries. In early 
2018, the source code was officially released under the Apache License 2.0. At 
the same time, while we focused on bringing the research prototype to a 
production-grade tool, the first companies started to use StreamPipes. 
Currently, the primary goal is to widen the user and developer base. At 
ApacheCon NA 2019, after having talked to many people from the Apache 
Community, we finally decided that we would like to bring StreamPipes to the 
Apache Incubator.
> 
> === Rationale ===
> The (Industrial) IoT domain is a highly relevant and emerging sector. 
Currently, IoT platforms are offered by many vendors ranging from SMEs up to 
large enterprises. We believe that open source alternatives are an important 
cornerstone for manufacturing companies to easily adopt data-driven decision 
making. From our point of view, StreamPipes fits very well into the existing 
(I)IoT ecosystem within the ASF, with projects such as Apache PLC4X focusing on 
connecting machine data from PLCs

RE: [DISCUSS] StreamPipes proposal

2019-11-02 Thread Dominik Riemer
Yes, it would be super cool to have you as a mentor, thanks!
We'll update the list in the wiki.

Dominik

-Original Message-
From: Jean-Baptiste Onofré  
Sent: Friday, November 1, 2019 6:49 PM
To: general@incubator.apache.org
Subject: Re: [DISCUSS] StreamPipes proposal

Hi Dominik,

it's an interesting proposal !

It sounds kind of integration platform for IoT protocols (a specialized 
platform compared to frameworks like Apache Camel or NiFi).

I would be happy to be mentor on the podling if you want !

Regards
JB

On 01/11/2019 16:51, Dominik Riemer wrote:
> Hi all,
> 
> following up my previous mail, we would now like to start an open discussion 
> on bringing StreamPipes to the Apache Incubator. StreamPipes is an open 
> source self-service toolbox for analyzing (Industrial) IoT data streams. We 
> are aware that one of our main challenges will be to diversify the developer 
> base and we are willing (and look forward!) to work on that. 
> 
> The proposal can be found below and is also listed in the Incubator wiki: 
> https://cwiki.apache.org/confluence/display/INCUBATOR/StreamPipesProposal, 
> thanks @Chris Dutz for creating the page!
> 
> We appreciate anyone who would be willing to support us a an additional 
> mentor!
> 
> 
> Dominik
> 
> 
> 
> StreamPipes Proposal
> 
> == Abstract ==
> StreamPipes is a self-service (Industrial) IoT toolbox to enable 
> non-technical users to connect, analyze and explore (Industrial) IoT data 
> streams.
> 
> = Proposal =
> 
> The goal of StreamPipes (www.streampipes.org) is to provide an easy-to-use 
> toolbox for non-technical users, e.g., domain experts, to exploit data 
> streams coming from (Industrial) IoT devices. Such users are provided with an 
> intuitive graphical user interface with the Pipeline Editor at its core. 
> Users are able to graphically model processing pipelines based on data 
> sources (streams), data processors and data sinks. Data processors and sinks 
> are self-contained microservices, which implement either stateful or 
> stateless processing logic (e.g., a trend detection or image classifier). 
> Their processing logic is implemented using one of several provided wrappers 
> (we currently have wrappers for standalone/Edge-based processing, Apache 
> Flink, Siddhi and working wrapper prototypes for Apache Kafka Streams and 
> Spark, in the future we also plan to integrate with Apache Beam). An SDK 
> allows to easily create new pipeline elements. Pipeline elements can be 
> installed at runtime. To support users in creating pipelines, an underlying 
> semantics-based data model enables pipeline elements to express requirements 
> on incoming data streams that need to be fulfilled, thus reducing modeling 
> errors.
> Data streams are integrated by using StreamPipes Connect, which allows to 
> connect data sources (based on standard protocols, such as MQTT, Kafka, 
> Pulsar, OPC-UA and further PLC4X-supported protocols) without further 
> programming using a graphical wizard. Additional user-faced modules of 
> StreamPipes are a Live dashboard to quickly explore IoT data streams and a 
> wizard that generates code templates for new pipeline elements, a Pipeline 
> Element Installer used to extend the algorithm feature set at runtime.
> 
> === Background ===
> StreamPipes was started in 2014 by researchers from FZI Research Center for 
> Information Technology in Karlsruhe, Germany. The original prototype was 
> funded by an EU project centered around predictive analytics for the 
> manufacturing domain. Since then, StreamPipes was constantly improved and 
> extended by public funding mainly from federal German ministries. In early 
> 2018, the source code was officially released under the Apache License 2.0. 
> At the same time, while we focused on bringing the research prototype to a 
> production-grade tool, the first companies started to use StreamPipes. 
> Currently, the primary goal is to widen the user and developer base. At 
> ApacheCon NA 2019, after having talked to many people from the Apache 
> Community, we finally decided that we would like to bring StreamPipes to the 
> Apache Incubator.
> 
> === Rationale ===
> The (Industrial) IoT domain is a highly relevant and emerging sector. 
> Currently, IoT platforms are offered by many vendors ranging from SMEs up to 
> large enterprises. We believe that open source alternatives are an important 
> cornerstone for manufacturing companies to easily adopt data-driven decision 
> making. From our point of view, StreamPipes fits very well into the existing 
> (I)IoT ecosystem within the ASF, with projects such as Apache PLC4X focusing 
> on connecting machine data from PLCs, or other tools we are also using either 
> in the core of 

Re: [DISCUSS] StreamPipes proposal

2019-11-01 Thread Jean-Baptiste Onofré
Hi Dominik,

it's an interesting proposal !

It sounds kind of integration platform for IoT protocols (a specialized
platform compared to frameworks like Apache Camel or NiFi).

I would be happy to be mentor on the podling if you want !

Regards
JB

On 01/11/2019 16:51, Dominik Riemer wrote:
> Hi all,
> 
> following up my previous mail, we would now like to start an open discussion 
> on bringing StreamPipes to the Apache Incubator. StreamPipes is an open 
> source self-service toolbox for analyzing (Industrial) IoT data streams. We 
> are aware that one of our main challenges will be to diversify the developer 
> base and we are willing (and look forward!) to work on that. 
> 
> The proposal can be found below and is also listed in the Incubator wiki: 
> https://cwiki.apache.org/confluence/display/INCUBATOR/StreamPipesProposal, 
> thanks @Chris Dutz for creating the page!
> 
> We appreciate anyone who would be willing to support us a an additional 
> mentor!
> 
> 
> Dominik
> 
> 
> 
> StreamPipes Proposal
> 
> == Abstract ==
> StreamPipes is a self-service (Industrial) IoT toolbox to enable 
> non-technical users to connect, analyze and explore (Industrial) IoT data 
> streams.
> 
> = Proposal =
> 
> The goal of StreamPipes (www.streampipes.org) is to provide an easy-to-use 
> toolbox for non-technical users, e.g., domain experts, to exploit data 
> streams coming from (Industrial) IoT devices. Such users are provided with an 
> intuitive graphical user interface with the Pipeline Editor at its core. 
> Users are able to graphically model processing pipelines based on data 
> sources (streams), data processors and data sinks. Data processors and sinks 
> are self-contained microservices, which implement either stateful or 
> stateless processing logic (e.g., a trend detection or image classifier). 
> Their processing logic is implemented using one of several provided wrappers 
> (we currently have wrappers for standalone/Edge-based processing, Apache 
> Flink, Siddhi and working wrapper prototypes for Apache Kafka Streams and 
> Spark, in the future we also plan to integrate with Apache Beam). An SDK 
> allows to easily create new pipeline elements. Pipeline elements can be 
> installed at runtime. To support users in creating pipelines, an underlying 
> semantics-based data model enables pipeline elements to express requirements 
> on incoming data streams that need to be fulfilled, thus reducing modeling 
> errors.
> Data streams are integrated by using StreamPipes Connect, which allows to 
> connect data sources (based on standard protocols, such as MQTT, Kafka, 
> Pulsar, OPC-UA and further PLC4X-supported protocols) without further 
> programming using a graphical wizard. Additional user-faced modules of 
> StreamPipes are a Live dashboard to quickly explore IoT data streams and a 
> wizard that generates code templates for new pipeline elements, a Pipeline 
> Element Installer used to extend the algorithm feature set at runtime.
> 
> === Background ===
> StreamPipes was started in 2014 by researchers from FZI Research Center for 
> Information Technology in Karlsruhe, Germany. The original prototype was 
> funded by an EU project centered around predictive analytics for the 
> manufacturing domain. Since then, StreamPipes was constantly improved and 
> extended by public funding mainly from federal German ministries. In early 
> 2018, the source code was officially released under the Apache License 2.0. 
> At the same time, while we focused on bringing the research prototype to a 
> production-grade tool, the first companies started to use StreamPipes. 
> Currently, the primary goal is to widen the user and developer base. At 
> ApacheCon NA 2019, after having talked to many people from the Apache 
> Community, we finally decided that we would like to bring StreamPipes to the 
> Apache Incubator.
> 
> === Rationale ===
> The (Industrial) IoT domain is a highly relevant and emerging sector. 
> Currently, IoT platforms are offered by many vendors ranging from SMEs up to 
> large enterprises. We believe that open source alternatives are an important 
> cornerstone for manufacturing companies to easily adopt data-driven decision 
> making. From our point of view, StreamPipes fits very well into the existing 
> (I)IoT ecosystem within the ASF, with projects such as Apache PLC4X focusing 
> on connecting machine data from PLCs, or other tools we are also using either 
> in the core of StreamPipes or with integrations (Apache Kafka, Apache IoTDB, 
> Apache Pulsar). StreamPipes itself focuses on enabling self-service IoT data 
> analytics for non-technical users.
> The whole StreamPipes code is currently on Github. To get a rough estimate of 
> the project size: 
> * streampipes: Backend and core modules, ~3300 commits
> * streampipes-ui: User Interface, ~1300 commits
> * streampipes-pipeline-elements: ~100 Pipeline Elements (data 
> processors/algorithms and sinks), ~500 Commits
> * streampipes-connect

[DISCUSS] StreamPipes proposal

2019-11-01 Thread Dominik Riemer
Hi all,

following up my previous mail, we would now like to start an open discussion on 
bringing StreamPipes to the Apache Incubator. StreamPipes is an open source 
self-service toolbox for analyzing (Industrial) IoT data streams. We are aware 
that one of our main challenges will be to diversify the developer base and we 
are willing (and look forward!) to work on that. 

The proposal can be found below and is also listed in the Incubator wiki: 
https://cwiki.apache.org/confluence/display/INCUBATOR/StreamPipesProposal, 
thanks @Chris Dutz for creating the page!

We appreciate anyone who would be willing to support us a an additional mentor!


Dominik



StreamPipes Proposal

== Abstract ==
StreamPipes is a self-service (Industrial) IoT toolbox to enable non-technical 
users to connect, analyze and explore (Industrial) IoT data streams.

= Proposal =

The goal of StreamPipes (www.streampipes.org) is to provide an easy-to-use 
toolbox for non-technical users, e.g., domain experts, to exploit data streams 
coming from (Industrial) IoT devices. Such users are provided with an intuitive 
graphical user interface with the Pipeline Editor at its core. Users are able 
to graphically model processing pipelines based on data sources (streams), data 
processors and data sinks. Data processors and sinks are self-contained 
microservices, which implement either stateful or stateless processing logic 
(e.g., a trend detection or image classifier). Their processing logic is 
implemented using one of several provided wrappers (we currently have wrappers 
for standalone/Edge-based processing, Apache Flink, Siddhi and working wrapper 
prototypes for Apache Kafka Streams and Spark, in the future we also plan to 
integrate with Apache Beam). An SDK allows to easily create new pipeline 
elements. Pipeline elements can be installed at runtime. To support users in 
creating pipelines, an underlying semantics-based data model enables pipeline 
elements to express requirements on incoming data streams that need to be 
fulfilled, thus reducing modeling errors.
Data streams are integrated by using StreamPipes Connect, which allows to 
connect data sources (based on standard protocols, such as MQTT, Kafka, Pulsar, 
OPC-UA and further PLC4X-supported protocols) without further programming using 
a graphical wizard. Additional user-faced modules of StreamPipes are a Live 
dashboard to quickly explore IoT data streams and a wizard that generates code 
templates for new pipeline elements, a Pipeline Element Installer used to 
extend the algorithm feature set at runtime.

=== Background ===
StreamPipes was started in 2014 by researchers from FZI Research Center for 
Information Technology in Karlsruhe, Germany. The original prototype was funded 
by an EU project centered around predictive analytics for the manufacturing 
domain. Since then, StreamPipes was constantly improved and extended by public 
funding mainly from federal German ministries. In early 2018, the source code 
was officially released under the Apache License 2.0. At the same time, while 
we focused on bringing the research prototype to a production-grade tool, the 
first companies started to use StreamPipes. Currently, the primary goal is to 
widen the user and developer base. At ApacheCon NA 2019, after having talked to 
many people from the Apache Community, we finally decided that we would like to 
bring StreamPipes to the Apache Incubator.

=== Rationale ===
The (Industrial) IoT domain is a highly relevant and emerging sector. 
Currently, IoT platforms are offered by many vendors ranging from SMEs up to 
large enterprises. We believe that open source alternatives are an important 
cornerstone for manufacturing companies to easily adopt data-driven decision 
making. From our point of view, StreamPipes fits very well into the existing 
(I)IoT ecosystem within the ASF, with projects such as Apache PLC4X focusing on 
connecting machine data from PLCs, or other tools we are also using either in 
the core of StreamPipes or with integrations (Apache Kafka, Apache IoTDB, 
Apache Pulsar). StreamPipes itself focuses on enabling self-service IoT data 
analytics for non-technical users.
The whole StreamPipes code is currently on Github. To get a rough estimate of 
the project size: 
* streampipes: Backend and core modules, ~3300 commits
* streampipes-ui: User Interface, ~1300 commits
* streampipes-pipeline-elements: ~100 Pipeline Elements (data 
processors/algorithms and sinks), ~500 Commits
* streampipes-connect-adapters: ~20 Adapters to connect data, ~100 commits To 
achieve our goal to further extend the code base with new features, new 
connectors and new algorithms and to grow both the user and developer 
community, we believe that a community-driven development process is the best 
way to further develop StreamPipes. Finally, after having talked to committers 
from various Apache IoT-related projects and participation in spontaneous 
hacking sessions and being impres