A lot of great ideas here! I think some cool processors or new controller services would make a lot of sense for GSoC: no need to have a deep knowledge of the NiFi framework to get started. Everything around provenance would certainly sound attractive from a student perspective: graph analytics and machine learning are trendy subjects.
Pierre 2018-02-28 14:33 GMT+01:00 Matt Burgess <mattyb...@apache.org>: > Yes I do! Sorry all, I had sent the original message in haste to get > the information out for discussion, but didn't have the time at that > moment to share everything else, including my enthusiasm and some > actual ideas :) Here are some I came up with, note that many may not > be "industrial-strength" but still interesting student projects: > > - Anything to do with provenance. Uwe has a wonderful idea that I will > respond to separately, but there are lots of applications and > approaches that can make use provenance, such as graph analytics (find > flow bottlenecks, e.g.), machine learning (predict likelihood of > reaching a failure connection based on attributes and/or content), > etc. > - An Apache Calcite adapter that can read from a NiFi Output Port. > This probably makes more sense from a SQL Streaming perspective than > emulating a relational DB, but is an interesting application of > Calcite and NiFi. > - An UpdateAttributeUsingJava processor (with a better name), this > could use Janino to quickly evaluate Java expressions that can > leverage attributes and perhaps all of Expression Language to perform > more powerful functions (without needing a full scripted processor) > - A RouteOnProbability processor, to support Monte Carlo simulations. > User-defined properties could have values whose sum is 1 and whose > keys become the outgoing relationship names. > - A SampleReservoir processor, to do reservoir sampling (good for > testing downstream flows without throwing a ton of data at it) > - YAML Record Reader/Writer > > Looks like proposals are being accepted on March 18 (I don't know if > that's for students proposing/selecting projects or for organizations > to propose possible projects) , but there are a number of Apache Jira > issues already tagged as gsoc2018 [1]. > > Regards, > Matt > > [1] http://s.apache.org/gsoc2018ideas > > > On Wed, Feb 28, 2018 at 12:12 AM, Joe Witt <joe.w...@gmail.com> wrote: > > Matt > > > > Did you have some ideas/features/enhancements in mind you think would > > be good to propose? > > > > Thanks > > Joe > > > > On Tue, Feb 27, 2018 at 6:56 PM, Matt Burgess <mattyb...@apache.org> > wrote: > >> If you haven't heard yet, the Apache Software Foundation was selected > >> as an organization for this year's Google Summer of Code [1]. I've > >> seen activity on other Apache projects' mailing lists requesting ideas > >> for issues, features, components, etc. that could be good > >> proposals/ideas for GSoC, and I'd like to also make that request of > >> this community. > >> > >> As Michael Mior (of Apache Calcite PMC) eloquently put it: "It's no > >> guarantee we would get someone to work on it, but it could be a good > >> push to move some isolated bits of functionality forward that may not > >> get much attention otherwise." > >> > >> Thoughts? > >> > >> Thanks in advance, > >> Matt > >> > >> [1] https://summerofcode.withgoogle.com/organizations/5718432427802624/ >