Hi Stephan, I can certainly imagine future DSLs on top of Apache Beam. However, melting all the features of the Beam API into a DSL is not that easy though. Likely, you will end up with something similar complex to use as the existing API :)
There are projects that try to simplify Big Data processing and visualization: Apache NiFi https://nifi.apache.org/ Apache Zeppelin (incubating) https://zeppelin.incubator.apache.org/ I would love to see those integrate with Apache Beam. Both of these projects have integrated with Apache Flink in the past. Best, Max On Wed, May 25, 2016 at 1:43 PM, Stephan Buys <[email protected]> wrote: > Hi all, > > Hope I'm in the right forum, I'm someone with about a decade's worth of log > management/event analytics experience - for the last 2 years though we've > been building our own solutions based on a variety of open source > technologies. As hopefully some of you might appreciate, whenever you want to > do something interesting, or at scale with timeseries/event data a lot of the > tools are lacking. > > I started off working in Splunk and it sort off spoiled me with > end-user/administrator functionality from the get go (even if it > prohibitively expensive and slow). In Splunk the 'sandpit' that you play in > has all the toys a non developer can ask for: built-in map/reduce + > streaming, and manipulation of results/streams through a simple DSL familiar > to anyone with a bit of Unix CLI/Bash experience. (ie. search something | > filter | map | eval | visualise > http://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutsearchlanguagesyntax) > > At the moment we spend our days in logstash + elasticsearch (and sundry). > > I looked into Beam and Flink a bit and from a technical perspective it seems > like the ideal direction to go, combining many sources of data (such as > elasticsearch, influxdb, rethinkdb, etc) and many analytics use-cases. The > only gotcha seems to be that, from what I can see, the target audience is > almost always developers. This isn't a problem for myself, but ideally I > would want to bolt a simple DSL (submittable via simple interfaces, such as > cli) on top of my datasets but have all of the stream/batch processing > capabilities that project like Flink allow. > > Is anyone aware of projects/efforts along these lines? Ideas on how we could > there from a project such as Apache Beam? (Am I being naive?) > > Your input/perspectives are most welcome! > > Kind regards, > Stephan Buys > > > > >
