Got a 404 on that link: https://github.com/Intel-bigdata/spark-streamsql
*Irfan Ahmad* CTO | Co-Founder | *CloudPhysics* <http://www.cloudphysics.com> Best of VMworld Finalist Best Cloud Management Award NetworkWorld 10 Startups to Watch EMA Most Notable Vendor On Wed, Mar 11, 2015 at 6:41 AM, Jason Dai <jason....@gmail.com> wrote: > Yes, a previous prototype is available > https://github.com/Intel-bigdata/spark-streamsql, and a talk is given at > last year's Spark Summit ( > http://spark-summit.org/2014/talk/streamsql-on-spark-manipulating-streams-by-sql-using-spark > ) > > We are currently porting the prototype to use the latest DataFrame API, > and will provide a stable version for people to try soon. > > Thabnks, > -Jason > > > On Wed, Mar 11, 2015 at 9:12 AM, Tobias Pfeiffer <t...@preferred.jp> wrote: > >> Hi, >> >> On Wed, Mar 11, 2015 at 9:33 AM, Cheng, Hao <hao.ch...@intel.com> wrote: >> >>> Intel has a prototype for doing this, SaiSai and Jason are the >>> authors. Probably you can ask them for some materials. >>> >> >> The github repository is here: https://github.com/intel-spark/stream-sql >> >> Also, what I did is writing a wrapper class SchemaDStream that internally >> holds a DStream[Row] and a DStream[StructType] (the latter having just one >> element in every RDD) and then allows to do >> - operations SchemaRDD => SchemaRDD using >> `rowStream.transformWith(schemaStream, ...)` >> - in particular you can register this stream's data as a table this way >> - and via a companion object with a method `fromSQL(sql: String): >> SchemaDStream` you can get a new stream from previously registered tables. >> >> However, you are limited to batch-internal operations, i.e., you can't >> aggregate across batches. >> >> I am not able to share the code at the moment, but will within the next >> months. It is not very advanced code, though, and should be easy to >> replicate. Also, I have no idea about the performance of transformWith.... >> >> Tobias >> >> >