[CFP] DataWorks Summit Europe 2018 - Call for abstracts

2017-12-09 Thread Yanbo Liang
The DataWorks Summit Europe is in Berlin, Germany this year, on April 16-19, 2018. This is a great place to talk about work you are doing in Apache Spark or how you are using Spark for SQL/streaming processing, machine learning and data science. Information on submitting an abstract is at

Re: queryable state & streaming

2017-12-09 Thread Stavros Kontopoulos
Nice I was looking for a jira. So I agree we should justify why we are building something. Now to that direction here is what I have seen from my experience. People quite often use state within their streaming app and may have large states (TBs). Shortening the pipeline by not having to copy data

Re: RDD[internalRow] -> DataSet

2017-12-09 Thread Jacek Laskowski
Hi Satyajit, That's exactly what Dataset.rdd does --> https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala?utf8=%E2%9C%93#L2916-L2921 Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski Spark Structured Streaming

Re: BUILD FAILURE due to...not found: value AnalysisBarrier in spark-catalyst_2.11?

2017-12-09 Thread Jacek Laskowski
Hi, Thanks Sean! You're right -- my local repo got hosed. I don't know why the patch with AnalysisBarrier didn't go through. Speaking of the patch [1] I've noticed a sentence in the scaladoc of AnalysisBarrier that does not make much sense to me. There's something missing in it, isn't there? >