Re: Apache Spark - Spark Structured Streaming - Watermark usage

2018-01-26 Thread Jacek Laskowski
Hi, I'm curious how would you do the requirement "by a certain amount of time" without a watermark? How would you know what's current and compute the lag? Let's forget about watermark for a moment and see if it pops up as an inevitable feature :) "I am trying to filter out records which are

Apache Spark - Spark Structured Streaming - Watermark usage

2018-01-26 Thread M Singh
Hi: I am trying to filter out records which are lagging behind (based on event time) by a certain amount of time.   Is the watermark api applicable to this scenario (ie, filtering lagging records) or it is only applicable with aggregation ?  I could not get a clear understanding from the

Re: Apache Spark - Custom structured streaming data source

2018-01-26 Thread M Singh
Thanks TD.  When will 2.3 scheduled for release ?   On Thursday, January 25, 2018 11:32 PM, Tathagata Das wrote: Hello Mans, The streaming DataSource APIs are still evolving and are not public yet. Hence there is no official documentation. In fact, there is a new

Re: Spark Standalone Mode, application runs, but executor is killed

2018-01-26 Thread Chandu
/Reply from Marco posted in another thread/ Re: Best active groups, forums or contacts for Spark ? Posted by Marco Mistroni on Jan 26, 2018; 9:08am URL: http://apache-spark-user-list.1001560.n3.nabble.com/Best-active-groups-forums-or-contacts-for-Spark-tp30744p30748.html Hi From personal

Re: Best active groups, forums or contacts for Spark ?

2018-01-26 Thread Chandu
@Thanks Marco. I have provided information in my original post ( http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Standalone-Mode-application-runs-but-executor-is-killed-tc30739.html

Re: Spark Standalone Mode, application runs, but executor is killed

2018-01-26 Thread Chandu
@Marco Thank you. I thought Standalone and Standalone cluster are the same? The app is not a huge app. It's just PI calculation example. The value of PI is calculated passed to the driver successfully. When I issue the spark.stop from my driver, that is when I see the KILLED message on the worker

Re: Spark Standalone Mode, application runs, but executor is killed

2018-01-26 Thread Chandu
/Reply from Marco in another post/ Re: Best active groups, forums or contacts for Spark ? Posted by Marco Mistroni on Jan 26, 2018; 9:08am URL: http://apache-spark-user-list.1001560.n3.nabble.com/Best-active-groups-forums-or-contacts-for-Spark-tp30744p30748.html Hi From personal

Re: Spark Standalone Mode, application runs, but executor is killed

2018-01-26 Thread Chandu
/Reply from Marco posted in another thread/ Re: Best active groups, forums or contacts for Spark ? Posted by Marco Mistroni on Jan 26, 2018; 9:08am URL: http://apache-spark-user-list.1001560.n3.nabble.com/Best-active-groups-forums-or-contacts-for-Spark-tp30744p30748.html Hi From personal

Re: Best active groups, forums or contacts for Spark ?

2018-01-26 Thread Marco Mistroni
Hi From personal experienceand I might be asking u obvious question 1. Does it work in standalone (no cluster) 2. Can u break down app in pieces and try to see at which step the code gets killed? 3. Have u had a look at spark gui to see if we executors go oom? I might be oversimplifying what

Re: how to create a DataType Object using the String representation in Java using Spark 2.2.0?

2018-01-26 Thread Rick Moritz
Hi, We solved this the ugly way, when parsing external column definitions: private def columnTypeToFieldType(columnType: String): DataType = { columnType match { case "IntegerType" => IntegerType case "StringType" => StringType case "DateType" => DateType case "FloatType" =>

Re: Best active groups, forums or contacts for Spark ?

2018-01-26 Thread Chandu
@Esa Thanks for posting this as I was thinking the same way when trying to get some help about Spark (I am just a beginner) @Jack I posted a question @ here ( http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Standalone-Mode-application-runs-but-executor-is-killed-tc30739.html

Re: Best active groups, forums or contacts for Spark ?

2018-01-26 Thread Jacek Laskowski
Hi Esa, I'd say https://stackoverflow.com/questions/tagged/apache-spark is where many active sparkians hang out :) Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski Mastering Spark SQL https://bit.ly/mastering-spark-sql Spark Structured Streaming

Best active groups, forums or contacts for Spark ?

2018-01-26 Thread Esa Heikkinen
Hi It is very often difficult to get answers of question about Spark in many forums.. Maybe they are inactive or my questions are too bad. I don't know, but does anyone know good active groups, forums or contacts other like this ? Esa Heikkinen