Re: Run Python User Defined Functions / code in Spark with Scala Codebase

2018-07-15 Thread Chetan Khatri
Hello Jayant, Thanks for great OSS Contribution :) On Thu, Jul 12, 2018 at 1:36 PM, Jayant Shekhar wrote: > Hello Chetan, > > Sorry missed replying earlier. You can find some sample code here : > > http://sparkflows.readthedocs.io/en/latest/user-guide/ > python/pipe-python.html > > We will

Security in pyspark using extensions

2018-07-15 Thread Maximiliano Patricio MĂ©ndez
Hi, I'm trying to build an authorization/security extension for spark using the hooks provided in SPARK-18127 ( https://issues.apache.org/jira/browse/SPARK-18127). The problem I've encountered is that those hooks aren't available for pyspark, as the exensions are loaded in the getOrCreate method

Re: Stopping a Spark Streaming Context gracefully

2018-07-15 Thread Dhaval Modi
+1 Regards, Dhaval Modi dhavalmod...@gmail.com On 8 November 2017 at 00:06, Bryan Jeffrey wrote: > Hello. > > I am running Spark 2.1, Scala 2.11. We're running several Spark streaming > jobs. In some cases we restart these jobs on an occasional basis. We have > code that looks like the

Re: Pyspark access to scala/java libraries

2018-07-15 Thread Holden Karau
If you want to see some examples in a library shows a way to do it - https://github.com/sparklingpandas/sparklingml and high performance spark also talks about it. On Sun, Jul 15, 2018, 11:57 AM <0xf0f...@protonmail.com.invalid> wrote: > Check >

Re: Properly stop applications or jobs within the application

2018-07-15 Thread Dhaval Modi
@sagar - YARN kill is not a reliable process for spark streaming. Regards, Dhaval Modi dhavalmod...@gmail.com On 8 March 2018 at 17:18, bsikander wrote: > I am running in Spark standalone mode. No YARN. > > anyways, yarn application -kill is a manual process. I donot want that. I > was to

Re: Stopping StreamingContext

2018-07-15 Thread Dhaval Modi
+1 Regards, Dhaval Modi dhavalmod...@gmail.com On 29 March 2018 at 19:57, Sidney Feiner wrote: > Hey, > > I have a Spark Streaming application processing some events. > > Sometimes, I want to stop the application if a get a specific event. I > collect the executor's results in the driver and

How to stop streaming jobs

2018-07-15 Thread Dhaval Modi
Hi Team, I have condition where I want to stop infinitely running spark streaming jobs. Currently spark streaming job is configured with "awaitTerminationOrTimeout(-1)" and is deploy in cluster mode in YARN. I have read - YARN kill does not work in this case. Can you please guide what are the

Re: Do GraphFrames support streaming?

2018-07-15 Thread kant kodali
I have tried this sort of approach in other streaming cases I ran into and I believe the problem with this approach is 1) we got one stream (say stream1) going to disk say HDFS or a Database and we got another Stream (say stream2) where for every row in stream2 we make an I/O call to see if we

Re: Can I specify watermark using raw sql alone?

2018-07-15 Thread kant kodali
I don't see a withWatermark UDF to use it in Raw sql. I am currently using Spark 2.3.1 On Sat, Jul 14, 2018 at 4:19 PM, kant kodali wrote: > Hi All, > > Can I specify watermark using raw sql alone? other words without using > .withWatermark from > Dataset API. > > Thanks! >

Re: Pyspark access to scala/java libraries

2018-07-15 Thread Mohit Jaggi
Trying again…anyone know how to make this work? > On Jul 9, 2018, at 3:45 PM, Mohit Jaggi wrote: > > Folks, > I am writing some Scala/Java code and want it to be usable from pyspark. > > For example: > class MyStuff(addend: Int) { > def myMapFunction(x: Int) = x + addend > } > > I want