A Spark job using a streaming context is an endless “while" loop till you kill 
it or specify when to stop. Initiate a TCP Server before you start the stream 
processing 
(https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API/Writing_a_WebSocket_server_in_Java
 
<https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API/Writing_a_WebSocket_server_in_Java>).
 

As long as your driver program is running, it will have a TCP server listening 
to connections on the port you specify (you can check with netstat).

And as long as your job is running, a client (A browser in your case running 
the dashboard code) will be able to connect to the TCP server running in your 
Spark job and receive the data that you write from the TCP Server.

As per the websocket protocol, this connection is an open connection. Read this 
too - 
https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API/Writing_WebSocket_servers
 
<https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API/Writing_WebSocket_servers>

Once you have the data you need at the client, you just need to figure out a 
way to push into the javascript object holding the data in your dashboard and 
refresh it. If it is an array, you can just take off the oldest data and add 
the latest data to it. If it is a hash or a dictionary, you could just update 
the value.

I would suggest using JSON for server-client communication. It is easier to 
navigate JSON objects in Javascript :) But your requirements may vary.

This may help too 
(http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/HomeWebsocket/WebsocketHome.html#section7
 
<http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/HomeWebsocket/WebsocketHome.html#section7>)

Regards,

Sivakumaran S 




> On 25-Aug-2016, at 8:09 PM, kant kodali <kanth...@gmail.com> wrote:
> 
> Your assumption is right (thats what I intend to do). My driver code will be 
> in Java. The link sent by Kevin is a API reference to websocket. 
> I understand how websockets works in general but my question was more geared 
> towards seeing the end to end path on how front end dashboard 
> gets updated in realtime. when we collect the data back to the driver program 
> and finished writing data to websocket client the websocket connection
>  terminate right so 
> 
> 1) is Spark driver program something that needs to run for ever like a 
> typical server? if not,
> 2) then do we need to open a web socket connection each time when the task 
> terminates?
> 
> 
> 
> 
> 
> On Thu, Aug 25, 2016 6:06 AM, Sivakumaran S siva.kuma...@me.com 
> <mailto:siva.kuma...@me.com> wrote:
> I am assuming that you are doing some calculations over a time window. At the 
> end of the calculations (using RDDs or SQL), once you have collected the data 
> back to the driver program, you format the data in the way your client 
> (dashboard) requires it and write it to the websocket. 
> 
> Is your driver code in Python? The link Kevin has sent should start you off.
> 
> Regards,
> 
> Sivakumaran 
>> On 25-Aug-2016, at 11:53 AM, kant kodali <kanth...@gmail.com 
>> <mailto:kanth...@gmail.com>> wrote:
>> 
>> yes for now it will be Spark Streaming Job but later it may change.
>> 
>> 
>> 
>> 
>> 
>> On Thu, Aug 25, 2016 2:37 AM, Sivakumaran S siva.kuma...@me.com 
>> <mailto:siva.kuma...@me.com> wrote:
>> Is this a Spark Streaming job?
>> 
>> Regards,
>> 
>> Sivakumaran S
>> 
>> 
>>> @Sivakumaran when you say create a web socket object in your spark code I 
>>> assume you meant a spark "task" opening websocket 
>>> connection from one of the worker machines to some node.js server in that 
>>> case the websocket connection terminates after the spark 
>>> task is completed right ? and when new data comes in a new task gets 
>>> created and opens a new websocket connection again…is that how it should be
>> 
>>> On 25-Aug-2016, at 7:08 AM, kant kodali <kanth...@gmail.com 
>>> <mailto:kanth...@gmail.com>> wrote:
>>> 
>>> @Sivakumaran when you say create a web socket object in your spark code I 
>>> assume you meant a spark "task" opening websocket connection from one of 
>>> the worker machines to some node.js server in that case the websocket 
>>> connection terminates after the spark task is completed right ? and when 
>>> new data comes in a new task gets created and opens a new websocket 
>>> connection again…is that how it should be?
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Wed, Aug 24, 2016 10:38 PM, Sivakumaran S siva.kuma...@me.com 
>>> <mailto:siva.kuma...@me.com> wrote:
>>> You create a websocket object in your spark code and write your data to the 
>>> socket. You create a websocket object in your dashboard code and receive 
>>> the data in realtime and update the dashboard. You can use Node.js in your 
>>> dashboard (socket.io <http://socket.io/>). I am sure there are other ways 
>>> too.
>>> 
>>> Does that help?
>>> 
>>> Sivakumaran S
>>> 
>>>> On 25-Aug-2016, at 6:30 AM, kant kodali <kanth...@gmail.com 
>>>> <mailto:kanth...@gmail.com>> wrote:
>>>> 
>>>> so I would need to open a websocket connection from spark worker machine 
>>>> to where?
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Wed, Aug 24, 2016 8:51 PM, Kevin Mellott kevin.r.mell...@gmail.com 
>>>> <mailto:kevin.r.mell...@gmail.com> wrote:
>>>> In the diagram you referenced, a real-time dashboard can be created using 
>>>> WebSockets. This technology essentially allows your web page to keep an 
>>>> active line of communication between the client and server, in which case 
>>>> you can detect and display new information without requiring any user 
>>>> input of page refreshes. The link below contains additional information on 
>>>> this concept, as well as links to several different implementations (based 
>>>> on your programming language preferences).
>>>> 
>>>> https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API 
>>>> <https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API>
>>>> 
>>>> Hope this helps!
>>>> - Kevin
>>>> 
>>>> On Wed, Aug 24, 2016 at 3:52 PM, kant kodali <kanth...@gmail.com 
>>>> <mailto:kanth...@gmail.com>> wrote:
>>>> 
>>>> ---------- Forwarded message ----------
>>>> From: kant kodali <kanth...@gmail.com <mailto:kanth...@gmail.com>>
>>>> Date: Wed, Aug 24, 2016 at 1:49 PM
>>>> Subject: quick question
>>>> To: d...@spark.apache.org <mailto:d...@spark.apache.org>, 
>>>> us...@spark.apache.org <mailto:us...@spark.apache.org>
>>>> 
>>>> 
>>>> <attachment-1.png>
>>>> 
>>>> In this picture what does "Dashboards" really mean? is there a open source 
>>>> project which can allow me to push the results back to Dashboards such 
>>>> that Dashboards are always in sync with real time updates? (a push based 
>>>> solution is better than poll but i am open to whatever is possible given 
>>>> the above picture)
>>>> 
>>> 
>> 
>> 
> 
> 

Reply via email to