Re: Happy Diwali to those forum members who celebrate this great festival

2016-10-30 Thread Sivakumaran S
Thank you Dr Mich :) Regards Sivakumaran S > On 30-Oct-2016, at 4:07 PM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > > Enjoy the festive season. > > Regards, > > Dr Mich Talebzadeh > > LinkedIn > https://www.linkedin.com/profile/view?id=AAE

Re: Scala Vs Python

2016-09-02 Thread Sivakumaran S
Whatever benefits you may accrue from the rapid prototyping and coding in Python, it will be offset against the time taken to convert it to run inside the JVM. This of course depends on the complexity of the DAG. I guess it is a matter of language preference. Regards, Sivakumaran S > On

Re: How to convert List into json object / json Array

2016-08-30 Thread Sivakumaran S
Look at scala.util.parsing.json or the Jackson library for json manipulation. Also read http://spark.apache.org/docs/latest/sql-programming-guide.html#json-datasets <http://spark.apache.org/docs/latest/sql-programming-guide.html#json-datasets> Regards, Sivakumaran S

Re: Design patterns involving Spark

2016-08-28 Thread Sivakumaran S
Spark best fits for processing. But depending on the use case, you could expand the scope of Spark to moving data using the native connectors. The only that Spark is not, is Storage. Connectors are available for most storage options though. Regards, Sivakumaran S > On 28-Aug-2016, at 6

Re: quick question

2016-08-25 Thread Sivakumaran S
requirements may vary. This may help too (http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/HomeWebsocket/WebsocketHome.html#section7 <http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/HomeWebsocket/WebsocketHome.html#section7>) Regards, Sivakumaran S >

Re: quick question

2016-08-25 Thread Sivakumaran S
driver code in Python? The link Kevin has sent should start you off. Regards, Sivakumaran > On 25-Aug-2016, at 11:53 AM, kant kodali <kanth...@gmail.com> wrote: > > yes for now it will be Spark Streaming Job but later it may change. > > > > > > On Thu, Au

Re: quick question

2016-08-25 Thread Sivakumaran S
Is this a Spark Streaming job? Regards, Sivakumaran S > @Sivakumaran when you say create a web socket object in your spark code I > assume you meant a spark "task" opening websocket > connection from one of the worker machines to some node.js server in that > case th

Re: quick question

2016-08-24 Thread Sivakumaran S
that help? Sivakumaran S > On 25-Aug-2016, at 6:30 AM, kant kodali <kanth...@gmail.com> wrote: > > so I would need to open a websocket connection from spark worker machine to > where? > > > > > > On Wed, Aug 24, 2016 8:51 PM, Kevin Mellott kevin.r.mell.

Re: Spark streaming not processing messages from partitioned topics

2016-08-10 Thread Sivakumaran S
> wrote: > > Hi Siva, > > Does topic has partitions? which version of Spark you are using? > > On Wed, Aug 10, 2016 at 2:38 AM, Sivakumaran S <siva.kuma...@me.com > <mailto:siva.kuma...@me.com>> wrote: > Hi, > > Here is a working example I did. >

Re: Spark streaming not processing messages from partitioned topics

2016-08-09 Thread Sivakumaran S
Hi, Here is a working example I did. HTH Regards, Sivakumaran S val topics = "test" val brokers = "localhost:9092" val topicsSet = topics.split(",").toSet val sparkConf = new SparkConf().setAppName("KafkaWeatherCalc").setMaster("local")

Re: Have I done everything correctly when subscribing to Spark User List

2016-08-08 Thread Sivakumaran S
Does it have anything to do with the fact that the mail address is displayed as user @spark.apache.org ? There is a space before ‘@‘. This is as received in my mail client. Sivakumaran > On 08-Aug-2016, at 7:42 PM, Chris Mattmann wrote: > >

Re: Machine learning question (suing spark)- removing redundant factors while doing clustering

2016-08-08 Thread Sivakumaran S
Not an expert here, but the first step would be devote some time and identify which of these 112 factors are actually causative. Some domain knowledge of the data may be required. Then, you can start of with PCA. HTH, Regards, Sivakumaran S > On 08-Aug-2016, at 3:01 PM, Tony Lane <to

Re: Help testing the Spark Extensions for the Apache Bahir 2.0.0 release

2016-08-07 Thread Sivakumaran S
Hi, How can I help? regards, Sivakumaran S > On 06-Aug-2016, at 6:18 PM, Luciano Resende <luckbr1...@gmail.com> wrote: > > Apache Bahir is voting it's 2.0.0 release based on Apache Spark 2.0.0. > > https://www.mail-archive.com/dev@bahir.apache.org/msg00312.html

Re: Visualization of data analysed using spark

2016-07-31 Thread Sivakumaran S
Hi Tony, If your requirement is browser based plotting (real time or other wise), you can load the data and display it in a browser using D3. Since D3 has very low level plotting routines, you can look at C3 ( provided by www.pubnub.com) or Rickshaw (https://github.com/shutterstock/rickshaw

Is spark-submit a single point of failure?

2016-07-22 Thread Sivakumaran S
fails and has to be restarted. Is there any way to obviate this? Is my understanding correct that the spark-submit in its current form is a Single Point of Vulnerability, much akin to the NameNode in HDFS? regards Sivakumaran S

Re: Question on Spark shell

2016-07-11 Thread Sivakumaran S
t; You should have the same output starting the application on the console. You > are not seeing any output? > > On Mon, 11 Jul 2016 at 11:55 Sivakumaran S <siva.kuma...@me.com > <mailto:siva.kuma...@me.com>> wrote: > I am running a spark streaming application using Scala

Re: Question on Spark shell

2016-07-11 Thread Sivakumaran S
ves you a Spark Context to play with straight > away. The output is printed to the console. > > On Mon, 11 Jul 2016 at 11:47 Sivakumaran S <siva.kuma...@me.com > <mailto:siva.kuma...@me.com>> wrote: > Hello, > > Is there a way to start the spark server with t

Question on Spark shell

2016-07-11 Thread Sivakumaran S
Hello, Is there a way to start the spark server with the log output piped to screen? I am currently running spark in the standalone mode on a single machine. Regards, Sivakumaran - To unsubscribe e-mail:

Re: problem extracting map from json

2016-07-07 Thread Sivakumaran S
Hi Michal, Will an example help? import scala.util.parsing.json._//Requires scala-parsec-combinators because it is no longer part of core scala val wbJSON = JSON.parseFull(weatherBox) //wbJSON is a JSON object now //Depending on the structure, now traverse through the object val

Re: Multiple aggregations over streaming dataframes

2016-07-07 Thread Sivakumaran S
ond aggregation. I could > probably rewrite the query in such a way that it does aggregation in one pass > but that would obfuscate the purpose of the various stages. > > Le 7 juil. 2016 12:55, "Sivakumaran S" <siva.kuma...@me.com > <mailto:siva.kuma...@me.com&g

Re: Multiple aggregations over streaming dataframes

2016-07-07 Thread Sivakumaran S
Hi Arnauld, Sorry for the doubt, but what exactly is multiple aggregation? What is the use case? Regards, Sivakumaran > On 07-Jul-2016, at 11:18 AM, Arnaud Bailly wrote: > > Hello, > > I understand multiple aggregations over streaming dataframes is not currently >

Re: Python to Scala

2016-06-18 Thread Sivakumaran S
If you can identify a suitable java example in the spark directory, you can use that as a template and convert it to scala code using http://javatoscala.com/ Siva > On 18-Jun-2016, at 6:27 AM, Aakash Basu wrote: > > I don't have a sound

Re: choice of RDD function

2016-06-16 Thread Sivakumaran S
direction":8.50031} In my Spark app, I have set the batch duration as 60 seconds. Now, as per the 1.6.1 documentation, "Spark SQL can automatically infer the schema of a JSON dataset and load it as a DataFrame. This conversion can be done using SQLContext.read.json() on either an R

Re: choice of RDD function

2016-06-16 Thread Sivakumaran S
ki > > https://medium.com/@jaceklaskowski/ > Mastering Apache Spark http://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > > On Wed, Jun 15, 2016 at 11:55 PM, Sivakumaran S <siva.kuma...@me.com> wrote: >> Cody, >> >&g

Re: choice of RDD function

2016-06-15 Thread Sivakumaran S
Jun 15, 2016 at 11:19 AM, Sivakumaran S <siva.kuma...@me.com> wrote: >> Of course :) >> >> object sparkStreaming { >> def main(args: Array[String]) { >>StreamingExamples.setStreamingLogLevels() //Set reasonable logging >> levels for streaming if the user has

Re: choice of RDD function

2016-06-15 Thread Sivakumaran S
com/@jaceklaskowski/ > Mastering Apache Spark http://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > > On Wed, Jun 15, 2016 at 5:03 PM, Sivakumaran S <siva.kuma...@me.com> wrote: >> Thanks Jacek, >> >> Job completed!! :) Ju

Re: choice of RDD function

2016-06-15 Thread Sivakumaran S
Thanks Jacek, Job completed!! :) Just used data frames and sql query. Very clean and functional code. Siva > On 15-Jun-2016, at 3:10 PM, Jacek Laskowski wrote: > > mapWithState

choice of RDD function

2016-06-14 Thread Sivakumaran S
Dear friends, I have set up Kafka 0.9.0.0, Spark 1.6.1 and Scala 2.10. My source is sending a json string periodically to a topic in kafka. I am able to consume this topic using Spark Streaming and print it. The schema of the source json is as follows: { “id”: 121156, “ht”: 42, “rotor_rpm”: