Re: How to retrieve multiple columns values (in one row) to variables in Spark Scala method

2019-04-05 Thread ayan guha
Try like this: val primitiveDS = spark.sql("select 1.2 avg ,2.3 stddev").collect().apply(0) val arr = Array(primitiveDS.getDecimal(0), primitiveDS.getDecimal(1)) primitiveDS: org.apache.spark.sql.Row = [1.2,2.3] arr: Array[java.math.BigDecimal] = Array(1.2, 2.3)

How to retrieve multiple columns values (in one row) to variables in Spark Scala method

2019-04-05 Thread Mich Talebzadeh
Hi, Pretty basic question. I use Spark on Hbase to retrieve the last 14 prices average and standard deviation for a security (ticker) from an Hbase table. However, the call is expensive in Spark streaming where these values are used to indicate buy and sell and subsequently the high value

Re: Checking if cascading graph computation is possible in Spark

2019-04-05 Thread Jason Nerothin
*I guess I was focusing on this:* #2 I want to do the above as a event driven way, *without using the batches* (i tried micro batches, but I realised that’s not what I want), i.e., *for each arriving event or as soon as a event message come my stream, not by accumulating the event * If you want

Re: Checking if cascading graph computation is possible in Spark

2019-04-05 Thread Basavaraj
I have checked broadcast of accumulated values, but not satellite stateful stabbing But, I am not sure how that helps here On Fri, 5 Apr 2019, 10:13 pm Jason Nerothin, wrote: > Have you looked at Arbitrary Stateful Streaming and Broadcast Accumulators? > > On Fri, Apr 5, 2019 at 10:55 AM

Re: combineByKey

2019-04-05 Thread Jason Nerothin
Take a look at this SOF: https://stackoverflow.com/questions/24804619/how-does-spark-aggregate-function-aggregatebykey-work On Fri, Apr 5, 2019 at 12:25 PM Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > Hi, > > Thank you for the details. It is a typo error while composing the mail. >

Re: combineByKey

2019-04-05 Thread Madabhattula Rajesh Kumar
Hi, Thank you for the details. It is a typo error while composing the mail. Below is the actual flow. Any idea, why the combineByKey is not working. aggregateByKey is working. //Defining createCombiner, mergeValue and mergeCombiner functions def createCombiner = (Id: String, value: String) =>

Re: Checking if cascading graph computation is possible in Spark

2019-04-05 Thread Jason Nerothin
Have you looked at Arbitrary Stateful Streaming and Broadcast Accumulators? On Fri, Apr 5, 2019 at 10:55 AM Basavaraj wrote: > Hi > > Have two questions > > #1 > I am trying to process events in realtime, outcome of the processing has > to find a node in the GraphX and update that node as well

Re: combineByKey

2019-04-05 Thread Jason Nerothin
I broke some of your code down into the following lines: import spark.implicits._ val a: RDD[Messages]= sc.parallelize(messages) val b: Dataset[Messages] = a.toDF.as[Messages] val c: Dataset[(String, (String, String))] = b.map{x => (x.timeStamp + "-" + x.Id, (x.Id, x.value))}

Checking if cascading graph computation is possible in Spark

2019-04-05 Thread Basavaraj
HiHave two questions #1 I am trying to process events in realtime, outcome of the processing has to find a node in the GraphX and update that node as well (in case if any anomaly or state change), If a node is updated, I have to update the related nodes as well, want to know if GraphX can help in

Checking if cascading graph computation is possible in Spark

2019-04-05 Thread Basavaraj
Hi Have two questions #1 I am trying to process events in realtime, outcome of the processing has to find a node in the GraphX and update that node as well (in case if any anomaly or state change), If a node is updated, I have to update the related nodes as well, want to know if GraphX can

Is there any spark API function to handle a group of companies at once in this scenario?

2019-04-05 Thread Shyam P
Hi , In my scenario I have few companies , for which I need to calculate few stats like avg I need to be stored in Cassandra , for next set of records I need to get previously calculated and over it i need to calculate accumulated results ( i.e preset set of data + previously stored stats) and

combineByKey

2019-04-05 Thread Madabhattula Rajesh Kumar
Hi, Any issue in the below code. case class Messages(timeStamp: Int, Id: String, value: String) val messages = Array( Messages(0, "d1", "t1"), Messages(0, "d1", "t1"), Messages(0, "d1", "t1"), Messages(0, "d1", "t1"), Messages(0, "d1", "t2"), Messages(0,