Re: Calculate sum of values in 2nd element of tuple

2016-01-03 Thread robert_dodier
jimitkr wrote > I've tried fold, reduce, foldLeft but with no success in my below code to > calculate total: / > val valuesForDEF=input.lookup("def") > val totalForDEF: Int = valuesForDEF.toList.reduce((x: Int,y: > Int)=>x+y) > println("THE TOTAL FOR DEF IS" + totalForDEF) / Hmm, what

Re: translate algorithm in spark

2016-01-03 Thread robert_dodier
domibd wrote > find(v, collection) : boolean > beign > item = collection.first // assuming collection has at least > one item > > while (item != v and collection has next item) > item = collection.nextItem > > return item == v > end I'm not an expert, so

Unable to run spark SQL Join query.

2016-01-03 Thread ๏̯͡๏
Code: val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) hiveContext.sql("drop table sojsuccessevents2_spark") hiveContext.sql("CREATE TABLE `sojsuccessevents2_spark`( `guid` string COMMENT 'from deserializer', `sessionkey` bigint COMMENT 'from deserializer',

Re: Unable to run spark SQL Join query.

2016-01-03 Thread Jins George
Column 'itemId' is not present in table 'success_events.sojsuccessevents1' or 'dw_bid' did you mean 'sojsuccessevents2_spark' table in your select query ? Thanks, Jins On 01/03/2016 07:22 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote: Code: val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)

Re: GLM I'm ml pipeline

2016-01-03 Thread Yanbo Liang
AFAIK, Spark MLlib will improve and support most GLM functions in the next release(Spark 2.0). 2016-01-03 23:02 GMT+08:00 : > keyStoneML could be an alternative. > > Ardo. > > On 03 Jan 2016, at 15:50, Arunkumar Pillai > wrote: > > Is there any road

sql:Exception in thread "main" scala.MatchError: StringType

2016-01-03 Thread Bonsen
(sbt) scala: import org.apache.spark.SparkContext import org.apache.spark.SparkConf import org.apache.spark.sql object SimpleApp { def main(args: Array[String]) { val conf = new SparkConf() conf.setAppName("mytest").setMaster("spark://Master:7077") val sc = new SparkContext(conf)

Re: sql:Exception in thread "main" scala.MatchError: StringType

2016-01-03 Thread Jeff Zhang
Spark only support one json object per line. You need to reformat your file. On Mon, Jan 4, 2016 at 11:26 AM, Bonsen wrote: > (sbt) scala: > import org.apache.spark.SparkContext > import org.apache.spark.SparkConf > import org.apache.spark.sql > object SimpleApp { > def

Re: Calculate sum of values in 2nd element of tuple

2016-01-03 Thread Roberto Congiu
For the first one, input.map { case(x,l) => (x, l.reduce(_ + _) ) } will do what you need. For the second, yes, there's a difference, one is a List the other is a Tuple. See for instance See for instance val a = (1,2,3) a.getClass.getName res4: String = scala.Tuple3 You should look up tuples

Re: GLM I'm ml pipeline

2016-01-03 Thread Arunkumar Pillai
Thanks so eagerly waiting for next Spark release On Mon, Jan 4, 2016 at 7:36 AM, Yanbo Liang wrote: > AFAIK, Spark MLlib will improve and support most GLM functions in the next > release(Spark 2.0). > > 2016-01-03 23:02 GMT+08:00 : > >> keyStoneML could be

Can a tempTable registered by sqlContext be used inside a forEachRDD?

2016-01-03 Thread SRK
Hi, Can a tempTable registered in sqlContext be used to query inside forEachRDD as shown below? My requirement is that I have a set of data in the form of parquet inside hdfs and I need to register the data as a tempTable using sqlContext and query it inside forEachRDD as shown below.

GLM I'm ml pipeline

2016-01-03 Thread Arunkumar Pillai
Is there any road map for glm in pipeline?

Re: GLM I'm ml pipeline

2016-01-03 Thread ndjido
keyStoneML could be an alternative. Ardo. > On 03 Jan 2016, at 15:50, Arunkumar Pillai wrote: > > Is there any road map for glm in pipeline?

Re: Can a tempTable registered by sqlContext be used inside a forEachRDD?

2016-01-03 Thread Sathish Kumaran Vairavelu
I think you can use foreachpartition instead of foreachrdd Sathish On Sun, Jan 3, 2016 at 5:51 AM SRK wrote: > Hi, > > Can a tempTable registered in sqlContext be used to query inside forEachRDD > as shown below? > My requirement is that I have a set of data in the

subscribe

2016-01-03 Thread Rajdeep Dua

Re: subscribe

2016-01-03 Thread prayag chandran
You should email users-subscr...@kafka.apache.org if you are trying to subscribe. On 3 January 2016 at 11:52, Rajdeep Dua wrote: > >

Calculate sum of values in 2nd element of tuple

2016-01-03 Thread jimitkr
Hi, I've created tuples of type (String, List[Int]) and want to sum the values in the List[Int] part, i.e. the 2nd element in each tuple. Here is my list / val input=sc.parallelize(List(("abc",List(1,2,3,4)),("def",List(5,6,7,8/ I want to sum up values in the 2nd element of the tuple so