date:20160103

Re: Calculate sum of values in 2nd element of tuple

2016-01-03 Thread robert_dodier

jimitkr wrote
> I've tried fold, reduce, foldLeft but with no success in my below code to
> calculate total:
/
> val valuesForDEF=input.lookup("def")
> val totalForDEF: Int = valuesForDEF.toList.reduce((x: Int,y:
> Int)=>x+y)
> println("THE TOTAL FOR DEF IS" + totalForDEF)
/

Hmm, what exactly is the error message you get? From what I can tell, that
should work as expected.


> Another query. What will be the difference between the following tuples
> when created:
/
>   val
> input=sc.parallelize(List(("abc",List(1,2,3,4)),("def",List(5,6,7,8
/
> 
/
>   val input=sc.parallelize(List(("abc",(1,2,3,4)),("def",(5,6,7,8
/
> 
> Is there a difference in how (1,2,3,4) and List(1,2,3,4) is handled?

Well, the difference is that (1, 2, 3, 4) is a Tuple4 instead of a List. In
Scala, Tuples have some things in common with Lists and some differences.
You can probably find some discussion about that via a web search.

Depending on what you're trying to do, you'll prefer one or the other. I
believe in the example you gave before, you want List since reduce is not
defined for Tuples.

Hope this helps,

Robert Dodier




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Calculate-sum-of-values-in-2nd-element-of-tuple-tp25865p25866.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: translate algorithm in spark

2016-01-03 Thread robert_dodier

domibd wrote
> find(v, collection) : boolean
> beign
> item = collection.first // assuming collection has at least
> one item
> 
>  while (item != v and collection has next item)
>   item = collection.nextItem
> 
>   return item == v
> end

I'm not an expert, so take my advice with a grain of salt. Anyway, one idea
you can try is to write a search function that works on the values in one
partition -- that part is sequential and not parallel. Then call
mapPartitions to map that function over all partitions in an RDD. Presumably
you will need to reduce the output of mapPartition (which, I guess, will be
a collection of Boolean values) by taking the logical disjunction (i.e., a
or b) of the output.

Hope this helps you figure out a solution.

Robert Dodier



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/translate-algorithm-in-spark-tp25844p25867.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Unable to run spark SQL Join query.

2016-01-03 Thread ๏̯͡๏

Code:

val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)

hiveContext.sql("drop table sojsuccessevents2_spark")

hiveContext.sql("CREATE TABLE `sojsuccessevents2_spark`( `guid` string
COMMENT 'from deserializer', `sessionkey` bigint COMMENT 'from
deserializer', `sessionstartdate` string COMMENT 'from deserializer',
`sojdatadate` string COMMENT 'from deserializer', `seqnum` int COMMENT
'from deserializer', `eventtimestamp` string COMMENT 'from deserializer',
`siteid` int COMMENT 'from deserializer', `successeventtype` string COMMENT
'from deserializer', `sourcetype` string COMMENT 'from deserializer',
`itemid` bigint COMMENT 'from deserializer', `shopcartid` bigint COMMENT
'from deserializer', `transactionid` bigint COMMENT 'from deserializer',
`offerid` bigint COMMENT 'from deserializer', `userid` bigint COMMENT 'from
deserializer', `priorpage1seqnum` int COMMENT 'from deserializer',
`priorpage1pageid` int COMMENT 'from deserializer',
`exclwmsearchattemptseqnum` int COMMENT 'from deserializer',
`exclpriorsearchpageid` int COMMENT 'from deserializer',
`exclpriorsearchseqnum` int COMMENT 'from deserializer',
`exclpriorsearchcategory` int COMMENT 'from deserializer',
`exclpriorsearchl1` int COMMENT 'from deserializer', `exclpriorsearchl2`
int COMMENT 'from deserializer', `currentimpressionid` bigint COMMENT 'from
deserializer', `sourceimpressionid` bigint COMMENT 'from deserializer',
`exclpriorsearchsqr` string COMMENT 'from deserializer',
`exclpriorsearchsort` string COMMENT 'from deserializer', `isduplicate` int
COMMENT 'from deserializer', `transactiondate` string COMMENT 'from
deserializer', `auctiontypecode` int COMMENT 'from deserializer', `isbin`
int COMMENT 'from deserializer', `leafcategoryid` int COMMENT 'from
deserializer', `itemsiteid` int COMMENT 'from deserializer', `bidquantity`
int COMMENT 'from deserializer', `bidamtusd` double COMMENT 'from
deserializer', `offerquantity` int COMMENT 'from deserializer',
`offeramountusd` double COMMENT 'from deserializer', `offercreatedate`
string COMMENT 'from deserializer', `buyersegment` string COMMENT 'from
deserializer', `buyercountryid` int COMMENT 'from deserializer', `sellerid`
bigint COMMENT 'from deserializer', `sellercountryid` int COMMENT 'from
deserializer', `sellerstdlevel` string COMMENT 'from deserializer',
`csssellerlevel` string COMMENT 'from deserializer', `experimentchannel`
int COMMENT 'from deserializer') ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION
'hdfs://
apollo-phx-nn.vip.ebay.com:8020/user/dvasthimal/spark/successeventstaging/sojsuccessevents2'
TBLPROPERTIES ( 'avro.schema.literal'='{\"type\":\"record\",\"name\":\"
success\",\"namespace\":\"Reporting.detail\",\"doc\":\"\",\"fields\":[{\"
name\":\"guid\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String
\"},\"doc\":\"\",\"default\":\"\"},{\"name\":\"sessionKey\",\"type\":\"long
\",\"doc\":\"\",\"default\":0},{\"name\":\"sessionStartDate\",\"type\":{\"
type\":\"string\",\"avro.java.string\":\"String\"},\"doc\":\"\",\"default\":
\"\"},{\"name\":\"sojDataDate\",\"type\":{\"type\":\"string\",\"
avro.java.string\":\"String\"},\"doc\":\"\",\"default\":\"\"},{\"name\":\"
seqNum\",\"type\":\"int\",\"doc\":\"\",\"default\":0},{\"name\":\"
eventTimestamp\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String
\"},\"doc\":\"\",\"default\":\"\"},{\"name\":\"siteId\",\"type\":\"int\",\"
doc\":\"\",\"default\":0},{\"name\":\"successEventType\",\"type\":{\"type\":
\"string\",\"avro.java.string\":\"String\"},\"doc\":\"\",\"default\":\"\"},{
\"name\":\"sourceType\",\"type\":{\"type\":\"string\",\"avro.java.string\":
\"String\"},\"doc\":\"\",\"default\":\"\"},{\"name\":\"itemId\",\"type\":\"
long\",\"doc\":\"\",\"default\":0},{\"name\":\"shopCartId\",\"type\":\"long
\",\"doc\":\"\",\"default\":0},{\"name\":\"transactionId\",\"type\":\"long\"
,\"doc\":\"\",\"default\":0},{\"name\":\"offerId\",\"type\":\"long\",\"doc\"
:\"\",\"default\":0},{\"name\":\"userId\",\"type\":\"long\",\"doc\":\"\",\"
default\":0},{\"name\":\"priorPage1SeqNum\",\"type\":\"int\",\"doc\":\"\",\"
default\":0},{\"name\":\"priorPage1PageId\",\"type\":\"int\",\"doc\":\"\",\"
default\":0},{\"name\":\"exclWMSearchAttemptSeqNum\",\"type\":\"int\",\"doc
\":\"\",\"default\":0},{\"name\":\"exclPriorSearchPageId\",\"type\":\"int\",
\"doc\":\"\",\"default\":0},{\"name\":\"exclPriorSearchSeqNum\",\"type\":\"
int\",\"doc\":\"\",\"default\":0},{\"name\":\"exclPriorSearchCategory\",\"
type\":\"int\",\"doc\":\"\",\"default\":0},{\"name\":\"exclPriorSearchL1\",
\"type\":\"int\",\"doc\":\"\",\"default\":0},{\"name\":\"exclPriorSearchL2\"
,\"type\":\"int\",\"doc\":\"\",\"default\":0},{\"name\":\"
currentImpressionId\",\"type\":\"long\",\"doc\":\"\",\"default\":0},{\"name

Re: Unable to run spark SQL Join query.

2016-01-03 Thread Jins George

Column 'itemId' is not present in table 
'success_events.sojsuccessevents1' or  'dw_bid'


did you mean  'sojsuccessevents2_spark' table  in your select query ?

Thanks,
Jins
On 01/03/2016 07:22 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote:

Code:

val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)

hiveContext.sql("drop table sojsuccessevents2_spark")

hiveContext.sql("CREATE TABLE `sojsuccessevents2_spark`( `guid` string 
COMMENT 'from deserializer', `sessionkey` bigint COMMENT 'from 
deserializer', `sessionstartdate` string COMMENT 'from deserializer', 
`sojdatadate` string COMMENT 'from deserializer', `seqnum` int COMMENT 
'from deserializer', `eventtimestamp` string COMMENT 'from 
deserializer', `siteid` int COMMENT 'from deserializer', 
`successeventtype` string COMMENT 'from deserializer', `sourcetype` 
string COMMENT 'from deserializer', `itemid` bigint COMMENT 'from 
deserializer', `shopcartid` bigint COMMENT 'from deserializer', 
`transactionid` bigint COMMENT 'from deserializer', `offerid` bigint 
COMMENT 'from deserializer', `userid` bigint COMMENT 'from 
deserializer', `priorpage1seqnum` int COMMENT 'from deserializer', 
`priorpage1pageid` int COMMENT 'from deserializer', 
`exclwmsearchattemptseqnum` int COMMENT 'from deserializer', 
`exclpriorsearchpageid` int COMMENT 'from deserializer', 
`exclpriorsearchseqnum` int COMMENT 'from deserializer', 
`exclpriorsearchcategory` int COMMENT 'from deserializer', 
`exclpriorsearchl1` int COMMENT 'from deserializer', 
`exclpriorsearchl2` int COMMENT 'from deserializer', 
`currentimpressionid` bigint COMMENT 'from deserializer', 
`sourceimpressionid` bigint COMMENT 'from deserializer', 
`exclpriorsearchsqr` string COMMENT 'from deserializer', 
`exclpriorsearchsort` string COMMENT 'from deserializer', 
`isduplicate` int COMMENT 'from deserializer', `transactiondate` 
string COMMENT 'from deserializer', `auctiontypecode` int COMMENT 
'from deserializer', `isbin` int COMMENT 'from deserializer', 
`leafcategoryid` int COMMENT 'from deserializer', `itemsiteid` int 
COMMENT 'from deserializer', `bidquantity` int COMMENT 'from 
deserializer', `bidamtusd` double COMMENT 'from deserializer', 
`offerquantity` int COMMENT 'from deserializer', `offeramountusd` 
double COMMENT 'from deserializer', `offercreatedate` string COMMENT 
'from deserializer', `buyersegment` string COMMENT 'from 
deserializer', `buyercountryid` int COMMENT 'from deserializer', 
`sellerid` bigint COMMENT 'from deserializer', `sellercountryid` int 
COMMENT 'from deserializer', `sellerstdlevel` string COMMENT 'from 
deserializer', `csssellerlevel` string COMMENT 'from deserializer', 
`experimentchannel` int COMMENT 'from deserializer') ROW FORMAT SERDE 
'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' 
OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION 
'hdfs://apollo-phx-nn.vip.ebay.com:8020/user/dvasthimal/spark/successeventstaging/sojsuccessevents2 
' 
TBLPROPERTIES (

Re: GLM I'm ml pipeline

2016-01-03 Thread Yanbo Liang

AFAIK, Spark MLlib will improve and support most GLM functions in the next
release(Spark 2.0).

2016-01-03 23:02 GMT+08:00 :

> keyStoneML could be an alternative.
>
> Ardo.
>
> On 03 Jan 2016, at 15:50, Arunkumar Pillai 
> wrote:
>
> Is there any road map for glm in pipeline?
>
>

sql:Exception in thread "main" scala.MatchError: StringType

2016-01-03 Thread Bonsen

(sbt) scala:
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql
object SimpleApp {
  def main(args: Array[String]) {
val conf = new SparkConf()
conf.setAppName("mytest").setMaster("spark://Master:7077")
val sc = new SparkContext(conf)
val sqlContext = new sql.SQLContext(sc)
val
d=sqlContext.read.json("/home/hadoop/2015data_test/Data/Data/100808cb11e9898816ef15fcdde4e1d74cbc0b/Db6Jh2XeQ.json")
sc.stop()
  }
}
__
after sbt package :
./spark-submit --class "SimpleApp" 
/home/hadoop/Downloads/sbt/bin/target/scala-2.10/simple-project_2.10-1.0.jar
___
json fIle:
{
"programmers": [
{
"firstName": "Brett",
"lastName": "McLaughlin",
"email": ""
},
{
"firstName": "Jason",
"lastName": "Hunter",
"email": ""
},
{
"firstName": "Elliotte",
"lastName": "Harold",
"email": ""
}
],
"authors": [
{
"firstName": "Isaac",
"lastName": "Asimov",
"genre": "sciencefiction"
},
{
"firstName": "Tad",
"lastName": "Williams",
"genre": "fantasy"
},
{
"firstName": "Frank",
"lastName": "Peretti",
"genre": "christianfiction"
}
],
"musicians": [
{
"firstName": "Eric",
"lastName": "Clapton",
"instrument": "guitar"
},
{
"firstName": "Sergei",
"lastName": "Rachmaninoff",
"instrument": "piano"
}
]
}
___
Exception in thread "main" scala.MatchError: StringType (of class
org.apache.spark.sql.types.StringType$)
at org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58)
at
org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139)
___
why



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/sql-Exception-in-thread-main-scala-MatchError-StringType-tp25868.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: sql:Exception in thread "main" scala.MatchError: StringType

2016-01-03 Thread Jeff Zhang

Spark only support one json object per line. You need to reformat your
file.

On Mon, Jan 4, 2016 at 11:26 AM, Bonsen  wrote:

> (sbt) scala:
> import org.apache.spark.SparkContext
> import org.apache.spark.SparkConf
> import org.apache.spark.sql
> object SimpleApp {
>   def main(args: Array[String]) {
> val conf = new SparkConf()
> conf.setAppName("mytest").setMaster("spark://Master:7077")
> val sc = new SparkContext(conf)
> val sqlContext = new sql.SQLContext(sc)
> val
>
> d=sqlContext.read.json("/home/hadoop/2015data_test/Data/Data/100808cb11e9898816ef15fcdde4e1d74cbc0b/Db6Jh2XeQ.json")
> sc.stop()
>   }
> }
>
> __
> after sbt package :
> ./spark-submit --class "SimpleApp"
>
> /home/hadoop/Downloads/sbt/bin/target/scala-2.10/simple-project_2.10-1.0.jar
>
> ___
> json fIle:
> {
> "programmers": [
> {
> "firstName": "Brett",
> "lastName": "McLaughlin",
> "email": ""
> },
> {
> "firstName": "Jason",
> "lastName": "Hunter",
> "email": ""
> },
> {
> "firstName": "Elliotte",
> "lastName": "Harold",
> "email": ""
> }
> ],
> "authors": [
> {
> "firstName": "Isaac",
> "lastName": "Asimov",
> "genre": "sciencefiction"
> },
> {
> "firstName": "Tad",
> "lastName": "Williams",
> "genre": "fantasy"
> },
> {
> "firstName": "Frank",
> "lastName": "Peretti",
> "genre": "christianfiction"
> }
> ],
> "musicians": [
> {
> "firstName": "Eric",
> "lastName": "Clapton",
> "instrument": "guitar"
> },
> {
> "firstName": "Sergei",
> "lastName": "Rachmaninoff",
> "instrument": "piano"
> }
> ]
> }
>
> ___
> Exception in thread "main" scala.MatchError: StringType (of class
> org.apache.spark.sql.types.StringType$)
> at
> org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58)
> at
>
> org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139)
>
> ___
> why
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/sql-Exception-in-thread-main-scala-MatchError-StringType-tp25868.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


-- 
Best Regards

Jeff Zhang

Re: Calculate sum of values in 2nd element of tuple

2016-01-03 Thread Roberto Congiu

For the first one,

 input.map { case(x,l) => (x, l.reduce(_ + _) ) }

will do what you need.
For the second, yes, there's a difference, one is a List the other is a
Tuple. See for instance
See for instance
val a = (1,2,3)
a.getClass.getName
res4: String = scala.Tuple3

You should look up tuples in the Scala doc as they are not specific to
spark, in particular read up about case classes and pattern matching.


2016-01-03 12:00 GMT-08:00 jimitkr :

> Hi,
>
> I've created tuples of type (String, List[Int]) and want to sum the values
> in the List[Int] part, i.e. the 2nd element in each tuple.
>
> Here is my list
> /  val
> input=sc.parallelize(List(("abc",List(1,2,3,4)),("def",List(5,6,7,8/
>
> I want to sum up values in the 2nd element of the tuple so that the output
> is
> (abc,10)
> (def, 26)
>
> I've tried fold, reduce, foldLeft but with no success in my below code to
> calculate total:
> /val valuesForDEF=input.lookup("def")
> val totalForDEF: Int = valuesForDEF.toList.reduce((x: Int,y: Int)=>x+y)
> println("THE TOTAL FOR DEF IS" + totalForDEF)/
>
> How do i calculate the total?
>
> Another query. What will be the difference between the following tuples
> when
> created:
> /  val
> input=sc.parallelize(List(("abc",List(1,2,3,4)),("def",List(5,6,7,8/
> /  val input=sc.parallelize(List(("abc",(1,2,3,4)),("def",(5,6,7,8/
>
> Is there a difference in how (1,2,3,4) and List(1,2,3,4) is handled?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Calculate-sum-of-values-in-2nd-element-of-tuple-tp25865.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


-- 
--
"Good judgment comes from experience.
Experience comes from bad judgment"
--

Re: GLM I'm ml pipeline

2016-01-03 Thread Arunkumar Pillai

Thanks so eagerly waiting for next Spark release

On Mon, Jan 4, 2016 at 7:36 AM, Yanbo Liang  wrote:

> AFAIK, Spark MLlib will improve and support most GLM functions in the next
> release(Spark 2.0).
>
> 2016-01-03 23:02 GMT+08:00 :
>
>> keyStoneML could be an alternative.
>>
>> Ardo.
>>
>> On 03 Jan 2016, at 15:50, Arunkumar Pillai 
>> wrote:
>>
>> Is there any road map for glm in pipeline?
>>
>>
>


-- 
Thanks and Regards
Arun

Can a tempTable registered by sqlContext be used inside a forEachRDD?

2016-01-03 Thread SRK

Hi,

Can a tempTable registered in sqlContext be used to query inside forEachRDD
as shown below?
My requirement is that I have a set of data in the form of parquet inside
hdfs and I need to register the data
as a tempTable using sqlContext and query it inside forEachRDD as shown
below. 

  sqlContext.registerTempTable("tempTable")

messages.foreachRDD { rdd =>
  val message:RDD[String] = rdd.map { y => y._2 }

  sqlContext.sql("SELECT time,To FROM tempTable")
}

Thanks!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Can-a-tempTable-registered-by-sqlContext-be-used-inside-a-forEachRDD-tp25862.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

GLM I'm ml pipeline

2016-01-03 Thread Arunkumar Pillai

Is there any road map for glm in pipeline?

Re: GLM I'm ml pipeline

2016-01-03 Thread ndjido

keyStoneML could be an alternative. 

Ardo.

> On 03 Jan 2016, at 15:50, Arunkumar Pillai  wrote:
> 
> Is there any road map for glm in pipeline?

Re: Can a tempTable registered by sqlContext be used inside a forEachRDD?

2016-01-03 Thread Sathish Kumaran Vairavelu

I think you can use foreachpartition instead of foreachrdd


Sathish
On Sun, Jan 3, 2016 at 5:51 AM SRK  wrote:

> Hi,
>
> Can a tempTable registered in sqlContext be used to query inside forEachRDD
> as shown below?
> My requirement is that I have a set of data in the form of parquet inside
> hdfs and I need to register the data
> as a tempTable using sqlContext and query it inside forEachRDD as shown
> below.
>
>   sqlContext.registerTempTable("tempTable")
>
> messages.foreachRDD { rdd =>
>   val message:RDD[String] = rdd.map { y => y._2 }
>
>   sqlContext.sql("SELECT time,To FROM tempTable")
> }
>
> Thanks!
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Can-a-tempTable-registered-by-sqlContext-be-used-inside-a-forEachRDD-tp25862.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

2016-01-03 Thread Rajdeep Dua

Re: subscribe

2016-01-03 Thread prayag chandran

You should email users-subscr...@kafka.apache.org if you are trying to
subscribe.

On 3 January 2016 at 11:52, Rajdeep Dua  wrote:

>
>

Calculate sum of values in 2nd element of tuple

2016-01-03 Thread jimitkr

Hi,

I've created tuples of type (String, List[Int]) and want to sum the values
in the List[Int] part, i.e. the 2nd element in each tuple.

Here is my list
/  val
input=sc.parallelize(List(("abc",List(1,2,3,4)),("def",List(5,6,7,8/

I want to sum up values in the 2nd element of the tuple so that the output
is
(abc,10)
(def, 26)

I've tried fold, reduce, foldLeft but with no success in my below code to
calculate total:
/val valuesForDEF=input.lookup("def")
val totalForDEF: Int = valuesForDEF.toList.reduce((x: Int,y: Int)=>x+y)
println("THE TOTAL FOR DEF IS" + totalForDEF)/

How do i calculate the total?

Another query. What will be the difference between the following tuples when
created:
/  val
input=sc.parallelize(List(("abc",List(1,2,3,4)),("def",List(5,6,7,8/
/  val input=sc.parallelize(List(("abc",(1,2,3,4)),("def",(5,6,7,8/

Is there a difference in how (1,2,3,4) and List(1,2,3,4) is handled?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Calculate-sum-of-values-in-2nd-element-of-tuple-tp25865.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Calculate sum of values in 2nd element of tuple

Re: translate algorithm in spark

Unable to run spark SQL Join query.

Re: Unable to run spark SQL Join query.

Re: GLM I'm ml pipeline

sql:Exception in thread "main" scala.MatchError: StringType

Re: sql:Exception in thread "main" scala.MatchError: StringType

Re: Calculate sum of values in 2nd element of tuple

Re: GLM I'm ml pipeline

Can a tempTable registered by sqlContext be used inside a forEachRDD?

GLM I'm ml pipeline

Re: GLM I'm ml pipeline

Re: Can a tempTable registered by sqlContext be used inside a forEachRDD?

subscribe

Re: subscribe

Calculate sum of values in 2nd element of tuple

16 matches

Site Navigation

Mail list logo

Footer information