Re: Put all elements of RDD into array
The ArrayBuffer did the trick! Thanks a lot! I'm learning Scala through spark so these details are still new to me Sent from my iPhone > On Jan 11, 2016, at 5:18 PM, Jakob Odersky wrote: > > Hey, > I just reread your question and saw I overlooked some crucial information. > Here's a solution: > > val data = > model.asInstanceOf[DistributedLDAModel].topicDistributions.sortByKey().collect() > > val tpdist = data.map(doc => doc._2.toArray) > > hope it works this time > >> On 11 January 2016 at 17:14, Jakob Odersky wrote: >> Hi Daniel, >> >> You're actually not modifying the original array: `array :+ x ` will give >> you a new array with `x` appended to it. >> In your case the fix is simple: collect() already returns an array, use it >> as the assignment value to your val. >> >> In case you ever want to append values iteratively, search for how to use >> scala "ArrayBuffer"s. Also, keep in mind that RDDs have a foreach method, so >> no need to call collect followed by foreach. >> >> regards, >> --Jakob >> >> >> >>> On 11 January 2016 at 16:55, Daniel Valdivia >>> wrote: >>> Hello, >>> >>> I'm trying to put all the values in pair rdd into an array (or list) for >>> later storing, however even if I'm collecting the data then pushing it to >>> the array the array size after the run is 0. >>> >>> Any idea on what I'm missing? >>> >>> Thanks in advance >>> >>> scala> val tpdist: Array[Array[Double]] = Array() >>> tpdist: Array[Array[Double]] = Array() >>> >>> scala> >>> ldaModel.asInstanceOf[DistributedLDAModel].topicDistributions.sortByKey().collect().foreach(doc >>> => tpdist :+ doc._2.toArray ) >>> >>> >>> scala> tpdist.size >>> res27: Int = 0 >
Re: Put all elements of RDD into array
Hey, I just reread your question and saw I overlooked some crucial information. Here's a solution: val data = model.asInstanceOf[DistributedLDAModel].topicDistributions.sortByKey().collect() val tpdist = data.map(doc => doc._2.toArray) hope it works this time On 11 January 2016 at 17:14, Jakob Odersky wrote: > Hi Daniel, > > You're actually not modifying the original array: `array :+ x ` will give > you a new array with `x` appended to it. > In your case the fix is simple: collect() already returns an array, use it > as the assignment value to your val. > > In case you ever want to append values iteratively, search for how to use > scala "ArrayBuffer"s. Also, keep in mind that RDDs have a foreach method, > so no need to call collect followed by foreach. > > regards, > --Jakob > > > > On 11 January 2016 at 16:55, Daniel Valdivia > wrote: > >> Hello, >> >> I'm trying to put all the values in pair rdd into an array (or list) for >> later storing, however even if I'm collecting the data then pushing it to >> the array the array size after the run is 0. >> >> Any idea on what I'm missing? >> >> Thanks in advance >> >> scala> val tpdist: Array[Array[Double]] = Array() >> tpdist: Array[Array[Double]] = Array() >> >> scala> >> ldaModel.asInstanceOf[DistributedLDAModel].topicDistributions.sortByKey().collect().foreach(doc >> => tpdist :+ doc._2.toArray ) >> >> >> >> scala> tpdist.size >> res27: Int = 0 >> > >
Re: Put all elements of RDD into array
Hi Daniel, You're actually not modifying the original array: `array :+ x ` will give you a new array with `x` appended to it. In your case the fix is simple: collect() already returns an array, use it as the assignment value to your val. In case you ever want to append values iteratively, search for how to use scala "ArrayBuffer"s. Also, keep in mind that RDDs have a foreach method, so no need to call collect followed by foreach. regards, --Jakob On 11 January 2016 at 16:55, Daniel Valdivia wrote: > Hello, > > I'm trying to put all the values in pair rdd into an array (or list) for > later storing, however even if I'm collecting the data then pushing it to > the array the array size after the run is 0. > > Any idea on what I'm missing? > > Thanks in advance > > scala> val tpdist: Array[Array[Double]] = Array() > tpdist: Array[Array[Double]] = Array() > > scala> > ldaModel.asInstanceOf[DistributedLDAModel].topicDistributions.sortByKey().collect().foreach(doc > => tpdist :+ doc._2.toArray ) > > > > scala> tpdist.size > res27: Int = 0 >
Put all elements of RDD into array
Hello, I'm trying to put all the values in pair rdd into an array (or list) for later storing, however even if I'm collecting the data then pushing it to the array the array size after the run is 0. Any idea on what I'm missing? Thanks in advance scala> val tpdist: Array[Array[Double]] = Array() tpdist: Array[Array[Double]] = Array() scala> ldaModel.asInstanceOf[DistributedLDAModel].topicDistributions.sortByKey().collect().foreach(doc => tpdist :+ doc._2.toArray ) scala> tpdist.size res27: Int = 0