Re: Put all elements of RDD into array

2016-01-11 Thread Daniel Valdivia
The ArrayBuffer did the trick!

Thanks a lot! I'm learning Scala through spark so these details are still new 
to me 

Sent from my iPhone

> On Jan 11, 2016, at 5:18 PM, Jakob Odersky  wrote:
> 
> Hey,
> I just reread your question and saw I overlooked some crucial information. 
> Here's a solution:
> 
> val data = 
> model.asInstanceOf[DistributedLDAModel].topicDistributions.sortByKey().collect()
> 
> val tpdist = data.map(doc => doc._2.toArray)
> 
> hope it works this time
> 
>> On 11 January 2016 at 17:14, Jakob Odersky  wrote:
>> Hi Daniel,
>> 
>> You're actually not modifying the original array: `array :+ x ` will give 
>> you a new array with `x` appended to it.
>> In your case the fix is simple: collect() already returns an array, use it 
>> as the assignment value to your val.
>> 
>> In case you ever want to append values iteratively, search for how to use 
>> scala "ArrayBuffer"s. Also, keep in mind that RDDs have a foreach method, so 
>> no need to call collect followed by foreach.
>> 
>> regards,
>> --Jakob
>> 
>>  
>> 
>>> On 11 January 2016 at 16:55, Daniel Valdivia  
>>> wrote:
>>> Hello,
>>> 
>>> I'm trying to put all the values in pair rdd into an array (or list) for 
>>> later storing, however even if I'm collecting the data then pushing it to 
>>> the array the array size after the run is 0.
>>> 
>>> Any idea on what I'm missing? 
>>> 
>>> Thanks in advance
>>> 
>>> scala> val tpdist: Array[Array[Double]]  = Array()
>>> tpdist: Array[Array[Double]] = Array()
>>> 
>>> scala> 
>>> ldaModel.asInstanceOf[DistributedLDAModel].topicDistributions.sortByKey().collect().foreach(doc
>>>  => tpdist :+ doc._2.toArray )
>>> 
>>> 
>>> scala> tpdist.size
>>> res27: Int = 0
> 


Re: Put all elements of RDD into array

2016-01-11 Thread Jakob Odersky
Hey,
I just reread your question and saw I overlooked some crucial information.
Here's a solution:

val data =
model.asInstanceOf[DistributedLDAModel].topicDistributions.sortByKey().collect()

val tpdist = data.map(doc => doc._2.toArray)

hope it works this time

On 11 January 2016 at 17:14, Jakob Odersky  wrote:

> Hi Daniel,
>
> You're actually not modifying the original array: `array :+ x ` will give
> you a new array with `x` appended to it.
> In your case the fix is simple: collect() already returns an array, use it
> as the assignment value to your val.
>
> In case you ever want to append values iteratively, search for how to use
> scala "ArrayBuffer"s. Also, keep in mind that RDDs have a foreach method,
> so no need to call collect followed by foreach.
>
> regards,
> --Jakob
>
>
>
> On 11 January 2016 at 16:55, Daniel Valdivia 
> wrote:
>
>> Hello,
>>
>> I'm trying to put all the values in pair rdd into an array (or list) for
>> later storing, however even if I'm collecting the data then pushing it to
>> the array the array size after the run is 0.
>>
>> Any idea on what I'm missing?
>>
>> Thanks in advance
>>
>> scala> val tpdist: Array[Array[Double]]  = Array()
>> tpdist: Array[Array[Double]] = Array()
>>
>> scala>
>> ldaModel.asInstanceOf[DistributedLDAModel].topicDistributions.sortByKey().collect().foreach(doc
>> => tpdist :+ doc._2.toArray )
>>
>>
>>
>> scala> tpdist.size
>> res27: Int = 0
>>
>
>


Re: Put all elements of RDD into array

2016-01-11 Thread Jakob Odersky
Hi Daniel,

You're actually not modifying the original array: `array :+ x ` will give
you a new array with `x` appended to it.
In your case the fix is simple: collect() already returns an array, use it
as the assignment value to your val.

In case you ever want to append values iteratively, search for how to use
scala "ArrayBuffer"s. Also, keep in mind that RDDs have a foreach method,
so no need to call collect followed by foreach.

regards,
--Jakob



On 11 January 2016 at 16:55, Daniel Valdivia 
wrote:

> Hello,
>
> I'm trying to put all the values in pair rdd into an array (or list) for
> later storing, however even if I'm collecting the data then pushing it to
> the array the array size after the run is 0.
>
> Any idea on what I'm missing?
>
> Thanks in advance
>
> scala> val tpdist: Array[Array[Double]]  = Array()
> tpdist: Array[Array[Double]] = Array()
>
> scala>
> ldaModel.asInstanceOf[DistributedLDAModel].topicDistributions.sortByKey().collect().foreach(doc
> => tpdist :+ doc._2.toArray )
>
>
>
> scala> tpdist.size
> res27: Int = 0
>


Put all elements of RDD into array

2016-01-11 Thread Daniel Valdivia
Hello,

I'm trying to put all the values in pair rdd into an array (or list) for later 
storing, however even if I'm collecting the data then pushing it to the array 
the array size after the run is 0.

Any idea on what I'm missing? 

Thanks in advance

scala> val tpdist: Array[Array[Double]]  = Array()
tpdist: Array[Array[Double]] = Array()

scala> 
ldaModel.asInstanceOf[DistributedLDAModel].topicDistributions.sortByKey().collect().foreach(doc
 => tpdist :+ doc._2.toArray )

scala> tpdist.size
res27: Int = 0