Re: Spark enables us to process Big Data on an ARM cluster !!

Chanwit Kaewkasi Thu, 20 Mar 2014 07:03:50 -0700

Hi Chester,

It is on our todo-list but it doesn't work at the moment. The
Parallela cores can not be utilized by the JVM. So, Spark will just
use its ARM cores. We'll be looking at Parallela again when the JVM
supports it.


Best regards,

-chanwit

--
Chanwit Kaewkasi
linkedin.com/in/chanwit


On Thu, Mar 20, 2014 at 8:52 PM, Chester <chesterxgc...@yahoo.com> wrote:
> I am curious  to see if you have tried on Parallela supercomputer (16 or 64 
> cores) cluster, run spark on that should be fun.
>
> Chester
>
> Sent from my iPad
>
> On Mar 19, 2014, at 9:18 AM, Chanwit Kaewkasi <chan...@gmail.com> wrote:
>
>> Hi Koert,
>>
>> There's some NAND flash built-in each node. We mount the NAND flash as
>> a local directory for Spark to spill data out.
>> A DZone article, also written by me, will tell more about the cluster.
>> We really appreciate the design of Spark's RDD done by the Spark team.
>> It turned out to be perfect for ARM clusters.
>>
>> http://www.dzone.com/articles/big-data-processing-arm-0
>>
>> Another great thing is that our cluster can operate at the room
>> temperature (25C / 77F) too.
>>
>> The board is Cubieboard here it is:
>> https://en.wikipedia.org/wiki/Cubieboard#Specification
>>
>> Best regards,
>>
>> -chanwit
>>
>> --
>> Chanwit Kaewkasi
>> linkedin.com/in/chanwit
>>
>>
>> On Wed, Mar 19, 2014 at 9:43 PM, Koert Kuipers <ko...@tresata.com> wrote:
>>> i dont know anything about arm clusters.... but it looks great. what are the
>>> specs? the nodes have no local disk at all?
>>>
>>>
>>> On Tue, Mar 18, 2014 at 10:36 PM, Chanwit Kaewkasi <chan...@gmail.com>
>>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> We are a small team doing a research on low-power (and low-cost) ARM
>>>> clusters. We built a 20-node ARM cluster that be able to start Hadoop.
>>>> But as all of you've known, Hadoop is performing on-disk operations,
>>>> so it's not suitable for a constraint machine powered by ARM.
>>>>
>>>> We then switched to Spark and had to say wow!!
>>>>
>>>> Spark / HDFS enables us to crush Wikipedia articles (of year 2012) of
>>>> size 34GB in 1h50m. We have identified the bottleneck and it's our
>>>> 100M network.
>>>>
>>>> Here's the cluster:
>>>> https://dl.dropboxusercontent.com/u/381580/aiyara_cluster/Mk-I_SSD.png
>>>>
>>>> And this is what we got from Spark's shell:
>>>> https://dl.dropboxusercontent.com/u/381580/aiyara_cluster/result_00.png
>>>>
>>>> I think it's the first ARM cluster that can process a non-trivial size
>>>> of Big Data.
>>>> (Please correct me if I'm wrong)
>>>> I really want to thank the Spark team that makes this possible !!
>>>>
>>>> Best regards,
>>>>
>>>> -chanwit
>>>>
>>>> --
>>>> Chanwit Kaewkasi
>>>> linkedin.com/in/chanwit
>>>
>>>

Re: Spark enables us to process Big Data on an ARM cluster !!

Reply via email to