[jira] [Commented] (SPARK-6646) Spark 2.0: Rearchitecting Spark for Mobile Platforms

Steve Loughran (JIRA) Wed, 01 Apr 2015 03:57:08 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14390373#comment-14390373
 ]


Steve Loughran commented on SPARK-6646:
---------------------------------------

Obviously the barrier will be data source access; talking to remote data is 
going to run up bills.

# couchdb has an offline mode, so its RDD/Dataframe support would allow 
spark-mobile to work in embedded mode.
# Hadoop 2.8 add hardware CRC on ARM parts for HDFS (HADOOP-11660). A 
{{MiniHDFSCluster}} could be instantiated locally to benefit from this.
# alternatively, mDNS could be used to discover and dynamically build up an 
HDFS cluster from nearby devices, MANET-style. The limited connectivity 
guarantees of moving devices means that a block size of <1536 bytes would be 
appropriate; probably 1KB blocks are safest.
# Those nodes on the network with limited CPU power but access to external 
power supplies, such as toasters and coffee machines, could have a role as the 
persistent co-ordinators of work and HDFS Namenodes, as well as being used as 
the preferred routers of wifi packets.
# It may be necessary to extend the hadoop {{s3://}} filesystem with the notion 
of monthly data quotas. Possibly even roaming and non-roaming quotas. The S3 
client would need to query the runtime to determine whether it was at home vs 
roaming & use the relevant quota. Apps could then set something like
{code}
fs.s3.quota.home=15GB
fs.s3.quota.roaming=2GB
{code}
Dealing with use abroad would be more complex, as if a cost value were to be 
included, exchange rates would have to be dynamically assessed.
# It may be interesting consider the notion of having devices publish some of 
their data (photos, healthkit history, movement history) to other devices 
nearby. If one phone could enumerate those nearby **and submit work to them**, 
the bandwidth problems could be addressed.



> Spark 2.0: Rearchitecting Spark for Mobile Platforms
> ----------------------------------------------------
>
>                 Key: SPARK-6646
>                 URL: https://issues.apache.org/jira/browse/SPARK-6646
>             Project: Spark
>          Issue Type: Improvement
>          Components: Project Infra
>            Reporter: Reynold Xin
>            Assignee: Reynold Xin
>            Priority: Blocker
>         Attachments: Spark on Mobile - Design Doc - v1.pdf
>
>
> Mobile computing is quickly rising to dominance, and by the end of 2017, it 
> is estimated that 90% of CPU cycles will be devoted to mobile hardware. 
> Spark’s project goal can be accomplished only when Spark runs efficiently for 
> the growing population of mobile users.
> Designed and optimized for modern data centers and Big Data applications, 
> Spark is unfortunately not a good fit for mobile computing today. In the past 
> few months, we have been prototyping the feasibility of a mobile-first Spark 
> architecture, and today we would like to share with you our findings. This 
> ticket outlines the technical design of Spark’s mobile support, and shares 
> results from several early prototypes.
> Mobile friendly version of the design doc: 
> https://databricks.com/blog/2015/04/01/spark-2-rearchitecting-spark-for-mobile.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-6646) Spark 2.0: Rearchitecting Spark for Mobile Platforms

Reply via email to