[ 
https://issues.apache.org/jira/browse/SPARK-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-4368:
-----------------------------
    Issue Type: Improvement  (was: Bug)

I don't think this ever evolved into a proposal to change Spark, and 
non-essential integration is generally directed to hosting outside the project 
now.

> Ceph integration?
> -----------------
>
>                 Key: SPARK-4368
>                 URL: https://issues.apache.org/jira/browse/SPARK-4368
>             Project: Spark
>          Issue Type: Improvement
>          Components: Input/Output
>            Reporter: Serge Smertin
>
> There is a use-case of storing big number of relatively small BLOB objects 
> (2-20Mb), which has to have some ugly workarounds in HDFS environments. There 
> is a need to process those BLOBs close to data themselves, so that's why 
> MapReduce paradigm is good, as it guarantees data locality.
> Ceph seems to be one of the systems that maintains both of the properties 
> (small files and data locality) -  
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-July/032119.html. I 
> know already that Spark supports GlusterFS - 
> http://mail-archives.apache.org/mod_mbox/spark-user/201404.mbox/%3ccf657f2b.5b3a1%25ven...@yarcdata.com%3E
> So i wonder, could there be an integration with this storage solution and 
> what could be the effort of doing that? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to