Hi Sun,
The issue with Ceph as the underlying file system for Spark is that you
lose data locality. Ceph is not designed to have spark run directly on top
of the OSDs. I know that cephfs provides data location information via
hadoop compatible API. The last time I researched on this is that the
in
Hi Jerry
Yeah, we managed to run and use ceph already in our few production environment,
especially with OpenStack.
The reason we want to use Ceph is that we aim to look for some workarounds for
unified storage layer and the design
concepts of ceph is quite catching. I am just interested i