[ceph-users] Re: HBase/HDFS on Ceph/CephFS

2020-04-27 Thread Gregory Farnum
On Thu, Apr 23, 2020 at 11:05 PM wrote: > > Hi > > We have an 3 year old Hadoop cluster - up for refresh - so it is time > to evaluate options. The "only" usecase is running an HBase installation > which is important for us and migrating out of HBase would be a hazzle. > > Our Ceph usage has

[ceph-users] Re: HBase/HDFS on Ceph/CephFS

2020-04-27 Thread jesper
> local filesystem is a bit tricky, we just tried a POC that mounting > CephFS > into every hadoop , configure Hadoop using LocalFS with Replica = 1. > Which > end up with each data only write once into cephfs and cephfs take care of > the data durability. Can you tell a bit more about this?

[ceph-users] Re: HBase/HDFS on Ceph/CephFS

2020-04-27 Thread Xiaoxi Chen
RBD is never a workable solution unless you want to pay the cost of double-replication in both HDFS and Ceph. I think the right approach is thinking about other implementation of the FileSystem interface, like s3a and localfs. s3a is straight forward, ceph rgw provide s3 interface and s3a is

[ceph-users] Re: HBase/HDFS on Ceph/CephFS

2020-04-24 Thread Marc Roos
I think the idea behind pool size of 1, is that hadoop already writes copies to 2 other pools(?). However that leaves the possibility that pg's of these 3 pools can maybe share an osd, and if that osd fails, you loose data in these pools. I have no idea what the chances are that the same

[ceph-users] Re: HBase/HDFS on Ceph/CephFS

2020-04-24 Thread Serkan Çoban
You do not want to mix ceph with hadoop, because you'll loose data locality, which is the main point of hadoop systems. Every read/write request will go through network, this is not optimal. On Fri, Apr 24, 2020 at 9:04 AM wrote: > > Hi > > We have an 3 year old Hadoop cluster - up for refresh -