Re: [ceph-users] Ceph as an Alternative to HDFS for Hadoop

2017-12-26 Thread Aristeu Gil Alves Jr
In a recent thread on the list, I received various important answers to my questions on hadoop plugin. Maybe this thread will help you. https://www.spinics.net/lists/ceph-users/msg40790.html One of the most important answers is about data locality. The last message lead me to this article.

Re: [ceph-users] Cephfs Hadoop Plugin and CEPH integration

2017-11-29 Thread Aristeu Gil Alves Jr
> > > Does s3 or swifta (for hadoop or spark) have integrated data-layout APIs > for > > local processing data as have cephfs hadoop plugin? > > > With s3 and swift you won't have data locality as it was designed for > public cloud. > We recommend disable locality based scheduling in Hadoop when

Re: [ceph-users] Cephfs Hadoop Plugin and CEPH integration

2017-11-29 Thread Aristeu Gil Alves Jr
-29 4:19 GMT-02:00 Orit Wasserman <owass...@redhat.com>: > On Tue, Nov 28, 2017 at 7:26 PM, Aristeu Gil Alves Jr > <aristeu...@gmail.com> wrote: > > Greg and Donny, > > > > Thanks for the answers. It helped a lot! > > > > I just watched the swifta

Re: [ceph-users] Cephfs Hadoop Plugin and CEPH integration

2017-11-28 Thread Aristeu Gil Alves Jr
Greg and Donny, Thanks for the answers. It helped a lot! I just watched the swifta presentation and it looks quite good! Due the lack of updates/development, and the fact that we can choose spark also, I think maybe swift/swifta with ceph is a good strategy too. I need to study it more, tho.

[ceph-users] Cephfs Hadoop Plugin and CEPH integration

2017-11-27 Thread Aristeu Gil Alves Jr
Hi. It's my first post on the list. First of all I have to say I'm new on hadoop. We are here a small lab and we have being running cephfs for almost two years, loading it with large files (4GB to 4TB in size). Our cluster is with approximately with 400TB with ~75% of usage, and we are planning