Re: [ceph-users] hadoop on cephfs
On Sat, Apr 30, 2016 at 2:55 PM, Adam Tygart wrote: > Supposedly cephfs-hadoop worked and/or works on hadoop 2. I am in the > process of getting it working with cdh5.7.0 (based on hadoop 2.6.0). > I'm under the impression that it is/was working with 2.4.0 at some > point in time. > > At this very moment, I can use all of the DFS tools built into hadoop > to create, list, delete, rename, and concat files. What I am not able > to do (currently) is run any jobs. Thanks a lot for looking at this. I am currently without any spare cycles, but will gladly test and merge changes! When you say that you cannot run jobs, are you referring to something like map-reduce? If so, it would be very helpful if you could turn on client debugging and get a log file so we can see which calls are failing. Getting the log file for the client can be a mess sometimes but usually helps debug much much faster. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] hadoop on cephfs
You can also have Hadoop talking to the Rados Gateway (SWIFT API) so that the data is in Ceph instead of HDFS. I wrote this tutorial that might help: https://github.com/zioproto/hadoop-swift-tutorial Saverio 2016-04-30 23:55 GMT+02:00 Adam Tygart : > Supposedly cephfs-hadoop worked and/or works on hadoop 2. I am in the > process of getting it working with cdh5.7.0 (based on hadoop 2.6.0). > I'm under the impression that it is/was working with 2.4.0 at some > point in time. > > At this very moment, I can use all of the DFS tools built into hadoop > to create, list, delete, rename, and concat files. What I am not able > to do (currently) is run any jobs. > > https://github.com/ceph/cephfs-hadoop > > It can be built using current (at least infernalis with my testing) > cephfs-java and libcephfs. The only thing you'll for sure need to do > is patch the file referenced here: > https://github.com/ceph/cephfs-hadoop/issues/25 When building, you'll > want to tell maven to skip tests (-Dmaven.test.skip=true). > > Like I said, I am digging into this still, and I am not entirely > convinced my issues are ceph related at the moment. > > -- > Adam > > On Sat, Apr 30, 2016 at 1:51 PM, Erik McCormick > wrote: >> I think what you are thinking of is the driver that was built to actually >> replace hdfs with rbd. As far as I know that thing had a very short lifespan >> on one version of hadoop. Very sad. >> >> As to what you proposed: >> >> 1) Don't use Cephfs in production pre-jewel. >> >> 2) running hdfs on top of ceph is a massive waste of disk and fairly >> pointless as you make replicas of replicas. >> >> -Erik >> >> On Apr 29, 2016 9:20 PM, "Bill Sharer" wrote: >>> >>> Actually this guy is already a fan of Hadoop. I was just wondering >>> whether anyone has been playing around with it on top of cephfs lately. It >>> seems like the last round of papers were from around cuttlefish. >>> >>> On 04/28/2016 06:21 AM, Oliver Dzombic wrote: Hi, bad idea :-) Its of course nice and important to drag developer towards a new/promising technology/software. But if the technology under the individual required specifications does not match, you will just risk to show this developer how worst this new/promising technology is. So you will just reach the opposite of what you want. So before you are doing something, usually big, like hadoop on an unstable software, maybe you should not use it. For the good of the developer, for your good and for the good of the reputation of the new/promising technology/software you wish. To force a pinguin to somehow live in the sahara, might be possible ( at least for some time ), but usually not a good idea ;-) >>> >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] hadoop on cephfs
Supposedly cephfs-hadoop worked and/or works on hadoop 2. I am in the process of getting it working with cdh5.7.0 (based on hadoop 2.6.0). I'm under the impression that it is/was working with 2.4.0 at some point in time. At this very moment, I can use all of the DFS tools built into hadoop to create, list, delete, rename, and concat files. What I am not able to do (currently) is run any jobs. https://github.com/ceph/cephfs-hadoop It can be built using current (at least infernalis with my testing) cephfs-java and libcephfs. The only thing you'll for sure need to do is patch the file referenced here: https://github.com/ceph/cephfs-hadoop/issues/25 When building, you'll want to tell maven to skip tests (-Dmaven.test.skip=true). Like I said, I am digging into this still, and I am not entirely convinced my issues are ceph related at the moment. -- Adam On Sat, Apr 30, 2016 at 1:51 PM, Erik McCormick wrote: > I think what you are thinking of is the driver that was built to actually > replace hdfs with rbd. As far as I know that thing had a very short lifespan > on one version of hadoop. Very sad. > > As to what you proposed: > > 1) Don't use Cephfs in production pre-jewel. > > 2) running hdfs on top of ceph is a massive waste of disk and fairly > pointless as you make replicas of replicas. > > -Erik > > On Apr 29, 2016 9:20 PM, "Bill Sharer" wrote: >> >> Actually this guy is already a fan of Hadoop. I was just wondering >> whether anyone has been playing around with it on top of cephfs lately. It >> seems like the last round of papers were from around cuttlefish. >> >> On 04/28/2016 06:21 AM, Oliver Dzombic wrote: >>> >>> Hi, >>> >>> bad idea :-) >>> >>> Its of course nice and important to drag developer towards a >>> new/promising technology/software. >>> >>> But if the technology under the individual required specifications does >>> not match, you will just risk to show this developer how worst this >>> new/promising technology is. >>> >>> So you will just reach the opposite of what you want. >>> >>> So before you are doing something, usually big, like hadoop on an >>> unstable software, maybe you should not use it. >>> >>> For the good of the developer, for your good and for the good of the >>> reputation of the new/promising technology/software you wish. >>> >>> To force a pinguin to somehow live in the sahara, might be possible ( at >>> least for some time ), but usually not a good idea ;-) >>> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] hadoop on cephfs
I think what you are thinking of is the driver that was built to actually replace hdfs with rbd. As far as I know that thing had a very short lifespan on one version of hadoop. Very sad. As to what you proposed: 1) Don't use Cephfs in production pre-jewel. 2) running hdfs on top of ceph is a massive waste of disk and fairly pointless as you make replicas of replicas. -Erik On Apr 29, 2016 9:20 PM, "Bill Sharer" wrote: > Actually this guy is already a fan of Hadoop. I was just wondering > whether anyone has been playing around with it on top of cephfs lately. It > seems like the last round of papers were from around cuttlefish. > > On 04/28/2016 06:21 AM, Oliver Dzombic wrote: > >> Hi, >> >> bad idea :-) >> >> Its of course nice and important to drag developer towards a >> new/promising technology/software. >> >> But if the technology under the individual required specifications does >> not match, you will just risk to show this developer how worst this >> new/promising technology is. >> >> So you will just reach the opposite of what you want. >> >> So before you are doing something, usually big, like hadoop on an >> unstable software, maybe you should not use it. >> >> For the good of the developer, for your good and for the good of the >> reputation of the new/promising technology/software you wish. >> >> To force a pinguin to somehow live in the sahara, might be possible ( at >> least for some time ), but usually not a good idea ;-) >> >> > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] hadoop on cephfs
Actually this guy is already a fan of Hadoop. I was just wondering whether anyone has been playing around with it on top of cephfs lately. It seems like the last round of papers were from around cuttlefish. On 04/28/2016 06:21 AM, Oliver Dzombic wrote: Hi, bad idea :-) Its of course nice and important to drag developer towards a new/promising technology/software. But if the technology under the individual required specifications does not match, you will just risk to show this developer how worst this new/promising technology is. So you will just reach the opposite of what you want. So before you are doing something, usually big, like hadoop on an unstable software, maybe you should not use it. For the good of the developer, for your good and for the good of the reputation of the new/promising technology/software you wish. To force a pinguin to somehow live in the sahara, might be possible ( at least for some time ), but usually not a good idea ;-) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] hadoop on cephfs
Hi, bad idea :-) Its of course nice and important to drag developer towards a new/promising technology/software. But if the technology under the individual required specifications does not match, you will just risk to show this developer how worst this new/promising technology is. So you will just reach the opposite of what you want. So before you are doing something, usually big, like hadoop on an unstable software, maybe you should not use it. For the good of the developer, for your good and for the good of the reputation of the new/promising technology/software you wish. To force a pinguin to somehow live in the sahara, might be possible ( at least for some time ), but usually not a good idea ;-) -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive mailto:i...@ip-interactive.de Anschrift: IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3 63571 Gelnhausen HRB 93402 beim Amtsgericht Hanau Geschäftsführung: Oliver Dzombic Steuer Nr.: 35 236 3622 1 UST ID: DE274086107 Am 28.04.2016 um 08:31 schrieb Bill Sharer: > Just got into a discussion today where I may have a chance to do work > with a db guy who wants hadoop and I want to steer him to it on cephfs. > While I'd really like to run gentoo with either infernalis or jewel > (when it becomes stable in portage), odds are more likely that I will be > required to use rhel/centos6.7 and thus stuck back at Hammer. Any > thoughts? > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] hadoop on cephfs
Just got into a discussion today where I may have a chance to do work with a db guy who wants hadoop and I want to steer him to it on cephfs. While I'd really like to run gentoo with either infernalis or jewel (when it becomes stable in portage), odds are more likely that I will be required to use rhel/centos6.7 and thus stuck back at Hammer. Any thoughts? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com