Do i really need HDFS?

2014-10-20 Thread Ankur Chauhan
Hi all, I am trying to setup a new mesos cluster and I so far I have a set of master and slave nodes working and I can get everything running. I am able to install and run a couple of sample apps, hookup jenkins etc. My main question now is Do I really need HDFS? All my artifacts (for apps

Re: Do i really need HDFS?

2014-10-20 Thread David Greenberg
esos cluster and I so far I have a set of > master and slave nodes working and I can get everything running. I am able > to install and run a couple of sample apps, hookup jenkins etc. My main > question now is Do I really need HDFS? All my artifacts (for apps) are on a > protected S3 bu

Re: Do i really need HDFS?

2014-10-20 Thread CCAAT
ng to setup a new mesos cluster and I so far I have a set of master and slave nodes working and I can get everything running. I am able to install and run a couple of sample apps, hookup jenkins etc. My main question now is Do I really need HDFS? All my artifacts (for apps) are o

Re: Do i really need HDFS?

2014-10-20 Thread Leigh Martell
of master and slave nodes working and I can get everything running. >> I am able to install and run a couple of sample apps, hookup jenkins >> etc. My main question now is Do I really need HDFS? All my artifacts >> (for apps) are on a protected S3 bucket or in a

Re: Do i really need HDFS?

2014-10-20 Thread David Greenberg
m trying to setup a new mesos cluster and I so far I have a set >> of master and slave nodes working and I can get everything running. >> I am able to install and run a couple of sample apps, hookup jenkins >> etc. My main question now is Do I really need HDFS? All my

Re: Do i really need HDFS?

2014-10-20 Thread Ankur Chauhan
; Hi all, >>> >>> I am trying to setup a new mesos cluster and I so far I have a set >>> of master and slave nodes working and I can get everything running. >>> I am able to install and run a couple of sample apps, hookup jenkins >>> e

Re: Do i really need HDFS?

2014-10-20 Thread Steven Schlansker
ns etc. My main question > now is Do I really need HDFS? All my artifacts (for apps) are on a protected > S3 bucket or in a private docker registry. > > If I need HDFS, do I need to go "all in" even when I am not using hdfs as a > data store but rather as a simple way to fetch

Re: Do i really need HDFS?

2014-10-20 Thread CCAAT
On 10/20/14 11:46, Steven Schlansker wrote: We are running Mesos entirely without HDFS with no problems. We use Docker to distribute our application to slave nodes, and keep no state on individual nodes. Background: I'm building up a 3 node cluster to run mesos and spark. No legacy Hadoop

Re: Do i really need HDFS?

2014-10-21 Thread Dick Davies
I think Spark needs a way to send jobs to/from the workers - the Spark distro itself will pull down the executor ok, but in my (very basic) tests I got stuck without HDFS. So basically it depends on the framework. I think in Sparks case they assume most users are migrating from an existing Hadoop

Re: Do i really need HDFS?

2014-10-21 Thread David Greenberg
We use spark without HDFS--in our case, we just use ansible to copy the spark executors onto all hosts at the same path. We also load and store our spark data from non-HDFS sources. On Tue, Oct 21, 2014 at 4:57 AM, Dick Davies wrote: > I think Spark needs a way to send jobs to/from the workers -

Re: Do i really need HDFS?

2014-10-21 Thread Ankur Chauhan
This is what I also intend to do. Is a s3 path considered non-hdfs? If so, how does it know the credentials to use to fetch the file. Sent from my iPhone > On Oct 21, 2014, at 5:16 AM, David Greenberg wrote: > > We use spark without HDFS--in our case, we just use ansible to copy the spark >

Re: Do i really need HDFS?

2014-10-21 Thread Tim St Clair
ry result = hdfs.copyToLocal(uri, path); if (result.isError()) { LOG(ERROR) << "HDFS copyToLocal failed: " << result.error(); return Error(result.error()); } ------------- - Original Message - > Fro

Re: Do i really need HDFS?

2014-10-21 Thread Ankur Chauhan
<< "' to '" << path << "'"; >Try result = hdfs.copyToLocal(uri, path); >if (result.isError()) { > LOG(ERROR) << "HDFS copyToLocal failed: " << result.error(); > return Error(result.error()); >} > --

Re: Do i really need HDFS?

2014-10-21 Thread Tim St Clair
No, it just means you need the utility libraries to access the path. - Original Message - > From: "Ankur Chauhan" > To: user@mesos.apache.org > Sent: Tuesday, October 21, 2014 11:18:11 AM > Subject: Re: Do i really need HDFS? > > So that means even if I do

Re: Do i really need HDFS?

2014-10-21 Thread Adam Bordelon
ur Chauhan" > > To: user@mesos.apache.org > > Sent: Tuesday, October 21, 2014 11:18:11 AM > > Subject: Re: Do i really need HDFS? > > > > So that means even if I don't use the dfs I would need HDFS namenode and > data > > node and related config to fetch s3

Re: Do i really need HDFS?

2014-10-22 Thread Dick Davies
Be interested to know what that is, if you don't mind sharing. We're thinking of deploying a Ceph cluster for another project anyway, it seems to remove some of the chokepoints/points of failure HDFS suffers from but I've no idea how well it can interoperate with the usual HDFS clients (Spark in m

Re: Do i really need HDFS?

2014-10-22 Thread David Greenberg
We use lustre and a couple internal data storage services. I wouldn't recommend lustre much; it's got an SPOF which is a problem at scale. I just wanted to point out that you can skip hdfs if you so choose. On Wednesday, October 22, 2014, Dick Davies wrote: > Be interested to know what that is,

Re: Do i really need HDFS?

2014-10-22 Thread Tim St Clair
y, October 22, 2014 2:29:20 AM > Subject: Re: Do i really need HDFS? > > Be interested to know what that is, if you don't mind sharing. > > We're thinking of deploying a Ceph cluster for another project anyway, > it seems to remove some of the chokepoints/points of failur

Re: Do i really need HDFS?

2014-10-22 Thread CCAAT
Ok so, I'd be curious to know your final architecture (D. Davies)? I was looking to put Ceph on top of the (3) btrfs nodes in case we need a DFS at some later point. We're not really sure what softwares will be in our final mix. Certainly installing Ceph does not hurt anything (?); and I'm not

Re: Do i really need HDFS?

2014-10-22 Thread Dick Davies
I haven't got as far as deploying a FS yet - still weighing up the options. Our Mesos cluster is just a PaaS at the moment but I think the option to use capacity for adhoc distributed computing alongside the web workloads is a killer feature. We're soon to Dockerize as well so some option that ca