So that means even if I don't use the dfs I would need HDFS namenode and data node and related config to fetch s3 and s3n urns.
Sent from my iPhone > On Oct 21, 2014, at 8:40 AM, Tim St Clair <tstcl...@redhat.com> wrote: > > Ankur - > > To answer your specific question re: > Q: Is a s3 path considered non-hdfs? > A: At this time no, it uses the hdfs layer to resolve (for better or worse). > > > --------------------------------------------------------------------- > // Grab the resource using the hadoop client if it's one of the known schemes > // TODO(tarnfeld): This isn't very scalable with hadoop's pluggable > // filesystem implementations. > // TODO(matei): Enforce some size limits on files we get from HDFS > if (strings::startsWith(uri, "hdfs://") || > strings::startsWith(uri, "hftp://") || > strings::startsWith(uri, "s3://") || > strings::startsWith(uri, "s3n://")) { > Try<string> base = os::basename(uri); > if (base.isError()) { > LOG(ERROR) << "Invalid basename for URI: " << base.error(); > return Error("Invalid basename for URI"); > } > string path = path::join(directory, base.get()); > > HDFS hdfs; > > LOG(INFO) << "Downloading resource from '" << uri > << "' to '" << path << "'"; > Try<Nothing> result = hdfs.copyToLocal(uri, path); > if (result.isError()) { > LOG(ERROR) << "HDFS copyToLocal failed: " << result.error(); > return Error(result.error()); > } > --------------------------------------------------------------------- > > ----- Original Message ----- > >> From: "Ankur Chauhan" <an...@malloc64.com> >> To: user@mesos.apache.org >> Sent: Tuesday, October 21, 2014 10:28:50 AM >> Subject: Re: Do i really need HDFS? > >> This is what I also intend to do. Is a s3 path considered non-hdfs? If so, >> how does it know the credentials to use to fetch the file. > >> Sent from my iPhone > >> On Oct 21, 2014, at 5:16 AM, David Greenberg < dsg123456...@gmail.com > >> wrote: > >>> We use spark without HDFS--in our case, we just use ansible to copy the >>> spark >>> executors onto all hosts at the same path. We also load and store our spark >>> data from non-HDFS sources. > >>> On Tue, Oct 21, 2014 at 4:57 AM, Dick Davies < d...@hellooperator.net > >>> wrote: > >>>> I think Spark needs a way to send jobs to/from the workers - the Spark >> >>>> distro itself >> >>>> will pull down the executor ok, but in my (very basic) tests I got >> >>>> stuck without HDFS. > >>>> So basically it depends on the framework. I think in Sparks case they >> >>>> assume most >> >>>> users are migrating from an existing Hadoop deployment, so HDFS is >> >>>> sort of assumed. > >>>> On 20 October 2014 23:18, CCAAT < cc...@tampabay.rr.com > wrote: >> >>>>> On 10/20/14 11:46, Steven Schlansker wrote: >> >> >> >>>>>> We are running Mesos entirely without HDFS with no problems. We use >> >>>>>> Docker to distribute our >> >>>>>> application to slave nodes, and keep no state on individual nodes. >> >> >> >> >>>>> Background: I'm building up a 3 node cluster to run mesos and spark. No >> >>>>> legacy Hadoop needed or wanted. I am using btrfs for the local file >>>>> system, >> >>>>> with (2) drives set up for raid1 on each system. >> >> >>>>> So you are suggesting that I can install mesos + spark + docker >> >>>>> and not a DFS on these (3) machines? >> >> >> >>>>> Will I need any other softwares? My application is a geophysical >> >>>>> fluid simulator, so scala, R, and all sorts of advanced math will >> >>>>> be required on the cluster for the Finite Element Methods. >> >> >> >>>>> James >> >> > > -- > > -- > Cheers, > Timothy St. Clair > Red Hat Inc.