Usage of data node to run on commodity hardware

2016-06-06 Thread Krishna
Hi All, I am new to hadoop and I am thinking of requirement don't know whether it is feasible or not. I want to run hadoop on non-cluster environment means I want to run it on commodity hardware. I have one desktop machine with higher CPU and memory configuration, and i have close to 20 laptops a

Re: HDFS in Kubernetes

2016-06-06 Thread Ravi Prakash
Klaus! Good luck with your attempt to run HDFS inside Kubernetes! Please keep us posted. For creating a new file, a DFSClient : 1. First calls addBlock on the NameNode. https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode

Re: HDFS Federation

2016-06-06 Thread Kun Ren
I figured it out by creating the directories at the beginning: hdfs dfs -mkdir hdfs://Master:9000/my hdfs dfs -mkdir hdfs://Slave1:9000/your Thanks. On Mon, Jun 6, 2016 at 4:44 PM, Ravi Prakash wrote: > Perhaps use the "viewfs://" protocol prepended to your path? > > > On Sun, Jun 5, 2016 at

Re: HDFS Federation

2016-06-06 Thread Ravi Prakash
Perhaps use the "viewfs://" protocol prepended to your path? On Sun, Jun 5, 2016 at 1:10 PM, Kun Ren wrote: > Hi Genius, > > I just configured HDFS Federation, and try to use it(2 namenodes, one is > for /my, another is for /your). When I run the command: > hdfs dfs -ls /, > > I can get: > -r-

Re: HDFS2 vs MaprFS

2016-06-06 Thread Aaron Eng
As others have answered, the number of blocks/files/directories that can be addressed by a NameNode is limited by the amount of heap space available to the NameNode JVM. If you need more background on this topic, I'd suggest reviewing various materials from Hadoop JIRA and other vendors that suppl

Re: HDFS2 vs MaprFS

2016-06-06 Thread Ascot Moss
Hi Aaron, from MapR site, [now HDSF2] "Limit to 50-200 million files", is it really true? On Tue, Jun 7, 2016 at 12:09 AM, Aaron Eng wrote: > As I said, MapRFS has topologies. You assign a volume (which is mounted > at a directory path) to a topology and in turn all the data for the volume > (e

Re: HDFS2 vs MaprFS

2016-06-06 Thread Aaron Eng
As I said, MapRFS has topologies. You assign a volume (which is mounted at a directory path) to a topology and in turn all the data for the volume (e.g. under the directory) is stored on the storage hardware assigned to the topology. These topological labels provide the same benefits as dfs.stora

Re: HDFS2 vs MaprFS

2016-06-06 Thread Ascot Moss
In HDFS2, I can find "dfs.storage.policy", for instances, HDFS2 allows to *Apply the COLD storage policy to a directory,* where are these features in Mapr-FS? On Mon, Jun 6, 2016 at 11:43 PM, Aaron Eng wrote: > >Since MapR is proprietary, I find that it has many compatibility issues > in Apac

Re: HDFS2 vs MaprFS

2016-06-06 Thread Aaron Eng
>Since MapR is proprietary, I find that it has many compatibility issues in Apache open source projects This is faulty logic. And rather than saying it has "many compatibility issues", perhaps you can describe one. Both MapRFS and HDFS are accessible through the same API. The backend implementa

Re: HDFS2 vs MaprFS

2016-06-06 Thread Ascot Moss
Since MapR is proprietary, I find that it has many compatibility issues in Apache open source projects, or even worse, lose Hadoop's features. For instances, Hadoop has a built-in storage policy named COLD, where is it in Mapr-FS? no to mention that Mapr-FS loses Data-Locality. On Mon, Jun 6, 2

Re: HDFS2 vs MaprFS

2016-06-06 Thread Ascot Moss
I don't think HDFS2 needs SAN, use the QuorumJournal approach is much better than using Shared edits directory SAN approach. On Monday, June 6, 2016, Peyman Mohajerian wrote: > It is very common practice to backup the metadata in some SAN store. So > the idea of complete loss of all the metada