As others have answered, the number of blocks/files/directories that can be
addressed by a NameNode is limited by the amount of heap space available to
the NameNode JVM. If you need more background on this topic, I'd suggest
reviewing various materials from Hadoop JIRA and other vendors that suppl
Hi Aaron, from MapR site, [now HDSF2] "Limit to 50-200 million files", is
it really true?
On Tue, Jun 7, 2016 at 12:09 AM, Aaron Eng wrote:
> As I said, MapRFS has topologies. You assign a volume (which is mounted
> at a directory path) to a topology and in turn all the data for the volume
> (e
As I said, MapRFS has topologies. You assign a volume (which is mounted at
a directory path) to a topology and in turn all the data for the volume
(e.g. under the directory) is stored on the storage hardware assigned to
the topology.
These topological labels provide the same benefits as dfs.stora
In HDFS2, I can find "dfs.storage.policy", for instances, HDFS2
allows to *Apply
the COLD storage policy to a directory,*
where are these features in Mapr-FS?
On Mon, Jun 6, 2016 at 11:43 PM, Aaron Eng wrote:
> >Since MapR is proprietary, I find that it has many compatibility issues
> in Apac
>Since MapR is proprietary, I find that it has many compatibility issues
in Apache open source projects
This is faulty logic. And rather than saying it has "many compatibility
issues", perhaps you can describe one.
Both MapRFS and HDFS are accessible through the same API. The backend
implementa
Since MapR is proprietary, I find that it has many compatibility issues in
Apache open source projects, or even worse, lose Hadoop's features. For
instances, Hadoop has a built-in storage policy named COLD, where is it in
Mapr-FS? no to mention that Mapr-FS loses Data-Locality.
On Mon, Jun 6, 2
I don't think HDFS2 needs SAN, use the QuorumJournal approach is much
better than using Shared edits directory SAN approach.
On Monday, June 6, 2016, Peyman Mohajerian wrote:
> It is very common practice to backup the metadata in some SAN store. So
> the idea of complete loss of all the metada
It is very common practice to backup the metadata in some SAN store. So the
idea of complete loss of all the metadata is preventable. You could lose a
day worth of data if e.g. you back the metadata once a day but you could do
it more frequently. I'm not saying S3 or Azure Blob are bad ideas.
On S
The namenode architecture is a source of fragility in HDFS. While a high
availability deployment (with two namenodes, and a failover mechanism)
means you're unlikely to see service interruption, it is still possible to
have a complete loss of filesystem metadata with the loss of two machines.
Seco
Another correction about the terminology needs to be made.
i said 1gb = 1 million blocks. Pay attention to term block. it is not file.
A file may contain more than one block. Default block size 64mb so 640 mb
file will hold 10 blocks. Each file has its name ,permissions, path,
creation date and et
it is written 128 000 000 million in my previous post. it was incorrect
(million million)
what i mean is 128 million.
1gb raughly 1 million.
5 Haz 2016 16:58 tarihinde "Ascot Moss" yazdı:
> HDFS2 "Limit to 50-200 million files", is it really true like what MapR
> says?
>
> On Sun, Jun 5, 2016 a
No it is Not true. it totally depends of server's Ram.
Assume that each file holds 1k on Ram and your server has 128gb of ram. So
you will have 128 000 000 million file. But 1k is just approximation.
Raughtly 1gb holds 1million blocks. So if your server has 512gb of ram then
you can approximately
HDFS2 "Limit to 50-200 million files", is it really true like what MapR
says?
On Sun, Jun 5, 2016 at 7:55 PM, Hayati Gonultas
wrote:
> I forgot to mention about file system limit.
>
> Yes HDFS has limit, because for the performance considirations HDFS
> filesystem is read from disk to RAM and re
I forgot to mention about file system limit.
Yes HDFS has limit, because for the performance considirations HDFS
filesystem is read from disk to RAM and rest of the work is done with RAM.
So RAM should be big enough to fit the filesystem image. But HDFS has
configuration options like har files (Ha
Hi,
In most cases I think one cluster is enough. Since HDFS is a file system,
and with federation you may have multiple namenodes for different mount
points. So, you may mount /images/facebook to a namenode1 and
/images/instagram to namenode2, similar to linux file system mounts. With
such a way y
Will the the common pool of datanodes and namenode federation be a more
effective alternative in HDFS2 than multiple clusters?
On Sun, Jun 5, 2016 at 12:19 PM, daemeon reiydelle
wrote:
> There are indeed many tuning points here. If the name nodes and journal
> nodes can be larger, perhaps even
There are indeed many tuning points here. If the name nodes and journal
nodes can be larger, perhaps even bonding multiple 10gbyte nics, one can
easily scale. I did have one client where the file counts forced multiple
clusters. But we were able to differentiate by airframe types ... eg fixed
wing
Here is what I found on Horton website.
Namespace scalability
While HDFS cluster storage scales horizontally with the addition of datanodes,
the namespace does not. Currently the namespace can only be vertically scaled
on a single namenode. The namenode stores the entire file system metadat
Hi,
I read some (old?) articles from Internet about Mapr-FS vs HDFS.
https://www.mapr.com/products/m5-features/no-namenode-architecture
It states that HDFS Federation has
a) "Multiple Single Points of Failure", is it really true?
Why MapR uses HDFS but not HDFS2 in its comparison as this would
19 matches
Mail list logo