now I will looking for hadoop learning classes, so give me suggestions to where
join hadoop classes and how to do study of hadoop framework
Sent from my Windows Phone
Hi there,
By looking the .jhist file of a job, I see that there are startTime and
finishTime for each map task.
My question is, does reading the input data (local or remote) included in
the execution time?
Thanks,
Sultan
Hi, all.
Just wanted to provide an update, which is that I’m finally getting good YARN
cluster utilization (consistently within the 90-100% range!). I believe the
biggest change was to increase the min split size. Since our input is all in
S3 and data locality is not really an issue, I bumpe
Hi Deepak,
Hadoop is just platform (Hadoop and all around it). Toolset to do what you
want to do.
If you are writing bad code you can't blame programming language. It's you
not being able to write good code. There's also nothing bad in using
commodity hardware (and not sure I understand whats' co
Sorry once again if I am wrong, or my comments are without significance
I am not saying Hadoop is bad or good...It is just that Hadoop might be
indirectly encouraging commodity hardware and software to be developed
which is convenient but might not be very good (also the cost factor is
unproven wi
We run several clusters of thousands of nodes (as do many companies), our
largest one has over 10K nodes. Disks, machines, memory, and network fail
all the time. The larger the scale, the higher the odds that some machine
is bad in a given day. On the other hand, scale helps. If a single node our
o
My thoughts inline
You could see it the other way around, it is enabling everyone to solve
problems that are too complex for one server.
*
Deepak
*
If one server (with scale up) would provide the scalability as several
hundreds of them (scale out) then the effect computing
You could see it the other way around, it is enabling everyone to solve
problems that are too complex for one server.
Another way to look at it is that it reduces costs because scaling out is much
cheaper than scaling up.
You can actually (and usually you have to) be pretty ingenious and you ha
What I think, and i am sorry if i am wrong :-(
In the cluster you are not only adding hardware (cpu, memory, disk) but you
are having separate software (os, jvm, application)...So the reason the
cluster is scaling linear is not due to hardware, but due to seperate
software on each machine [As comp
Deepak,
I believe yahoo and Facebook have largest clusters like over 4-5 thousand nodes
of size..
If you add a new server to the cluster, you are simply adding to the cpu,
memory, disk space of the cluster.. So, the capacity grows linearly as you add
nodes except that network bandwidth is share
Deepak,
I have managed clusters where worker nodes crashed, disks failed..
HDFS takes care of the data replication unless you loose too many of the nodes
where there is not enough space to fit the replicas.
Sent from my iPhone
> On May 27, 2016, at 11:54 AM, Deepak Goel wrote:
>
>
> Hey
>
One of the potential problem might be with the hostname
configurations. If you are using the host file to resolve the DNS
please verify the hostnames set in the host file. The problem looks to
be related with the invalid principal name which can be due to bad
hostname to IP mapping.
Regards,
Gaga
Hey
Namaskara~Nalama~Guten Tag~Bonjour
We are yet to see any server go down in our cluster nodes in the production
environment? Has anyone seen reliability problems in their production
environment? How many times?
Thanks
Deepak
--
Keigu
Deepak
73500 12833
www.simtree.net, dee...@simtree.net
Hey
Namaskara~Nalama~Guten Tag~Bonjour
Are there any performance benchmarks as to how many machines can Hadoop
scale up to? Is the growth linear (For 1 machine - growth x, for 2 machines
- 2x growth, for 1 machines - 1x growth??)
Also does the scaling depend on the type of jobs and amoun
Thank you for the detailed explanation.
On Tue, May 24, 2016 at 10:48 PM, Chris Nauroth
wrote:
> Hello Kumar,
>
> I answered at the Stack Overflow link. I'll repeat the same information
> here for everyone's benefit.
>
> HDFS implements the POSIX ACL model [1]. The linked documentation
> expla
15 matches
Mail list logo