Where I am working we are working on transient cluster (temporary) using
Amazon EMR. When I was reading up on how things work they suggested for
monitoring to use ganglia to monitor memory usage and network usage etc.
That way depending on how things are setup be it using an amazon s3
bucket
Hi Rishabh,
I didn't know anything about Hadoop a few months ago, and I started from
the very beginning. I don't suggest you to start with online documentation,
that is always fragmented, incomplete and sometimes not even up to date.
Also starting by directly using Hadoop is the fastest way to
Hi Tariq,
Glad to see that your issue is resolved, thank you. This re-affirms the
compatibility issue with openJDK. Thanks
Regards,
Ravi
On Sat, Feb 21, 2015 at 1:40 PM, tesm...@gmail.com tesm...@gmail.com
wrote:
Dear Nair,
Your tip in your first email saved my day. Tahnks once again. I am
Thank you for your sharing.
Appreciate.
Tim
On Feb 22, 2015, at 1:23 AM, Jonathan Aquilina jaquil...@eagleeyet.net
wrote:
Hi Tim,
Not sure if this might be of any use in terms of improving overall cluster
performance for you, but I hope that it might shed some ideas for you and
Hi Tim,
Not sure if this might be of any use in terms of improving overall
cluster performance for you, but I hope that it might shed some ideas
for you and others.
https://media.amazonwebservices.com/AWS_Amazon_EMR_Best_Practices.pdf
---
Regards,
Jonathan Aquilina
Founder Eagle Eye T
On
Can anyone help me?
Thanks,
Tim
On Feb 21, 2015, at 2:54 PM, Fang Zhou timchou@gmail.com wrote:
Hi All,
I want to test the memory usage on Namenode and Datanode.
I try to use jmap, jstat, proc/pid/stat, top, ps aux, and Hadoop website
interface to check the memory.
The values I
Hi Jonathan,
Very useful information. I will look at the ganglia.
However, I do not have the administrative privilege for the cluster. I
don't know if I can install Ganglia in the cluster.
Thank you for your information.
Best,
Tim
2015-02-22 0:53 GMT-06:00 Jonathan Aquilina
Hi
Be careful, HTTPS is to secure WebHDFS. If you want to protect all
network streams you need more than that :
https://s3.amazonaws.com/dev.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.2/bk_reference/content/reference_chap-wire-encryption.html
If you're just interested in HTTPS an lsof -p
I am rather new to hadoop, but wouldnt the difference be potentially in
how the files are split in terms of size?
---
Regards,
Jonathan Aquilina
Founder Eagle Eye T
On 2015-02-21 21:54, Fang Zhou wrote:
Hi All,
I want to test the memory usage on Namenode and Datanode.
I try to use
Hi Jonathan,
Thank you.
The number of files impact on the memory usage in Namenode.
I just want to get the real memory usage situation in Namenode.
The memory used in heap always changes so that I have no idea about which value
is the right one.
Thanks,
Tim
On Feb 22, 2015, at 12:22 AM,
Hello,
Please tell me where can i learn the concepts of Big Data and Hadoop from
the scratch. Please provide some links online.
Rishabh Agrawal
I have been learning and trying to implement a hadoop ecosystem for one of the
POC from last 1 month or so and i think that the best way to learn is by doing
it..
Hadoop as the concept has lots of implementation and i picked up hortonworks
sandbox for learning...
This has helped me in guaging
$ time hadoop fs -put local file hdfs path
For small files, I would expect the time to have a significant variance
between runs. For larger files, it should be more consistent (since the
throughput will be bound by the network bandwidth of the local machine).
On 21 Feb 2015 08:43,
Hi,
Is it possible to run jobs on Hadoop in batch mode?
I have 5 different datasets in HDFS and need to run the same MapReduce
application on these datasets sets one after the other.
Right now I am doing it manually How can I automate this?
How can I save the log of each execution in text
Rishabh:
You can start with:
http://wiki.apache.org/hadoop/HowToContribute
There're several components: common, hdfs, YARN, mapreduce, ...
Which ones are you interested in ?
Cheers
On Sat, Feb 21, 2015 at 12:18 AM, Bhupendra Gupta bhupendra1...@gmail.com
wrote:
I have been learning and trying
Hi All,
I want to test the memory usage on Namenode and Datanode.
I try to use jmap, jstat, proc/pid/stat, top, ps aux, and Hadoop website
interface to check the memory.
The values I get from them are different. I also found that the memory always
changes periodically.
This is the first thing
Alex,
Thanks for looking at the output and your feedback. I want to make sure I
understand your input correctly.
My cluster is a set of old dual core machines and my client is a virtual box
VM with 10 GB mem allocated to it.
I did some more testing (and will continue to do so to track
17 matches
Mail list logo