Hi,
I just wanted to add that I know 45GB data is really less to test the
performance of Hadoop/Hive as it needs data in Terabytes. Actually I have to
implement a POC and it requires me to test only 45GB of data. Please let me
know if the performance can be improved.
Thanks,
Ramya
__
Hi,
I have set up a 4 (physica) nodes Hadoop cluster. Configuration: 2GB RAM each
machine. Currently am using the sub-project Hive for firing queries on 45GB of
data. I have certain queries that need to be resolved:-
1) The performance that I am getting with the above setup is quite bad. It
ta
Hi all,
The 3rd Hadoop in China event (Hadoop World:Beijing 2009) is open for
registration now.
http://hadoop-world-beijing.eventbrite.com/
Please register as early as possible.
Thanks,
Yongqiang
On 09-8-22 上午12:21, "He Yongqiang" wrote:
>
> http://www.hadooper.c
For info on newer JDK support for compressed oops, see http://java.sun.com/javase/6/webnotes/6u14.html
and http://wikis.sun.com/display/HotSpotInternals/CompressedOops
-Bryan
On Sep 1, 2009, at Sep 1, 12:21 PM, Brian Bockelman wrote:
On Sep 1, 2009, at 1:58 PM, Stas Oskin wrote:
Hi.
On Sep 1, 2009, at 1:58 PM, Stas Oskin wrote:
Hi.
With regards to memory, have you tried the compressed pointers JDK
option
(we saw great benefits on the NN)? Java is incredibly hard to get a
straight answer from with regards to memory. You need to perform a
GC first
manually - the act
On Sep 1, 2009, at 2:02 PM, Stas Oskin wrote:
Hi.
What does 'up to 700MB' mean? Is it JVM's virtual memory? resident
memory?
or java heap in use?
700 MB is what taken by overall java process.
Resident, shared, or virtual? Unix memory management is not
straightforward; the worst thi
Hi.
What does 'up to 700MB' mean? Is it JVM's virtual memory? resident memory?
> or java heap in use?
>
700 MB is what taken by overall java process.
>
> How many blocks to you have? For an idle DN, most of the memory is taken by
> block info structures. It does not really optimize for it.. May
Hi.
[
> https://issues.apache.org/jira/browse/HADOOP-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
>
>
Does it has any effect on the issue I have?
It seems from the description that the issues are related to various node
task, and not to one particular.
Regards.
Hi.
> With regards to memory, have you tried the compressed pointers JDK option
> (we saw great benefits on the NN)? Java is incredibly hard to get a
> straight answer from with regards to memory. You need to perform a GC first
> manually - the actual usage is the amount it reports used post-GC
Hi.
The datanode would be using the major part of memory to do following-
> a. Continuously (at regular interval) send heartbeat messages to namenode
> to
> say 'I am live and awake'
> b. In case, any data/file is added to DFS, OR Map Reduce jobs are running,
> datanode would again be talking to n
I think this thread is moving in all the possible directions... without
many details on original problem.
There is no need to speculate on where the memory goes you can run 'jmap
-histo:live' and 'jmap -heap' to get much better idea.
What does 'up to 700MB' mean? Is it JVM's virtual memory?
After split Distcp belongs to hadoop-mapreduce.
Make sure hadoop-mapred-tools-$VER-dev.jar is in your classpath.
Boris.
On 8/31/09 2:19 PM, "Kevin Peterson" wrote:
> On Fri, Aug 28, 2009 at 10:34 AM, mpiller wrote:
>
>>
>> I am using the DistCp class inside of my application to copy final ou
Hi,
The recommended RAM for namenode,datanode, jobtracker and tasktracker is 1
GB.
The datanode would be using the major part of memory to do following-
a. Continuously (at regular interval) send heartbeat messages to namenode to
say 'I am live and awake'
b. In case, any data/file is added to DFS,
I have resolved the issue:
What i did:
1) '/etc/init.d/iptables stop' -->stopped firewall
2) SELINUX=disabled in '/etc/selinux/config' file.-->disabled selinux
I worked for me after these two changes.
thanks,
--umer
> From: m_umer_ars...@hotmail.com
> To: common-user@hadoop.apache.org
> Subject
Ahh.. very luckily got a mesg on that jira today itself.
--
[
https://issues.apache.org/jira/browse/HADOOP-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Allen Wittenauer resolved HADOOP-6168.
--
Resolution: Duplicat
Hey Mafish,
If you are getting 1-2m blocks on a single datanode, you'll have many
other problems - especially with regards to periodic block reports.
With regards to memory, have you tried the compressed pointers JDK
option (we saw great benefits on the NN)? Java is incredibly hard to
ge
ashish pareek wrote:
Hello Bharath,
Earlier even I faced the same problem. I think your are
accessing internet through proxy.So try using direct broadband connection.
Hope this will solve your problem.
or set Ant's proxy up
http://ant.apache.org/manual/proxy.html
Ashish
2009/9/1 Mafish Liu :
> Both NameNode and DataNode will be affected by number of files greatly.
> In my test, almost 60% memory are used in datanodes while storing 1m
> files, and the value reach 80% with 2m files.
> My test best is with 5 nodes, 1 namenode and 4 datanodes. All nodes
test bed
>
Both NameNode and DataNode will be affected by number of files greatly.
In my test, almost 60% memory are used in datanodes while storing 1m
files, and the value reach 80% with 2m files.
My test best is with 5 nodes, 1 namenode and 4 datanodes. All nodes
have 2GB memory and replication is 3.
2009/
Hi.
2009/9/1 Amogh Vasekar
> This wont change the daemon configs.
> Hadoop by default allocates 1000MB of memory for each of its daemons, which
> can be controlled by HADOOP_HEAPSIZE, HADOOP_NAMENODE_OPTS,
> HADOOP_TASKTRACKER_OPTS in the hadoop script.
> However, there was a discussion on this
Hi.
2009/9/1 Mafish Liu
> Did you have many small files in your system?
>
>
Yes, quite plenty.
But this should influence the Namenode, and not the Datanode, correct?
Regards.
21 matches
Mail list logo