Re: dual core configuration
First of all, mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum are both set to 2 in hadoop-default.xml file; this file is read before hadoop-site.xml file so any properties that aren't set in hadoop-site.xml will follow the values set in hadoop-default.xml. As for the question on why only one core is utilized... I think it really depends on the process scheduling of the underlying OS. It's not like two tasks (two JVM subprocesses spawned by the tasktracker) will always run on independent cores as there are other processes which need one or more cores to be run. By the way, what tools did you use to find out which tasks (or processes) use which cores? /Taeho On Wed, Oct 8, 2008 at 1:01 PM, Alex Loddengaard [EMAIL PROTECTED]wrote: Taeho, I was going to suggest this change as well, but it's documented that mapred.tasktracker.map.tasks.maximum defaults to 2. Can you explain why Elia is only having one core utilized when this config option is set to 2? Here is the documentation I'm referring to: http://hadoop.apache.org/core/docs/r0.18.1/cluster_setup.html Alex On Tue, Oct 7, 2008 at 8:27 PM, Taeho Kang [EMAIL PROTECTED] wrote: You can have your node (tasktracker) running more than 1 task simultaneously. You may set mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum properties found in hadoop-site.xml file. You should change hadoop-site.xml file on all your slave nodes depending on how many cores each slave has. For example, you don't really want to have 8 tasks running at once on a 2 core machine. /Taeho On Wed, Oct 8, 2008 at 5:53 AM, Elia Mazzawi [EMAIL PROTECTED]wrote: hello, I have some dual core nodes, and I've noticed hadoop is only running 1 instance, and so is only using 1 on the CPU's on each node. is there a configuration to tell it to run more than once? or do i need to turn each machine into 2 nodes? Thanks.
Re: dual core configuration
Elia, perhaps you can try changing mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum to 4 in hadoop-site.xml in hopes of getting better utilization. It's strange to me that having these both set to 2 only utilizes a single core, because I would imagine that any modern OS scheduler would do a good job of core utilization. Just a thought. Alex On Wed, Oct 8, 2008 at 12:52 AM, Taeho Kang [EMAIL PROTECTED] wrote: First of all, mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum are both set to 2 in hadoop-default.xml file; this file is read before hadoop-site.xml file so any properties that aren't set in hadoop-site.xml will follow the values set in hadoop-default.xml. As for the question on why only one core is utilized... I think it really depends on the process scheduling of the underlying OS. It's not like two tasks (two JVM subprocesses spawned by the tasktracker) will always run on independent cores as there are other processes which need one or more cores to be run. By the way, what tools did you use to find out which tasks (or processes) use which cores? /Taeho On Wed, Oct 8, 2008 at 1:01 PM, Alex Loddengaard [EMAIL PROTECTED]wrote: Taeho, I was going to suggest this change as well, but it's documented that mapred.tasktracker.map.tasks.maximum defaults to 2. Can you explain why Elia is only having one core utilized when this config option is set to 2? Here is the documentation I'm referring to: http://hadoop.apache.org/core/docs/r0.18.1/cluster_setup.html Alex On Tue, Oct 7, 2008 at 8:27 PM, Taeho Kang [EMAIL PROTECTED] wrote: You can have your node (tasktracker) running more than 1 task simultaneously. You may set mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum properties found in hadoop-site.xml file. You should change hadoop-site.xml file on all your slave nodes depending on how many cores each slave has. For example, you don't really want to have 8 tasks running at once on a 2 core machine. /Taeho On Wed, Oct 8, 2008 at 5:53 AM, Elia Mazzawi [EMAIL PROTECTED]wrote: hello, I have some dual core nodes, and I've noticed hadoop is only running 1 instance, and so is only using 1 on the CPU's on each node. is there a configuration to tell it to run more than once? or do i need to turn each machine into 2 nodes? Thanks.
Re: dual core configuration
false alarm guys, thanks for the replies, I do have 2 set as the task maximum, and it is utilizing 2 cores according to top. I must have caught it in between tasks or during the reduce, since i had only 1 reducer per node going on at the time. hadoop-default.xml: property namemapred.tasktracker.map.tasks.maximum/name value2/value /property output from top: top - 12:54:50 up 48 days, 16:19, 1 user, load average: 2.60, 1.55, 0.66 Tasks: 80 total, 3 running, 77 sleeping, 0 stopped, 0 zombie Cpu0 : 98.1%us, 1.6%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu1 : 95.8%us, 2.9%sy, 0.0%ni, 0.0%id, 1.3%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 1035160k total, 1019608k used,15552k free, 1808k buffers Swap: 2031608k total, 372k used, 2031236k free, 293612k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 2469 root 25 0 410m 161m 10m R 44.5 15.9 0:40.40 java 2446 root 25 0 411m 161m 11m R 43.2 16.0 0:45.88 java Alex Loddengaard wrote: Elia, perhaps you can try changing mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum to 4 in hadoop-site.xml in hopes of getting better utilization. It's strange to me that having these both set to 2 only utilizes a single core, because I would imagine that any modern OS scheduler would do a good job of core utilization. Just a thought. Alex On Wed, Oct 8, 2008 at 12:52 AM, Taeho Kang [EMAIL PROTECTED] wrote: First of all, mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum are both set to 2 in hadoop-default.xml file; this file is read before hadoop-site.xml file so any properties that aren't set in hadoop-site.xml will follow the values set in hadoop-default.xml. As for the question on why only one core is utilized... I think it really depends on the process scheduling of the underlying OS. It's not like two tasks (two JVM subprocesses spawned by the tasktracker) will always run on independent cores as there are other processes which need one or more cores to be run. By the way, what tools did you use to find out which tasks (or processes) use which cores? /Taeho On Wed, Oct 8, 2008 at 1:01 PM, Alex Loddengaard [EMAIL PROTECTED]wrote: Taeho, I was going to suggest this change as well, but it's documented that mapred.tasktracker.map.tasks.maximum defaults to 2. Can you explain why Elia is only having one core utilized when this config option is set to 2? Here is the documentation I'm referring to: http://hadoop.apache.org/core/docs/r0.18.1/cluster_setup.html Alex On Tue, Oct 7, 2008 at 8:27 PM, Taeho Kang [EMAIL PROTECTED] wrote: You can have your node (tasktracker) running more than 1 task simultaneously. You may set mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum properties found in hadoop-site.xml file. You should change hadoop-site.xml file on all your slave nodes depending on how many cores each slave has. For example, you don't really want to have 8 tasks running at once on a 2 core machine. /Taeho On Wed, Oct 8, 2008 at 5:53 AM, Elia Mazzawi [EMAIL PROTECTED]wrote: hello, I have some dual core nodes, and I've noticed hadoop is only running 1 instance, and so is only using 1 on the CPU's on each node. is there a configuration to tell it to run more than once? or do i need to turn each machine into 2 nodes? Thanks.
dual core configuration
hello, I have some dual core nodes, and I've noticed hadoop is only running 1 instance, and so is only using 1 on the CPU's on each node. is there a configuration to tell it to run more than once? or do i need to turn each machine into 2 nodes? Thanks.
Re: dual core configuration
You can have your node (tasktracker) running more than 1 task simultaneously. You may set mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum properties found in hadoop-site.xml file. You should change hadoop-site.xml file on all your slave nodes depending on how many cores each slave has. For example, you don't really want to have 8 tasks running at once on a 2 core machine. /Taeho On Wed, Oct 8, 2008 at 5:53 AM, Elia Mazzawi [EMAIL PROTECTED]wrote: hello, I have some dual core nodes, and I've noticed hadoop is only running 1 instance, and so is only using 1 on the CPU's on each node. is there a configuration to tell it to run more than once? or do i need to turn each machine into 2 nodes? Thanks.
Re: dual core configuration
Taeho, I was going to suggest this change as well, but it's documented that mapred.tasktracker.map.tasks.maximum defaults to 2. Can you explain why Elia is only having one core utilized when this config option is set to 2? Here is the documentation I'm referring to: http://hadoop.apache.org/core/docs/r0.18.1/cluster_setup.html Alex On Tue, Oct 7, 2008 at 8:27 PM, Taeho Kang [EMAIL PROTECTED] wrote: You can have your node (tasktracker) running more than 1 task simultaneously. You may set mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum properties found in hadoop-site.xml file. You should change hadoop-site.xml file on all your slave nodes depending on how many cores each slave has. For example, you don't really want to have 8 tasks running at once on a 2 core machine. /Taeho On Wed, Oct 8, 2008 at 5:53 AM, Elia Mazzawi [EMAIL PROTECTED]wrote: hello, I have some dual core nodes, and I've noticed hadoop is only running 1 instance, and so is only using 1 on the CPU's on each node. is there a configuration to tell it to run more than once? or do i need to turn each machine into 2 nodes? Thanks.