Re: Appending and seeking files while writing
Append is supported in hadoop 0.20 . Hi. I think this really depends on the append functionality, any idea whether it supports such behaviour now? Regards. On Fri, Jun 11, 2010 at 10:41 AM, hadooprcoks hadoopro...@gmail.com wrote: Stas, I also believe that there should be a seek interface on the write path so that the FS API is complete. The FsDataInputStream already support seek() - so should FsDataOutputStream. For File systems, that do not support the seek on the write path, the seek can be a no operation. Could you open a JIRA to track this. I am willing to provide the patch if you do not have the time to do so. thanks hadooprocks On Thu, Jun 10, 2010 at 5:05 AM, Stas Oskin stas.os...@gmail.com wrote: Hi. Was the append functionality finally added to 0.20.1 version? Also, is the ability to seek file being written and write data in other place also supported? Thanks in advance! -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Hadoop 0.20.2 looking *inside* a file in the input path for files?
Hello, I am a newbie to hadoop, following the http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.html WordCount tutorial but trying update it to use the mapreduce classes instead of mapred. However I am getting the following error: 10/06/13 18:24:50 INFO mapred.JobClient: Task Id : attempt_201006131625_0023_m_00_0, Status : FAILED java.io.FileNotFoundException: 123 123 123 (No such file or directory) 123 123 123 is the first line in a file on my specified input path. Anyone have any idea what is going on? I do not run into this problem when I use the deprecated mapred classes. Thanks, Yipeng -- View this message in context: http://old.nabble.com/Hadoop-0.20.2-looking-*inside*-a-file-in-the-input-path-for-files--tp28870429p28870429.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Hadoop 0.20.2 looking *inside* a file in the input path for files?
See https://issues.apache.org/jira/browse/MAPREDUCE-1734 BTW, it's easier for other people to reproduce your scenario if you post your code. On Sun, Jun 13, 2010 at 3:35 AM, suckerfish yip...@gmail.com wrote: Hello, I am a newbie to hadoop, following the http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.htmlWordCount tutorial but trying update it to use the mapreduce classes instead of mapred. However I am getting the following error: 10/06/13 18:24:50 INFO mapred.JobClient: Task Id : attempt_201006131625_0023_m_00_0, Status : FAILED java.io.FileNotFoundException: 123 123 123 (No such file or directory) 123 123 123 is the first line in a file on my specified input path. Anyone have any idea what is going on? I do not run into this problem when I use the deprecated mapred classes. Thanks, Yipeng -- View this message in context: http://old.nabble.com/Hadoop-0.20.2-looking-*inside*-a-file-in-the-input-path-for-files--tp28870429p28870429.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Appending and seeking files while writing
On Sun, Jun 13, 2010 at 12:46 AM, Vidur Goyal vi...@students.iiit.ac.inwrote: Append is supported in hadoop 0.20 . Append will be supported in the 0.20-append branch, which is still in progress. It is NOT supported in vanilla 0.20. You can turn on the config option but it is dangerous and highly discouraged for real use. Append will be supported fully in 0.21. Also, append does *not* add random write. It simply adds the ability to re-open a file and add more data to the end. -Todd Hi. I think this really depends on the append functionality, any idea whether it supports such behaviour now? Regards. On Fri, Jun 11, 2010 at 10:41 AM, hadooprcoks hadoopro...@gmail.com wrote: Stas, I also believe that there should be a seek interface on the write path so that the FS API is complete. The FsDataInputStream already support seek() - so should FsDataOutputStream. For File systems, that do not support the seek on the write path, the seek can be a no operation. Could you open a JIRA to track this. I am willing to provide the patch if you do not have the time to do so. thanks hadooprocks On Thu, Jun 10, 2010 at 5:05 AM, Stas Oskin stas.os...@gmail.com wrote: Hi. Was the append functionality finally added to 0.20.1 version? Also, is the ability to seek file being written and write data in other place also supported? Thanks in advance! -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- Todd Lipcon Software Engineer, Cloudera
problem setting up development environment for hadoop
Hello All, I have been trying to set up a development environment for hdfs using this link http://wiki.apache.org/hadoop/EclipseEnvironment , but the project gives error after the build is completed. It does not contain certain files. Please help ! vidur -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Is it possible to sort values before they are sent to the reduce function?
Hi, For each key, there might be millions of values(LongWritable), but I only want to emit top 20 of these values which I want to be sorted in descending order. So is it possible to sort these values before they enter the reduce phase? Thank you in advance! Kevin
Re: Is it possible to sort values before they are sent to the reduce function?
Hi Kevin, This is a very common technique. Look for secondary sort in Tom White's HTGD (Chapter 6). You'll most likely have to write your own Partitioner and WritableComparator. -- Alex K On Sun, Jun 13, 2010 at 7:16 PM, Kevin Tse kevintse.on...@gmail.com wrote: Hi, For each key, there might be millions of values(LongWritable), but I only want to emit top 20 of these values which I want to be sorted in descending order. So is it possible to sort these values before they enter the reduce phase? Thank you in advance! Kevin
Re: Problems with HOD and HDFS
Anybody? I am completely stuck here. I have no idea who else I can ask or where I can go for more information. Is there somewhere specific where I should be asking about HOD? Thank you, Dave On Thu, Jun 10, 2010 at 2:56 PM, David Milne d.n.mi...@gmail.com wrote: Hi there, I am trying to get Hadoop on Demand up and running, but am having problems with the ringmaster not being able to communicate with HDFS. The output from the hod allocate command ends with this, with full verbosity: [2010-06-10 14:40:22,650] CRITICAL/50 hadoop:298 - Failed to retrieve 'hdfs' service address. [2010-06-10 14:40:22,654] DEBUG/10 hadoop:631 - Cleaning up cluster id 34029.symphony.cs.waikato.ac.nz, as cluster could not be allocated. [2010-06-10 14:40:22,655] DEBUG/10 hadoop:635 - Calling rm.stop() [2010-06-10 14:40:22,665] DEBUG/10 hadoop:637 - Returning from rm.stop() [2010-06-10 14:40:22,666] CRITICAL/50 hod:401 - Cannot allocate cluster /home/dmilne/hadoop/cluster [2010-06-10 14:40:23,090] DEBUG/10 hod:597 - return code: 7 I've attached the hodrc file below, but briefly HOD is supposed to provision an HDFS cluster as well as a Map/Reduce cluster, and seems to be failing to do so. The ringmaster log looks like this: [2010-06-10 14:36:05,144] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs [2010-06-10 14:36:05,145] DEBUG/10 ringMaster:487 - getServiceAddr service: hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8 [2010-06-10 14:36:05,147] DEBUG/10 ringMaster:504 - getServiceAddr addr hdfs: not found [2010-06-10 14:36:06,195] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs [2010-06-10 14:36:06,197] DEBUG/10 ringMaster:487 - getServiceAddr service: hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8 [2010-06-10 14:36:06,198] DEBUG/10 ringMaster:504 - getServiceAddr addr hdfs: not found ... and so on, until it gives up Any ideas why? One red flag is that when running the allocate command, some of the variables echo-ed back look dodgy: --gridservice-hdfs.fs_port 0 --gridservice-hdfs.host localhost --gridservice-hdfs.info_port 0 These are not what I specified in the hodrc. Are the port numbers just set to 0 because I am not using an external HDFS, or is this a problem? The software versions involved are: - Hadoop 0.20.2 - Python 2.5.2 (no Twisted) - Java 1.6.0_20 - Torque 2.4.5 The hodrc file looks like this: [hod] stream = True java-home = /opt/jdk1.6.0_20 cluster = debian5 cluster-factor = 1.8 xrs-port-range = 32768-65536 debug = 3 allocate-wait-time = 3600 temp-dir = /scratch/local/dmilne/hod [ringmaster] register = True stream = False temp-dir = /scratch/local/dmilne/hod log-dir = /scratch/local/dmilne/hod/log http-port-range = 8000-9000 idleness-limit = 864000 work-dirs = /scratch/local/dmilne/hod/1,/scratch/local/dmilne/hod/2 xrs-port-range = 32768-65536 debug = 4 [hodring] stream = False temp-dir = /scratch/local/dmilne/hod log-dir = /scratch/local/dmilne/hod/log register = True java-home = /opt/jdk1.6.0_20 http-port-range = 8000-9000 xrs-port-range = 32768-65536 debug = 4 [resource_manager] queue = express batch-home = /opt/torque-2.4.5 id = torque options = l:pmem=3812M,W:X=NACCESSPOLICY:SINGLEJOB #env-vars = HOD_PYTHON_HOME=/foo/bar/python-2.5.1/bin/python [gridservice-mapred] external = False pkgs = /opt/hadoop-0.20.2 tracker_port = 8030 info_port = 50080 [gridservice-hdfs] external = False pkgs = /opt/hadoop-0.20.2 fs_port = 8020 info_port = 50070 Cheers, Dave
Re: Problems with HOD and HDFS
Hey Dave, I can't speak for the folks at Yahoo!, but from watching the JIRA, I don't think HOD is actively used or developed anywhere these days. You're attempting to use a mostly deprecated project, and hence not receiving any support on the mailing list. Thanks, Jeff On Sun, Jun 13, 2010 at 7:33 PM, David Milne d.n.mi...@gmail.com wrote: Anybody? I am completely stuck here. I have no idea who else I can ask or where I can go for more information. Is there somewhere specific where I should be asking about HOD? Thank you, Dave On Thu, Jun 10, 2010 at 2:56 PM, David Milne d.n.mi...@gmail.com wrote: Hi there, I am trying to get Hadoop on Demand up and running, but am having problems with the ringmaster not being able to communicate with HDFS. The output from the hod allocate command ends with this, with full verbosity: [2010-06-10 14:40:22,650] CRITICAL/50 hadoop:298 - Failed to retrieve 'hdfs' service address. [2010-06-10 14:40:22,654] DEBUG/10 hadoop:631 - Cleaning up cluster id 34029.symphony.cs.waikato.ac.nz, as cluster could not be allocated. [2010-06-10 14:40:22,655] DEBUG/10 hadoop:635 - Calling rm.stop() [2010-06-10 14:40:22,665] DEBUG/10 hadoop:637 - Returning from rm.stop() [2010-06-10 14:40:22,666] CRITICAL/50 hod:401 - Cannot allocate cluster /home/dmilne/hadoop/cluster [2010-06-10 14:40:23,090] DEBUG/10 hod:597 - return code: 7 I've attached the hodrc file below, but briefly HOD is supposed to provision an HDFS cluster as well as a Map/Reduce cluster, and seems to be failing to do so. The ringmaster log looks like this: [2010-06-10 14:36:05,144] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs [2010-06-10 14:36:05,145] DEBUG/10 ringMaster:487 - getServiceAddr service: hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8 [2010-06-10 14:36:05,147] DEBUG/10 ringMaster:504 - getServiceAddr addr hdfs: not found [2010-06-10 14:36:06,195] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs [2010-06-10 14:36:06,197] DEBUG/10 ringMaster:487 - getServiceAddr service: hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8 [2010-06-10 14:36:06,198] DEBUG/10 ringMaster:504 - getServiceAddr addr hdfs: not found ... and so on, until it gives up Any ideas why? One red flag is that when running the allocate command, some of the variables echo-ed back look dodgy: --gridservice-hdfs.fs_port 0 --gridservice-hdfs.host localhost --gridservice-hdfs.info_port 0 These are not what I specified in the hodrc. Are the port numbers just set to 0 because I am not using an external HDFS, or is this a problem? The software versions involved are: - Hadoop 0.20.2 - Python 2.5.2 (no Twisted) - Java 1.6.0_20 - Torque 2.4.5 The hodrc file looks like this: [hod] stream = True java-home = /opt/jdk1.6.0_20 cluster = debian5 cluster-factor = 1.8 xrs-port-range = 32768-65536 debug = 3 allocate-wait-time = 3600 temp-dir= /scratch/local/dmilne/hod [ringmaster] register= True stream = False temp-dir= /scratch/local/dmilne/hod log-dir = /scratch/local/dmilne/hod/log http-port-range = 8000-9000 idleness-limit = 864000 work-dirs = /scratch/local/dmilne/hod/1,/scratch/local/dmilne/hod/2 xrs-port-range = 32768-65536 debug = 4 [hodring] stream = False temp-dir= /scratch/local/dmilne/hod log-dir = /scratch/local/dmilne/hod/log register= True java-home = /opt/jdk1.6.0_20 http-port-range = 8000-9000 xrs-port-range = 32768-65536 debug = 4 [resource_manager] queue = express batch-home = /opt/torque-2.4.5 id = torque options = l:pmem=3812M,W:X=NACCESSPOLICY:SINGLEJOB #env-vars = HOD_PYTHON_HOME=/foo/bar/python-2.5.1/bin/python [gridservice-mapred] external= False pkgs= /opt/hadoop-0.20.2 tracker_port= 8030 info_port = 50080 [gridservice-hdfs] external= False pkgs= /opt/hadoop-0.20.2 fs_port = 8020 info_port = 50070 Cheers, Dave
Re: Is it possible to sort values before they are sent to the reduce function?
Hi Alex, I am was reading Tom's book, but I have not reached chapter 6 yet. I just read it, it is really helpful. Thank you for mentioning it, and Thanks also goes to Tom. Kevin On Mon, Jun 14, 2010 at 10:22 AM, Alex Kozlov ale...@cloudera.com wrote: Hi Kevin, This is a very common technique. Look for secondary sort in Tom White's HTGD (Chapter 6). You'll most likely have to write your own Partitioner and WritableComparator. -- Alex K On Sun, Jun 13, 2010 at 7:16 PM, Kevin Tse kevintse.on...@gmail.com wrote: Hi, For each key, there might be millions of values(LongWritable), but I only want to emit top 20 of these values which I want to be sorted in descending order. So is it possible to sort these values before they enter the reduce phase? Thank you in advance! Kevin
Re: Caching in HDFS C API Client
I'd bet on the Linux file-cache. Assuming you wrote the file with the default replication factor of 3, there is one replica of the local- filesystem which you are reading... Try writing multiple GBs of data and randomly reading large files to blow your file-cache? Arun On Jun 11, 2010, at 10:05 AM, Patrick Donnelly wrote: Hi List, I need to explain an higher than expected throughput (bandwidth) for a HDFS C API Client. Specifically, the client is getting bandwidth higher than its link rate :). The client is first writing a 512 MB file followed by reading the entire file back. The file read is what's getting the higher than link rate bandwidth. I assume this is a consequence of caching? Is this done by HDFS or by Linux? Thanks for any help, -- - Patrick Donnelly
Re: Problems with HOD and HDFS
Ok, thanks Jeff. This is pretty surprising though. I would have thought many people would be in my position, where they have to use Hadoop on a general purpose cluster, and need it to play nice with a resource manager? What do other people do in this position, if they don't use HOD? Deprecated normally means there is a better alternative. - Dave On Mon, Jun 14, 2010 at 2:39 PM, Jeff Hammerbacher ham...@cloudera.com wrote: Hey Dave, I can't speak for the folks at Yahoo!, but from watching the JIRA, I don't think HOD is actively used or developed anywhere these days. You're attempting to use a mostly deprecated project, and hence not receiving any support on the mailing list. Thanks, Jeff On Sun, Jun 13, 2010 at 7:33 PM, David Milne d.n.mi...@gmail.com wrote: Anybody? I am completely stuck here. I have no idea who else I can ask or where I can go for more information. Is there somewhere specific where I should be asking about HOD? Thank you, Dave On Thu, Jun 10, 2010 at 2:56 PM, David Milne d.n.mi...@gmail.com wrote: Hi there, I am trying to get Hadoop on Demand up and running, but am having problems with the ringmaster not being able to communicate with HDFS. The output from the hod allocate command ends with this, with full verbosity: [2010-06-10 14:40:22,650] CRITICAL/50 hadoop:298 - Failed to retrieve 'hdfs' service address. [2010-06-10 14:40:22,654] DEBUG/10 hadoop:631 - Cleaning up cluster id 34029.symphony.cs.waikato.ac.nz, as cluster could not be allocated. [2010-06-10 14:40:22,655] DEBUG/10 hadoop:635 - Calling rm.stop() [2010-06-10 14:40:22,665] DEBUG/10 hadoop:637 - Returning from rm.stop() [2010-06-10 14:40:22,666] CRITICAL/50 hod:401 - Cannot allocate cluster /home/dmilne/hadoop/cluster [2010-06-10 14:40:23,090] DEBUG/10 hod:597 - return code: 7 I've attached the hodrc file below, but briefly HOD is supposed to provision an HDFS cluster as well as a Map/Reduce cluster, and seems to be failing to do so. The ringmaster log looks like this: [2010-06-10 14:36:05,144] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs [2010-06-10 14:36:05,145] DEBUG/10 ringMaster:487 - getServiceAddr service: hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8 [2010-06-10 14:36:05,147] DEBUG/10 ringMaster:504 - getServiceAddr addr hdfs: not found [2010-06-10 14:36:06,195] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs [2010-06-10 14:36:06,197] DEBUG/10 ringMaster:487 - getServiceAddr service: hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8 [2010-06-10 14:36:06,198] DEBUG/10 ringMaster:504 - getServiceAddr addr hdfs: not found ... and so on, until it gives up Any ideas why? One red flag is that when running the allocate command, some of the variables echo-ed back look dodgy: --gridservice-hdfs.fs_port 0 --gridservice-hdfs.host localhost --gridservice-hdfs.info_port 0 These are not what I specified in the hodrc. Are the port numbers just set to 0 because I am not using an external HDFS, or is this a problem? The software versions involved are: - Hadoop 0.20.2 - Python 2.5.2 (no Twisted) - Java 1.6.0_20 - Torque 2.4.5 The hodrc file looks like this: [hod] stream = True java-home = /opt/jdk1.6.0_20 cluster = debian5 cluster-factor = 1.8 xrs-port-range = 32768-65536 debug = 3 allocate-wait-time = 3600 temp-dir = /scratch/local/dmilne/hod [ringmaster] register = True stream = False temp-dir = /scratch/local/dmilne/hod log-dir = /scratch/local/dmilne/hod/log http-port-range = 8000-9000 idleness-limit = 864000 work-dirs = /scratch/local/dmilne/hod/1,/scratch/local/dmilne/hod/2 xrs-port-range = 32768-65536 debug = 4 [hodring] stream = False temp-dir = /scratch/local/dmilne/hod log-dir = /scratch/local/dmilne/hod/log register = True java-home = /opt/jdk1.6.0_20 http-port-range = 8000-9000 xrs-port-range = 32768-65536 debug = 4 [resource_manager] queue = express batch-home = /opt/torque-2.4.5 id = torque options = l:pmem=3812M,W:X=NACCESSPOLICY:SINGLEJOB #env-vars = HOD_PYTHON_HOME=/foo/bar/python-2.5.1/bin/python [gridservice-mapred] external = False pkgs = /opt/hadoop-0.20.2 tracker_port
Re: Problems with HOD and HDFS
On Monday 14 June 2010 08:03 AM, David Milne wrote: Anybody? I am completely stuck here. I have no idea who else I can ask or where I can go for more information. Is there somewhere specific where I should be asking about HOD? Thank you, Dave In the ringmaster logs, you should see which node was supposed to run Namenode. This can be found above the logs that you've printed. I can barely remember but I guess it reads something like getCommand(). Once you find out the node, check the hodring logs there, something must have gone wrong there. The return code was 7 - indicating HDFS failure. See http://hadoop.apache.org/common/docs/r0.20.0/hod_user_guide.html#The+Exit+Codes+For+HOD+Are+Not+Getting+Into+Torque, and check if you are hitting one of the problems listed there. HTH, +vinod On Thu, Jun 10, 2010 at 2:56 PM, David Milned.n.mi...@gmail.com wrote: Hi there, I am trying to get Hadoop on Demand up and running, but am having problems with the ringmaster not being able to communicate with HDFS. The output from the hod allocate command ends with this, with full verbosity: [2010-06-10 14:40:22,650] CRITICAL/50 hadoop:298 - Failed to retrieve 'hdfs' service address. [2010-06-10 14:40:22,654] DEBUG/10 hadoop:631 - Cleaning up cluster id 34029.symphony.cs.waikato.ac.nz, as cluster could not be allocated. [2010-06-10 14:40:22,655] DEBUG/10 hadoop:635 - Calling rm.stop() [2010-06-10 14:40:22,665] DEBUG/10 hadoop:637 - Returning from rm.stop() [2010-06-10 14:40:22,666] CRITICAL/50 hod:401 - Cannot allocate cluster /home/dmilne/hadoop/cluster [2010-06-10 14:40:23,090] DEBUG/10 hod:597 - return code: 7 I've attached the hodrc file below, but briefly HOD is supposed to provision an HDFS cluster as well as a Map/Reduce cluster, and seems to be failing to do so. The ringmaster log looks like this: [2010-06-10 14:36:05,144] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs [2010-06-10 14:36:05,145] DEBUG/10 ringMaster:487 - getServiceAddr service:hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8 [2010-06-10 14:36:05,147] DEBUG/10 ringMaster:504 - getServiceAddr addr hdfs: not found [2010-06-10 14:36:06,195] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs [2010-06-10 14:36:06,197] DEBUG/10 ringMaster:487 - getServiceAddr service:hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8 [2010-06-10 14:36:06,198] DEBUG/10 ringMaster:504 - getServiceAddr addr hdfs: not found ... and so on, until it gives up Any ideas why? One red flag is that when running the allocate command, some of the variables echo-ed back look dodgy: --gridservice-hdfs.fs_port 0 --gridservice-hdfs.host localhost --gridservice-hdfs.info_port 0 These are not what I specified in the hodrc. Are the port numbers just set to 0 because I am not using an external HDFS, or is this a problem? The software versions involved are: - Hadoop 0.20.2 - Python 2.5.2 (no Twisted) - Java 1.6.0_20 - Torque 2.4.5 The hodrc file looks like this: [hod] stream = True java-home = /opt/jdk1.6.0_20 cluster = debian5 cluster-factor = 1.8 xrs-port-range = 32768-65536 debug = 3 allocate-wait-time = 3600 temp-dir= /scratch/local/dmilne/hod [ringmaster] register= True stream = False temp-dir= /scratch/local/dmilne/hod log-dir = /scratch/local/dmilne/hod/log http-port-range = 8000-9000 idleness-limit = 864000 work-dirs = /scratch/local/dmilne/hod/1,/scratch/local/dmilne/hod/2 xrs-port-range = 32768-65536 debug = 4 [hodring] stream = False temp-dir= /scratch/local/dmilne/hod log-dir = /scratch/local/dmilne/hod/log register= True java-home = /opt/jdk1.6.0_20 http-port-range = 8000-9000 xrs-port-range = 32768-65536 debug = 4 [resource_manager] queue = express batch-home = /opt/torque-2.4.5 id = torque options = l:pmem=3812M,W:X=NACCESSPOLICY:SINGLEJOB #env-vars = HOD_PYTHON_HOME=/foo/bar/python-2.5.1/bin/python [gridservice-mapred] external= False pkgs= /opt/hadoop-0.20.2 tracker_port= 8030 info_port = 50080 [gridservice-hdfs] external= False pkgs= /opt/hadoop-0.20.2 fs_port = 8020 info_port = 50070 Cheers, Dave