unsubscribe
Original Message From: Smita Deshpande smita.deshpa...@cumulus-systems.com Sent: Wednesday, July 16, 2014 11:10 PM To: user@hadoop.apache.org Subject: Progress indicator should not be negative. Hi, I am running the distributed shell example of YARN on apache Hadoop 2.4.0. I have implemented getProgress method in my ApplicationMaster as follows public float getProgress() { // set progress to deliver to RM on next heartbeat float progress = 0; try{ progress = (float) numCompletedContainers.get() / numTotalContainers.get(); } catch(Exception _ex) { _ex.printStackTrace(); } return progress; } While shutting down the application I get following excpetion - Interrupted while waiting for queue java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:275) When I restart my application I get following error java.lang.IllegalArgumentException: Progress indicator should not be negative Because of this, my ApplicationMaster is launched in another container. This exception is coming all the time in above scenario. Can you suggest me what is going wrong? Thanks, Smita
Multiple Part files
Hi After Map Reduce job, we are seeing multiple small part files in the output directory. We are using RC file format (snappy codec) 1) Do each part file will take 64MB block size? 2) How to merge these multiple RC format part files into one RC file? 3) What is the pros-cons of having multiple part files? 4) Do merging part files will improve performance? Thanks and Regards Prabakaran.N aka NP nsn, Bangalore When I is replaced by We - even Illness becomes Wellness
Replace a block with a new one
Hi guys, I recently encounter a scenario which needs to replace an exist block with a newly written block The most straightforward way to finish may be like this: Suppose the original file is A, and we write a new file B which is composed by the new data blocks, then we merge A and B to C which is the file we wanted The obvious shortcoming of this method is wasting of network bandwidth I'm wondering whether there is a way to replace the old block by the new block directly. Any thoughts? -- Best Wishes! Yours, Zesheng
Re: Evaluation of cost of cluster
Hi, The mail is now nicely formated but I would suggest you take the time to digest answers from the same question you already asked twice. https://www.mail-archive.com/search?l=user%40hadoop.apache.orgq=YIMEN+YIMGA+Gael https://www.mail-archive.com/user%40hadoop.apache.org/msg15411.html The hadoop mailing is not a hardware shop. The best way to know the price is ask vendors for quote and/or check their prices on their website. Regards Bertrand Dechoux On Thu, Jul 17, 2014 at 11:01 AM, YIMEN YIMGA Gael gael.yimen-yi...@sgcib.com wrote: Hello Guys, I need your help to fix cost evaluation for a cluster. *Characteristics are the followings :* A cluster with 6 servers (1 Namenode, 1 Secondary namenode, 4 datanodes). The configuration is given below in the tables. *COMPONENTS OF OUR NAMENODE MACHINE* *SIZE* *Hard Disk which will store all information.* - *File configuration for Hadoop cluster* - *Namenode and JobTracker* - *SATA, 12 HDD of 3TB* - *SAS Controller 6GB/s* *RAM* - *Job Tracker* - *Namenode* - *10 GB of RAM* - *DDR3 ECC* *CPU* - *Namenode CPU* - *Datanode CPU* - *4 cores of 2 GHz = 8GHz* *Operating System* - *Redhat* or *Debian* *Others* - Possibility to get address by DHCP on the network. - Possibility to access Internet to install some OS packages - An admin account on the server - By default, SSH should be install *COMPONENTS OF OUR DATANODE MACHINE* *SIZE* *Hard Disk which will store all information.* - *Applications logs* - *Datanode and tasktracker* - *SATA, 12 HDD of 3TB* - *SAS Controller 6GB/s* *RAM* - *Task Tracker* - *Datanode* - *Data processes (Map Reduce)* - *24 GB of RAM* - *DDR3 ECC* *CPU* - *datanode CPU* - *tasktracker CPU* - *4 cores of 3 GHz = 12GHz* *Operating System* - *Redhat* or *Debian* *Others* - Possibility to get address by DHCP on the network. - Possibility to access Internet to install some OS packages - An admin account on the server - By default, SSH should be install NB : I would like to have an evaluation using commodity hardware (the most low cost possible). Do not forget to give the labor costs Standing by for your return. Warm regards *---* *Gaël YIMEN YIMGA* * This message and any attachments (the message) are confidential, intended solely for the addressee(s), and may contain legally privileged information. Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or falsified. Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products. Ce message et toutes les pieces jointes (ci-apres le message) sont confidentiels et susceptibles de contenir des informations couvertes par le secret professionnel. Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite. Tout message electronique est susceptible d'alteration. La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie. Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives. *
RE: Evaluation of cost of cluster
Hi, First, To produce this clear architecture, I used the answer from the question I asked last time. I thank all people who gave me the clue to do so. Second, I know that Hadoop mailing is not a shop. I would like just to use experience of some people here. I know that what I’m challenging today is an issue that has an answer already, that’s why I submit my request. I can contact some vendors but in my scope, I do not have ways to do so. To contact a vendors you should have money. I do not have it. Thanks in advance for your help. Regards From: Bertrand Dechoux [mailto:decho...@gmail.com] Sent: Thursday 17 July 2014 11:50 To: user@hadoop.apache.org Cc: vino...@hortonworks.com Subject: Re: Evaluation of cost of cluster Hi, The mail is now nicely formated but I would suggest you take the time to digest answers from the same question you already asked twice. https://www.mail-archive.com/search?l=user%40hadoop.apache.orgq=YIMEN+YIMGA+Gael https://www.mail-archive.com/user%40hadoop.apache.org/msg15411.html The hadoop mailing is not a hardware shop. The best way to know the price is ask vendors for quote and/or check their prices on their website. Regards Bertrand Dechoux On Thu, Jul 17, 2014 at 11:01 AM, YIMEN YIMGA Gael gael.yimen-yi...@sgcib.commailto:gael.yimen-yi...@sgcib.com wrote: Hello Guys, I need your help to fix cost evaluation for a cluster. Characteristics are the followings : A cluster with 6 servers (1 Namenode, 1 Secondary namenode, 4 datanodes). The configuration is given below in the tables. COMPONENTS OF OUR NAMENODE MACHINE SIZE Hard Disk which will store all information. - File configuration for Hadoop cluster - Namenode and JobTracker - SATA, 12 HDD of 3TB - SAS Controller 6GB/s RAM - Job Tracker - Namenode - 10 GB of RAM - DDR3 ECC CPU - Namenode CPU - Datanode CPU - 4 cores of 2 GHz = 8GHz Operating System - Redhat or Debian Others - Possibility to get address by DHCP on the network. - Possibility to access Internet to install some OS packages - An admin account on the server - By default, SSH should be install COMPONENTS OF OUR DATANODE MACHINE SIZE Hard Disk which will store all information. - Applications logs - Datanode and tasktracker - SATA, 12 HDD of 3TB - SAS Controller 6GB/s RAM - Task Tracker - Datanode - Data processes (Map Reduce) - 24 GB of RAM - DDR3 ECC CPU - datanode CPU - tasktracker CPU - 4 cores of 3 GHz = 12GHz Operating System - Redhat or Debian Others - Possibility to get address by DHCP on the network. - Possibility to access Internet to install some OS packages - An admin account on the server - By default, SSH should be install NB : I would like to have an evaluation using commodity hardware (the most low cost possible). Do not forget to give the labor costs Standing by for your return. Warm regards --- Gaël YIMEN YIMGA * This message and any attachments (the message) are confidential, intended solely for the addressee(s), and may contain legally privileged information. Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or falsified. Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products. Ce message et toutes les pieces jointes (ci-apres le message) sont confidentiels et susceptibles de contenir des informations couvertes par le secret professionnel. Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite. Tout message electronique est susceptible d'alteration. La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie. Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives. *
RE: Multiple Part files
Hi Prabakaran, Multiple small part files in the output directory is because each reducer task output is coming as one part file. 1. Do each part file will take 64MB block size? Based on the output size of the reducer one part file is created. Filesize can be smaller size than the hdfs block size, i.e. it not be mandatorily be of 64MB 2. How to merge these multiple RC format part files into one RC file? One way (may be longer way ) is to get the part files in to local diretory and write a tool to merge all the RC files. But anyway i feel in the first place we need to ensure we have single reducer so that there is no need for merging 3. What is the pros-cons of having multiple part files? Depends on the next operation what you want to do, Like if you are planning to load into Hive then based on Hive paritions better to configure the MR to be partitioned as per Hive partiions and loading would be easier? etc ... 4. Do merging part files will improve performance? Performance of the Map reduce or later operation ? I think if the overall scenario is known then we will be able to support better Regards, Naga Huawei Technologies Co., Ltd. Phone: Fax: Mobile: +91 9980040283 Email: naganarasimh...@huawei.commailto:naganarasimh...@huawei.com Huawei Technologies Co., Ltd. Bantian, Longgang District,Shenzhen 518129, P.R.China http://www.huawei.com ¡This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! From: Natarajan, Prabakaran 1. (NSN - IN/Bangalore) [prabakaran.1.natara...@nsn.com] Sent: Thursday, July 17, 2014 15:52 To: user@hadoop.apache.org Subject: Multiple Part files Hi After Map Reduce job, we are seeing multiple small part files in the output directory. We are using RC file format (snappy codec) 1. Do each part file will take 64MB block size? 2. How to merge these multiple RC format part files into one RC file? 3. What is the pros-cons of having multiple part files? 4. Do merging part files will improve performance? Thanks and Regards Prabakaran.N aka NP nsn, Bangalore When I is replaced by We - even Illness becomes Wellness
Re: Evaluation of cost of cluster
Vendors should always be able to give you no obligation quotations based on your requirements (though you may have to fend off follow up sales calls afterwards) Or you can simply use the vendors websites many of which will have tools that allow you to configure a server thus giving you an estimated cost per server which you can use to calculate an estimated cluster cost Rob From: YIMEN YIMGA Gael gael.yimen-yi...@sgcib.com Reply-To: user@hadoop.apache.org Date: Thursday, 17 July 2014 10:58 To: user@hadoop.apache.org user@hadoop.apache.org Cc: vino...@hortonworks.com vino...@hortonworks.com, decho...@gmail.com decho...@gmail.com Subject: RE: Evaluation of cost of cluster Hi, First, To produce this clear architecture, I used the answer from the question I asked last time. I thank all people who gave me the clue to do so. Second, I know that Hadoop mailing is not a shop. I would like just to use experience of some people here. I know that what I’m challenging today is an issue that has an answer already, that’s why I submit my request. I can contact some vendors but in my scope, I do not have ways to do so. To contact a vendors you should have money. I do not have it. Thanks in advance for your help. Regards From: Bertrand Dechoux [mailto:decho...@gmail.com] Sent: Thursday 17 July 2014 11:50 To: user@hadoop.apache.org Cc: vino...@hortonworks.com Subject: Re: Evaluation of cost of cluster Hi, The mail is now nicely formated but I would suggest you take the time to digest answers from the same question you already asked twice. https://www.mail-archive.com/search?l=user%40hadoop.apache.orgq=YIMEN+YIMGA+G ael https://www.mail-archive.com/user%40hadoop.apache.org/msg15411.html The hadoop mailing is not a hardware shop. The best way to know the price is ask vendors for quote and/or check their prices on their website. Regards Bertrand Dechoux On Thu, Jul 17, 2014 at 11:01 AM, YIMEN YIMGA Gael gael.yimen-yi...@sgcib.com wrote: Hello Guys, I need your help to fix cost evaluation for a cluster. Characteristics are the followings : A cluster with 6 servers (1 Namenode, 1 Secondary namenode, 4 datanodes). The configuration is given below in the tables. COMPONENTS OF OUR NAMENODE MACHINE SIZE Hard Disk which will store all information.- File configuration for Hadoop cluster- Namenode and JobTracker - SATA, 12 HDD of 3TB- SAS Controller 6GB/s RAM- Job Tracker- Namenode - 10 GB of RAM- DDR3 ECC CPU- Namenode CPU- Datanode CPU - 4 cores of 2 GHz = 8GHz Operating System - Redhat or Debian Others - Possibility to get address by DHCP on the network.- Possibility to access Internet to install some OS packages- An admin account on the server- By default, SSH should be install COMPONENTS OF OUR DATANODE MACHINE SIZE Hard Disk which will store all information.- Applications logs- Datanode and tasktracker - SATA, 12 HDD of 3TB- SAS Controller 6GB/s RAM- Task Tracker- Datanode- Data processes (Map Reduce) - 24 GB of RAM- DDR3 ECC CPU- datanode CPU- tasktracker CPU - 4 cores of 3 GHz = 12GHz Operating System - Redhat or Debian Others - Possibility to get address by DHCP on the network.- Possibility to access Internet to install some OS packages- An admin account on the server- By default, SSH should be install NB : I would like to have an evaluation using commodity hardware (the most low cost possible). Do not forget to give the labor costs Standing by for your return. Warm regards --- Gaël YIMEN YIMGA * This message and any attachments (the message) are confidential, intended solely for the addressee(s), and may contain legally privileged information. Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or falsified. Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products. Ce message et toutes les pieces jointes (ci-apres le message) sont confidentiels et susceptibles de contenir des informations couvertes par le secret professionnel. Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite. Tout message electronique est susceptible d'alteration. La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie. Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir
Re: Replace a block with a new one
Hi, there's no way to do that, as HDFS does not provide file updates features. You'll need to write a new file with the changes. Notice that even if you manage to find the physical block replica files on the disk, corresponding to the part of the file you want to change, you can't simply update it manually, as this would give a different checksum, making HDFS mark such blocks as corrupt. Regards, Wellington. On 17 Jul 2014, at 10:50, Zesheng Wu wuzeshen...@gmail.com wrote: Hi guys, I recently encounter a scenario which needs to replace an exist block with a newly written block The most straightforward way to finish may be like this: Suppose the original file is A, and we write a new file B which is composed by the new data blocks, then we merge A and B to C which is the file we wanted The obvious shortcoming of this method is wasting of network bandwidth I'm wondering whether there is a way to replace the old block by the new block directly. Any thoughts? -- Best Wishes! Yours, Zesheng
Re: Multiple Part files
Hadoop has a getmerge command ( http://hadoop.apache.org/docs/r0.19.1/hdfs_shell.html#getmerge) command, I'm not certain if it works with RC file, i think it should. So maybe you don't have to copy the files to local. On Thu, Jul 17, 2014 at 6:18 AM, Naganarasimha G R (Naga) garlanaganarasi...@huawei.com wrote: Hi Prabakaran, Multiple small part files in the output directory is because each reducer task output is coming as one part file. 1. Do each part file will take 64MB block size? *Based on the output size of the reducer one part file is created. Filesize can be smaller size than the hdfs block size, i.e. it not be mandatorily be of 64MB* 2. How to merge these multiple RC format part files into one RC file? *One way (may be longer way ) is to get the part files in to local diretory and write a tool to merge all the RC files. * *But anyway i feel in the first place we need to ensure we have single reducer so that there is no need for merging* 3. What is the pros-cons of having multiple part files? *Depends on the next operation what you want to do, * *Like if you are planning to load into Hive then based on Hive paritions better to configure the MR to be partitioned as per Hive partiions and loading would be easier? etc ... * 4. Do merging part files will improve performance? Performance of the Map reduce or later operation ? I think if the overall scenario is known then we will be able to support better Regards, Naga Huawei Technologies Co., Ltd. Phone: Fax: Mobile: +91 9980040283 Email: naganarasimh...@huawei.com Huawei Technologies Co., Ltd. Bantian, Longgang District,Shenzhen 518129, P.R.China http://www.huawei.com ¡ This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! -- *From:* Natarajan, Prabakaran 1. (NSN - IN/Bangalore) [ prabakaran.1.natara...@nsn.com] *Sent:* Thursday, July 17, 2014 15:52 *To:* user@hadoop.apache.org *Subject:* Multiple Part files Hi After Map Reduce job, we are seeing multiple small part files in the output directory. We are using RC file format (snappy codec) 1. Do each part file will take 64MB block size? 2. How to merge these multiple RC format part files into one RC file? 3. What is the pros-cons of having multiple part files? 4. Do merging part files will improve performance? *Thanks and Regards* Prabakaran.N aka NP nsn, Bangalore *When I is replaced by We - even Illness becomes Wellness*
Re: Configuration set up questions - Container killed on request. Exit code is 143
Another thing to try is smaller input splits if your data can be broken up into smaller files that can be independently processed. That way s you get more but smaller map tasks. You could also use more but smaller reducers. The many files will tax your NameNode more but you might get to use all you cores. On Jul 17, 2014 9:07 AM, Chris MacKenzie stu...@chrismackenziephotography.co.uk wrote: Hi Chris, Thanks for getting back to me. I will set that value to 10 I have just tried this. https://support.gopivotal.com/hc/en-us/articles/201462036-Mapreduce-YARN-Me mory-Parameters Setting both to mapreduce.map.memory.mb mapreduce.reduce.memory.mb. Though after setting it I didn’t get the expected change. As the output was still 2.1 GB of 2.1 GB virtual memory used. Killing container Regards, Chris MacKenzie telephone: 0131 332 6967 email: stu...@chrismackenziephotography.co.uk corporate: www.chrismackenziephotography.co.uk http://www.chrismackenziephotography.co.uk/ http://plus.google.com/+ChrismackenziephotographyCoUk/posts http://www.linkedin.com/in/chrismackenziephotography/ From: Chris Mawata chris.maw...@gmail.com Reply-To: user@hadoop.apache.org Date: Thursday, 17 July 2014 13:36 To: Chris MacKenzie stu...@chrismackenziephotography.co.uk Cc: user@hadoop.apache.org Subject: Re: Configuration set up questions - Container killed on request. Exit code is 143 Hi Chris MacKenzie, I have a feeling (I am not familiar with the kind of work you are doing) that your application is memory intensive. 8 cores per node and only 12GB is tight. Try bumping up the yarn.nodemanager.vmem-pmem-ratio Chris Mawata On Wed, Jul 16, 2014 at 11:37 PM, Chris MacKenzie stu...@chrismackenziephotography.co.uk wrote: Hi, Thanks Chris Mawata I’m working through this myself, but wondered if anyone could point me in the right direction. I have attached my configs. I’m using hadoop 2.41 My system is: 32 Clusters 8 processors per machine 12 gb ram Available disk space per node 890 gb This is my current error: mapreduce.Job (Job.java:printTaskEvents(1441)) - Task Id : attempt_1405538067846_0006_r_00_1, Status : FAILED Container [pid=25848,containerID=container_1405538067846_0006_01_04] is running beyond virtual memory limits. Current usage: 439.0 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1405538067846_0006_01_04 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 25853 25848 25848 25848 (java) 2262 193 2268090368 112050 /usr/java/latest//bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx768m -Djava.io.tmpdir=/tmp/hadoop-cm469/nm-local-dir/usercache/cm469/appcache/ap plication_1405538067846_0006/container_1405538067846_0006_01_04/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/scratch/extra/cm469/hadoop-2.4.1/logs/userlog s/application_1405538067846_0006/container_1405538067846_0006_01_04 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 137.195.143.103 59056 attempt_1405538067846_0006_r_00_1 4 |- 25848 25423 25848 25848 (bash) 0 0 108613632 333 /bin/bash -c /usr/java/latest//bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx768m -Djava.io.tmpdir=/tmp/hadoop-cm469/nm-local-dir/usercache/cm469/appcache/ap plication_1405538067846_0006/container_1405538067846_0006_01_04/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/scratch/extra/cm469/hadoop-2.4.1/logs/userlog s/application_1405538067846_0006/container_1405538067846_0006_01_04 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 137.195.143.103 59056 attempt_1405538067846_0006_r_00_1 4 1/scratch/extra/cm469/hadoop-2.4.1/logs/userlogs/application_1405538067846 _0006/container_1405538067846_0006_01_04/stdout 2/scratch/extra/cm469/hadoop-2.4.1/logs/userlogs/application_1405538067846 _0006/container_1405538067846_0006_01_04/stderr Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 Regards, Chris MacKenzie telephone: 0131 332 6967 email: stu...@chrismackenziephotography.co.uk corporate: www.chrismackenziephotography.co.uk http://www.chrismackenziephotography.co.uk http://www.chrismackenziephotography.co.uk/ http://plus.google.com/+ChrismackenziephotographyCoUk/posts http://www.linkedin.com/in/chrismackenziephotography/ From: Chris Mawata chris.maw...@gmail.com Reply-To: user@hadoop.apache.org Date: Thursday, 17 July 2014 02:10 To: user@hadoop.apache.org Subject: Re: Can someone shed some light on this ? - java.io.IOException: Spill failed I would post
Re: Providing a file instead of a directory to a M/R job
No reason why not. And a permission explains why there is an error : missing access rights Bertrand Dechoux On Thu, Jul 17, 2014 at 4:58 PM, Shahab Yunus shahab.yu...@gmail.com wrote: In MRv2 or Yarn is it possible to provide a complete path to a file instead of a directory to a mapreduce job? Usually we provide a list of directory paths by using FileInputFormat.addInputPath. Can we provide a path which is a full path to an actually file? I have tried it but getting unexpected permission errors. Thanks.
Re: Providing a file instead of a directory to a M/R job
That is what I thought so too but when I give the parent directory as the input path of that same file, it works. Perhaps I am messing something up. I am suing cloudera 4.6 btw. Meanwhile I have noticed that I can read files directly by using MultileInputs and that works fine. Regards, Shahab On Thu, Jul 17, 2014 at 11:23 AM, Bertrand Dechoux decho...@gmail.com wrote: No reason why not. And a permission explains why there is an error : missing access rights Bertrand Dechoux On Thu, Jul 17, 2014 at 4:58 PM, Shahab Yunus shahab.yu...@gmail.com wrote: In MRv2 or Yarn is it possible to provide a complete path to a file instead of a directory to a mapreduce job? Usually we provide a list of directory paths by using FileInputFormat.addInputPath. Can we provide a path which is a full path to an actually file? I have tried it but getting unexpected permission errors. Thanks.
Upgrading from 1.1.2 to 2.2.0
Has anyone upgraded directly from 1.1.2 to 2.2.0? If so, is there anything I should be concerned about? Thanks, Rich -- *Kernighan's Law* Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
unsubscribe
unsubscribe *Don Hilborn *Solutions Engineer, Hortonworks *Mobile: 832-444-5463* Email: *dhilb...@hortonworks.com dhilb...@hortonworks.com* Website: *http://www.hortonworks.com/ http://www.hortonworks.com/* Hortonworks where business data becomes business insight http://hortonworks.com/hadoop-modern-data-architecture/?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://facebook.com/hortonworks/?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://facebook.com/hortonworks/?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://twitter.com/hortonworks?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://twitter.com/hortonworks?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://hortonworks.com/blog/feed/?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://hortonworks.com/blog/feed/?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://www.youtube.com/user/hortonworks?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://www.youtube.com/user/hortonworks?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://www.linkedin.com/company/hortonworks?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://www.linkedin.com/company/hortonworks?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://www.twitter.com/aeng2?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: unsubscribe
Please send email to user-unsubscr...@hadoop.apache.org See http://hadoop.apache.org/mailing_lists.html#User On Thu, Jul 17, 2014 at 9:20 AM, Don Hilborn dhilb...@hortonworks.com wrote: unsubscribe *Don Hilborn *Solutions Engineer, Hortonworks *Mobile: 832-444-5463 832-444-5463* Email: *dhilb...@hortonworks.com dhilb...@hortonworks.com* Website: *http://www.hortonworks.com/ http://www.hortonworks.com/* Hortonworks where business data becomes business insight http://hortonworks.com/hadoop-modern-data-architecture/?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://facebook.com/hortonworks/?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://facebook.com/hortonworks/?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://twitter.com/hortonworks?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://twitter.com/hortonworks?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://hortonworks.com/blog/feed/?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://hortonworks.com/blog/feed/?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://www.youtube.com/user/hortonworks?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://www.youtube.com/user/hortonworks?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://www.linkedin.com/company/hortonworks?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://www.linkedin.com/company/hortonworks?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature http://www.twitter.com/aeng2?utm_source=WiseStamputm_medium=emailutm_term=utm_content=utm_campaign=signature CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: what changes needed for existing HDFS java client in order to work with kerberosed hadoop server ?
Hello Sophie, If you're using the HDFS lib like you said (i.e. obtaining an instance of FileSystem and using its methods), then I expect your code will just work with no code changes required when you start running against a secure cluster. The work of switching to a secured deployment with Kerberos is mostly a matter of configuration: http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/SecureMode.html This configuration is external to application code. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Jul 16, 2014 at 2:00 PM, Xiaohua Chen xiaohua.c...@gmail.com wrote: Hi Experts, I am new to Hadoop. I would like to get some help from you: Our current HDFS java client works fine with hadoop server which has NO Kerberos security enabled. We use HDFS lib e.g. org.apache.hadoop.fs.*. Now we need to change it to work with Kerberosed Hadoop server. Can you let me know what changes are needed ? Thanks and regards, Sophie -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: what changes needed for existing HDFS java client in order to work with kerberosed hadoop server ?
Hi Chris, Thank you very much for your reply. One more question: I come across org.apache.hadoop.security.SecurityUtil class(http://hadoop.apache.org/docs/stable1/api/index.html?org/apache/hadoop/security/SecurityUtil.html) and it provides a couple of login methods e.g. login(Configuration conf, String keytabFileKey, String userNameKey) . So if Kerberos kinit utility is not available from client workstation where our java client is deployed , do you think the above SecurityUtil.login(...) can help our application code to authenticate the user defined through the userNameKey argument and its credential is provided through keyTab file ? Thanks again your help! Best Regards, Sophie On Thu, Jul 17, 2014 at 10:42 AM, Chris Nauroth cnaur...@hortonworks.com wrote: Hello Sophie, If you're using the HDFS lib like you said (i.e. obtaining an instance of FileSystem and using its methods), then I expect your code will just work with no code changes required when you start running against a secure cluster. The work of switching to a secured deployment with Kerberos is mostly a matter of configuration: http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/SecureMode.html This configuration is external to application code. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Jul 16, 2014 at 2:00 PM, Xiaohua Chen xiaohua.c...@gmail.com wrote: Hi Experts, I am new to Hadoop. I would like to get some help from you: Our current HDFS java client works fine with hadoop server which has NO Kerberos security enabled. We use HDFS lib e.g. org.apache.hadoop.fs.*. Now we need to change it to work with Kerberosed Hadoop server. Can you let me know what changes are needed ? Thanks and regards, Sophie CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: what changes needed for existing HDFS java client in order to work with kerberosed hadoop server ?
Hi Sophie, Yes, you could authenticate via SecurityUtil#login, which is a convenience wrapper over UserGroupInformation#loginUserFromKeytab. This is essentially what daemons like the NameNode do. However, you might find that it's best overall to get kinit deployed to your client machines. For example, the CLI commands like hdfs dfs -ls aren't coded to do an explicit login like this, so you'll really need kinit available if users on the client machines want to use the CLI. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Jul 17, 2014 at 2:45 PM, Xiaohua Chen xiaohua.c...@gmail.com wrote: Hi Chris, Thank you very much for your reply. One more question: I come across org.apache.hadoop.security.SecurityUtil class( http://hadoop.apache.org/docs/stable1/api/index.html?org/apache/hadoop/security/SecurityUtil.html ) and it provides a couple of login methods e.g. login(Configuration conf, String keytabFileKey, String userNameKey) . So if Kerberos kinit utility is not available from client workstation where our java client is deployed , do you think the above SecurityUtil.login(...) can help our application code to authenticate the user defined through the userNameKey argument and its credential is provided through keyTab file ? Thanks again your help! Best Regards, Sophie On Thu, Jul 17, 2014 at 10:42 AM, Chris Nauroth cnaur...@hortonworks.com wrote: Hello Sophie, If you're using the HDFS lib like you said (i.e. obtaining an instance of FileSystem and using its methods), then I expect your code will just work with no code changes required when you start running against a secure cluster. The work of switching to a secured deployment with Kerberos is mostly a matter of configuration: http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/SecureMode.html This configuration is external to application code. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Jul 16, 2014 at 2:00 PM, Xiaohua Chen xiaohua.c...@gmail.com wrote: Hi Experts, I am new to Hadoop. I would like to get some help from you: Our current HDFS java client works fine with hadoop server which has NO Kerberos security enabled. We use HDFS lib e.g. org.apache.hadoop.fs.*. Now we need to change it to work with Kerberosed Hadoop server. Can you let me know what changes are needed ? Thanks and regards, Sophie CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: what changes needed for existing HDFS java client in order to work with kerberosed hadoop server ?
Unsubscribe On Jul 16, 2014 5:00 PM, Xiaohua Chen xiaohua.c...@gmail.com wrote: Hi Experts, I am new to Hadoop. I would like to get some help from you: Our current HDFS java client works fine with hadoop server which has NO Kerberos security enabled. We use HDFS lib e.g. org.apache.hadoop.fs.*. Now we need to change it to work with Kerberosed Hadoop server. Can you let me know what changes are needed ? Thanks and regards, Sophie
Re: what changes needed for existing HDFS java client in order to work with kerberosed hadoop server ?
Thanks Chris for the very helpful reply. Now I understand the preferred way is to use kinit. Do you mind to share: what is the road map for Hadoop authentication in the near future ? Specifically I understand the latest released hadoop supports Kerberos protocol for authentication, do you know if hadoop has any plan to support other authenticators in the foreseeable future? Thanks and regards! Sophie On Thu, Jul 17, 2014 at 4:14 PM, Chris Nauroth cnaur...@hortonworks.com wrote: Hi Sophie, Yes, you could authenticate via SecurityUtil#login, which is a convenience wrapper over UserGroupInformation#loginUserFromKeytab. This is essentially what daemons like the NameNode do. However, you might find that it's best overall to get kinit deployed to your client machines. For example, the CLI commands like hdfs dfs -ls aren't coded to do an explicit login like this, so you'll really need kinit available if users on the client machines want to use the CLI. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Jul 17, 2014 at 2:45 PM, Xiaohua Chen xiaohua.c...@gmail.com wrote: Hi Chris, Thank you very much for your reply. One more question: I come across org.apache.hadoop.security.SecurityUtil class(http://hadoop.apache.org/docs/stable1/api/index.html?org/apache/hadoop/security/SecurityUtil.html) and it provides a couple of login methods e.g. login(Configuration conf, String keytabFileKey, String userNameKey) . So if Kerberos kinit utility is not available from client workstation where our java client is deployed , do you think the above SecurityUtil.login(...) can help our application code to authenticate the user defined through the userNameKey argument and its credential is provided through keyTab file ? Thanks again your help! Best Regards, Sophie On Thu, Jul 17, 2014 at 10:42 AM, Chris Nauroth cnaur...@hortonworks.com wrote: Hello Sophie, If you're using the HDFS lib like you said (i.e. obtaining an instance of FileSystem and using its methods), then I expect your code will just work with no code changes required when you start running against a secure cluster. The work of switching to a secured deployment with Kerberos is mostly a matter of configuration: http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/SecureMode.html This configuration is external to application code. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Jul 16, 2014 at 2:00 PM, Xiaohua Chen xiaohua.c...@gmail.com wrote: Hi Experts, I am new to Hadoop. I would like to get some help from you: Our current HDFS java client works fine with hadoop server which has NO Kerberos security enabled. We use HDFS lib e.g. org.apache.hadoop.fs.*. Now we need to change it to work with Kerberosed Hadoop server. Can you let me know what changes are needed ? Thanks and regards, Sophie CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
HDFS input/output error - fuse mount
Hello, Hadoop package installed: hadoop-0.20-0.20.2+737-33.osg.el5.noarch Operating System: CentOS release 5.8 (Final) I am mounting HDFS from my namenode to another node with fuse. After mounting to /hdfs, any attempts to 'ls', 'cd', or use 'hadoop fs' leads to the below output. $ls /hdfs *ls: /hdfs: Input/output error* $hadoop fs -ls *Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/fs/FsShell : Unsupported major.minor version 51.0at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method)at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247)Could not find the main class: org.apache.hadoop.fs.FsShell. Program will exit.* I have attempted to mount /hdfs manually in debug mode and then attempted to access /hdfs from a different terminal. This is the output. The namenode is *glados*. The server where /hdfs is being mounted is *glados2*. $hdfs -oserver=glados,port=9000,rdbuffer=131072,allow_other /hdfs -d *fuse-dfs ignoring option allow_otherERROR fuse_options.c:162 fuse-dfs didn't recognize /hdfs,-2fuse-dfs ignoring option -dunique: 1, opcode: INIT (26), nodeid: 0, insize: 56INIT: 7.10flags=0x000bmax_readahead=0x0002INFO fuse_init.c:115 Mounting glados:9000Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/conf/Configuration : Unsupported major.minor version 51.0at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method)at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247)Can't construct instance of class org.apache.hadoop.conf.ConfigurationERROR fuse_init.c:127 Unable to establish test connection to server INIT: 7.8 flags=0x0001 max_readahead=0x0002 max_write=0x0002 unique: 1, error: 0 (Success), outsize: 40unique: 2, opcode: GETATTR (3), nodeid: 1, insize: 56Exception in thread Thread-0 java.lang.UnsupportedClassVersionError: org/apache/hadoop/conf/Configuration : Unsupported major.minor version 51.0at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method)at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247)Can't construct instance of class org.apache.hadoop.conf.ConfigurationERROR fuse_connect.c:83 Unable to instantiate a filesystem for user027ERROR fuse_impls_getattr.c:40 Could not connect to glados:9000 unique: 2, error: -5 (Input/output error), outsize: 16unique: 3, opcode: GETATTR (3), nodeid: 1, insize: 56* I adopted this system after this was already setup, so I do not know which java version was used during install. Currently I'm using: $java -version *java version 1.6.0_45Java(TM) SE Runtime Environment (build 1.6.0_45-b06)Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)* $java -version *java version 1.6.0_45Java(TM) SE Runtime Environment (build 1.6.0_45-b06)Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)* Is my java version really the cause of this issue? What is the correct java version to be used for this version of hadoop. I have also tried 1.6.0_31
Re: HDFS input/output error - fuse mount
Version 51 ia Java 7 Chris On Jul 17, 2014 7:50 PM, andrew touchet adt...@latech.edu wrote: Hello, Hadoop package installed: hadoop-0.20-0.20.2+737-33.osg.el5.noarch Operating System: CentOS release 5.8 (Final) I am mounting HDFS from my namenode to another node with fuse. After mounting to /hdfs, any attempts to 'ls', 'cd', or use 'hadoop fs' leads to the below output. $ls /hdfs *ls: /hdfs: Input/output error* $hadoop fs -ls *Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/fs/FsShell : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: org.apache.hadoop.fs.FsShell. Program will exit.* I have attempted to mount /hdfs manually in debug mode and then attempted to access /hdfs from a different terminal. This is the output. The namenode is *glados*. The server where /hdfs is being mounted is *glados2*. $hdfs -oserver=glados,port=9000,rdbuffer=131072,allow_other /hdfs -d *fuse-dfs ignoring option allow_otherERROR fuse_options.c:162 fuse-dfs didn't recognize /hdfs,-2fuse-dfs ignoring option -d unique: 1, opcode: INIT (26), nodeid: 0, insize: 56INIT: 7.10flags=0x000bmax_readahead=0x0002INFO fuse_init.c:115 Mounting glados:9000Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/conf/Configuration : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Can't construct instance of class org.apache.hadoop.conf.ConfigurationERROR fuse_init.c:127 Unable to establish test connection to server INIT: 7.8 flags=0x0001 max_readahead=0x0002 max_write=0x0002 unique: 1, error: 0 (Success), outsize: 40unique: 2, opcode: GETATTR (3), nodeid: 1, insize: 56Exception in thread Thread-0 java.lang.UnsupportedClassVersionError: org/apache/hadoop/conf/Configuration : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Can't construct instance of class org.apache.hadoop.conf.ConfigurationERROR fuse_connect.c:83 Unable to instantiate a filesystem for user027ERROR fuse_impls_getattr.c:40 Could not connect to glados:9000 unique: 2, error: -5 (Input/output error), outsize: 16 unique: 3, opcode: GETATTR (3), nodeid: 1, insize: 56* I adopted this system after this was already setup, so I do not know which java version was used during install. Currently I'm using: $java -version *java version 1.6.0_45Java(TM) SE Runtime Environment (build 1.6.0_45-b06)Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)* $java -version *java version 1.6.0_45 Java(TM) SE Runtime Environment (build 1.6.0_45-b06)Java HotSpot(TM) 64-Bit Server
Re: what changes needed for existing HDFS java client in order to work with kerberosed hadoop server ?
I'm not sure if this directly answers your question, but you might try taking a look at issue HADOOP-9671 and the various issues that are linked to it: https://issues.apache.org/jira/browse/HADOOP-9671 Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Jul 17, 2014 at 4:30 PM, Xiaohua Chen xiaohua.c...@gmail.com wrote: Thanks Chris for the very helpful reply. Now I understand the preferred way is to use kinit. Do you mind to share: what is the road map for Hadoop authentication in the near future ? Specifically I understand the latest released hadoop supports Kerberos protocol for authentication, do you know if hadoop has any plan to support other authenticators in the foreseeable future? Thanks and regards! Sophie On Thu, Jul 17, 2014 at 4:14 PM, Chris Nauroth cnaur...@hortonworks.com wrote: Hi Sophie, Yes, you could authenticate via SecurityUtil#login, which is a convenience wrapper over UserGroupInformation#loginUserFromKeytab. This is essentially what daemons like the NameNode do. However, you might find that it's best overall to get kinit deployed to your client machines. For example, the CLI commands like hdfs dfs -ls aren't coded to do an explicit login like this, so you'll really need kinit available if users on the client machines want to use the CLI. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Jul 17, 2014 at 2:45 PM, Xiaohua Chen xiaohua.c...@gmail.com wrote: Hi Chris, Thank you very much for your reply. One more question: I come across org.apache.hadoop.security.SecurityUtil class( http://hadoop.apache.org/docs/stable1/api/index.html?org/apache/hadoop/security/SecurityUtil.html ) and it provides a couple of login methods e.g. login(Configuration conf, String keytabFileKey, String userNameKey) . So if Kerberos kinit utility is not available from client workstation where our java client is deployed , do you think the above SecurityUtil.login(...) can help our application code to authenticate the user defined through the userNameKey argument and its credential is provided through keyTab file ? Thanks again your help! Best Regards, Sophie On Thu, Jul 17, 2014 at 10:42 AM, Chris Nauroth cnaur...@hortonworks.com wrote: Hello Sophie, If you're using the HDFS lib like you said (i.e. obtaining an instance of FileSystem and using its methods), then I expect your code will just work with no code changes required when you start running against a secure cluster. The work of switching to a secured deployment with Kerberos is mostly a matter of configuration: http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/SecureMode.html This configuration is external to application code. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Jul 16, 2014 at 2:00 PM, Xiaohua Chen xiaohua.c...@gmail.com wrote: Hi Experts, I am new to Hadoop. I would like to get some help from you: Our current HDFS java client works fine with hadoop server which has NO Kerberos security enabled. We use HDFS lib e.g. org.apache.hadoop.fs.*. Now we need to change it to work with Kerberosed Hadoop server. Can you let me know what changes are needed ? Thanks and regards, Sophie CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination,
Re: HDFS input/output error - fuse mount
Hi Chris, I tried to mount /hdfs with java versions below but there was no change in output. jre-7u21 jdk-7u21 jdk-7u55 jdk1.6.0_31 jdk1.6.0_45 On Thu, Jul 17, 2014 at 6:56 PM, Chris Mawata chris.maw...@gmail.com wrote: Version 51 ia Java 7 Chris On Jul 17, 2014 7:50 PM, andrew touchet adt...@latech.edu wrote: Hello, Hadoop package installed: hadoop-0.20-0.20.2+737-33.osg.el5.noarch Operating System: CentOS release 5.8 (Final) I am mounting HDFS from my namenode to another node with fuse. After mounting to /hdfs, any attempts to 'ls', 'cd', or use 'hadoop fs' leads to the below output. $ls /hdfs *ls: /hdfs: Input/output error* $hadoop fs -ls *Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/fs/FsShell : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: org.apache.hadoop.fs.FsShell. Program will exit.* I have attempted to mount /hdfs manually in debug mode and then attempted to access /hdfs from a different terminal. This is the output. The namenode is *glados*. The server where /hdfs is being mounted is *glados2*. $hdfs -oserver=glados,port=9000,rdbuffer=131072,allow_other /hdfs -d *fuse-dfs ignoring option allow_otherERROR fuse_options.c:162 fuse-dfs didn't recognize /hdfs,-2fuse-dfs ignoring option -d unique: 1, opcode: INIT (26), nodeid: 0, insize: 56INIT: 7.10flags=0x000bmax_readahead=0x0002INFO fuse_init.c:115 Mounting glados:9000Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/conf/Configuration : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Can't construct instance of class org.apache.hadoop.conf.ConfigurationERROR fuse_init.c:127 Unable to establish test connection to server INIT: 7.8 flags=0x0001 max_readahead=0x0002 max_write=0x0002 unique: 1, error: 0 (Success), outsize: 40unique: 2, opcode: GETATTR (3), nodeid: 1, insize: 56Exception in thread Thread-0 java.lang.UnsupportedClassVersionError: org/apache/hadoop/conf/Configuration : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Can't construct instance of class org.apache.hadoop.conf.ConfigurationERROR fuse_connect.c:83 Unable to instantiate a filesystem for user027ERROR fuse_impls_getattr.c:40 Could not connect to glados:9000 unique: 2, error: -5 (Input/output error), outsize: 16 unique: 3, opcode: GETATTR (3), nodeid: 1, insize: 56* I adopted this system after this was already setup, so I do not know which java version was used during install. Currently I'm using: $java -version *java version 1.6.0_45Java(TM) SE Runtime
Re: Re: HDFS input/output error - fuse mount
I think you first confirm you local java version , Some liux will pre-installed java ,that version is very low firefly...@gmail.com From: andrew touchet Date: 2014-07-18 09:06 To: user Subject: Re: HDFS input/output error - fuse mount Hi Chris, I tried to mount /hdfs with java versions below but there was no change in output. jre-7u21 jdk-7u21 jdk-7u55 jdk1.6.0_31 jdk1.6.0_45 On Thu, Jul 17, 2014 at 6:56 PM, Chris Mawata chris.maw...@gmail.com wrote: Version 51 ia Java 7 Chris On Jul 17, 2014 7:50 PM, andrew touchet adt...@latech.edu wrote: Hello, Hadoop package installed: hadoop-0.20-0.20.2+737-33.osg.el5.noarch Operating System: CentOS release 5.8 (Final) I am mounting HDFS from my namenode to another node with fuse. After mounting to /hdfs, any attempts to 'ls', 'cd', or use 'hadoop fs' leads to the below output. $ls /hdfs ls: /hdfs: Input/output error $hadoop fs -ls Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/fs/FsShell : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) at java.lang.ClassLoader.defineClass(ClassLoader.java:615) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) at java.net.URLClassLoader.access$000(URLClassLoader.java:58) at java.net.URLClassLoader$1.run(URLClassLoader.java:197) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: org.apache.hadoop.fs.FsShell. Program will exit. I have attempted to mount /hdfs manually in debug mode and then attempted to access /hdfs from a different terminal. This is the output. The namenode is glados. The server where /hdfs is being mounted is glados2. $hdfs -oserver=glados,port=9000,rdbuffer=131072,allow_other /hdfs -d fuse-dfs ignoring option allow_other ERROR fuse_options.c:162 fuse-dfs didn't recognize /hdfs,-2 fuse-dfs ignoring option -d unique: 1, opcode: INIT (26), nodeid: 0, insize: 56 INIT: 7.10 flags=0x000b max_readahead=0x0002 INFO fuse_init.c:115 Mounting glados:9000 Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/conf/Configuration : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) at java.lang.ClassLoader.defineClass(ClassLoader.java:615) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) at java.net.URLClassLoader.access$000(URLClassLoader.java:58) at java.net.URLClassLoader$1.run(URLClassLoader.java:197) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Can't construct instance of class org.apache.hadoop.conf.Configuration ERROR fuse_init.c:127 Unable to establish test connection to server INIT: 7.8 flags=0x0001 max_readahead=0x0002 max_write=0x0002 unique: 1, error: 0 (Success), outsize: 40 unique: 2, opcode: GETATTR (3), nodeid: 1, insize: 56 Exception in thread Thread-0 java.lang.UnsupportedClassVersionError: org/apache/hadoop/conf/Configuration : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) at java.lang.ClassLoader.defineClass(ClassLoader.java:615) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) at java.net.URLClassLoader.access$000(URLClassLoader.java:58) at java.net.URLClassLoader$1.run(URLClassLoader.java:197) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Can't construct instance of class org.apache.hadoop.conf.Configuration ERROR fuse_connect.c:83 Unable to instantiate a filesystem for user027 ERROR fuse_impls_getattr.c:40 Could not connect to glados:9000 unique: 2, error: -5 (Input/output error), outsize: 16 unique: 3, opcode: GETATTR (3), nodeid: 1, insize: 56 I adopted this system
Re: Configuration set up questions - Container killed on request. Exit code is 143
Hi Chris MacKenzie, Since your output is still As the output was still 2.1 GB of 2.1 GB virtual memory used. Killing I guess yarn.nodemanager.vmem-pmem-ratio doesn't take effect, if it take effect, it should be xxGB of 10GB virtual memory used ... Have your tried restart NM after configure that option? Thanks, Wangda On Thu, Jul 17, 2014 at 11:15 PM, Chris Mawata chris.maw...@gmail.com wrote: Another thing to try is smaller input splits if your data can be broken up into smaller files that can be independently processed. That way s you get more but smaller map tasks. You could also use more but smaller reducers. The many files will tax your NameNode more but you might get to use all you cores. On Jul 17, 2014 9:07 AM, Chris MacKenzie stu...@chrismackenziephotography.co.uk wrote: Hi Chris, Thanks for getting back to me. I will set that value to 10 I have just tried this. https://support.gopivotal.com/hc/en-us/articles/201462036-Mapreduce-YARN-Me mory-Parameters https://support.gopivotal.com/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters Setting both to mapreduce.map.memory.mb mapreduce.reduce.memory.mb. Though after setting it I didn’t get the expected change. As the output was still 2.1 GB of 2.1 GB virtual memory used. Killing container Regards, Chris MacKenzie telephone: 0131 332 6967 email: stu...@chrismackenziephotography.co.uk corporate: www.chrismackenziephotography.co.uk http://www.chrismackenziephotography.co.uk/ http://plus.google.com/+ChrismackenziephotographyCoUk/posts http://www.linkedin.com/in/chrismackenziephotography/ From: Chris Mawata chris.maw...@gmail.com Reply-To: user@hadoop.apache.org Date: Thursday, 17 July 2014 13:36 To: Chris MacKenzie stu...@chrismackenziephotography.co.uk Cc: user@hadoop.apache.org Subject: Re: Configuration set up questions - Container killed on request. Exit code is 143 Hi Chris MacKenzie, I have a feeling (I am not familiar with the kind of work you are doing) that your application is memory intensive. 8 cores per node and only 12GB is tight. Try bumping up the yarn.nodemanager.vmem-pmem-ratio Chris Mawata On Wed, Jul 16, 2014 at 11:37 PM, Chris MacKenzie stu...@chrismackenziephotography.co.uk wrote: Hi, Thanks Chris Mawata I’m working through this myself, but wondered if anyone could point me in the right direction. I have attached my configs. I’m using hadoop 2.41 My system is: 32 Clusters 8 processors per machine 12 gb ram Available disk space per node 890 gb This is my current error: mapreduce.Job (Job.java:printTaskEvents(1441)) - Task Id : attempt_1405538067846_0006_r_00_1, Status : FAILED Container [pid=25848,containerID=container_1405538067846_0006_01_04] is running beyond virtual memory limits. Current usage: 439.0 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1405538067846_0006_01_04 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 25853 25848 25848 25848 (java) 2262 193 2268090368 112050 /usr/java/latest//bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx768m -Djava.io.tmpdir=/tmp/hadoop-cm469/nm-local-dir/usercache/cm469/appcache/ap plication_1405538067846_0006/container_1405538067846_0006_01_04/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/scratch/extra/cm469/hadoop-2.4.1/logs/userlog s/application_1405538067846_0006/container_1405538067846_0006_01_04 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 137.195.143.103 59056 attempt_1405538067846_0006_r_00_1 4 |- 25848 25423 25848 25848 (bash) 0 0 108613632 333 /bin/bash -c /usr/java/latest//bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx768m -Djava.io.tmpdir=/tmp/hadoop-cm469/nm-local-dir/usercache/cm469/appcache/ap plication_1405538067846_0006/container_1405538067846_0006_01_04/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/scratch/extra/cm469/hadoop-2.4.1/logs/userlog s/application_1405538067846_0006/container_1405538067846_0006_01_04 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 137.195.143.103 59056 attempt_1405538067846_0006_r_00_1 4 1/scratch/extra/cm469/hadoop-2.4.1/logs/userlogs/application_1405538067846 _0006/container_1405538067846_0006_01_04/stdout 2/scratch/extra/cm469/hadoop-2.4.1/logs/userlogs/application_1405538067846 _0006/container_1405538067846_0006_01_04/stderr Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 Regards, Chris MacKenzie telephone: 0131 332 6967 email: stu...@chrismackenziephotography.co.uk
Re: Re: HDFS input/output error - fuse mount
Hi Fireflyhoo, Below I follow the symbolic links for the jdk-7u21. These links are changed accordingly as I change between versions. Also, I have 8 datanodes and 2 other various servers that are capable of mounting /hdfs. So it is just this server is an issue. $ java -version java version 1.7.0_21 Java(TM) SE Runtime Environment (build 1.7.0_21-b11) Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode) java $ ls -l `which java` *lrwxrwxrwx 1 root root 26 Jul 17 19:50 /usr/bin/java - /usr/java/default/bin/java* $ ls -l /usr/java/default *lrwxrwxrwx 1 root root 16 Jul 17 19:50 /usr/java/default - /usr/java/latest* $ ls -l /usr/java/latest *lrwxrwxrwx 1 root root 21 Jul 17 20:29 /usr/java/latest - /usr/java/jdk1.7.0_21* jar $ ls -l `which jar` *lrwxrwxrwx 1 root root 21 Jul 17 20:18 /usr/bin/jar - /etc/alternatives/jar* $ ls -l /etc/alternatives/jar *lrwxrwxrwx 1 root root 29 Jul 17 20:26 /etc/alternatives/jar - /usr/java/jdk1.7.0_21/bin/jar* javac $ ls -l `which javac` *lrwxrwxrwx 1 root root 23 Jul 17 20:18 /usr/bin/javac - /etc/alternatives/javac* $ ls -l /etc/alternatives/javac *lrwxrwxrwx 1 root root 31 Jul 17 20:26 /etc/alternatives/javac - /usr/java/jdk1.7.0_21/bin/javac* Now that I've tried version from 6 7, I'm really not sure what is causing this issue. On Thu, Jul 17, 2014 at 8:21 PM, firefly...@gmail.com firefly...@gmail.com wrote: I think you first confirm you local java version , Some liux will pre-installed java ,that version is very low -- firefly...@gmail.com *From:* andrew touchet adt...@latech.edu *Date:* 2014-07-18 09:06 *To:* user user@hadoop.apache.org *Subject:* Re: HDFS input/output error - fuse mount Hi Chris, I tried to mount /hdfs with java versions below but there was no change in output. jre-7u21 jdk-7u21 jdk-7u55 jdk1.6.0_31 jdk1.6.0_45 On Thu, Jul 17, 2014 at 6:56 PM, Chris Mawata chris.maw...@gmail.com wrote: Version 51 ia Java 7 Chris On Jul 17, 2014 7:50 PM, andrew touchet adt...@latech.edu wrote: Hello, Hadoop package installed: hadoop-0.20-0.20.2+737-33.osg.el5.noarch Operating System: CentOS release 5.8 (Final) I am mounting HDFS from my namenode to another node with fuse. After mounting to /hdfs, any attempts to 'ls', 'cd', or use 'hadoop fs' leads to the below output. $ls /hdfs *ls: /hdfs: Input/output error* $hadoop fs -ls *Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/fs/FsShell : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: org.apache.hadoop.fs.FsShell. Program will exit.* I have attempted to mount /hdfs manually in debug mode and then attempted to access /hdfs from a different terminal. This is the output. The namenode is *glados*. The server where /hdfs is being mounted is *glados2*. $hdfs -oserver=glados,port=9000,rdbuffer=131072,allow_other /hdfs -d *fuse-dfs ignoring option allow_otherERROR fuse_options.c:162 fuse-dfs didn't recognize /hdfs,-2fuse-dfs ignoring option -d unique: 1, opcode: INIT (26), nodeid: 0, insize: 56INIT: 7.10flags=0x000bmax_readahead=0x0002INFO fuse_init.c:115 Mounting glados:9000Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/conf/Configuration : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Can't construct instance of class
Re: Replace a block with a new one
How about write a new block with new checksum file, and replace the old block file and checksum file both? 2014-07-17 19:34 GMT+08:00 Wellington Chevreuil wellington.chevre...@gmail.com: Hi, there's no way to do that, as HDFS does not provide file updates features. You'll need to write a new file with the changes. Notice that even if you manage to find the physical block replica files on the disk, corresponding to the part of the file you want to change, you can't simply update it manually, as this would give a different checksum, making HDFS mark such blocks as corrupt. Regards, Wellington. On 17 Jul 2014, at 10:50, Zesheng Wu wuzeshen...@gmail.com wrote: Hi guys, I recently encounter a scenario which needs to replace an exist block with a newly written block The most straightforward way to finish may be like this: Suppose the original file is A, and we write a new file B which is composed by the new data blocks, then we merge A and B to C which is the file we wanted The obvious shortcoming of this method is wasting of network bandwidth I'm wondering whether there is a way to replace the old block by the new block directly. Any thoughts? -- Best Wishes! Yours, Zesheng -- Best Wishes! Yours, Zesheng
Re: Re: HDFS input/output error - fuse mount
Check the JAVA_HOME environment variable as well ... On Jul 17, 2014 9:46 PM, andrew touchet adt...@latech.edu wrote: Hi Fireflyhoo, Below I follow the symbolic links for the jdk-7u21. These links are changed accordingly as I change between versions. Also, I have 8 datanodes and 2 other various servers that are capable of mounting /hdfs. So it is just this server is an issue. $ java -version java version 1.7.0_21 Java(TM) SE Runtime Environment (build 1.7.0_21-b11) Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode) java $ ls -l `which java` *lrwxrwxrwx 1 root root 26 Jul 17 19:50 /usr/bin/java - /usr/java/default/bin/java* $ ls -l /usr/java/default *lrwxrwxrwx 1 root root 16 Jul 17 19:50 /usr/java/default - /usr/java/latest* $ ls -l /usr/java/latest *lrwxrwxrwx 1 root root 21 Jul 17 20:29 /usr/java/latest - /usr/java/jdk1.7.0_21* jar $ ls -l `which jar` *lrwxrwxrwx 1 root root 21 Jul 17 20:18 /usr/bin/jar - /etc/alternatives/jar* $ ls -l /etc/alternatives/jar *lrwxrwxrwx 1 root root 29 Jul 17 20:26 /etc/alternatives/jar - /usr/java/jdk1.7.0_21/bin/jar* javac $ ls -l `which javac` *lrwxrwxrwx 1 root root 23 Jul 17 20:18 /usr/bin/javac - /etc/alternatives/javac* $ ls -l /etc/alternatives/javac *lrwxrwxrwx 1 root root 31 Jul 17 20:26 /etc/alternatives/javac - /usr/java/jdk1.7.0_21/bin/javac* Now that I've tried version from 6 7, I'm really not sure what is causing this issue. On Thu, Jul 17, 2014 at 8:21 PM, firefly...@gmail.com firefly...@gmail.com wrote: I think you first confirm you local java version , Some liux will pre-installed java ,that version is very low -- firefly...@gmail.com *From:* andrew touchet adt...@latech.edu *Date:* 2014-07-18 09:06 *To:* user user@hadoop.apache.org *Subject:* Re: HDFS input/output error - fuse mount Hi Chris, I tried to mount /hdfs with java versions below but there was no change in output. jre-7u21 jdk-7u21 jdk-7u55 jdk1.6.0_31 jdk1.6.0_45 On Thu, Jul 17, 2014 at 6:56 PM, Chris Mawata chris.maw...@gmail.com wrote: Version 51 ia Java 7 Chris On Jul 17, 2014 7:50 PM, andrew touchet adt...@latech.edu wrote: Hello, Hadoop package installed: hadoop-0.20-0.20.2+737-33.osg.el5.noarch Operating System: CentOS release 5.8 (Final) I am mounting HDFS from my namenode to another node with fuse. After mounting to /hdfs, any attempts to 'ls', 'cd', or use 'hadoop fs' leads to the below output. $ls /hdfs *ls: /hdfs: Input/output error* $hadoop fs -ls *Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/fs/FsShell : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: org.apache.hadoop.fs.FsShell. Program will exit.* I have attempted to mount /hdfs manually in debug mode and then attempted to access /hdfs from a different terminal. This is the output. The namenode is *glados*. The server where /hdfs is being mounted is *glados2*. $hdfs -oserver=glados,port=9000,rdbuffer=131072,allow_other /hdfs -d *fuse-dfs ignoring option allow_otherERROR fuse_options.c:162 fuse-dfs didn't recognize /hdfs,-2fuse-dfs ignoring option -d unique: 1, opcode: INIT (26), nodeid: 0, insize: 56INIT: 7.10flags=0x000bmax_readahead=0x0002INFO fuse_init.c:115 Mounting glados:9000Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/conf/Configuration : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at
Re: Re: HDFS input/output error - fuse mount
Yet another place to check -- in the hadoop-env.sh file there is also a JAVA_HOME setting. Chris On Jul 17, 2014 9:46 PM, andrew touchet adt...@latech.edu wrote: Hi Fireflyhoo, Below I follow the symbolic links for the jdk-7u21. These links are changed accordingly as I change between versions. Also, I have 8 datanodes and 2 other various servers that are capable of mounting /hdfs. So it is just this server is an issue. $ java -version java version 1.7.0_21 Java(TM) SE Runtime Environment (build 1.7.0_21-b11) Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode) java $ ls -l `which java` *lrwxrwxrwx 1 root root 26 Jul 17 19:50 /usr/bin/java - /usr/java/default/bin/java* $ ls -l /usr/java/default *lrwxrwxrwx 1 root root 16 Jul 17 19:50 /usr/java/default - /usr/java/latest* $ ls -l /usr/java/latest *lrwxrwxrwx 1 root root 21 Jul 17 20:29 /usr/java/latest - /usr/java/jdk1.7.0_21* jar $ ls -l `which jar` *lrwxrwxrwx 1 root root 21 Jul 17 20:18 /usr/bin/jar - /etc/alternatives/jar* $ ls -l /etc/alternatives/jar *lrwxrwxrwx 1 root root 29 Jul 17 20:26 /etc/alternatives/jar - /usr/java/jdk1.7.0_21/bin/jar* javac $ ls -l `which javac` *lrwxrwxrwx 1 root root 23 Jul 17 20:18 /usr/bin/javac - /etc/alternatives/javac* $ ls -l /etc/alternatives/javac *lrwxrwxrwx 1 root root 31 Jul 17 20:26 /etc/alternatives/javac - /usr/java/jdk1.7.0_21/bin/javac* Now that I've tried version from 6 7, I'm really not sure what is causing this issue. On Thu, Jul 17, 2014 at 8:21 PM, firefly...@gmail.com firefly...@gmail.com wrote: I think you first confirm you local java version , Some liux will pre-installed java ,that version is very low -- firefly...@gmail.com *From:* andrew touchet adt...@latech.edu *Date:* 2014-07-18 09:06 *To:* user user@hadoop.apache.org *Subject:* Re: HDFS input/output error - fuse mount Hi Chris, I tried to mount /hdfs with java versions below but there was no change in output. jre-7u21 jdk-7u21 jdk-7u55 jdk1.6.0_31 jdk1.6.0_45 On Thu, Jul 17, 2014 at 6:56 PM, Chris Mawata chris.maw...@gmail.com wrote: Version 51 ia Java 7 Chris On Jul 17, 2014 7:50 PM, andrew touchet adt...@latech.edu wrote: Hello, Hadoop package installed: hadoop-0.20-0.20.2+737-33.osg.el5.noarch Operating System: CentOS release 5.8 (Final) I am mounting HDFS from my namenode to another node with fuse. After mounting to /hdfs, any attempts to 'ls', 'cd', or use 'hadoop fs' leads to the below output. $ls /hdfs *ls: /hdfs: Input/output error* $hadoop fs -ls *Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/fs/FsShell : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at java.lang.ClassLoader.loadClass(ClassLoader.java:306)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: org.apache.hadoop.fs.FsShell. Program will exit.* I have attempted to mount /hdfs manually in debug mode and then attempted to access /hdfs from a different terminal. This is the output. The namenode is *glados*. The server where /hdfs is being mounted is *glados2*. $hdfs -oserver=glados,port=9000,rdbuffer=131072,allow_other /hdfs -d *fuse-dfs ignoring option allow_otherERROR fuse_options.c:162 fuse-dfs didn't recognize /hdfs,-2fuse-dfs ignoring option -d unique: 1, opcode: INIT (26), nodeid: 0, insize: 56INIT: 7.10flags=0x000bmax_readahead=0x0002INFO fuse_init.c:115 Mounting glados:9000Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/hadoop/conf/Configuration : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)at java.lang.ClassLoader.defineClass(ClassLoader.java:615)at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)at java.net.URLClassLoader.access$000(URLClassLoader.java:58)at java.net.URLClassLoader$1.run(URLClassLoader.java:197)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190)at
umsubscribe
Original Message From: Wellington Chevreuil wellington.chevre...@gmail.com Sent: Thursday, July 17, 2014 04:34 AM To: user@hadoop.apache.org Subject: Re: Replace a block with a new one Hi, there's no way to do that, as HDFS does not provide file updates features. You'll need to write a new file with the changes. Notice that even if you manage to find the physical block replica files on the disk, corresponding to the part of the file you want to change, you can't simply update it manually, as this would give a different checksum, making HDFS mark such blocks as corrupt. Regards, Wellington. On 17 Jul 2014, at 10:50, Zesheng Wu wuzeshen...@gmail.com wrote: Hi guys, I recently encounter a scenario which needs to replace an exist block with a newly written block The most straightforward way to finish may be like this: Suppose the original file is A, and we write a new file B which is composed by the new data blocks, then we merge A and B to C which is the file we wanted The obvious shortcoming of this method is wasting of network bandwidth I'm wondering whether there is a way to replace the old block by the new block directly. Any thoughts? -- Best Wishes! Yours, Zesheng
unsubscribe
Regards, Suman++
Re: NFS Gateway readonly issue
Hello, I was able to reproduce the issue on latest hadoop trunk. Though for me, I could only delete files, deleting directories were correctly blocked. I have opened https://issues.apache.org/jira/browse/HDFS-6703 to further track the issue. Thanks for reporting! Regards, Abhiraj On Thu, Jul 10, 2014 at 3:09 AM, bigdatagroup bigdatagr...@itecons.it wrote: Hello, we are experiencing a strange issue with our Hadoop cluster implementing NFS Gateway. We exported our distributed filesystem with the following configuration (Managed by Cloudera Manager over CDH 5.0.1): property namedfs.nfs.exports.allowed.hosts/name value192.168.0.153 ro/value /property As you can see, we expect the exported FS to be read-only, but in fact we are able to delete files and folders stored on it (where the user has the correct permissions), from the client machine that mounted the FS. Other writing operations are correctly blocked. Hadoop Version in use: 2.3.0+cdh5.0.1+567 Bye. -- *Omar*