Multiple Part files

2014-07-17 Thread Natarajan, Prabakaran 1. (NSN - IN/Bangalore)
Hi After Map Reduce job, we are seeing multiple small part files in the output directory. We are using RC file format (snappy codec) 1) Do each part file will take 64MB block size? 2) How to merge these multiple RC format part files into one RC file? 3) What is the pros-cons of ha

Evaluation of cost of cluster

2014-07-17 Thread YIMEN YIMGA Gael
Hello Guys, I need your help to fix cost evaluation for a cluster. Characteristics are the followings : A cluster with 6 servers (1 Namenode, 1 Secondary namenode, 4 datanodes). The configuration is given below in the tables. COMPONENTS OF OUR NAMENODE MACHINE SIZE Hard Disk which will sto

Replace a block with a new one

2014-07-17 Thread Zesheng Wu
Hi guys, I recently encounter a scenario which needs to replace an exist block with a newly written block The most straightforward way to finish may be like this: Suppose the original file is A, and we write a new file B which is composed by the new data blocks, then we merge A and B to C which is

Re: Evaluation of cost of cluster

2014-07-17 Thread Bertrand Dechoux
Hi, The mail is now nicely formated but I would suggest you take the time to digest answers from the same question you already asked twice. https://www.mail-archive.com/search?l=user%40hadoop.apache.org&q=YIMEN+YIMGA+Gael https://www.mail-archive.com/user%40hadoop.apache.org/msg15411.html The ha

RE: Evaluation of cost of cluster

2014-07-17 Thread YIMEN YIMGA Gael
Hi, First, To produce this clear architecture, I used the answer from the question I asked last time. I thank all people who gave me the clue to do so. Second, I know that Hadoop mailing is not a shop. I would like just to use experience of some people here. I know that what I’m challenging tod

RE: Multiple Part files

2014-07-17 Thread Naganarasimha G R (Naga)
Hi Prabakaran, Multiple small part files in the output directory is because each reducer task output is coming as one part file. 1. Do each part file will take 64MB block size? Based on the output size of the reducer one part file is created. Filesize can be smaller s

Re: Evaluation of cost of cluster

2014-07-17 Thread Rob Vesse
Vendors should always be able to give you no obligation quotations based on your requirements (though you may have to fend off follow up sales calls afterwards) Or you can simply use the vendors websites many of which will have tools that allow you to configure a server thus giving you an estimate

Re: Replace a block with a new one

2014-07-17 Thread Wellington Chevreuil
Hi, there's no way to do that, as HDFS does not provide file updates features. You'll need to write a new file with the changes. Notice that even if you manage to find the physical block replica files on the disk, corresponding to the part of the file you want to change, you can't simply upda

Re: Configuration set up questions - Container killed on request. Exit code is 143

2014-07-17 Thread Chris Mawata
Hi Chris MacKenzie, I have a feeling (I am not familiar with the kind of work you are doing) that your application is memory intensive. 8 cores per node and only 12GB is tight. Try bumping up the yarn.nodemanager.vmem-pmem-ratio Chris Mawata On Wed, Jul 16, 2014 at 11:37 PM, Chris MacKenz

Re: Configuration set up questions - Container killed on request. Exit code is 143

2014-07-17 Thread Chris MacKenzie
Hi Chris, Thanks for getting back to me. I will set that value to 10 I have just tried this. https://support.gopivotal.com/hc/en-us/articles/201462036-Mapreduce-YARN-Me mory-Parameters Setting both to mapreduce.map.memory.mb mapreduce.reduce.memory.mb. Though after setting it I didn’t get the ex

Re: Multiple Part files

2014-07-17 Thread Peyman Mohajerian
Hadoop has a getmerge command ( http://hadoop.apache.org/docs/r0.19.1/hdfs_shell.html#getmerge) command, I'm not certain if it works with RC file, i think it should. So maybe you don't have to copy the files to local. On Thu, Jul 17, 2014 at 6:18 AM, Naganarasimha G R (Naga) < garlanaganarasi...@

Providing a file instead of a directory to a M/R job

2014-07-17 Thread Shahab Yunus
In MRv2 or Yarn is it possible to provide a complete path to a file instead of a directory to a mapreduce job? Usually we provide a list of directory paths by using FileInputFormat.addInputPath. Can we provide a path which is a full path to an actually file? I have tried it but getting unexpected

Re: Configuration set up questions - Container killed on request. Exit code is 143

2014-07-17 Thread Chris Mawata
Another thing to try is smaller input splits if your data can be broken up into smaller files that can be independently processed. That way s you get more but smaller map tasks. You could also use more but smaller reducers. The many files will tax your NameNode more but you might get to use all yo

Re: Providing a file instead of a directory to a M/R job

2014-07-17 Thread Bertrand Dechoux
No reason why not. And a permission explains why there is an error : missing access rights Bertrand Dechoux On Thu, Jul 17, 2014 at 4:58 PM, Shahab Yunus wrote: > In MRv2 or Yarn is it possible to provide a complete path to a file > instead of a directory to a mapreduce job? > > Usually we

Re: Providing a file instead of a directory to a M/R job

2014-07-17 Thread Shahab Yunus
That is what I thought so too but when I give the parent directory as the input path of that same file, it works. Perhaps I am messing something up. I am suing cloudera 4.6 btw. Meanwhile I have noticed that I can read files directly by using MultileInputs and that works fine. Regards, Shahab O

Upgrading from 1.1.2 to 2.2.0

2014-07-17 Thread Rich Haase
Has anyone upgraded directly from 1.1.2 to 2.2.0? If so, is there anything I should be concerned about? Thanks, Rich -- *Kernighan's Law* "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not sm

unsubscribe

2014-07-17 Thread Don Hilborn
unsubscribe *Don Hilborn *Solutions Engineer, Hortonworks *Mobile: 832-444-5463* Email: *dhilb...@hortonworks.com * Website: *http://www.hortonworks.com/ * Hortonworks where business data becomes business insight

Re: unsubscribe

2014-07-17 Thread Ted Yu
Please send email to user-unsubscr...@hadoop.apache.org See http://hadoop.apache.org/mailing_lists.html#User On Thu, Jul 17, 2014 at 9:20 AM, Don Hilborn wrote: > unsubscribe > *Don Hilborn *Solutions Engineer, Hortonworks > *Mobile: 832-444-5463 <832-444-5463>* > Email: *dhilb...@hortonworks

Re: what changes needed for existing HDFS java client in order to work with kerberosed hadoop server ?

2014-07-17 Thread Chris Nauroth
Hello Sophie, If you're using the HDFS lib like you said (i.e. obtaining an instance of FileSystem and using its methods), then I expect your code will "just work" with no code changes required when you start running against a secure cluster. The work of switching to a secured deployment with Ker

Re: what changes needed for existing HDFS java client in order to work with kerberosed hadoop server ?

2014-07-17 Thread Xiaohua Chen
Hi Chris, Thank you very much for your reply. One more question: I come across org.apache.hadoop.security.SecurityUtil class(http://hadoop.apache.org/docs/stable1/api/index.html?org/apache/hadoop/security/SecurityUtil.html) and it provides a couple of login methods e.g. login(Configuration conf,

Re: what changes needed for existing HDFS java client in order to work with kerberosed hadoop server ?

2014-07-17 Thread Chris Nauroth
Hi Sophie, Yes, you could authenticate via SecurityUtil#login, which is a convenience wrapper over UserGroupInformation#loginUserFromKeytab. This is essentially what daemons like the NameNode do. However, you might find that it's best overall to get kinit deployed to your client machines. For e

Re: what changes needed for existing HDFS java client in order to work with kerberosed hadoop server ?

2014-07-17 Thread John Glynn
Unsubscribe On Jul 16, 2014 5:00 PM, "Xiaohua Chen" wrote: > Hi Experts, > > I am new to Hadoop. I would like to get some help from you: > > Our current HDFS java client works fine with hadoop server which has > NO Kerberos security enabled. We use HDFS lib e.g. > org.apache.hadoop.fs.*. > > No

Re: what changes needed for existing HDFS java client in order to work with kerberosed hadoop server ?

2014-07-17 Thread Xiaohua Chen
Thanks Chris for the very helpful reply. Now I understand the preferred way is to use kinit. Do you mind to share: what is the road map for Hadoop authentication in the near future ? Specifically I understand the latest released hadoop supports Kerberos protocol for authentication, do you know i

HDFS input/output error - fuse mount

2014-07-17 Thread andrew touchet
Hello, Hadoop package installed: hadoop-0.20-0.20.2+737-33.osg.el5.noarch Operating System: CentOS release 5.8 (Final) I am mounting HDFS from my namenode to another node with fuse. After mounting to /hdfs, any attempts to 'ls', 'cd', or use 'hadoop fs' leads to the below output. $ls /hdfs *l

Re: HDFS input/output error - fuse mount

2014-07-17 Thread Chris Mawata
Version 51 ia Java 7 Chris On Jul 17, 2014 7:50 PM, "andrew touchet" wrote: > Hello, > > Hadoop package installed: > hadoop-0.20-0.20.2+737-33.osg.el5.noarch > > Operating System: > CentOS release 5.8 (Final) > > I am mounting HDFS from my namenode to another node with fuse. After > mounting to

Re: what changes needed for existing HDFS java client in order to work with kerberosed hadoop server ?

2014-07-17 Thread Chris Nauroth
I'm not sure if this directly answers your question, but you might try taking a look at issue HADOOP-9671 and the various issues that are linked to it: https://issues.apache.org/jira/browse/HADOOP-9671 Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Jul 17, 2014 at 4:30 PM, Xiaohua C

Re: HDFS input/output error - fuse mount

2014-07-17 Thread andrew touchet
Hi Chris, I tried to mount /hdfs with java versions below but there was no change in output. jre-7u21 jdk-7u21 jdk-7u55 jdk1.6.0_31 jdk1.6.0_45 On Thu, Jul 17, 2014 at 6:56 PM, Chris Mawata wrote: > Version 51 ia Java 7 > Chris > On Jul 17, 2014 7:50 PM, "andrew touchet" wrote: > >> Hello,

Re: Re: HDFS input/output error - fuse mount

2014-07-17 Thread firefly...@gmail.com
I think you first confirm you local java version , Some liux will pre-installed java ,that version is very low firefly...@gmail.com From: andrew touchet Date: 2014-07-18 09:06 To: user Subject: Re: HDFS input/output error - fuse mount Hi Chris, I tried to mount /hdfs with java versions

Re: Configuration set up questions - Container killed on request. Exit code is 143

2014-07-17 Thread Wangda Tan
Hi Chris MacKenzie, Since your output is still "As the output was still 2.1 GB of 2.1 GB virtual memory used. Killing" I guess yarn.nodemanager.vmem-pmem-ratio doesn't take effect, if it take effect, it should be "xxGB of 10GB virtual memory used ..". Have your tried restart NM after configure that

Re: Re: HDFS input/output error - fuse mount

2014-07-17 Thread andrew touchet
Hi Fireflyhoo, Below I follow the symbolic links for the jdk-7u21. These links are changed accordingly as I change between versions. Also, I have 8 datanodes and 2 other various servers that are capable of mounting /hdfs. So it is just this server is an issue. $ java -version java version "1.7.0

Re: Replace a block with a new one

2014-07-17 Thread Zesheng Wu
How about write a new block with new checksum file, and replace the old block file and checksum file both? 2014-07-17 19:34 GMT+08:00 Wellington Chevreuil < wellington.chevre...@gmail.com>: > Hi, > > there's no way to do that, as HDFS does not provide file updates features. > You'll need to writ

Re: Re: HDFS input/output error - fuse mount

2014-07-17 Thread Chris Mawata
Check the JAVA_HOME environment variable as well ... On Jul 17, 2014 9:46 PM, "andrew touchet" wrote: > Hi Fireflyhoo, > > Below I follow the symbolic links for the jdk-7u21. These links are > changed accordingly as I change between versions. Also, I have 8 datanodes > and 2 other various server

Re: Re: HDFS input/output error - fuse mount

2014-07-17 Thread Chris Mawata
Yet another place to check -- in the hadoop-env.sh file there is also a JAVA_HOME setting. Chris On Jul 17, 2014 9:46 PM, "andrew touchet" wrote: > Hi Fireflyhoo, > > Below I follow the symbolic links for the jdk-7u21. These links are > changed accordingly as I change between versions. Also, I ha

umsubscribe

2014-07-17 Thread jason_j...@yahoo.com
Original Message From: Wellington Chevreuil Sent: Thursday, July 17, 2014 04:34 AM To: user@hadoop.apache.org Subject: Re: Replace a block with a new one >Hi, > >there's no way to do that, as HDFS does not provide file updates features. >You'll need to write a new file with t

unsubscribe

2014-07-17 Thread Debnath, Suman
Regards, Suman++

Re: NFS Gateway readonly issue

2014-07-17 Thread Abhiraj Butala
Hello, I was able to reproduce the issue on latest hadoop trunk. Though for me, I could only delete files, deleting directories were correctly blocked. I have opened https://issues.apache.org/jira/browse/HDFS-6703 to further track the issue. Thanks for reporting! Regards, Abhiraj On Thu, Jul 10,