ListWritable In Hadoop

2014-07-10 Thread unmesha sreeveni
hi Do we have a ListWritable in hadoop ? -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer*

Re: ListWritable In Hadoop

2014-07-10 Thread Rich Haase
No, but hadoop 2.2 has ArrayWritable. https://hadoop.apache.org/docs/r2.2.0/api/org/apache/hadoop/io/ArrayWritable.html Would you provide more information about your use case you can describe? There maybe an alternative that will meet your needs. On Thu, Jul 10, 2014 at 12:31 AM, unmesha

Re: Copy hdfs block from one data node to another

2014-07-10 Thread Yehia Elshater
Thanks a lot. I will take a look on the balancer and decommissioning code. On 10 July 2014 00:34, Arpit Agarwal aagar...@hortonworks.com wrote: The balancer does something similar. It uses DataTransferProtocol.replaceBlock. On Wed, Jul 9, 2014 at 9:20 PM, sudhakara st

Re: The number of simultaneous map tasks is unexpected.

2014-07-10 Thread Tomasz Guziałek
Hi Adam, yarn.nodemanager.resource.memory-mb = 2370 MiB, yarn.nodemanager.resource.cpu-vcores = 2, yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler, Use CGroups for Resource Management

RE: Need to evaluate a cluster

2014-07-10 Thread YIMEN YIMGA Gael
Hi, When I said Size of a disk is 3TB, this means that a datanode should have a disk space of 3TB (1 to 3 disk of 1TB to 3TB). Could you please help me with your experience to approximate the number of nodes on one year ? Regards From: Oner Ak. [mailto:oak26...@gmail.com] Sent: Wednesday 9

RE: Need to evaluate a cluster

2014-07-10 Thread YIMEN YIMGA Gael
Hi, To Mirko The number of HDDs per datanodes is : 3 (3 disks of 1TB to 3TB) I calculate the number of nodes using the following formulae === - Used space on the cluster by daily feed : daily feed * replication factor = 720GB * 3 = 2160GB - Size of a disk for HDFS : Size of a

RE: Need to evaluate a cluster

2014-07-10 Thread YIMEN YIMGA Gael
Hi, What does « 1.3 for overhead » mean in this calculation ? Regards From: Mirko Kämpf [mailto:mirko.kae...@gmail.com] Sent: Wednesday 9 July 2014 18:09 To: user@hadoop.apache.org Subject: Re: Need to evaluate a cluster Hello, if I follow your numbers I see one missing fact: What is the

Re: Need to evaluate a cluster

2014-07-10 Thread Mirko Kämpf
I multiply by 1.3 which means I add 30% of the estimated amount to have reserved capacity for intermediate data. In your case with approx. 2TB per day I think, data nodes with 1 to 3 discs are not a good idea. You should consider servers with more discs and than add one per week. Start with 10

RE: Need to evaluate a cluster

2014-07-10 Thread YIMEN YIMGA Gael
Thank for your return Mirko, In my case, I can consider compression factor of *8 according to the service in charge of it. Data, I’m dealing with are : logs only. But it’s many types of logs (printing logs, USB logs, Remote access logs, Active Directory logs, database servers logs, Web

Re: Need to evaluate a cluster

2014-07-10 Thread Olivier Renault
Either you spend your money on servers with more disks or your spend your money on cooling / power consumption and potentially building a new DC ;). A typical server from a tier 1 vendor ( HP, Dell, IBM, Cisco ) should be around 5k euros ( fully loaded with HDD ). Kind regards, Olivier On 10

RE: Need to evaluate a cluster

2014-07-10 Thread YIMEN YIMGA Gael
In addition, when I applied the Compression factor of 8, I have as daily feeds : 87GB/day From: YIMEN YIMGA Gael ItecCsySat Sent: Thursday 10 July 2014 11:11 To: user@hadoop.apache.org Subject: RE: Need to evaluate a cluster Thank for your return Mirko, In my case, I can consider compression

RE: Need to evaluate a cluster

2014-07-10 Thread YIMEN YIMGA Gael
Hi Olivier, When I say LOW-COST, I mean COMMODITY HARDWARE. Could you advice please ? From: Olivier Renault [mailto:orena...@hortonworks.com] Sent: Thursday 10 July 2014 11:18 To: user@hadoop.apache.org Subject: Re: Need to evaluate a cluster Either you spend your money on servers with more

RE: Need to evaluate a cluster

2014-07-10 Thread YIMEN YIMGA Gael
Thank you Mirko, i saw the chapter title PLANNING A HADOOP CLUSTER. I’ll take that book. From: Mirko Kämpf [mailto:mirko.kae...@gmail.com] Sent: Thursday 10 July 2014 11:22 To: user@hadoop.apache.org Subject: Re: Need to evaluate a cluster Just request a quote from the leading and also local

multiple map tasks writing in same hdfs file -issue

2014-07-10 Thread rab ra
Hello I have one use-case that spans multiple map tasks in hadoop environment. I use hadoop 1.2.1 and with 6 task nodes. Each map task writes their output into a file stored in hdfs. This file is shared across all the map tasks. Though, they all computes thier output but some of them are missing

NFS Gateway readonly issue

2014-07-10 Thread bigdatagroup
Hello, we are experiencing a strange issue with our Hadoop cluster implementing NFS Gateway. We exported our distributed filesystem with the following configuration (Managed by Cloudera Manager over CDH 5.0.1): property namedfs.nfs.exports.allowed.hosts/name value192.168.0.153

Re: The number of simultaneous map tasks is unexpected.

2014-07-10 Thread Adam Kawa
yarn.nodemanager.resource.memory-mb = 2370 MiB, yarn.nodemanager.resource.cpu-vcores = 2, So, you cannot run more than 8 containers on your setup (according to your settings, each container consumes 1GB and 1 vcore). Considering that I have 8 cores in my cluster and not 16 as I thought at

Re: how to access configuration properties on a remote Hadoop cluster

2014-07-10 Thread Geoff Thompson
Hi Adam, Thanks for the suggestion. Geoff On Jul 9, 2014, at 2:02 PM, Adam Kawa kawa.a...@gmail.com wrote: Instead of Resource-Manager-WebApp-Address/conf, If you have application id and job id, you can query the Resource Manager for the configuration of this particular application. You

Re: multiple map tasks writing in same hdfs file -issue

2014-07-10 Thread Arpit Agarwal
HDFS is single-writer, multiple-reader (see sec 8.3.1 of http://aosabook.org/en/hdfs.html). You cannot have multiple writers for a single file at a time. On Thu, Jul 10, 2014 at 2:55 AM, rab ra rab...@gmail.com wrote: Hello I have one use-case that spans multiple map tasks in hadoop

In_use.lock and other directories

2014-07-10 Thread hadoop hive
Hi folks, I am bit confused about the directories present inside datanode data directories like In_use.lock Detach Storage Current Thanks

Re: Muliple map writing into same hdfs file

2014-07-10 Thread Vinod Kumar Vavilapalli
Current writes to a single file in HDFS is not possible today. You may want to write a per-task file and use that entire directory as your output. +Vinod Hortonworks Inc. http://hortonworks.com/ On Wed, Jul 9, 2014 at 10:42 PM, rab ra rab...@gmail.com wrote: hello I have one use-case

Re: Issues with documentation on YARN

2014-07-10 Thread Akira AJISAKA
Thanks for the report! You can create a issue to https://issues.apache.org/jira/browse/YARN and submit a patch. The wiki page describes how to contribute to Apache Hadoop. https://wiki.apache.org/hadoop/HowToContribute Thanks, Akira (2014/07/08 19:54), Никитин Константин wrote: Hi! I'm

Re: Issues with documentation on YARN

2014-07-10 Thread Tsuyoshi OZAWA
Hi, To edit the document you mentioned, please edit hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/WritingYarnApplications.apt.vm The document of apt.vm is written in APT format, which is described in the following document:

Re: Muliple map writing into same hdfs file

2014-07-10 Thread Bertrand Dechoux
And beside with a single file, if that were possible, how do you handle error? Let' say task 1 ran 3 times : 1 error, 1 speculative and 1 success... A per-task file has been a standard to easily solve that problem. Bertrand Dechoux On Thu, Jul 10, 2014 at 10:00 PM, Vinod Kumar Vavilapalli