How to download CDH3 RPM's

2013-05-01 Thread Sandeep Nemuri
Guys help me out !! Thanks in advance. -- Regards Sandeep

name node and secondary name node

2013-05-01 Thread Aditya exalter
what is the process going through name node and secondary name node in detail.

Re: name node and secondary name node

2013-05-01 Thread pradeep T
Hi Aditya, Secondary name node will take a copy of metadata from Primary Namenode at regular intervals. This is the short of the long story. To know more see the picture and description in the link. https://www.inkling.com/read/hadoop-definitive-guide-tom-white-3rd/chapter-10/hdfs Regards, Pra

Re: High IO Usage in Datanodes due to Replication

2013-05-01 Thread Harsh J
Hi, Neither block reports nor block scanning should affect general DN I/O, although the former may affect DN liveliness in older versions, if they lack HDFS-2379 in them. Brahma is partially right in having mentioned the block reports, hence. Your solution, if the # of blocks per DN is too high (

executing with the data but through using a file system interface

2013-05-01 Thread Julian Bui
Hello hadoop users, I have a library that takes a string as input and finds the file on the HDFS and performs operations on it...but at the moment this doesn't take advantage of node awareness; it may or may not run on the node with the data. I'd like to fix this. ***Background*** So a little mo

Hadoop properties

2013-05-01 Thread Aditya exalter
What are the hadoop properties list all the hadoop properties and please explain in detail. Thank you.

Re: Hadoop properties

2013-05-01 Thread Harsh J
You can read some maintained descriptions through the following links: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml http://hadoop.apache.org/docs/current/hadoop-mapred

Re: executing with the data but through using a file system interface

2013-05-01 Thread Harsh J
Hi, 1. MR uses the host hints provided by an InputSplit object's getLocations return. [1] 2. Typically, (1) is populated by taking each block's locations, available via the FileSystem's FileStatus objects queried over a block of a file [2]. 3. You need something custom like a WholeFileInputFormat,

Re: High IO Usage in Datanodes due to Replication

2013-05-01 Thread selva
Hi Harsh, You are right, Our Hadoop version is "0.20.2-cdh3u1" which is lack of HDFS-2379. As you suggest i have doubled the DN heap size, Now i will monitor the Block scanning speed. The 2nd idea is good, but I can not merge the small files(~1 MB) since its all in hive table partitions. -Selva

Re: High IO Usage in Datanodes due to Replication

2013-05-01 Thread Harsh J
If your partitions are only storing about 1 MB in each, I don't know if its a good key design or a good application for Hadoop. But if you mean that there are many files under a single partition, but all of them being 1 MB or so each, then you can safely merge them without issues. HDFS-2379 should

Re: How to download CDH3 RPM's

2013-05-01 Thread Paul Wilkinson
You can find the repo file at http://archive.cloudera.com/redhat/cdh/cloudera-cdh3.repo, alternatively, have a look under http://archive.cloudera.com/cdh/3. Curious as to why you'd want CDH3 when CDH4 has been available for quite some time. In general I'd recommend you look at CDH4 instead: http:/

Using Hadoop core jars on Windows

2013-05-01 Thread Benjamin Sznajder
Hi, Recently, I ran Nutch 1.6 on Windows. I encountered a problem with the code related to hadoop which did not run on windows. For solving this issue, I had to use a -relatively- old version of Hadoop core jar - hadoop-0.20.2-core.jar My question is: - What is the way to use the newest jar o

there is not data-node

2013-05-01 Thread Mohsen B.Sarmadi
Dear Sirs/madams i am trying to run hadoop 1.0.4 in the pseudo distributed mode, but i am facing with datanode log, 01/05/2013 13:16:54 2013-05-01 13:16:54,206 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /home/mohs/hadoop/dfsdirdata: n

Re: there is not data-node

2013-05-01 Thread Serge Blazhievsky
You also need to remove data in you data directories on every data node Sent from my iPhone On May 1, 2013, at 5:31 AM, "Mohsen B.Sarmadi" wrote: > Dear Sirs/madams > > i am trying to run hadoop 1.0.4 in the pseudo distributed mode, but i am > facing with > > datanode log, > > 01/05/2013

hadoop 1.1.2 sources (maven project)

2013-05-01 Thread Oleg Ruchovets
Hi I have hadoop-core 1.1.2 and hadoop-test 1.1.2 as dependency in my maven project. org.apache.hadoop hadoop-core 1.1.2 org.apache.hadoop hadoop-test 1.1.2 for some reason I didn't succeded to get the source code for th

Re: hadoop 1.1.2 sources (maven project)

2013-05-01 Thread Ted Yu
I looked under my local maven repo and didn't see source code along side hadoop-core-1.1.2.jar Can you check out the 1.1.2 source code ? Cheers On Wed, May 1, 2013 at 6:58 AM, Oleg Ruchovets wrote: > Hi >I have hadoop-core 1.1.2 and hadoop-test 1.1.2 as dependency in my > maven project. >

Re: hadoop 1.1.2 sources (maven project)

2013-05-01 Thread Oleg Ruchovets
Hi Ted. What do you mean check out code? Thansk Oleg. On Wed, May 1, 2013 at 5:23 PM, Ted Yu wrote: > I looked under my local maven repo and didn't see source code along > side hadoop-core-1.1.2.jar > > Can you check out the 1.1.2 source code ? > > Cheers > > > On Wed, May 1, 2013 at 6:58 A

Re: hadoop 1.1.2 sources (maven project)

2013-05-01 Thread Ted Yu
I meant using command such as the following (I use svn): svn co http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 Cheers On Wed, May 1, 2013 at 7:25 AM, Oleg Ruchovets wrote: > Hi Ted. >What do you mean check out code? > > Thansk > Oleg. > > > On Wed, May 1, 2013 at 5:23 PM,

Re: hadoop 1.1.2 sources (maven project)

2013-05-01 Thread Oleg Ruchovets
Oh , I see. You propose check out source code from source constrol. Is there any other option , I mean Is there maven repository which holds sorce of these jars? Thanks Oleg. On Wed, May 1, 2013 at 5:28 PM, Ted Yu wrote: > I meant using command such as the following (I use svn): > svn co ht

Re: there is not data-node

2013-05-01 Thread 姚吉龙
Id is different in namenode and data node.you can modify the id.I met the same issue and I completely remove all  file under hadoop — Sent from Mailbox for iPhone On Wed, May 1, 2013 at 8:32 PM, Mohsen B.Sarmadi wrote: > Dear Sirs/madams > i am trying to run hadoop 1.0.4 in the pseudo distribut

Re: New to Hadoop-SSH communication

2013-05-01 Thread Automation Me
I am still not able to communicate between two Vm's. Changed the hostname as Master and Slave.Copied the hostnames in master slave etc/hosts.reinstalled SSH server. But when i try to copy the file from master to slave its copying to the master and not to the slave scp -r /usr/local/hadoop/conf hdu

Re: New to Hadoop-SSH communication

2013-05-01 Thread kishore alajangi
Might be you are copying by logging slave machine, Exit from slave in Master. Thanks, Kishore. On Wed, May 1, 2013 at 3:00 AM, Automation Me wrote: > Thank you Tariq. > > I am using the same username on both the machines and when i try to copy > a file master to slave just to make sure SSH

Re: reducer gets values with empty attributes

2013-05-01 Thread alxsss
Hi, Here is the map and reduce part of the code public void map(Text key, Writable value, OutputCollector output, Reporter reporter) throws IOException { ParseData parseData = (ParseData) value; Metadata parseMeta = parseData.getParseMeta(); if(parseMeta.getValues("id").len

Re: New to Hadoop-SSH communication

2013-05-01 Thread shashwat shriparv
Open /etc/hostname file : change master and slave in those file restart the system. and dont give IP as 127.0.0.1 and 127.0.0.2 just give ifconfig command it will show you the actual ip give that in hosts file. *Thanks & Regards* ∞ Shashwat Shriparv On Wed, May 1, 2013 at 9:48 PM, kishore

Re: New to Hadoop-SSH communication

2013-05-01 Thread shashwat shriparv
Watch these for sucessful configuration https://www.youtube.com/watch?v=gIRubPl20oo https://www.youtube.com/watch?v=pgOKKl5P0to https://www.youtube.com/watch?v=8CrgPUaNfjk *Thanks & Regards* ∞ Shashwat Shriparv On Wed, May 1, 2013 at 11:55 PM, shashwat shriparv < dwivedishash...@gmail.com

Re: there is not data-node

2013-05-01 Thread shashwat shriparv
Format your namenode and start again *Thanks & Regards* ∞ Shashwat Shriparv On Wed, May 1, 2013 at 8:40 PM, 姚吉龙 wrote: > Id is different in namenode and data node.you can modify the id.I met the > same issue and I completely remove all file under hadoop > — > Sent from Mailbox

Re: there is not data-node

2013-05-01 Thread Mohammad Tariq
Have you reformatted the NN(unsuccessfully)?Was your NN serving some other cluster earlier or your DNs were part of some other cluster?Datanodes bind themselves to namenode through namespaceID and in your case the IDs of DNs and NN seem to be different. As a workaround you could do this : 1-

RE: Using Hadoop core jars on Windows

2013-05-01 Thread Ivan Mitic
Hi Benjamin, Is there a specific version of Hadoop you are interested in? Hadoop 1.0, Hadoop 2.0? There is not an official Apache release of Hadoop on Windows yet, however, there is an ongoing effort to get there. Apache branch-1-win and trunk branches already have a decent Windows support.

Re: Versions - Confusion

2013-05-01 Thread Harsh J
Hi Naidu, (a) As Steve already mentioned, please do not ask user questions on any of the *-dev lists as the list is only for project development and contributors, not general Q/A. (b) I've moved your question to user@hadoop.apache.org which is a proper list for general Q/A. (c) It is not a good th