How does AWS know how many map/reduce slot should be configured to each EC2 instance?

2013-07-19 Thread WangRamon
Hi All
We have a plan to move to Amazon AWS cloud, by doing some research i find that 
i can start the map/reduce cluster in AWS with the following command:% 
bin/hadoop-ec2 launch-cluster test-cluster 2
The command allows me to start a cluster with required nodes(no more than 20, 
correct me if i were wrong), so here comes to my questions:
1. How does AWS know how many map/reduce slot should be configured to each EC2 
instance? Is it depends on the EC2 instance type (m1.large, m1.xlarge...)?2. 
How it is charged? Nodes number * price per node per hour ?3. Is each node like 
a single EC2 instance in my admin console? 
Thanks in advance!
CheersRamon
  

Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

2013-07-19 Thread TianYi Zhu
1. Yes, it's depends on instance type. Generally, number of map slots +
number of reduce slots = number of ECU, number of map slots / number of
reduce slots = 3. You can customize these numbers.
2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR
node is a little bit more expensive than EC2 node)
3. Yes, you can find them in admin console.


On 19 July 2013 16:23, WangRamon ramon_w...@hotmail.com wrote:

 Hi All

 We have a plan to move to Amazon AWS cloud, by doing some research i find
 that i can start the map/reduce cluster in AWS with the following command:
 % bin/hadoop-ec2 launch-cluster test-cluster 2

 The command allows me to start a cluster with required nodes(no more than
 20, correct me if i were wrong), so here comes to my questions:

 1. How does AWS know how many map/reduce slot should be configured to each
 EC2 instance? Is it depends on the EC2 instance type (m1.large,
 m1.xlarge...)?
 2. How it is charged? Nodes number * price per node per hour ?
 3. Is each node like a single EC2 instance in my admin console?

 Thanks in advance!

 Cheers
 Ramon




RE: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

2013-07-19 Thread WangRamon
Hi Tianyi
Thanks for the reply, that's really help. So i have two further questions:
1.  You said i can customize the number of the slots on AWS, how to do it? i 
know i can do it in the mapred-site.xml if i created the cluster without AWS.2. 
 You mentioned about the EMR node, will the hadoop-ec2 launch-cluster command 
start EMR node or common EC2 instance? Thanks a lot.
CheersRamon

Date: Fri, 19 Jul 2013 16:37:21 +1000
Subject: Re: How does AWS know how many map/reduce slot should be configured to 
each EC2 instance?
From: tianyi@facilitatedigital.com
To: user@hadoop.apache.org

1. Yes, it's depends on instance type. Generally, number of map slots + number 
of reduce slots = number of ECU, number of map slots / number of reduce slots 
= 3. You can customize these numbers.
2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR node 
is a little bit more expensive than EC2 node)3. Yes, you can find them in admin 
console.


On 19 July 2013 16:23, WangRamon ramon_w...@hotmail.com wrote:




Hi All
We have a plan to move to Amazon AWS cloud, by doing some research i find that 
i can start the map/reduce cluster in AWS with the following command:% 
bin/hadoop-ec2 launch-cluster test-cluster 2

The command allows me to start a cluster with required nodes(no more than 20, 
correct me if i were wrong), so here comes to my questions:
1. How does AWS know how many map/reduce slot should be configured to each EC2 
instance? Is it depends on the EC2 instance type (m1.large, m1.xlarge...)?
2. How it is charged? Nodes number * price per node per hour ?3. Is each node 
like a single EC2 instance in my admin console? 

Thanks in advance!
CheersRamon
  

  

Re:

2013-07-19 Thread Anit Alexander
Hello Tariq,
I solved the problem. There must have been some problem in the custom input
format i created. so i took a sample custom input format which was working
in cdh4 environment and applied the changes as per my requirement. It is
working now. But i havent tested that code in apache hadoop environment yet
:)

Regards,
Anit


On Thu, Jul 18, 2013 at 1:22 AM, Mohammad Tariq donta...@gmail.com wrote:

 Hello Anit,

 Could you show me the exact error log?

 Warm Regards,
 Tariq
 cloudfront.blogspot.com


 On Tue, Jul 16, 2013 at 8:45 AM, Anit Alexander anitama...@gmail.comwrote:

 yes i did recompile. But i seem to face the same problem. I am running
 the map reduce with a custom input format. I am not sure if there is some
 change in the API to get the splits correct.

 Regards


 On Tue, Jul 16, 2013 at 6:24 AM, 闫昆 yankunhad...@gmail.com wrote:

 I think you should recompile the program after run the program


 2013/7/13 Anit Alexander anitama...@gmail.com

 Hello,

 I am encountering a problem in cdh4 environment.
 I can successfully run the map reduce job in the hadoop cluster. But
 when i migrated the same map reduce to my cdh4 environment it creates an
 error stating that it cannot read the next block(each block is 64 mb). Why
 is that so?

 Hadoop environment: hadoop 1.0.3
 java version 1.6

 chd4 environment: CDH4.2.0
 java version 1.6

 Regards,
 Anit Alexander







Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

2013-07-19 Thread Mischa Tuffield
Hey, 

On 19 Jul 2013, at 07:55, WangRamon ramon_w...@hotmail.com wrote:

 Hi Tianyi
 
 Thanks for the reply, that's really help. So i have two further questions:
 
 1.  You said i can customize the number of the slots on AWS, how to do it? i 
 know i can do it in the mapred-site.xml if i created the cluster without AWS.

You can pass arguments to a bootstrap command called configure-hadoop that is 
provided by the AWS folk, like so (I do this all the time)

 --bootstrap-action s3://elasticmapreduce/bootstrap-actions/configure-hadoop \
  --args 
-m,mapred.reduce.child.java.opts=-Xmx7168m,-s,mapred.tasktracker.reduce.tasks.maximum=80,-s,mapred.reduce.tasks=80
 \


 2.  You mentioned about the EMR node, will the hadoop-ec2 launch-cluster 
 command start EMR node or common EC2 instance? Thanks a lot.

An EMR node in this case is an EC2 instance running an AMI which the AWS folk 
have configured and install a version of hadoop on. 

You can find the EMR AMI for EC2 by searching for AWS157 under AMIs.

Mischa

 
 Cheers
 Ramon
 
 Date: Fri, 19 Jul 2013 16:37:21 +1000
 Subject: Re: How does AWS know how many map/reduce slot should be configured 
 to each EC2 instance?
 From: tianyi@facilitatedigital.com
 To: user@hadoop.apache.org
 
 1. Yes, it's depends on instance type. Generally, number of map slots + 
 number of reduce slots = number of ECU, number of map slots / number of 
 reduce slots = 3. You can customize these numbers.
 2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR 
 node is a little bit more expensive than EC2 node)
 3. Yes, you can find them in admin console.
 
 
 On 19 July 2013 16:23, WangRamon ramon_w...@hotmail.com wrote:
 Hi All
 
 We have a plan to move to Amazon AWS cloud, by doing some research i find 
 that i can start the map/reduce cluster in AWS with the following command:
 % bin/hadoop-ec2 launch-cluster test-cluster 2
 
 The command allows me to start a cluster with required nodes(no more than 20, 
 correct me if i were wrong), so here comes to my questions:
 
 1. How does AWS know how many map/reduce slot should be configured to each 
 EC2 instance? Is it depends on the EC2 instance type (m1.large, m1.xlarge...)?
 2. How it is charged? Nodes number * price per node per hour ?
 3. Is each node like a single EC2 instance in my admin console? 
 
 Thanks in advance!
 
 Cheers
 Ramon

___
Mischa Tuffield PhD
http://mmt.me.uk/
@mischat







Re:

2013-07-19 Thread Mohammad Tariq
Glad to hear that :)

Warm Regards,
Tariq
cloudfront.blogspot.com


On Fri, Jul 19, 2013 at 1:10 PM, Anit Alexander anitama...@gmail.comwrote:

 Hello Tariq,
 I solved the problem. There must have been some problem in the custom
 input format i created. so i took a sample custom input format which was
 working in cdh4 environment and applied the changes as per my requirement.
 It is working now. But i havent tested that code in apache hadoop
 environment yet :)

 Regards,
 Anit


 On Thu, Jul 18, 2013 at 1:22 AM, Mohammad Tariq donta...@gmail.comwrote:

 Hello Anit,

 Could you show me the exact error log?

 Warm Regards,
 Tariq
 cloudfront.blogspot.com


 On Tue, Jul 16, 2013 at 8:45 AM, Anit Alexander anitama...@gmail.comwrote:

 yes i did recompile. But i seem to face the same problem. I am running
 the map reduce with a custom input format. I am not sure if there is some
 change in the API to get the splits correct.

 Regards


 On Tue, Jul 16, 2013 at 6:24 AM, 闫昆 yankunhad...@gmail.com wrote:

 I think you should recompile the program after run the program


 2013/7/13 Anit Alexander anitama...@gmail.com

 Hello,

 I am encountering a problem in cdh4 environment.
 I can successfully run the map reduce job in the hadoop cluster. But
 when i migrated the same map reduce to my cdh4 environment it creates an
 error stating that it cannot read the next block(each block is 64 mb). Why
 is that so?

 Hadoop environment: hadoop 1.0.3
 java version 1.6

 chd4 environment: CDH4.2.0
 java version 1.6

 Regards,
 Anit Alexander








./hdfs namenode -bootstrapStandby error

2013-07-19 Thread lei liu
I  use hadoop-2.0.5 version and use QJM for HA.


I use ./hdfs namenode -bootstrapStandby for StandbyNameNode, but report
below error:

=
About to bootstrap Standby ID nn2 from:
   Nameservice ID: mycluster
Other Namenode ID: nn1
  Other NN's HTTP address: 10.232.98.77:20021
  Other NN's IPC  address: dw77.kgb.sqa.cm4/10.232.98.77:20020
 Namespace ID: 1499625118
Block pool ID: BP-2012507965-10.232.98.77-1372993302021
   Cluster ID: CID-921af0aa-b831-4828-965c-3b71a5149600
   Layout version: -40
=
Re-format filesystem in Storage Directory
/home/musa.ll/hadoop2/cluster-data/name ? (Y or N) Y
13/07/19 17:04:28 INFO common.Storage: Storage directory
/home/musa.ll/hadoop2/cluster-data/name has been successfully formatted.
13/07/19 17:04:29 FATAL ha.BootstrapStandby: Unable to read transaction ids
16317-16337 from the configured shared edits storage
qjournal://10.232.98.61:20022;10.232.98.62:20022;
10.232.98.63:20022/mycluster. Please copy these logs into the shared edits
storage or call saveNamespace on the active node.
Error: Gap in transactions. Expected to be able to read up until at least
txid 16337 but unable to find any edit logs containing txid 16331
13/07/19 17:04:29 INFO util.ExitUtil: Exiting with status 6



The edit logs are below content in JournalNode:
-rw-r--r-- 1 musa.ll users  30 Jul 19 15:51
edits_0016327-0016328
-rw-r--r-- 1 musa.ll users  30 Jul 19 15:53
edits_0016329-0016330
-rw-r--r-- 1 musa.ll users 1048576 Jul 19 17:03
edits_inprogress_0016331


The edits_inprogress_0016331 should contains the 16331-16337
transactions, why the ./hdfs namenode -bootstrapStandby command report
error? How can I initialize the StandbyNameNode?

Thanks,

LiuLei


DistributedCache incompatibility issue between 1.0 and 2.0

2013-07-19 Thread Edward J. Yoon
Hi,

I wonder why setLocalFiles and addLocalFiles methods have been
removed, and what should I use instead of them?

-- 
Best Regards, Edward J. Yoon
@eddieyoon


Unexpected problem in creating temporary file

2013-07-19 Thread Ajay Srivastava
Hi,

I am seeing many such errors on a datanode -

2013-07-18 22:10:49,473 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(10.254.0.40:50010, 
storageID=DS-595314104-10.254.0.40-50010-1374154266946, infoPort=50075, 
ipcPort=50020):DataXceiver
java.io.IOException: Unexpected problem in creating temporary file for 
blk_-395633752903233591_3721.  File 
/data/hadoop-admin/tmp/blk_-395633752903233591 should not be present, but is.
at 
org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:429)
at 
org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:407)
at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1242)
at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1131)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:99)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:305)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:113)
at java.lang.Thread.run(Unknown Source)

What could be the reason behind it ? It is causing system to slow down.
Before these errors, there were few exceptions with connection reset by peer 
which I guess are harmless.


Regards,
Ajay Srivastava

RE: DistributedCache incompatibility issue between 1.0 and 2.0

2013-07-19 Thread Botelho, Andrew
I have been using Job.addCacheFile() to cache files in the distributed cache.  
It has been working for me on Hadoop 2.0.5:

public void addCacheFile(URI uri)
Add a file to be localized
Parameters:
uri - The uri of the cache to be localized

-Original Message-
From: Edward J. Yoon [mailto:edwardy...@apache.org] 
Sent: Friday, July 19, 2013 8:03 AM
To: user@hadoop.apache.org
Subject: DistributedCache incompatibility issue between 1.0 and 2.0

Hi,

I wonder why setLocalFiles and addLocalFiles methods have been removed, and 
what should I use instead of them?

--
Best Regards, Edward J. Yoon
@eddieyoon



Re: Namenode automatically going to safemode with 2.1.0-beta

2013-07-19 Thread Krishna Kishore Bonagiri
Hi Harsh,

  I have made my dfs.namenode.name.dir point to a subdirectory of my home,
and I don't see this issue again. So, is this a bug that we need to log
into JIRA?

Thanks,
Kishore


On Tue, Jul 16, 2013 at 6:39 AM, Harsh J ha...@cloudera.com wrote:

  2013-07-12 11:04:26,002 WARN
 org.apache.hadoop.hdfs.server.namenode.NameNodeResourceChecker: Space
 available on volume 'null' is 0, which is below the configured reserved
 amount 104857600

 This is interesting. Its calling your volume null, which may be more
 of a superficial bug.

 What is your dfs.namenode.name.dir set to? From
 /tmp/hadoop-dsadm/dfs/name I'd expect you haven't set it up and /tmp
 is being used off of the out-of-box defaults. Could you try to set it
 to a specific directory thats not on /tmp?

 On Mon, Jul 15, 2013 at 2:43 PM, Krishna Kishore Bonagiri
 write2kish...@gmail.com wrote:
  I don't have it in my hdfs-site.xml, in which case probably the default
  value is taken..
 
 
  On Mon, Jul 15, 2013 at 2:29 PM, Azuryy Yu azury...@gmail.com wrote:
 
  please check dfs.datanode.du.reserved in the hdfs-site.xml
 
  On Jul 15, 2013 4:30 PM, Aditya exalter adityaexal...@gmail.com
 wrote:
 
  Hi Krishna,
 
 Can you please send screenshots of namenode web UI.
 
  Thanks Aditya.
 
 
  On Mon, Jul 15, 2013 at 1:54 PM, Krishna Kishore Bonagiri
  write2kish...@gmail.com wrote:
 
  I have had enough space on the disk that is used, like around 30 Gigs
 
  Thanks,
  Kishore
 
 
  On Mon, Jul 15, 2013 at 1:30 PM, Venkatarami Netla
  venkatarami.ne...@cloudwick.com wrote:
 
  Hi,
  pls see the available space for NN storage directory.
 
  Thanks  Regards
 
  Venkat
 
 
  On Mon, Jul 15, 2013 at 12:14 PM, Krishna Kishore Bonagiri
  write2kish...@gmail.com wrote:
 
  Hi,
 
   I am doing no activity on my single node cluster which is using
  2.1.0-beta, and still observed that it has gone to safe mode by
 itself after
  a while. I was looking at the name node log and see many of these
 kinds of
  entries.. Can anything be interpreted from these?
 
  2013-07-12 09:06:11,256 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log
 segment at
  561
  2013-07-12 09:07:11,290 INFO
  org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log
 from
  9.70.137.114
  2013-07-12 09:07:11,290 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
  2013-07-12 09:07:11,290 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log
 segment 561
  2013-07-12 09:07:11,291 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
 transactions: 2
  Total time for transactions(ms): 1 Number of transactions batched
 in Syncs:
  0 Number of syncs: 2 SyncTimes(ms): 14
  2013-07-12 09:07:11,292 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
 transactions: 2
  Total time for transactions(ms): 1 Number of transactions batched
 in Syncs:
  0 Number of syncs: 3 SyncTimes(ms): 15
  2013-07-12 09:07:11,293 INFO
  org.apache.hadoop.hdfs.server.namenode.FileJournalManager:
 Finalizing edits
  file
 /tmp/hadoop-dsadm/dfs/name/current/edits_inprogress_561
  -
 
 /tmp/hadoop-dsadm/dfs/name/current/edits_561-562
  2013-07-12 09:07:11,294 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log
 segment at
  563
  2013-07-12 09:08:11,397 INFO
  org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log
 from
  9.70.137.114
  2013-07-12 09:08:11,398 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
  2013-07-12 09:08:11,398 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log
 segment 563
  2013-07-12 09:08:11,399 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
 transactions: 2
  Total time for transactions(ms): 2 Number of transactions batched
 in Syncs:
  0 Number of syncs: 2 SyncTimes(ms): 11
  2013-07-12 09:08:11,400 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
 transactions: 2
  Total time for transactions(ms): 2 Number of transactions batched
 in Syncs:
  0 Number of syncs: 3 SyncTimes(ms): 12
  2013-07-12 09:08:11,402 INFO
  org.apache.hadoop.hdfs.server.namenode.FileJournalManager:
 Finalizing edits
  file
 /tmp/hadoop-dsadm/dfs/name/current/edits_inprogress_563
  -
 
 /tmp/hadoop-dsadm/dfs/name/current/edits_563-564
  2013-07-12 09:08:11,402 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log
 segment at
  565
  2013-07-12 09:09:11,440 INFO
  org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log
 from
  9.70.137.114
  2013-07-12 09:09:11,440 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
  2013-07-12 09:09:11,440 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log
 segment 565
  2013-07-12 09:09:11,440 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
 transactions: 2
  Total time for transactions(ms): 0 

Re: DistributedCache incompatibility issue between 1.0 and 2.0

2013-07-19 Thread Ted Yu
See this thread also:
http://search-hadoop.com/m/3pgakkVpm71/Distributed+Cache+omkarsubj=Re+Distributed+Cache

On Fri, Jul 19, 2013 at 6:20 AM, Botelho, Andrew andrew.bote...@emc.comwrote:

 I have been using Job.addCacheFile() to cache files in the distributed
 cache.  It has been working for me on Hadoop 2.0.5:

 public void addCacheFile(URI uri)
 Add a file to be localized
 Parameters:
 uri - The uri of the cache to be localized

 -Original Message-
 From: Edward J. Yoon [mailto:edwardy...@apache.org]
 Sent: Friday, July 19, 2013 8:03 AM
 To: user@hadoop.apache.org
 Subject: DistributedCache incompatibility issue between 1.0 and 2.0

 Hi,

 I wonder why setLocalFiles and addLocalFiles methods have been removed,
 and what should I use instead of them?

 --
 Best Regards, Edward J. Yoon
 @eddieyoon




Re: ./hdfs namenode -bootstrapStandby error

2013-07-19 Thread Azuryy Yu
hi,

can you using
'hdfs namenode -initializeSharedEdits' on the active NN, remember start all
journal nodes before try this.
 On Jul 19, 2013 5:17 PM, lei liu liulei...@gmail.com wrote:

 I  use hadoop-2.0.5 version and use QJM for HA.


 I use ./hdfs namenode -bootstrapStandby for StandbyNameNode, but report
 below error:

 =
 About to bootstrap Standby ID nn2 from:
Nameservice ID: mycluster
 Other Namenode ID: nn1
   Other NN's HTTP address: 10.232.98.77:20021
   Other NN's IPC  address: dw77.kgb.sqa.cm4/10.232.98.77:20020
  Namespace ID: 1499625118
 Block pool ID: BP-2012507965-10.232.98.77-1372993302021
Cluster ID: CID-921af0aa-b831-4828-965c-3b71a5149600
Layout version: -40
 =
 Re-format filesystem in Storage Directory
 /home/musa.ll/hadoop2/cluster-data/name ? (Y or N) Y
 13/07/19 17:04:28 INFO common.Storage: Storage directory
 /home/musa.ll/hadoop2/cluster-data/name has been successfully formatted.
 13/07/19 17:04:29 FATAL ha.BootstrapStandby: Unable to read transaction
 ids 16317-16337 from the configured shared edits storage qjournal://
 10.232.98.61:20022;10.232.98.62:20022;10.232.98.63:20022/mycluster.
 Please copy these logs into the shared edits storage or call saveNamespace
 on the active node.
 Error: Gap in transactions. Expected to be able to read up until at least
 txid 16337 but unable to find any edit logs containing txid 16331
 13/07/19 17:04:29 INFO util.ExitUtil: Exiting with status 6



 The edit logs are below content in JournalNode:
 -rw-r--r-- 1 musa.ll users  30 Jul 19 15:51
 edits_0016327-0016328
 -rw-r--r-- 1 musa.ll users  30 Jul 19 15:53
 edits_0016329-0016330
 -rw-r--r-- 1 musa.ll users 1048576 Jul 19 17:03
 edits_inprogress_0016331


 The edits_inprogress_0016331 should contains the 16331-16337
 transactions, why the ./hdfs namenode -bootstrapStandby command report
 error? How can I initialize the StandbyNameNode?

 Thanks,

 LiuLei














Re: Namenode automatically going to safemode with 2.1.0-beta

2013-07-19 Thread Azuryy Yu
this is not a bug.

it has been documented.
 On Jul 19, 2013 10:13 PM, Krishna Kishore Bonagiri 
write2kish...@gmail.com wrote:

 Hi Harsh,

   I have made my dfs.namenode.name.dir point to a subdirectory of my
 home, and I don't see this issue again. So, is this a bug that we need to
 log into JIRA?

 Thanks,
 Kishore


 On Tue, Jul 16, 2013 at 6:39 AM, Harsh J ha...@cloudera.com wrote:

  2013-07-12 11:04:26,002 WARN
 org.apache.hadoop.hdfs.server.namenode.NameNodeResourceChecker: Space
 available on volume 'null' is 0, which is below the configured reserved
 amount 104857600

 This is interesting. Its calling your volume null, which may be more
 of a superficial bug.

 What is your dfs.namenode.name.dir set to? From
 /tmp/hadoop-dsadm/dfs/name I'd expect you haven't set it up and /tmp
 is being used off of the out-of-box defaults. Could you try to set it
 to a specific directory thats not on /tmp?

 On Mon, Jul 15, 2013 at 2:43 PM, Krishna Kishore Bonagiri
 write2kish...@gmail.com wrote:
  I don't have it in my hdfs-site.xml, in which case probably the default
  value is taken..
 
 
  On Mon, Jul 15, 2013 at 2:29 PM, Azuryy Yu azury...@gmail.com wrote:
 
  please check dfs.datanode.du.reserved in the hdfs-site.xml
 
  On Jul 15, 2013 4:30 PM, Aditya exalter adityaexal...@gmail.com
 wrote:
 
  Hi Krishna,
 
 Can you please send screenshots of namenode web UI.
 
  Thanks Aditya.
 
 
  On Mon, Jul 15, 2013 at 1:54 PM, Krishna Kishore Bonagiri
  write2kish...@gmail.com wrote:
 
  I have had enough space on the disk that is used, like around 30 Gigs
 
  Thanks,
  Kishore
 
 
  On Mon, Jul 15, 2013 at 1:30 PM, Venkatarami Netla
  venkatarami.ne...@cloudwick.com wrote:
 
  Hi,
  pls see the available space for NN storage directory.
 
  Thanks  Regards
 
  Venkat
 
 
  On Mon, Jul 15, 2013 at 12:14 PM, Krishna Kishore Bonagiri
  write2kish...@gmail.com wrote:
 
  Hi,
 
   I am doing no activity on my single node cluster which is using
  2.1.0-beta, and still observed that it has gone to safe mode by
 itself after
  a while. I was looking at the name node log and see many of these
 kinds of
  entries.. Can anything be interpreted from these?
 
  2013-07-12 09:06:11,256 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log
 segment at
  561
  2013-07-12 09:07:11,290 INFO
  org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log
 from
  9.70.137.114
  2013-07-12 09:07:11,290 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
  2013-07-12 09:07:11,290 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log
 segment 561
  2013-07-12 09:07:11,291 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
 transactions: 2
  Total time for transactions(ms): 1 Number of transactions batched
 in Syncs:
  0 Number of syncs: 2 SyncTimes(ms): 14
  2013-07-12 09:07:11,292 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
 transactions: 2
  Total time for transactions(ms): 1 Number of transactions batched
 in Syncs:
  0 Number of syncs: 3 SyncTimes(ms): 15
  2013-07-12 09:07:11,293 INFO
  org.apache.hadoop.hdfs.server.namenode.FileJournalManager:
 Finalizing edits
  file
 /tmp/hadoop-dsadm/dfs/name/current/edits_inprogress_561
  -
 
 /tmp/hadoop-dsadm/dfs/name/current/edits_561-562
  2013-07-12 09:07:11,294 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log
 segment at
  563
  2013-07-12 09:08:11,397 INFO
  org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log
 from
  9.70.137.114
  2013-07-12 09:08:11,398 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
  2013-07-12 09:08:11,398 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log
 segment 563
  2013-07-12 09:08:11,399 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
 transactions: 2
  Total time for transactions(ms): 2 Number of transactions batched
 in Syncs:
  0 Number of syncs: 2 SyncTimes(ms): 11
  2013-07-12 09:08:11,400 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
 transactions: 2
  Total time for transactions(ms): 2 Number of transactions batched
 in Syncs:
  0 Number of syncs: 3 SyncTimes(ms): 12
  2013-07-12 09:08:11,402 INFO
  org.apache.hadoop.hdfs.server.namenode.FileJournalManager:
 Finalizing edits
  file
 /tmp/hadoop-dsadm/dfs/name/current/edits_inprogress_563
  -
 
 /tmp/hadoop-dsadm/dfs/name/current/edits_563-564
  2013-07-12 09:08:11,402 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log
 segment at
  565
  2013-07-12 09:09:11,440 INFO
  org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log
 from
  9.70.137.114
  2013-07-12 09:09:11,440 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
  2013-07-12 09:09:11,440 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log
 segment 565
  2013-07-12 

Re: Namenode automatically going to safemode with 2.1.0-beta

2013-07-19 Thread Harsh J
Yeah I believe your /tmp was probably misbehaving somehow (running out
of space or otherwise). You could log a JIRA for the null seen in
the log though, it shouldn't have done that and should've shown the
real mount point.

On Fri, Jul 19, 2013 at 8:47 PM, Azuryy Yu azury...@gmail.com wrote:
 this is not a bug.

 it has been documented.

 On Jul 19, 2013 10:13 PM, Krishna Kishore Bonagiri
 write2kish...@gmail.com wrote:

 Hi Harsh,

   I have made my dfs.namenode.name.dir point to a subdirectory of my home,
 and I don't see this issue again. So, is this a bug that we need to log into
 JIRA?

 Thanks,
 Kishore


 On Tue, Jul 16, 2013 at 6:39 AM, Harsh J ha...@cloudera.com wrote:

  2013-07-12 11:04:26,002 WARN
  org.apache.hadoop.hdfs.server.namenode.NameNodeResourceChecker: Space
  available on volume 'null' is 0, which is below the configured reserved
  amount 104857600

 This is interesting. Its calling your volume null, which may be more
 of a superficial bug.

 What is your dfs.namenode.name.dir set to? From
 /tmp/hadoop-dsadm/dfs/name I'd expect you haven't set it up and /tmp
 is being used off of the out-of-box defaults. Could you try to set it
 to a specific directory thats not on /tmp?

 On Mon, Jul 15, 2013 at 2:43 PM, Krishna Kishore Bonagiri
 write2kish...@gmail.com wrote:
  I don't have it in my hdfs-site.xml, in which case probably the default
  value is taken..
 
 
  On Mon, Jul 15, 2013 at 2:29 PM, Azuryy Yu azury...@gmail.com wrote:
 
  please check dfs.datanode.du.reserved in the hdfs-site.xml
 
  On Jul 15, 2013 4:30 PM, Aditya exalter adityaexal...@gmail.com
  wrote:
 
  Hi Krishna,
 
 Can you please send screenshots of namenode web UI.
 
  Thanks Aditya.
 
 
  On Mon, Jul 15, 2013 at 1:54 PM, Krishna Kishore Bonagiri
  write2kish...@gmail.com wrote:
 
  I have had enough space on the disk that is used, like around 30
  Gigs
 
  Thanks,
  Kishore
 
 
  On Mon, Jul 15, 2013 at 1:30 PM, Venkatarami Netla
  venkatarami.ne...@cloudwick.com wrote:
 
  Hi,
  pls see the available space for NN storage directory.
 
  Thanks  Regards
 
  Venkat
 
 
  On Mon, Jul 15, 2013 at 12:14 PM, Krishna Kishore Bonagiri
  write2kish...@gmail.com wrote:
 
  Hi,
 
   I am doing no activity on my single node cluster which is using
  2.1.0-beta, and still observed that it has gone to safe mode by
  itself after
  a while. I was looking at the name node log and see many of these
  kinds of
  entries.. Can anything be interpreted from these?
 
  2013-07-12 09:06:11,256 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log
  segment at
  561
  2013-07-12 09:07:11,290 INFO
  org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log
  from
  9.70.137.114
  2013-07-12 09:07:11,290 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit
  logs
  2013-07-12 09:07:11,290 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log
  segment 561
  2013-07-12 09:07:11,291 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
  transactions: 2
  Total time for transactions(ms): 1 Number of transactions batched
  in Syncs:
  0 Number of syncs: 2 SyncTimes(ms): 14
  2013-07-12 09:07:11,292 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
  transactions: 2
  Total time for transactions(ms): 1 Number of transactions batched
  in Syncs:
  0 Number of syncs: 3 SyncTimes(ms): 15
  2013-07-12 09:07:11,293 INFO
  org.apache.hadoop.hdfs.server.namenode.FileJournalManager:
  Finalizing edits
  file
  /tmp/hadoop-dsadm/dfs/name/current/edits_inprogress_561
  -
 
  /tmp/hadoop-dsadm/dfs/name/current/edits_561-562
  2013-07-12 09:07:11,294 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log
  segment at
  563
  2013-07-12 09:08:11,397 INFO
  org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log
  from
  9.70.137.114
  2013-07-12 09:08:11,398 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit
  logs
  2013-07-12 09:08:11,398 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log
  segment 563
  2013-07-12 09:08:11,399 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
  transactions: 2
  Total time for transactions(ms): 2 Number of transactions batched
  in Syncs:
  0 Number of syncs: 2 SyncTimes(ms): 11
  2013-07-12 09:08:11,400 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of
  transactions: 2
  Total time for transactions(ms): 2 Number of transactions batched
  in Syncs:
  0 Number of syncs: 3 SyncTimes(ms): 12
  2013-07-12 09:08:11,402 INFO
  org.apache.hadoop.hdfs.server.namenode.FileJournalManager:
  Finalizing edits
  file
  /tmp/hadoop-dsadm/dfs/name/current/edits_inprogress_563
  -
 
  /tmp/hadoop-dsadm/dfs/name/current/edits_563-564
  2013-07-12 09:08:11,402 INFO
  org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log
  segment at
  565
  2013-07-12 

Re: DistributedCache incompatibility issue between 1.0 and 2.0

2013-07-19 Thread Omkar Joshi
check https://issues.apache.org/jira/browse/MAPREDUCE-4493 and
https://issues.apache.org/jira/browse/YARN-916

Thanks,
Omkar Joshi
*Hortonworks Inc.* http://www.hortonworks.com


On Fri, Jul 19, 2013 at 8:12 AM, Ted Yu yuzhih...@gmail.com wrote:

 See this thread also:

 http://search-hadoop.com/m/3pgakkVpm71/Distributed+Cache+omkarsubj=Re+Distributed+Cache


 On Fri, Jul 19, 2013 at 6:20 AM, Botelho, Andrew 
 andrew.bote...@emc.comwrote:

 I have been using Job.addCacheFile() to cache files in the distributed
 cache.  It has been working for me on Hadoop 2.0.5:

 public void addCacheFile(URI uri)
 Add a file to be localized
 Parameters:
 uri - The uri of the cache to be localized

 -Original Message-
 From: Edward J. Yoon [mailto:edwardy...@apache.org]
 Sent: Friday, July 19, 2013 8:03 AM
 To: user@hadoop.apache.org
 Subject: DistributedCache incompatibility issue between 1.0 and 2.0

 Hi,

 I wonder why setLocalFiles and addLocalFiles methods have been removed,
 and what should I use instead of them?

 --
 Best Regards, Edward J. Yoon
 @eddieyoon





Re: Unexpected problem in creating temporary file

2013-07-19 Thread Ajay Srivastava
Any suggestion ?
I am stuck.


Regards,
Ajay Srivastava


On 19-Jul-2013, at 5:54 PM, Ajay Srivastava wrote:

 Hi,
 
 I am seeing many such errors on a datanode -
 
 2013-07-18 22:10:49,473 ERROR 
 org.apache.hadoop.hdfs.server.datanode.DataNode: 
 DatanodeRegistration(10.254.0.40:50010, 
 storageID=DS-595314104-10.254.0.40-50010-1374154266946, infoPort=50075, 
 ipcPort=50020):DataXceiver
 java.io.IOException: Unexpected problem in creating temporary file for 
 blk_-395633752903233591_3721.  File 
 /data/hadoop-admin/tmp/blk_-395633752903233591 should not be present, but is.
at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:429)
at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:407)
at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1242)
at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1131)
at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:99)
at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:305)
at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:113)
at java.lang.Thread.run(Unknown Source)
 
 What could be the reason behind it ? It is causing system to slow down.
 Before these errors, there were few exceptions with connection reset by 
 peer which I guess are harmless.
 
 
 Regards,
 Ajay Srivastava



Wish to subscribe

2013-07-19 Thread Pradeep Singh
Regards
Pradeep Singh


Wish to subscribe

2013-07-19 Thread Pradeep Singh
Regards
Pradeep Singh


subsicrbe

2013-07-19 Thread Pradeep Singh
Regards
Pradeep Singh


RE: subsicrbe

2013-07-19 Thread Devaraj k
Hi Pradeep,

Please send mail to subscribe mail ids, after subscription if you have any 
queries you can reach to the corresponding  lists. You can find the subscribe 
mail ids in this page.

   http://hadoop.apache.org/mailing_lists.html


Thanks
Devaraj k

From: Pradeep Singh [mailto:hadoop.guy0...@gmail.com]
Sent: 20 July 2013 09:38
To: Hadoop Common commits mailing list; Hadoop Common issue tracking system; 
Hadoop HDFS issues mailing list; Hadoop MapReduce commits mailing; Hadoop user 
mailing list; general mailing list is for announcements and project management
Subject: subsicrbe

Regards
Pradeep Singh