Too Many CLOSE_WAIT make performance down

2013-08-07 Thread Bing Jiang
Version:
HBase: 0.94.3
HDFS: 0.20.*

There are too many CLOSE_WAIT connection from RS to DN, and I find the
number is over 3.
And change the Log-Level of 'org.apache.hadoop.ipc.HBaseServer.trace' to
DEBUG, and check that the performance:

 Call #2649932; Served: HRegionInterface#get queueTime=0 processingTime=284
 contents=1 Get, 86 bytes

 So the conclusion is that when DataNode server port has been occupied by
normal or  irregular connection, it will bring read/write performance down.

According to principle of TCP/IP protocol, CLOSE_WAIT means that RS cannot
close fd which has been opened, and I restart RS gracefully, the problem
has been tackled. Ok, My question is :

Can someone tell me in which conditions do RS will ignore the file handler?

Any ideas will be nice.

Thanks!

-- 
Bing Jiang
Tel:(86)134-2619-1361
weibo: http://weibo.com/jiangbinglover
BLOG: www.binospace.com
BLOG: http://blog.sina.com.cn/jiangbinglover
Focus on distributed computing, HDFS/HBase


Re: whitelist feature of YARN

2013-08-07 Thread Sandy Ryza
YARN-521, which brings whitelisting to the AMRMClient APIs, is now included
in 2.1.0-beta.  Check out the doc for the relaxLocality paramater in
ContainerRequest in AMRMClient:
https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/AMRMClient.java
and
I can help clarify here if anything's confusing.

-Sandy


On Tue, Jul 9, 2013 at 2:54 AM, Krishna Kishore Bonagiri 
write2kish...@gmail.com wrote:

 Hi Sandy,

   Yes, I have been using AMRMClient APIs. I am planning to shift to
 whatever way is this white list feature is supported with. But am not sure
 what is meant by submitting ResourceRequests directly to RM. Can you please
 elaborate on this or give me a pointer to some example code on how to do
 it...

Thanks for the reply,

 -Kishore


 On Mon, Jul 8, 2013 at 10:53 PM, Sandy Ryza sandy.r...@cloudera.comwrote:

 Hi Krishna,

 From your previous email, it looks like you are using the AMRMClient
 APIs.  Support for whitelisting is not yet supported through them.  I am
 working on this in YARN-521, which should be included in the next release
 after 2.1.0-beta.  If you are submitting ResourceRequests directly to the
 RM, you can whitelist a node by
 * setting the relaxLocality flag on the node-level ResourceRequest to true
 * setting the relaxLocality flag on the corresponding rack-level
 ResourceRequest to false
 * setting the relaxLocality flag on the corresponding any-level
 ResourceRequest to false

 -Sandy


 On Mon, Jul 8, 2013 at 6:48 AM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi,

   Can someone please point to some example code of how to use the
 whitelist feature of YARN, I have recently got RC1 for hadoop-2.1.0-beta
 and want to use this feature.

   It would be great if you can point me to some description of what this
 white listing feature is, I have gone through some JIRA logs related to
 this but more concrete explanation would be helpful.

 Thanks,
 Kishore






Re: whitelist feature of YARN

2013-08-07 Thread Krishna Kishore Bonagiri
Hi Sandy,

  Thanks for the reply and it is good to know YARN-521 is done! Please
answer my following questions

1) when is 2.1.0-beta going to be released? is it soon or do you suggest me
take it from the trunk or is there a recent release candidate available?

2) I have recently changed my application to use the new Asynchronous
interfaces. I am hoping it works with that too, correct me if I am wrong.

3) Change in interface:

The old interface for ContainerRequest constructor used to be this:

 public ContainerRequest(Resource capability, String[] nodes,
String[] racks, Priority priority, int containerCount);

where as now it is changed to

a) public ContainerRequest(Resource capability, String[] nodes,
String[] racks, Priority priority)


b) public ContainerRequest(Resource capability, String[] nodes,
String[] racks, Priority priority, boolean relaxLocality)

that means the old argument containerCount is gone! How would I be able to
specify how many containers do I need?

-Kishore




On Wed, Aug 7, 2013 at 11:37 AM, Sandy Ryza sandy.r...@cloudera.com wrote:

 YARN-521, which brings whitelisting to the AMRMClient APIs, is now
 included in 2.1.0-beta.  Check out the doc for the relaxLocality paramater
 in ContainerRequest in AMRMClient:
 https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/AMRMClient.java
  and
 I can help clarify here if anything's confusing.

 -Sandy


 On Tue, Jul 9, 2013 at 2:54 AM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi Sandy,

   Yes, I have been using AMRMClient APIs. I am planning to shift to
 whatever way is this white list feature is supported with. But am not sure
 what is meant by submitting ResourceRequests directly to RM. Can you please
 elaborate on this or give me a pointer to some example code on how to do
 it...

Thanks for the reply,

 -Kishore


 On Mon, Jul 8, 2013 at 10:53 PM, Sandy Ryza sandy.r...@cloudera.comwrote:

 Hi Krishna,

 From your previous email, it looks like you are using the AMRMClient
 APIs.  Support for whitelisting is not yet supported through them.  I am
 working on this in YARN-521, which should be included in the next release
 after 2.1.0-beta.  If you are submitting ResourceRequests directly to the
 RM, you can whitelist a node by
 * setting the relaxLocality flag on the node-level ResourceRequest to
 true
 * setting the relaxLocality flag on the corresponding rack-level
 ResourceRequest to false
 * setting the relaxLocality flag on the corresponding any-level
 ResourceRequest to false

 -Sandy


 On Mon, Jul 8, 2013 at 6:48 AM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi,

   Can someone please point to some example code of how to use the
 whitelist feature of YARN, I have recently got RC1 for hadoop-2.1.0-beta
 and want to use this feature.

   It would be great if you can point me to some description of what
 this white listing feature is, I have gone through some JIRA logs related
 to this but more concrete explanation would be helpful.

 Thanks,
 Kishore







Namenode is failing with expception to join

2013-08-07 Thread Manish Bhoge
I have all configuration fine. But whenever i start namenode it fails with a 
below exception. No clue where to fix this?


2013-08-07 02:56:22,754 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: 
Exception in namenode join
2013-08-07 02:56:22,751 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Number of files = 1
2013-08-07 02:56:22,751 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Number of files under construction = 0
2013-08-07 02:56:22,752 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Image file of size 115 loaded in 0 seconds.
2013-08-07 02:56:22,752 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Loaded image for txid 0 from /data/1/dfs/nn/current/fsimage_000
2013-08-07 02:56:22,752 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Reading 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@5f18223d 
expecting start txid #1
2013-08-07 02:56:22,752 INFO 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Fast-forwarding 
stream '/data/1/dfs/nn/current/edits_0515247-0515255' 
to transaction ID 1
2013-08-07 02:56:22,753 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
Stopping NameNode metrics system...
2013-08-07 02:56:22,754 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
NameNode metrics system stopped.
2013-08-07 02:56:22,754 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
NameNode metrics system shutdown complete. 2013-08-07 02:56:22,754 FATAL 
org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join 
java.io.IOException: There appears to be a gap in the edit log.  We expected 
txid 1, but got txid 515247.

Re: whitelist feature of YARN

2013-08-07 Thread Sandy Ryza
Responses inline:


On Tue, Aug 6, 2013 at 11:55 PM, Krishna Kishore Bonagiri 
write2kish...@gmail.com wrote:

 Hi Sandy,

   Thanks for the reply and it is good to know YARN-521 is done! Please
 answer my following questions

 1) when is 2.1.0-beta going to be released? is it soon or do you suggest
 me take it from the trunk or is there a recent release candidate available?

 We're very close and my guess would be no later than the end of the
month (don't hold me to this).


 2) I have recently changed my application to use the new Asynchronous
 interfaces. I am hoping it works with that too, correct me if I am wrong.

ContainerRequest is shared by the async interfaces as well so it should
work here.


 3) Change in interface:

 The old interface for ContainerRequest constructor used to be this:

  public ContainerRequest(Resource capability, String[] nodes,
 String[] racks, Priority priority, int containerCount);

 where as now it is changed to

 a) public ContainerRequest(Resource capability, String[] nodes,
 String[] racks, Priority priority)
 

 b) public ContainerRequest(Resource capability, String[] nodes,
 String[] racks, Priority priority, boolean relaxLocality)

 that means the old argument containerCount is gone! How would I be able to
 specify how many containers do I need?

 We now expect that you submit a ContainerRequest for each container you
want.


 -Kishore




 On Wed, Aug 7, 2013 at 11:37 AM, Sandy Ryza sandy.r...@cloudera.comwrote:

 YARN-521, which brings whitelisting to the AMRMClient APIs, is now
 included in 2.1.0-beta.  Check out the doc for the relaxLocality paramater
 in ContainerRequest in AMRMClient:
 https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/AMRMClient.java
  and
 I can help clarify here if anything's confusing.

 -Sandy


 On Tue, Jul 9, 2013 at 2:54 AM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi Sandy,

   Yes, I have been using AMRMClient APIs. I am planning to shift to
 whatever way is this white list feature is supported with. But am not sure
 what is meant by submitting ResourceRequests directly to RM. Can you please
 elaborate on this or give me a pointer to some example code on how to do
 it...

Thanks for the reply,

 -Kishore


 On Mon, Jul 8, 2013 at 10:53 PM, Sandy Ryza sandy.r...@cloudera.comwrote:

 Hi Krishna,

 From your previous email, it looks like you are using the AMRMClient
 APIs.  Support for whitelisting is not yet supported through them.  I am
 working on this in YARN-521, which should be included in the next release
 after 2.1.0-beta.  If you are submitting ResourceRequests directly to the
 RM, you can whitelist a node by
 * setting the relaxLocality flag on the node-level ResourceRequest to
 true
 * setting the relaxLocality flag on the corresponding rack-level
 ResourceRequest to false
 * setting the relaxLocality flag on the corresponding any-level
 ResourceRequest to false

 -Sandy


 On Mon, Jul 8, 2013 at 6:48 AM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi,

   Can someone please point to some example code of how to use the
 whitelist feature of YARN, I have recently got RC1 for hadoop-2.1.0-beta
 and want to use this feature.

   It would be great if you can point me to some description of what
 this white listing feature is, I have gone through some JIRA logs related
 to this but more concrete explanation would be helpful.

 Thanks,
 Kishore








Re: Namenode is failing with expception to join

2013-08-07 Thread Azuryy Yu
Manish,

you stop HDFS then start HDFS on the standby name node right?

please looked at https://issues.apache.org/jira/browse/HDFS-5058

there are two solutions:
1) start HDFS on the active name node, nor SBN
2) copy {namenode.name.dir}/* to the SBN

I advice #1.




On Wed, Aug 7, 2013 at 3:00 PM, Manish Bhoge manishbh...@rocketmail.comwrote:

 I have all configuration fine. But whenever i start namenode it fails with
 a below exception. No clue where to fix this?

 2013-08-07 02:56:22,754 FATAL
 org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join

 2013-08-07 02:56:22,751 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Number of files = 1
 2013-08-07 02:56:22,751 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Number of files under construction = 0
 2013-08-07 02:56:22,752 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Image file of size 115 loaded in 0 seconds.
 2013-08-07 02:56:22,752 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Loaded image for txid 0 from 
 /data/1/dfs/nn/current/fsimage_000
 2013-08-07 02:56:22,752 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Reading 
 org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@5f18223d 
 expecting start txid #1
 2013-08-07 02:56:22,752 INFO 
 org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Fast-forwarding 
 stream '/data/1/dfs/nn/current/edits_0515247-0515255' 
 to transaction ID 1
 2013-08-07 02:56:22,753 INFO 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics 
 system...
 2013-08-07 02:56:22,754 INFO 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system 
 stopped.
 2013-08-07 02:56:22,754 INFO 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system 
 shutdown complete.2013-08-07 02:56:22,754 FATAL 
 org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
 java.io.IOException: There appears to be a gap in the edit log.  We expected 
 txid 1, but got txid 515247.




Re: Namenode is failing with expception to join

2013-08-07 Thread Manish Bhoge
I am not using HA here. All I am trying here is to make a 2 node cluster. But 
before that i wanted to make sure that i am setting up everything right and 
make the HDFS up on Pseudo distributed mode. However, I am suspecting a mistake 
in my /etc/hosts file. As, I have rename the local host to myhost-1
 

Please suggest.




 From: Azuryy Yu azury...@gmail.com
To: user@hadoop.apache.org; Manish Bhoge manishbh...@rocketmail.com 
Sent: Wednesday, 7 August 2013 1:08 PM
Subject: Re: Namenode is failing with expception to join
 


Manish,
 
you stop HDFS then start HDFS on the standby name node right?  
 
please looked at https://issues.apache.org/jira/browse/HDFS-5058
 
there are two solutions:
1) start HDFS on the active name node, nor SBN
2) copy {namenode.name.dir}/* to the SBN 
 
I advice #1.
 
 



On Wed, Aug 7, 2013 at 3:00 PM, Manish Bhoge manishbh...@rocketmail.com wrote:

I have all configuration fine. But whenever i start namenode it fails with a 
below exception. No clue where to fix this?



2013-08-07 02:56:22,754 FATAL 
org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
2013-08-07 02:56:22,751 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Number of files = 1
2013-08-07 02:56:22,751 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Number of files under construction = 0
2013-08-07 02:56:22,752 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Image file of size 115 loaded in 0 seconds.
2013-08-07 02:56:22,752 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Loaded image for txid 0 from /data/1/dfs/nn/current/fsimage_000
2013-08-07 02:56:22,752 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Reading 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@5f18223d 
expecting start txid #1
2013-08-07 02:56:22,752 INFO 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Fast-forwarding 
stream '/data/1/dfs/nn/current/edits_0515247-0515255' 
to transaction ID 1
2013-08-07 02:56:22,753 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
Stopping NameNode metrics system...
2013-08-07 02:56:22,754 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
NameNode metrics system stopped.
2013-08-07 02:56:22,754 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
NameNode metrics system shutdown complete. 2013-08-07 02:56:22,754 FATAL 
org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join 
java.io.IOException: There appears to be a gap in the edit log.  We expected 
txid 1, but got txid 515247.

Re: Namenode is failing with expception to join

2013-08-07 Thread Jitendra Yadav
Hi,

Did you configured your Name Node to store multiple copies of its metadata?.

You can recover your name node in that situation.

#hadoop namenode -recover

it will ask you whether you want to continue or not, Please follow the
instructions.

Thanks
On Wed, Aug 7, 2013 at 1:44 PM, Manish Bhoge manishbh...@rocketmail.comwrote:

  I am not using HA here. All I am trying here is to make a 2 node
 cluster. But before that i wanted to make sure that i am setting up
 everything right and make the HDFS up on Pseudo distributed mode. However,
 I am suspecting a mistake in my /etc/hosts file. As, I have rename the
 local host to myhost-1

 Please suggest.

   --
 *From:* Azuryy Yu azury...@gmail.com
 *To:* user@hadoop.apache.org; Manish Bhoge manishbh...@rocketmail.com
 *Sent:* Wednesday, 7 August 2013 1:08 PM
 *Subject:* Re: Namenode is failing with expception to join

  Manish,

 you stop HDFS then start HDFS on the standby name node right?

 please looked at https://issues.apache.org/jira/browse/HDFS-5058

 there are two solutions:
 1) start HDFS on the active name node, nor SBN
 2) copy {namenode.name.dir}/* to the SBN

 I advice #1.




 On Wed, Aug 7, 2013 at 3:00 PM, Manish Bhoge 
 manishbh...@rocketmail.comwrote:

  I have all configuration fine. But whenever i start namenode it fails
 with a below exception. No clue where to fix this?

 2013-08-07 02:56:22,754 FATAL
 org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join

 2013-08-07 02:56:22,751 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Number of files = 1
 2013-08-07 02:56:22,751 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Number of files under construction = 0
 2013-08-07 02:56:22,752 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Image file of size 115 loaded in 0 seconds.
 2013-08-07 02:56:22,752 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Loaded image for txid 0 from 
 /data/1/dfs/nn/current/fsimage_000
 2013-08-07 02:56:22,752 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Reading 
 org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@5f18223d 
 expecting start txid #1
 2013-08-07 02:56:22,752 INFO 
 org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Fast-forwarding 
 stream '/data/1/dfs/nn/current/edits_0515247-0515255' 
 to transaction ID 1
 2013-08-07 02:56:22,753 INFO 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics 
 system...
 2013-08-07 02:56:22,754 INFO 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system 
 stopped.
 2013-08-07 02:56:22,754 INFO 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system 
 shutdown complete.2013-08-07 02:56:22,754 FATAL 
 org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
 java.io.IOException: There appears to be a gap in the edit log.  We expected 
 txid 1, but got txid 515247.







Re: whitelist feature of YARN

2013-08-07 Thread Krishna Kishore Bonagiri
Sandy,
  Thanks again. I found RC1 for 2.1.0-beta available at
http://people.apache.org/~acmurthy/hadoop-2.1.0-beta-rc1/
   Would this have the fix for YARN-521? and, can I use that?

-Kishore


On Wed, Aug 7, 2013 at 12:35 PM, Sandy Ryza sandy.r...@cloudera.com wrote:

 Responses inline:


 On Tue, Aug 6, 2013 at 11:55 PM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi Sandy,

   Thanks for the reply and it is good to know YARN-521 is done! Please
 answer my following questions

 1) when is 2.1.0-beta going to be released? is it soon or do you suggest
 me take it from the trunk or is there a recent release candidate available?

 We're very close and my guess would be no later than the end of the
 month (don't hold me to this).


 2) I have recently changed my application to use the new Asynchronous
 interfaces. I am hoping it works with that too, correct me if I am wrong.

 ContainerRequest is shared by the async interfaces as well so it should
 work here.


 3) Change in interface:

 The old interface for ContainerRequest constructor used to be this:

  public ContainerRequest(Resource capability, String[] nodes,
 String[] racks, Priority priority, int containerCount);

 where as now it is changed to

 a) public ContainerRequest(Resource capability, String[] nodes,
 String[] racks, Priority priority)
 

 b) public ContainerRequest(Resource capability, String[] nodes,
 String[] racks, Priority priority, boolean relaxLocality)

 that means the old argument containerCount is gone! How would I be able
 to specify how many containers do I need?

 We now expect that you submit a ContainerRequest for each container you
 want.


 -Kishore




 On Wed, Aug 7, 2013 at 11:37 AM, Sandy Ryza sandy.r...@cloudera.comwrote:

 YARN-521, which brings whitelisting to the AMRMClient APIs, is now
 included in 2.1.0-beta.  Check out the doc for the relaxLocality paramater
 in ContainerRequest in AMRMClient:
 https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/AMRMClient.java
  and
 I can help clarify here if anything's confusing.

 -Sandy


 On Tue, Jul 9, 2013 at 2:54 AM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi Sandy,

   Yes, I have been using AMRMClient APIs. I am planning to shift to
 whatever way is this white list feature is supported with. But am not sure
 what is meant by submitting ResourceRequests directly to RM. Can you please
 elaborate on this or give me a pointer to some example code on how to do
 it...

Thanks for the reply,

 -Kishore


 On Mon, Jul 8, 2013 at 10:53 PM, Sandy Ryza sandy.r...@cloudera.comwrote:

 Hi Krishna,

 From your previous email, it looks like you are using the AMRMClient
 APIs.  Support for whitelisting is not yet supported through them.  I am
 working on this in YARN-521, which should be included in the next release
 after 2.1.0-beta.  If you are submitting ResourceRequests directly to the
 RM, you can whitelist a node by
 * setting the relaxLocality flag on the node-level ResourceRequest to
 true
 * setting the relaxLocality flag on the corresponding rack-level
 ResourceRequest to false
 * setting the relaxLocality flag on the corresponding any-level
 ResourceRequest to false

 -Sandy


 On Mon, Jul 8, 2013 at 6:48 AM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi,

   Can someone please point to some example code of how to use the
 whitelist feature of YARN, I have recently got RC1 for hadoop-2.1.0-beta
 and want to use this feature.

   It would be great if you can point me to some description of what
 this white listing feature is, I have gone through some JIRA logs related
 to this but more concrete explanation would be helpful.

 Thanks,
 Kishore









MutableCounterLong metrics display in ganglia

2013-08-07 Thread lei liu
I use hadoop-2.0.5 and config hadoop-metrics2.properties file with below
content.
*.sink.ganglia.class=org.
apache.hadoop.metrics2.sink.ganglia.GangliaSink31
*.sink.ganglia.period=10
*.sink.ganglia.supportsparse=true
namenode.sink.ganglia.servers=10.232.98.74:8649
datanode.sink.ganglia.servers=10.232.98.74:8649

I write one programme that call FSDataOutputStream.hsync() method once per
second.

There is @Metric MutableCounterLong fsyncCount metrics in
DataNodeMetrics, when FSDataOutputStream.hsync() method is called, the
value of  fsyncCount is increased, dataNode send the value of  fsyncCount
to ganglia every ten seconds, so I think the value  of  fsyncCount in
ganglia should be 10, 20 ,30, 40 and so on .  but the ganglia display
1,1,1,1,1 .. , so the value is
the value of fsyncCount is set to zero every ten seconds and
”fsyncCount.value/10“ .


Is  the the value of MutableCounterLong class  set to zero every ten
seconds and   MutableCounterLong .value/10?

Thanks,

LiuLei


Re: Large-scale collection of logs from multiple Hadoop nodes

2013-08-07 Thread 武泽胜
We have the same scenario as you described. The following is our solution, just 
FYI:

We installed a local scribe agent on every node of our cluster, and we have 
several central scribe servers. We extended log4j to support writing logs to 
the local scribe agent,  and the local scribe agents forward the logs to the 
central scribe servers, at last the central scribe servers write these logs to 
a specified hdfs cluster used for offline processing.

Then we use hive/impale to analyse  the collected logs.

From: Public Network Services 
publicnetworkservi...@gmail.commailto:publicnetworkservi...@gmail.com
Reply-To: user@hadoop.apache.orgmailto:user@hadoop.apache.org 
user@hadoop.apache.orgmailto:user@hadoop.apache.org
Date: Tuesday, August 6, 2013 1:58 AM
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org 
user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: Large-scale collection of logs from multiple Hadoop nodes

Hi...

I am facing a large-scale usage scenario of log collection from a Hadoop 
cluster and examining ways as to how it should be implemented.

More specifically, imagine a cluster that has hundreds of nodes, each of which 
constantly produces Syslog events that need to be gathered an analyzed at 
another point. The total amount of logs could be tens of gigabytes per day, if 
not more, and the reception rate in the order of thousands of events per 
second, if not more.

One solution is to send those events over the network (e.g., using using flume) 
and collect them in one or more (less than 5) nodes in the cluster, or in 
another location, whereby the logs will be processed by a either constantly 
MapReduce job, or by non-Hadoop servers running some log processing application.

Another approach could be to deposit all these events into a queuing system 
like ActiveMQ or RabbitMQ, or whatever.

In all cases, the main objective is to be able to do real-time log analysis.

What would be the best way of implementing the above scenario?

Thanks!

PNS



Re: Large-scale collection of logs from multiple Hadoop nodes

2013-08-07 Thread Alexander Lorenz
Hi,

the approach with Flume is the most reliable workflow for, since Flume has a 
builtin Syslog source as well a loadbalancing channel. On top you can define 
multiple channels for different sources. 

Best,
Alex

sent via my mobile device

mapredit.blogspot.com
@mapredit


 On Aug 7, 2013, at 1:44 PM, 武泽胜 wuzesh...@xiaomi.com wrote:
 
 We have the same scenario as you described. The following is our solution, 
 just FYI:
 
 We installed a local scribe agent on every node of our cluster, and we have 
 several central scribe servers. We extended log4j to support writing logs to 
 the local scribe agent,  and the local scribe agents forward the logs to the 
 central scribe servers, at last the central scribe servers write these logs 
 to a specified hdfs cluster used for offline processing.
 
 Then we use hive/impale to analyse  the collected logs.
 
 From: Public Network Services publicnetworkservi...@gmail.com
 Reply-To: user@hadoop.apache.org user@hadoop.apache.org
 Date: Tuesday, August 6, 2013 1:58 AM
 To: user@hadoop.apache.org user@hadoop.apache.org
 Subject: Large-scale collection of logs from multiple Hadoop nodes
 
 Hi...
 
 I am facing a large-scale usage scenario of log collection from a Hadoop 
 cluster and examining ways as to how it should be implemented.
 
 More specifically, imagine a cluster that has hundreds of nodes, each of 
 which constantly produces Syslog events that need to be gathered an analyzed 
 at another point. The total amount of logs could be tens of gigabytes per 
 day, if not more, and the reception rate in the order of thousands of events 
 per second, if not more.
 
 One solution is to send those events over the network (e.g., using using 
 flume) and collect them in one or more (less than 5) nodes in the cluster, or 
 in another location, whereby the logs will be processed by a either 
 constantly MapReduce job, or by non-Hadoop servers running some log 
 processing application.
 
 Another approach could be to deposit all these events into a queuing system 
 like ActiveMQ or RabbitMQ, or whatever.
 
 In all cases, the main objective is to be able to do real-time log analysis.
 
 What would be the best way of implementing the above scenario?
 
 Thanks!
 
 PNS
 


RE: Compilation problem of Hadoop Projects after Import into Eclipse

2013-08-07 Thread German Florez-Larrahondo
Sathwik 

 

I experienced something similar a few weeks ago.

I reported a JIRA on the documentation of this, please comment there

 

https://issues.apache.org/jira/browse/HADOOP-9771

 

Regards

./g

 

 

From: Sathwik B P [mailto:sath...@apache.org] 
Sent: Tuesday, August 06, 2013 4:46 AM
To: user@hadoop.apache.org
Subject: Compilation problem of Hadoop Projects after Import into Eclipse

 

Hi guys,

I see a couple of problem with the generation of eclipse artifacts mvn
eclipse:eclipse.
There are a couple of compilation issues after importing the hadoop projects
into eclipse, though am able to rectify them.

1) hadoop-common: TestAvroSerialization.java doesn't compile as it uses
AvroRecord which exists under target/generated-test-sources/java. 
Solution: Need to include target/generated-test-sources/java as source.

2) hadoop-streaming: linked source folder conf which should point to
hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn
-server-resourcemanager/conf
doesn't point to the path correctly
Solution: Manually add the conf and link it to
hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn
-server-resourcemanager/conf

Can this be fixed?

I have just compiled the hadoop trunk codebase.

regards,
sathwik



Oozie ssh action error

2013-08-07 Thread Kasa V Varun Tej
Whats the probable cause of the error when the error log of the ssh action
reads:

Error: permission denied (publickey password)

I already have a passphrase-less ssh set. can you guys point me towards the
potential reason and solution to the error.

Thanks,
Kasa.


Re: Oozie ssh action error

2013-08-07 Thread Jitendra Yadav
Hi,

I hope below points might help you.

 *Approach 1#*

You need to change the sshd_config file in the remote server (probably in
/etc/ssh/sshd_config).

Change PasswordAuthentication value.

PasswordAuthentication no to
PasswordAuthentication yes
And then restart the SSHD daemon
*Approach 2#*
Check the authorized_keys file permission.
 chmod 600 ~/.ssh/authorized_keys

Thanks.





On Wed, Aug 7, 2013 at 7:16 PM, Kasa V Varun Tej kasava...@gmail.comwrote:

 Whats the probable cause of the error when the error log of the ssh action
 reads:

 Error: permission denied (publickey password)

 I already have a passphrase-less ssh set. can you guys point me towards
 the potential reason and solution to the error.

 Thanks,
 Kasa.







Re: Extra start-up overhead with hadoop-2.1.0-beta

2013-08-07 Thread Krishna Kishore Bonagiri
Hi Omkar,

 Can you please see if you can answer my question with this info or if you
need anything else from me?

 Also, does resource localization improve or impact any performance?

Thanks,
Kishore


On Thu, Aug 1, 2013 at 11:20 PM, Omkar Joshi ojo...@hortonworks.com wrote:

 How are you making these measurements can you elaborate more? Is it on a
 best case basis or on an average or worst case? How many resources are you
 sending it for localization? were the sizes and number of these resources
 consistent across tests? Were these resources public/private/application
 specific? Apart from this is the other load on node manager same? is the
 load on hdfs same? did you see any network bottleneck?

 More information will help a lot.


 Thanks,
 Omkar Joshi
 *Hortonworks Inc.* http://www.hortonworks.com


 On Thu, Aug 1, 2013 at 2:19 AM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi,
   Please share with me if you anyone has an answer or clues to my
 question regarding the start up performance.

 Also, one more thing I have observed today is the time taken to run a
 command on a container went up by more than a second in this latest version.

 When using 2.0.4-alpha, it used to take 0.3 to 0.5 seconds from the point
 I call startContainer() to the  point the command is started on the
 container.

 where as

 When using 2.1.0-beta, it is taking around 1.5 seconds from the point it
 came to the call back onContainerStarted() to the point the command is seen
 started running on the container.

 Thanks,
 Kishore


 On Thu, Jul 25, 2013 at 8:38 PM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi,

   I have been using the hadoop-2.0.1-beta release candidate and observed
 that it is slower in running my simple application that runs on 2
 containers. I have tried to find out which parts of it is really having
 this extra overhead(compared to hadoop-2.0.4-alpha), and here is what I
 found that.

 1) From the point my Client has submitted the Application Master to RM,
 it is taking 2  seconds extra
 2) From the point my container request are set up by Application Master,
 till the containers are allocated, it is taking 2 seconds extra

 Is this overhead expected with the changes that went into the new
 version? Or is there to improve it by changing something in configurations
 or so?

 Thanks,
 Kishore






Re: Is there any way to use a hdfs file as a Circular buffer?

2013-08-07 Thread Shekhar Sharma
Use CEP tool like Esper and Storm, you will be able to achieve that
...I can give you more inputs if you can provide me more details of what
you are trying to achieve
Regards,
Som Shekhar Sharma
+91-8197243810


On Wed, Aug 7, 2013 at 9:58 PM, Wukang Lin vboylin1...@gmail.com wrote:

 Hi Niels and Bertrand,
 Thank you for you great advices.
 In our scenario, we need to store a steady stream of binary data into
 a circular storage,throughput and concurrency are the most important
 indicators.The first way seems work, but as  hdfs is not friendly for small
 files, this approche may be not smooth enough.HBase is good, but  not
 appropriate for us, both for throughput and storage.mongodb is quite good
 for web applications, but not suitable the scenario we meet all the same.
 we need a distributed storage system,with Highe throughput, HA,LB and
 secure. Maybe It act much like hbase, manager a lot of small file(hfile) as
 a large region. we manager a lot of small file as a large one. Perhaps we
 should develop it by ourselives.

 Thank you.
 Lin Wukang


 2013/7/25 Niels Basjes ni...@basjes.nl

 A circular file on hdfs is not possible.

 Some of the ways around this limitation:
 - Create a series of files and delete the oldest file when you have too
 much.
 - Put the data into an hbase table and do something similar.
 - Use completely different technology like mongodb which has built in
 support for a circular buffer (capped collection).

 Niels

 Hi all,
Is there any way to use a hdfs file as a Circular buffer? I mean, if I 
 set a quotas to a directory on hdfs, and writting data to a file in that 
 directory continuously. Once the quotas exceeded, I can redirect the writter 
 and write the data from the beginning of the file automatically .





Datanode doesn't connect to Namenode

2013-08-07 Thread Felipe Gutierrez
Hi everyone,

My slave machine (cloud15) the datanode shows this log. It doesn't connect
to the master (cloud6).

2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: cloud15/192.168.188.15:54310. Already tried 9 time(s); retry
policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1
SECONDS)
2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.RPC: Server at cloud15/
192.168.188.15:54310 not available yet, Z...

But when I type jps command on slave machine DataNode is running. This is
my file core-site.xml in slave machine (cloud15):
configuration
property
  namehadoop.tmp.dir/name
  value/app/hadoop/tmp/value
  descriptionA base for other temporary directories./description
/property

property
  namefs.default.name/name
  valuehdfs://cloud15:54310/value
  descriptionThe name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem./description
/property
/configuration

In the master machine I just swap cloud15 to cloud6.
In the file /etc/host I have (192.168.188.15  cloud15) and (192.168.188.6
cloud6) lines, and both machines access through ssh with out password.

Am I missing anything?

Thanks in advance!
Felipe


-- 
*--
-- Felipe Oliveira Gutierrez
-- felipe.o.gutier...@gmail.com
-- https://sites.google.com/site/lipe82/Home/diaadia*


Re: Is there any way to use a hdfs file as a Circular buffer?

2013-08-07 Thread Wukang Lin
Hi Shekhar,
Thank you for your replies.So far as I know, Storm is a distributed
computing framework, but what we need is a storage system, high throughput
and concurrency is matters.We have thousands of devices, each device will
produce a steady stream of brinary data. The space for every device is
fixed, so their should reuse the space on the disk.So, how can storm or
esper achieve that?

Many Thanks
Lin Wukang


2013/8/8 Shekhar Sharma shekhar2...@gmail.com

 Use CEP tool like Esper and Storm, you will be able to achieve that
 ...I can give you more inputs if you can provide me more details of what
 you are trying to achieve
 Regards,
 Som Shekhar Sharma
 +91-8197243810


 On Wed, Aug 7, 2013 at 9:58 PM, Wukang Lin vboylin1...@gmail.com wrote:

 Hi Niels and Bertrand,
 Thank you for you great advices.
 In our scenario, we need to store a steady stream of binary data into
 a circular storage,throughput and concurrency are the most important
 indicators.The first way seems work, but as  hdfs is not friendly for small
 files, this approche may be not smooth enough.HBase is good, but  not
 appropriate for us, both for throughput and storage.mongodb is quite
 good for web applications, but not suitable the scenario we meet all the
 same.
 we need a distributed storage system,with Highe throughput, HA,LB
 and secure. Maybe It act much like hbase, manager a lot of small
 file(hfile) as a large region. we manager a lot of small file as a large
 one. Perhaps we should develop it by ourselives.

 Thank you.
 Lin Wukang


 2013/7/25 Niels Basjes ni...@basjes.nl

 A circular file on hdfs is not possible.

 Some of the ways around this limitation:
 - Create a series of files and delete the oldest file when you have too
 much.
 - Put the data into an hbase table and do something similar.
 - Use completely different technology like mongodb which has built in
 support for a circular buffer (capped collection).

 Niels

 Hi all,
Is there any way to use a hdfs file as a Circular buffer? I mean, if I 
 set a quotas to a directory on hdfs, and writting data to a file in that 
 directory continuously. Once the quotas exceeded, I can redirect the 
 writter and write the data from the beginning of the file automatically .






Re: Datanode doesn't connect to Namenode

2013-08-07 Thread Jitendra Yadav
Hi,

Your logs showing that the process is creating IPC call not for namenode,
it is hitting datanode itself.

Check you please check you datanode processes status?.

Regards
Jitendra
On Wed, Aug 7, 2013 at 10:29 PM, Felipe Gutierrez 
felipe.o.gutier...@gmail.com wrote:

 Hi everyone,

 My slave machine (cloud15) the datanode shows this log. It doesn't connect
 to the master (cloud6).

  2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.Client: Retrying
 connect to server: cloud15/192.168.188.15:54310. Already tried 9 time(s);
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
 sleepTime=1 SECONDS)
 2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.RPC: Server at cloud15/
 192.168.188.15:54310 not available yet, Z...

 But when I type jps command on slave machine DataNode is running. This is
 my file core-site.xml in slave machine (cloud15):
  configuration
 property
   namehadoop.tmp.dir/name
   value/app/hadoop/tmp/value
   descriptionA base for other temporary directories./description
 /property

 property
   namefs.default.name/name
   valuehdfs://cloud15:54310/value
   descriptionThe name of the default file system.  A URI whose
   scheme and authority determine the FileSystem implementation.  The
   uri's scheme determines the config property (fs.SCHEME.impl) naming
   the FileSystem implementation class.  The uri's authority is used to
   determine the host, port, etc. for a filesystem./description
 /property
 /configuration

 In the master machine I just swap cloud15 to cloud6.
 In the file /etc/host I have (192.168.188.15  cloud15) and (192.168.188.6
   cloud6) lines, and both machines access through ssh with out password.

 Am I missing anything?

 Thanks in advance!
 Felipe


 --
 *--
 -- Felipe Oliveira Gutierrez
 -- felipe.o.gutier...@gmail.com
 -- https://sites.google.com/site/lipe82/Home/diaadia*



Re: Datanode doesn't connect to Namenode

2013-08-07 Thread Sivaram RL
Hi ,

your configuration  of Datanode shows

 namefs.default.name/name
  valuehdfs://cloud15:54310/value

But you have said Namenode is configured on master (cloud6). Can you check
the configuration again ?


Regards,
Sivaram R L


On Wed, Aug 7, 2013 at 10:29 PM, Felipe Gutierrez 
felipe.o.gutier...@gmail.com wrote:

 Hi everyone,

 My slave machine (cloud15) the datanode shows this log. It doesn't connect
 to the master (cloud6).

 2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.Client: Retrying
 connect to server: cloud15/192.168.188.15:54310. Already tried 9 time(s);
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
 sleepTime=1 SECONDS)
 2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.RPC: Server at cloud15/
 192.168.188.15:54310 not available yet, Z...

 But when I type jps command on slave machine DataNode is running. This is
 my file core-site.xml in slave machine (cloud15):
 configuration
 property
   namehadoop.tmp.dir/name
   value/app/hadoop/tmp/value
   descriptionA base for other temporary directories./description
 /property

 property
   namefs.default.name/name
   valuehdfs://cloud15:54310/value
   descriptionThe name of the default file system.  A URI whose
   scheme and authority determine the FileSystem implementation.  The
   uri's scheme determines the config property (fs.SCHEME.impl) naming
   the FileSystem implementation class.  The uri's authority is used to
   determine the host, port, etc. for a filesystem./description
 /property
 /configuration

 In the master machine I just swap cloud15 to cloud6.
 In the file /etc/host I have (192.168.188.15  cloud15) and (192.168.188.6
   cloud6) lines, and both machines access through ssh with out password.

 Am I missing anything?

 Thanks in advance!
 Felipe


 --
 *--
 -- Felipe Oliveira Gutierrez
 -- felipe.o.gutier...@gmail.com
 -- https://sites.google.com/site/lipe82/Home/diaadia*



Re: Extra start-up overhead with hadoop-2.1.0-beta

2013-08-07 Thread Ravi Prakash
I believe https://issues.apache.org/jira/browse/MAPREDUCE-5399 causes 
performance degradation in cases where there are a lot of reducers. I can 
imagine it causing degradation if the configuration files are super big / some 
other weird cases.





 From: Krishna Kishore Bonagiri write2kish...@gmail.com
To: user@hadoop.apache.org 
Sent: Wednesday, August 7, 2013 10:03 AM
Subject: Re: Extra start-up overhead with hadoop-2.1.0-beta
 


Hi Omkar,


 Can you please see if you can answer my question with this info or if you need 
anything else from me?

 Also, does resource localization improve or impact any performance?

Thanks,
Kishore



On Thu, Aug 1, 2013 at 11:20 PM, Omkar Joshi ojo...@hortonworks.com wrote:

How are you making these measurements can you elaborate more? Is it on a best 
case basis or on an average or worst case? How many resources are you sending 
it for localization? were the sizes and number of these resources consistent 
across tests? Were these resources public/private/application specific? Apart 
from this is the other load on node manager same? is the load on hdfs same? did 
you see any network bottleneck? 


More information will help a lot.





Thanks,
Omkar Joshi
Hortonworks Inc.



On Thu, Aug 1, 2013 at 2:19 AM, Krishna Kishore Bonagiri 
write2kish...@gmail.com wrote:

Hi,
  Please share with me if you anyone has an answer or clues to my question 
regarding the start up performance. 


Also, one more thing I have observed today is the time taken to run a command 
on a container went up by more than a second in this latest version.


When using 2.0.4-alpha, it used to take 0.3 to 0.5 seconds from the point I 
call startContainer() to the  point the command is started on the container.


where as


When using 2.1.0-beta, it is taking around 1.5 seconds from the point it came 
to the call back onContainerStarted() to the point the command is seen 
started running on the container.


Thanks,

Kishore



On Thu, Jul 25, 2013 at 8:38 PM, Krishna Kishore Bonagiri 
write2kish...@gmail.com wrote:

Hi,


  I have been using the hadoop-2.0.1-beta release candidate and observed 
that it is slower in running my simple application that runs on 2 
containers. I have tried to find out which parts of it is really having this 
extra overhead(compared to hadoop-2.0.4-alpha), and here is what I found 
that.


1) From the point my Client has submitted the Application Master to RM, it 
is taking 2  seconds extra
2) From the point my container request are set up by Application Master, 
till the containers are allocated, it is taking 2 seconds extra


Is this overhead expected with the changes that went into the new version? 
Or is there to improve it by changing something in configurations or so?


Thanks,
Kishore



Re: Datanode doesn't connect to Namenode

2013-08-07 Thread Felipe Gutierrez
yes, in slave I type:
namefs.default.name/name
valuehdfs://cloud15:54310/value

in master I type:
namefs.default.name/name
valuehdfs://cloud6:54310/value

If I type cloud6 on both configurations, the slave doesn't start.




On Wed, Aug 7, 2013 at 2:40 PM, Sivaram RL sivaram...@gmail.com wrote:

 Hi ,

 your configuration  of Datanode shows

  namefs.default.name/name
   valuehdfs://cloud15:54310/value

 But you have said Namenode is configured on master (cloud6). Can you check
 the configuration again ?


 Regards,
 Sivaram R L


 On Wed, Aug 7, 2013 at 10:29 PM, Felipe Gutierrez 
 felipe.o.gutier...@gmail.com wrote:

 Hi everyone,

 My slave machine (cloud15) the datanode shows this log. It doesn't
 connect to the master (cloud6).

 2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.Client: Retrying
 connect to server: cloud15/192.168.188.15:54310. Already tried 9
 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
 sleepTime=1 SECONDS)
 2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.RPC: Server at cloud15/
 192.168.188.15:54310 not available yet, Z...

 But when I type jps command on slave machine DataNode is running. This is
 my file core-site.xml in slave machine (cloud15):
 configuration
 property
   namehadoop.tmp.dir/name
   value/app/hadoop/tmp/value
   descriptionA base for other temporary directories./description
 /property

 property
   namefs.default.name/name
   valuehdfs://cloud15:54310/value
descriptionThe name of the default file system.  A URI whose
   scheme and authority determine the FileSystem implementation.  The
   uri's scheme determines the config property (fs.SCHEME.impl) naming
   the FileSystem implementation class.  The uri's authority is used to
   determine the host, port, etc. for a filesystem./description
 /property
 /configuration

 In the master machine I just swap cloud15 to cloud6.
 In the file /etc/host I have (192.168.188.15  cloud15) and (192.168.188.6
   cloud6) lines, and both machines access through ssh with out password.

 Am I missing anything?

 Thanks in advance!
 Felipe


 --
 *--
 -- Felipe Oliveira Gutierrez
 -- felipe.o.gutier...@gmail.com
 -- https://sites.google.com/site/lipe82/Home/diaadia*





-- 
*--
-- Felipe Oliveira Gutierrez
-- felipe.o.gutier...@gmail.com
-- https://sites.google.com/site/lipe82/Home/diaadia*


Re: Extra start-up overhead with hadoop-2.1.0-beta

2013-08-07 Thread Krishna Kishore Bonagiri
No Ravi, I am not running any MR job. Also, my configuration files are not
big.


On Wed, Aug 7, 2013 at 11:12 PM, Ravi Prakash ravi...@ymail.com wrote:

 I believe https://issues.apache.org/jira/browse/MAPREDUCE-5399 causes
 performance degradation in cases where there are a lot of reducers. I can
 imagine it causing degradation if the configuration files are super big /
 some other weird cases.


   --
  *From:* Krishna Kishore Bonagiri write2kish...@gmail.com
 *To:* user@hadoop.apache.org
 *Sent:* Wednesday, August 7, 2013 10:03 AM
 *Subject:* Re: Extra start-up overhead with hadoop-2.1.0-beta

 Hi Omkar,

  Can you please see if you can answer my question with this info or if you
 need anything else from me?

  Also, does resource localization improve or impact any performance?

 Thanks,
 Kishore


 On Thu, Aug 1, 2013 at 11:20 PM, Omkar Joshi ojo...@hortonworks.comwrote:

 How are you making these measurements can you elaborate more? Is it on a
 best case basis or on an average or worst case? How many resources are you
 sending it for localization? were the sizes and number of these resources
 consistent across tests? Were these resources public/private/application
 specific? Apart from this is the other load on node manager same? is the
 load on hdfs same? did you see any network bottleneck?

 More information will help a lot.


 Thanks,
 Omkar Joshi
 *Hortonworks Inc.* http://www.hortonworks.com/


 On Thu, Aug 1, 2013 at 2:19 AM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi,
   Please share with me if you anyone has an answer or clues to my question
 regarding the start up performance.

 Also, one more thing I have observed today is the time taken to run a
 command on a container went up by more than a second in this latest version.

 When using 2.0.4-alpha, it used to take 0.3 to 0.5 seconds from the point
 I call startContainer() to the  point the command is started on the
 container.

 where as

 When using 2.1.0-beta, it is taking around 1.5 seconds from the point it
 came to the call back onContainerStarted() to the point the command is seen
 started running on the container.

 Thanks,
 Kishore


 On Thu, Jul 25, 2013 at 8:38 PM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi,

   I have been using the hadoop-2.0.1-beta release candidate and observed
 that it is slower in running my simple application that runs on 2
 containers. I have tried to find out which parts of it is really having
 this extra overhead(compared to hadoop-2.0.4-alpha), and here is what I
 found that.

 1) From the point my Client has submitted the Application Master to RM, it
 is taking 2  seconds extra
 2) From the point my container request are set up by Application Master,
 till the containers are allocated, it is taking 2 seconds extra

 Is this overhead expected with the changes that went into the new version?
 Or is there to improve it by changing something in configurations or so?

 Thanks,
 Kishore









Re: Datanode doesn't connect to Namenode

2013-08-07 Thread Jitendra Yadav
I'm not able to see tasktraker process on your datanode.

On Wed, Aug 7, 2013 at 11:14 PM, Felipe Gutierrez 
felipe.o.gutier...@gmail.com wrote:

 yes, in slave I type:
 namefs.default.name/name
 valuehdfs://cloud15:54310/value

 in master I type:
 namefs.default.name/name
 valuehdfs://cloud6:54310/value

 If I type cloud6 on both configurations, the slave doesn't start.




 On Wed, Aug 7, 2013 at 2:40 PM, Sivaram RL sivaram...@gmail.com wrote:

 Hi ,

 your configuration  of Datanode shows

  namefs.default.name/name
   valuehdfs://cloud15:54310/value

 But you have said Namenode is configured on master (cloud6). Can you
 check the configuration again ?


 Regards,
 Sivaram R L


  On Wed, Aug 7, 2013 at 10:29 PM, Felipe Gutierrez 
 felipe.o.gutier...@gmail.com wrote:

 Hi everyone,

 My slave machine (cloud15) the datanode shows this log. It doesn't
 connect to the master (cloud6).

  2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.Client: Retrying
 connect to server: cloud15/192.168.188.15:54310. Already tried 9
 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
 sleepTime=1 SECONDS)
 2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.RPC: Server at
 cloud15/192.168.188.15:54310 not available yet, Z...

 But when I type jps command on slave machine DataNode is running. This
 is my file core-site.xml in slave machine (cloud15):
  configuration
 property
   namehadoop.tmp.dir/name
   value/app/hadoop/tmp/value
   descriptionA base for other temporary directories./description
 /property

 property
   namefs.default.name/name
   valuehdfs://cloud15:54310/value
   descriptionThe name of the default file system.  A URI whose
   scheme and authority determine the FileSystem implementation.  The
   uri's scheme determines the config property (fs.SCHEME.impl) naming
   the FileSystem implementation class.  The uri's authority is used to
   determine the host, port, etc. for a filesystem./description
 /property
 /configuration

 In the master machine I just swap cloud15 to cloud6.
 In the file /etc/host I have (192.168.188.15  cloud15) and
 (192.168.188.6   cloud6) lines, and both machines access through ssh with
 out password.

 Am I missing anything?

 Thanks in advance!
 Felipe


 --
 *--
 -- Felipe Oliveira Gutierrez
 -- felipe.o.gutier...@gmail.com
 -- https://sites.google.com/site/lipe82/Home/diaadia*





 --
 *--
 -- Felipe Oliveira Gutierrez
 -- felipe.o.gutier...@gmail.com
 -- https://sites.google.com/site/lipe82/Home/diaadia*



Re: Datanode doesn't connect to Namenode

2013-08-07 Thread Jitendra Yadav
Your hdfs name entry should be same on master and databnodes

* namefs.default.name/name*
*valuehdfs://cloud6:54310/value*

Thanks
On Wed, Aug 7, 2013 at 11:05 PM, Felipe Gutierrez 
felipe.o.gutier...@gmail.com wrote:

 on my slave the process is running:
 hduser@cloud15:/usr/local/hadoop$ jps
 19025 DataNode
 19092 Jps


 On Wed, Aug 7, 2013 at 2:26 PM, Jitendra Yadav jeetuyadav200...@gmail.com
  wrote:

 Hi,

 Your logs showing that the process is creating IPC call not for namenode,
 it is hitting datanode itself.

 Check you please check you datanode processes status?.

 Regards
 Jitendra

 On Wed, Aug 7, 2013 at 10:29 PM, Felipe Gutierrez 
 felipe.o.gutier...@gmail.com wrote:

 Hi everyone,

 My slave machine (cloud15) the datanode shows this log. It doesn't
 connect to the master (cloud6).

  2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.Client: Retrying
 connect to server: cloud15/192.168.188.15:54310. Already tried 9
 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
 sleepTime=1 SECONDS)
 2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.RPC: Server at
 cloud15/192.168.188.15:54310 not available yet, Z...

 But when I type jps command on slave machine DataNode is running. This
 is my file core-site.xml in slave machine (cloud15):
  configuration
 property
   namehadoop.tmp.dir/name
   value/app/hadoop/tmp/value
   descriptionA base for other temporary directories./description
 /property

 property
   namefs.default.name/name
   valuehdfs://cloud15:54310/value
   descriptionThe name of the default file system.  A URI whose
   scheme and authority determine the FileSystem implementation.  The
   uri's scheme determines the config property (fs.SCHEME.impl) naming
   the FileSystem implementation class.  The uri's authority is used to
   determine the host, port, etc. for a filesystem./description
 /property
 /configuration

 In the master machine I just swap cloud15 to cloud6.
 In the file /etc/host I have (192.168.188.15  cloud15) and
 (192.168.188.6   cloud6) lines, and both machines access through ssh with
 out password.

 Am I missing anything?

 Thanks in advance!
 Felipe


 --
 *--
 -- Felipe Oliveira Gutierrez
 -- felipe.o.gutier...@gmail.com
 -- https://sites.google.com/site/lipe82/Home/diaadia*





 --
 *--
 -- Felipe Oliveira Gutierrez
 -- felipe.o.gutier...@gmail.com
 -- https://sites.google.com/site/lipe82/Home/diaadia*



Re: Datanode doesn't connect to Namenode

2013-08-07 Thread Shekhar Sharma
Disable the firewall on data node and namenode machines..
Regards,
Som Shekhar Sharma
+91-8197243810


On Wed, Aug 7, 2013 at 11:33 PM, Jitendra Yadav
jeetuyadav200...@gmail.comwrote:

 Your hdfs name entry should be same on master and databnodes

 * namefs.default.name/name*
 *valuehdfs://cloud6:54310/value*

 Thanks
 On Wed, Aug 7, 2013 at 11:05 PM, Felipe Gutierrez 
 felipe.o.gutier...@gmail.com wrote:

 on my slave the process is running:
 hduser@cloud15:/usr/local/hadoop$ jps
 19025 DataNode
 19092 Jps


 On Wed, Aug 7, 2013 at 2:26 PM, Jitendra Yadav 
 jeetuyadav200...@gmail.com wrote:

 Hi,

 Your logs showing that the process is creating IPC call not for
 namenode, it is hitting datanode itself.

 Check you please check you datanode processes status?.

 Regards
 Jitendra

 On Wed, Aug 7, 2013 at 10:29 PM, Felipe Gutierrez 
 felipe.o.gutier...@gmail.com wrote:

 Hi everyone,

 My slave machine (cloud15) the datanode shows this log. It doesn't
 connect to the master (cloud6).

  2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.Client: Retrying
 connect to server: cloud15/192.168.188.15:54310. Already tried 9
 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
 sleepTime=1 SECONDS)
 2013-08-07 13:44:03,110 INFO org.apache.hadoop.ipc.RPC: Server at
 cloud15/192.168.188.15:54310 not available yet, Z...

 But when I type jps command on slave machine DataNode is running. This
 is my file core-site.xml in slave machine (cloud15):
  configuration
 property
   namehadoop.tmp.dir/name
   value/app/hadoop/tmp/value
   descriptionA base for other temporary directories./description
 /property

 property
   namefs.default.name/name
   valuehdfs://cloud15:54310/value
   descriptionThe name of the default file system.  A URI whose
   scheme and authority determine the FileSystem implementation.  The
   uri's scheme determines the config property (fs.SCHEME.impl) naming
   the FileSystem implementation class.  The uri's authority is used to
   determine the host, port, etc. for a filesystem./description
 /property
 /configuration

 In the master machine I just swap cloud15 to cloud6.
 In the file /etc/host I have (192.168.188.15  cloud15) and
 (192.168.188.6   cloud6) lines, and both machines access through ssh with
 out password.

 Am I missing anything?

 Thanks in advance!
 Felipe


 --
 *--
 -- Felipe Oliveira Gutierrez
 -- felipe.o.gutier...@gmail.com
 -- https://sites.google.com/site/lipe82/Home/diaadia*





 --
 *--
 -- Felipe Oliveira Gutierrez
 -- felipe.o.gutier...@gmail.com
 -- https://sites.google.com/site/lipe82/Home/diaadia*





Re: setLocalResources() on ContainerLaunchContext

2013-08-07 Thread Omkar Joshi
Good that your timestamp worked... Now for hdfs try this
hdfs://hdfs-host-name:hdfs-host-portabsolute-path
now verify that your absolute path is correct. I hope it will work.
bin/hadoop fs -ls absolute-path


hdfs://isredeng:8020*//*kishore/kk.ksh... why // ?? you have hdfs file at
absolute location /kishore/kk.sh? is /kishore and /kishore/kk.sh accessible
to the user who is making startContainer call or the one running AM
container?

Thanks,
Omkar Joshi
*Hortonworks Inc.* http://www.hortonworks.com


On Tue, Aug 6, 2013 at 10:43 PM, Krishna Kishore Bonagiri 
write2kish...@gmail.com wrote:

 Hi Harsh, Hitesh  Omkar,

   Thanks for the replies.

 I tried getting the last modified timestamp like this and it works. Is
 this a right thing to do?

   File file = new File(/home_/dsadm/kishore/kk.ksh);
   shellRsrc.setTimestamp(file.lastModified());


 And, when I tried using a hdfs file qualifying it with both node name and
 port, it didn't work, I get a similar error as earlier.

   String shellScriptPath = hdfs://isredeng:8020//kishore/kk.ksh;


 13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
 containerID= container_1375853431091_0005_01_02, state=COMPLETE,
 exitStatus=-1000, diagnostics=File does not exist:
 hdfs://isredeng:8020/kishore/kk.ksh

 13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
 container : -1000



 On Wed, Aug 7, 2013 at 7:45 AM, Harsh J ha...@cloudera.com wrote:

 Thanks Hitesh!

 P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
 port), but isredeng has to be the authority component.

 On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah hit...@apache.org wrote:
  @Krishna, your logs showed the file error for
 hdfs://isredeng/kishore/kk.ksh
 
  I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that
 the file exists? Also the qualified path seems to be missing the namenode
 port. I need to go back and check if a path without the port works by
 assuming the default namenode port.
 
  @Harsh, adding a helper function seems like a good idea. Let me file a
 jira to have the above added to one of the helper/client libraries.
 
  thanks
  -- Hitesh
 
  On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
 
  It is kinda unnecessary to be asking developers to load in timestamps
 and
  length themselves. Why not provide a java.io.File, or perhaps a Path
  accepting API, that gets it automatically on their behalf using the
  FileSystem API internally?
 
  P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
  TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
  paths.
 
  On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah hit...@apache.org
 wrote:
  Hi Krishna,
 
  YARN downloads a specified local resource on the container's node
 from the url specified. In all situtations, the remote url needs to be a
 fully qualified path. To verify that the file at the remote url is still
 valid, YARN expects you to provide the length and last modified timestamp
 of that file.
 
  If you use an hdfs path such as hdfs://namenode:port/absolute path
 to file, you will need to get the length and timestamp from HDFS.
  If you use file:///, the file should exist on all nodes and all nodes
 should have the file with the same length and timestamp for localization to
 work. ( For a single node setup, this works but tougher to get right on a
 multi-node setup - deploying the file via a rpm should likely work).
 
  -- Hitesh
 
  On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
 
  Hi,
 
  You need to match the timestamp. Probably get the timestamp locally
 before adding it. This is explicitly done to ensure that file is not
 updated after user makes the call to avoid possible errors.
 
 
  Thanks,
  Omkar Joshi
  Hortonworks Inc.
 
 
  On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:
  I tried the following and it works!
  String shellScriptPath = file:///home_/dsadm/kishore/kk.ksh;
 
  But now getting a timestamp error like below, when I passed 0 to
 setTimestamp()
 
  13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
 containerID= container_1375784329048_0017_01_02, state=COMPLETE,
 exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
 changed on src filesystem (expected 0, was 136758058
 
 
 
 
  On Tue, Aug 6, 2013 at 5:24 PM, Harsh J ha...@cloudera.com wrote:
  Can you try passing a fully qualified local path? That is, including
 the file:/ scheme
 
  On Aug 6, 2013 4:05 PM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:
  Hi Harsh,
The setResource() call on LocalResource() is expecting an argument
 of type org.apache.hadoop.yarn.api.records.URL which is converted from a
 string in the form of URI. This happens in the following call of
 Distributed Shell example,
 
  shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
 shellScriptPath)));
 
  So, if I give a local file I get a parsing error like below, which
 is 

Re: setLocalResources() on ContainerLaunchContext

2013-08-07 Thread Krishna Kishore Bonagiri
Hi Omkar,

  I will try that. I might have got 2 of '/' wrongly while trying it in
different ways to make it work. The file kishore/kk.ksh is accessible to
the same user that is running the AM container.

  And my another questions is to understand what are the exact benefits of
using this resource localization? Can you please explain me briefly or
point me some online documentation talking about it?

Thanks,
Kishore


On Wed, Aug 7, 2013 at 11:49 PM, Omkar Joshi ojo...@hortonworks.com wrote:

 Good that your timestamp worked... Now for hdfs try this
 hdfs://hdfs-host-name:hdfs-host-portabsolute-path
 now verify that your absolute path is correct. I hope it will work.
 bin/hadoop fs -ls absolute-path


 hdfs://isredeng:8020*//*kishore/kk.ksh... why // ?? you have hdfs file
 at absolute location /kishore/kk.sh? is /kishore and /kishore/kk.sh
 accessible to the user who is making startContainer call or the one running
 AM container?

 Thanks,
 Omkar Joshi
 *Hortonworks Inc.* http://www.hortonworks.com


 On Tue, Aug 6, 2013 at 10:43 PM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi Harsh, Hitesh  Omkar,

   Thanks for the replies.

 I tried getting the last modified timestamp like this and it works. Is
 this a right thing to do?

   File file = new File(/home_/dsadm/kishore/kk.ksh);
   shellRsrc.setTimestamp(file.lastModified());


 And, when I tried using a hdfs file qualifying it with both node name and
 port, it didn't work, I get a similar error as earlier.

   String shellScriptPath = hdfs://isredeng:8020//kishore/kk.ksh;


 13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
 containerID= container_1375853431091_0005_01_02, state=COMPLETE,
 exitStatus=-1000, diagnostics=File does not exist:
 hdfs://isredeng:8020/kishore/kk.ksh

 13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
 container : -1000



 On Wed, Aug 7, 2013 at 7:45 AM, Harsh J ha...@cloudera.com wrote:

 Thanks Hitesh!

 P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
 port), but isredeng has to be the authority component.

 On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah hit...@apache.org wrote:
  @Krishna, your logs showed the file error for
 hdfs://isredeng/kishore/kk.ksh
 
  I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed
 that the file exists? Also the qualified path seems to be missing the
 namenode port. I need to go back and check if a path without the port works
 by assuming the default namenode port.
 
  @Harsh, adding a helper function seems like a good idea. Let me file a
 jira to have the above added to one of the helper/client libraries.
 
  thanks
  -- Hitesh
 
  On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
 
  It is kinda unnecessary to be asking developers to load in timestamps
 and
  length themselves. Why not provide a java.io.File, or perhaps a Path
  accepting API, that gets it automatically on their behalf using the
  FileSystem API internally?
 
  P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
  TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
  paths.
 
  On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah hit...@apache.org
 wrote:
  Hi Krishna,
 
  YARN downloads a specified local resource on the container's node
 from the url specified. In all situtations, the remote url needs to be a
 fully qualified path. To verify that the file at the remote url is still
 valid, YARN expects you to provide the length and last modified timestamp
 of that file.
 
  If you use an hdfs path such as hdfs://namenode:port/absolute path
 to file, you will need to get the length and timestamp from HDFS.
  If you use file:///, the file should exist on all nodes and all
 nodes should have the file with the same length and timestamp for
 localization to work. ( For a single node setup, this works but tougher to
 get right on a multi-node setup - deploying the file via a rpm should
 likely work).
 
  -- Hitesh
 
  On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
 
  Hi,
 
  You need to match the timestamp. Probably get the timestamp locally
 before adding it. This is explicitly done to ensure that file is not
 updated after user makes the call to avoid possible errors.
 
 
  Thanks,
  Omkar Joshi
  Hortonworks Inc.
 
 
  On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:
  I tried the following and it works!
  String shellScriptPath = file:///home_/dsadm/kishore/kk.ksh;
 
  But now getting a timestamp error like below, when I passed 0 to
 setTimestamp()
 
  13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
 containerID= container_1375784329048_0017_01_02, state=COMPLETE,
 exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
 changed on src filesystem (expected 0, was 136758058
 
 
 
 
  On Tue, Aug 6, 2013 at 5:24 PM, Harsh J ha...@cloudera.com wrote:
  Can you try passing a fully qualified local path? That is,
 including the 

compatible hadoop version for hbase-0.94.10

2013-08-07 Thread oc tsdb
Hi,

I need to create a opentsdb cluster which needs hbase and hadoop.
I picked the latest hbase supported  by opentsdb which is hbase-.0.94.10

Can anybody please suggest which latest version of Hadoop I can use with
hbase-0.94.10

Thanks in advance.

Regards,
VSR.


Re: compatible hadoop version for hbase-0.94.10

2013-08-07 Thread Ted Yu
If you look at pom.xml for 0.94, you should see hadoop-1.1 and hadoop-1.2
profiles.

Those hadoop releases (1.1.2 and 1.2.0, respectively) should work.

On Wed, Aug 7, 2013 at 12:13 PM, oc tsdb oc.t...@gmail.com wrote:

 Hi,

 I need to create a opentsdb cluster which needs hbase and hadoop.
 I picked the latest hbase supported  by opentsdb which is hbase-.0.94.10

 Can anybody please suggest which latest version of Hadoop I can use with
 hbase-0.94.10

 Thanks in advance.

 Regards,
 VSR.



Re: compatible hadoop version for hbase-0.94.10

2013-08-07 Thread oc tsdb
Thanks Ted.

Regards,
OC.


On Wed, Aug 7, 2013 at 12:22 PM, Ted Yu yuzhih...@gmail.com wrote:

 If you look at pom.xml for 0.94, you should see hadoop-1.1 and hadoop-1.2
 profiles.

 Those hadoop releases (1.1.2 and 1.2.0, respectively) should work.


 On Wed, Aug 7, 2013 at 12:13 PM, oc tsdb oc.t...@gmail.com wrote:

 Hi,

 I need to create a opentsdb cluster which needs hbase and hadoop.
 I picked the latest hbase supported  by opentsdb which is hbase-.0.94.10

 Can anybody please suggest which latest version of Hadoop I can use with
 hbase-0.94.10

 Thanks in advance.

 Regards,
 VSR.





specify Mapred tasks and slots

2013-08-07 Thread Azuryy Yu
Hi Dears,

Can I specify how many slots to use for reduce?

I know we can specify reduces tasks, but is there one task occupy one slot?

it it possible that one tak occupy more than one slot in Hadoop-1.1.2.

Thanks.


Re: specify Mapred tasks and slots

2013-08-07 Thread Shekhar Sharma
use mapred.tasktracker.reduce.tasks in mapred-site.xml

the default value is 2...Which means that on this task tracker it will not
run more than 2 reducer tasks at any given point of time..


Regards,
Som Shekhar Sharma
+91-8197243810


On Thu, Aug 8, 2013 at 7:19 AM, Azuryy Yu azury...@gmail.com wrote:

 Hi Dears,

 Can I specify how many slots to use for reduce?

 I know we can specify reduces tasks, but is there one task occupy one slot?

 it it possible that one tak occupy more than one slot in Hadoop-1.1.2.

 Thanks.



Re: specify Mapred tasks and slots

2013-08-07 Thread Shekhar Sharma
Slots are decided upon the configuration of machines, RAM etc...

Regards,
Som Shekhar Sharma
+91-8197243810


On Thu, Aug 8, 2013 at 7:19 AM, Azuryy Yu azury...@gmail.com wrote:

 Hi Dears,

 Can I specify how many slots to use for reduce?

 I know we can specify reduces tasks, but is there one task occupy one slot?

 it it possible that one tak occupy more than one slot in Hadoop-1.1.2.

 Thanks.



Re: specify Mapred tasks and slots

2013-08-07 Thread Azuryy Yu
My question is can I specify how many slots to be used for each M/R task?


On Thu, Aug 8, 2013 at 10:29 AM, Shekhar Sharma shekhar2...@gmail.comwrote:

 Slots are decided upon the configuration of machines, RAM etc...

 Regards,
 Som Shekhar Sharma
 +91-8197243810


 On Thu, Aug 8, 2013 at 7:19 AM, Azuryy Yu azury...@gmail.com wrote:

 Hi Dears,

 Can I specify how many slots to use for reduce?

 I know we can specify reduces tasks, but is there one task occupy one
 slot?

 it it possible that one tak occupy more than one slot in Hadoop-1.1.2.

 Thanks.





RE: specify Mapred tasks and slots

2013-08-07 Thread Devaraj k
One task can use only one slot, It cannot use more than one slot. If the task 
is Map task then it will use one map slot and if the task is reduce task the it 
will use one reduce slot from the configured ones.

Thanks
Devaraj k

From: Azuryy Yu [mailto:azury...@gmail.com]
Sent: 08 August 2013 08:27
To: user@hadoop.apache.org
Subject: Re: specify Mapred tasks and slots

My question is can I specify how many slots to be used for each M/R task?

On Thu, Aug 8, 2013 at 10:29 AM, Shekhar Sharma 
shekhar2...@gmail.commailto:shekhar2...@gmail.com wrote:
Slots are decided upon the configuration of machines, RAM etc...

Regards,
Som Shekhar Sharma
+91-8197243810

On Thu, Aug 8, 2013 at 7:19 AM, Azuryy Yu 
azury...@gmail.commailto:azury...@gmail.com wrote:
Hi Dears,

Can I specify how many slots to use for reduce?

I know we can specify reduces tasks, but is there one task occupy one slot?

it it possible that one tak occupy more than one slot in Hadoop-1.1.2.

Thanks.




Re: specify Mapred tasks and slots

2013-08-07 Thread Harsh J
What Devaraj said. Except that if you use CapacityScheduler, then you
can bind together memory requests and slot concepts, and be able to have a task
grab more than one slot for itself when needed. We've discussed this
aspect previously at http://search-hadoop.com/m/gnFs91yIg1e

On Thu, Aug 8, 2013 at 8:34 AM, Devaraj k devara...@huawei.com wrote:
 One task can use only one slot, It cannot use more than one slot. If the
 task is Map task then it will use one map slot and if the task is reduce
 task the it will use one reduce slot from the configured ones.



 Thanks

 Devaraj k



 From: Azuryy Yu [mailto:azury...@gmail.com]
 Sent: 08 August 2013 08:27
 To: user@hadoop.apache.org
 Subject: Re: specify Mapred tasks and slots



 My question is can I specify how many slots to be used for each M/R task?



 On Thu, Aug 8, 2013 at 10:29 AM, Shekhar Sharma shekhar2...@gmail.com
 wrote:

 Slots are decided upon the configuration of machines, RAM etc...


 Regards,

 Som Shekhar Sharma

 +91-8197243810



 On Thu, Aug 8, 2013 at 7:19 AM, Azuryy Yu azury...@gmail.com wrote:

 Hi Dears,



 Can I specify how many slots to use for reduce?



 I know we can specify reduces tasks, but is there one task occupy one slot?



 it it possible that one tak occupy more than one slot in Hadoop-1.1.2.



 Thanks.







-- 
Harsh J


Re: specify Mapred tasks and slots

2013-08-07 Thread Azuryy Yu
Thanks Harsh and all friends response. That's helpful.



On Thu, Aug 8, 2013 at 11:55 AM, Harsh J ha...@cloudera.com wrote:

 What Devaraj said. Except that if you use CapacityScheduler, then you
 can bind together memory requests and slot concepts, and be able to have a
 task
 grab more than one slot for itself when needed. We've discussed this
 aspect previously at http://search-hadoop.com/m/gnFs91yIg1e

 On Thu, Aug 8, 2013 at 8:34 AM, Devaraj k devara...@huawei.com wrote:
  One task can use only one slot, It cannot use more than one slot. If the
  task is Map task then it will use one map slot and if the task is reduce
  task the it will use one reduce slot from the configured ones.
 
 
 
  Thanks
 
  Devaraj k
 
 
 
  From: Azuryy Yu [mailto:azury...@gmail.com]
  Sent: 08 August 2013 08:27
  To: user@hadoop.apache.org
  Subject: Re: specify Mapred tasks and slots
 
 
 
  My question is can I specify how many slots to be used for each M/R task?
 
 
 
  On Thu, Aug 8, 2013 at 10:29 AM, Shekhar Sharma shekhar2...@gmail.com
  wrote:
 
  Slots are decided upon the configuration of machines, RAM etc...
 
 
  Regards,
 
  Som Shekhar Sharma
 
  +91-8197243810
 
 
 
  On Thu, Aug 8, 2013 at 7:19 AM, Azuryy Yu azury...@gmail.com wrote:
 
  Hi Dears,
 
 
 
  Can I specify how many slots to use for reduce?
 
 
 
  I know we can specify reduces tasks, but is there one task occupy one
 slot?
 
 
 
  it it possible that one tak occupy more than one slot in Hadoop-1.1.2.
 
 
 
  Thanks.
 
 
 
 



 --
 Harsh J



is it ok? build hadoop cluster on kvm on product envionment?

2013-08-07 Thread ch huang
hi,all:
my company has not much burget for boxes,if i build cluster on kvm ,it will
cause a lot of impact on performance??