can't submit remote job

2015-05-18 Thread xeonmailinglist-gmail

Hi,

I am trying to submit a remote job in Yarn MapReduce, but I can’t 
because I get the error [1]. I don’t have more exceptions in the other logs.


My Mapreduce runtime have 1 /ResourceManager/ and 3 /NodeManagers/, and 
the HDFS is running properly (all nodes are alive).


I have looked to all logs, and I still don’t understand why I get this 
error. Any help to fix this? Is it a problem of the remote job that I am 
submitting?


[1]

|$ less logs/hadoop-ubuntu-namenode-ip-172-31-17-45.log

2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameNode.addBlock: file 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.split
fileId=16394 for DFSClient_NONMAPREDUCE_-1923902075_14
2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.getAdditionalBlock: 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.
split inodeId 16394 for DFSClient_NONMAPREDUCE_-1923902075_14
2015-05-18 10:42:16,571 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
choose remote rack (location = ~/default-rack), fallback to lo
cal rack
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:126)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1545)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
|

​

--
--



Fwd: can't submit remote job

2015-05-18 Thread xeonmailinglist-gmail
I also can't find a good site/book that explains well how to submit 
remote jobs. Also, can anyone know where can I get more useful info?



 Forwarded Message 
Subject:can't submit remote job
Date:   Mon, 18 May 2015 11:54:56 +0100
From:   xeonmailinglist-gmail 
To: user@hadoop.apache.org 



Hi,

I am trying to submit a remote job in Yarn MapReduce, but I can’t 
because I get the error [1]. I don’t have more exceptions in the other logs.


My Mapreduce runtime have 1 /ResourceManager/ and 3 /NodeManagers/, and 
the HDFS is running properly (all nodes are alive).


I have looked to all logs, and I still don’t understand why I get this 
error. Any help to fix this? Is it a problem of the remote job that I am 
submitting?


[1]

|$ less logs/hadoop-ubuntu-namenode-ip-172-31-17-45.log

2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameNode.addBlock: file 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.split
fileId=16394 for DFSClient_NONMAPREDUCE_-1923902075_14
2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.getAdditionalBlock: 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.
split inodeId 16394 for DFSClient_NONMAPREDUCE_-1923902075_14
2015-05-18 10:42:16,571 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
choose remote rack (location = ~/default-rack), fallback to lo
cal rack
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:126)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1545)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
|

​

--
--





Re: Jr. to Mid Level Big Data jobs in Bay Area

2015-05-18 Thread mark charts
I agree.  
I think its OK. But be advised there are jerks even in "sheep's clothing." 
Humans are good at complaining for no reason at all. Simply to hears their own 
voice, I suppose. 


 On Sunday, May 17, 2015 9:15 PM, Juan Suero  wrote:
   

 Hes a human asking for human advice.. its ok methinks.
we should live in a more tolerant world.
Thanks.

On Sun, May 17, 2015 at 8:10 PM, Stephen Boesch  wrote:

Hi,  This is not a job board. Thanks.
2015-05-17 16:00 GMT-07:00 Adam Pritchard :

Hi everyone,
I was wondering if any of you know any openings looking to hire a big data dev 
in the Palo Alto area.
Main thing I am looking for is to be on a team that will embrace having a Jr to 
Mid level big data developer, where I can grow my skill set and contribute.

My skills are:
3 years Java1.5 years Hadoop1.5 years Hbase1 year map reduce1 year Apache 
Storm<1 year Apache Spark (did a Spark Streaming project in Scala)
5 years PHP3 years iOS development4 years Amazon ec2 experience

Currently I am working in San Francisco as a big data developer, but the team 
I'm on is content leaving me work that I already knew how to do when I came to 
the team (web services) and I want to work with big data technologies at least 
70% of the time.

I am not a senior big data dev, but I am motivated to be and am just looking 
for an opportunity where I can work all day or most of the day with big data 
technologies, and contribute and learn from the project at hand.

Thanks if anyone can share any information,

Adam







  

Re: can't submit remote job

2015-05-18 Thread Billy Watson
Netflix Genie is what we use for submitting jobs.

William Watson
Software Engineer
(904) 705-7056 PCS

On Mon, May 18, 2015 at 7:07 AM, xeonmailinglist-gmail <
xeonmailingl...@gmail.com> wrote:

>  I also can't find a good site/book that explains well how to submit
> remote jobs. Also, can anyone know where can I get more useful info?
>
>
>
>  Forwarded Message   Subject: can't submit remote job  Date:
> Mon, 18 May 2015 11:54:56 +0100  From: xeonmailinglist-gmail
>To:
> user@hadoop.apache.org  
>
>  Hi,
>
> I am trying to submit a remote job in Yarn MapReduce, but I can’t because
> I get the error [1]. I don’t have more exceptions in the other logs.
>
> My Mapreduce runtime have 1 *ResourceManager* and 3 *NodeManagers*, and
> the HDFS is running properly (all nodes are alive).
>
> I have looked to all logs, and I still don’t understand why I get this
> error. Any help to fix this? Is it a problem of the remote job that I am
> submitting?
>
> [1]
>
> $ less logs/hadoop-ubuntu-namenode-ip-172-31-17-45.log
>
> 2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
> NameNode.addBlock: file 
> /tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.split
> fileId=16394 for DFSClient_NONMAPREDUCE_-1923902075_14
> 2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.getAdditionalBlock: 
> /tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.
> split inodeId 16394 for DFSClient_NONMAPREDUCE_-1923902075_14
> 2015-05-18 10:42:16,571 DEBUG 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> choose remote rack (location = ~/default-rack), fallback to lo
> cal rack
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:126)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1545)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
>
> ​
>
> --
> --
>
>
>
>


Re: can't submit remote job

2015-05-18 Thread Gopy Krishna
REMOVE

On Mon, May 18, 2015 at 6:54 AM, xeonmailinglist-gmail <
xeonmailingl...@gmail.com> wrote:

>  Hi,
>
> I am trying to submit a remote job in Yarn MapReduce, but I can’t because
> I get the error [1]. I don’t have more exceptions in the other logs.
>
> My Mapreduce runtime have 1 *ResourceManager* and 3 *NodeManagers*, and
> the HDFS is running properly (all nodes are alive).
>
> I have looked to all logs, and I still don’t understand why I get this
> error. Any help to fix this? Is it a problem of the remote job that I am
> submitting?
>
> [1]
>
> $ less logs/hadoop-ubuntu-namenode-ip-172-31-17-45.log
>
> 2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
> NameNode.addBlock: file 
> /tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.split
> fileId=16394 for DFSClient_NONMAPREDUCE_-1923902075_14
> 2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.getAdditionalBlock: 
> /tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.
> split inodeId 16394 for DFSClient_NONMAPREDUCE_-1923902075_14
> 2015-05-18 10:42:16,571 DEBUG 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> choose remote rack (location = ~/default-rack), fallback to lo
> cal rack
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:126)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1545)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
>
> ​
>
> --
> --
>
>


-- 
Thanks & Regards
Gopy
Rapisource LLC
Direct: 732-419-9663
Fax: 512-287-4047
Email: g...@rapisource.com
www.rapisource.com
http://www.linkedin.com/in/gopykrishna

According to Bill S.1618 Title III passed by the 105th US Congress,this
message is not considered as "Spam" as we have included the contact
information.If you wish to be removed from our mailing list, please respond
with "remove" in the subject field.We apologize for any inconvenience
caused.


Re: can't submit remote job

2015-05-18 Thread xeonmailinglist-gmail

Why "Remove"?

On 05/18/2015 02:25 PM, Gopy Krishna wrote:

REMOVE

On Mon, May 18, 2015 at 6:54 AM, xeonmailinglist-gmail 
mailto:xeonmailingl...@gmail.com>> wrote:


Hi,

I am trying to submit a remote job in Yarn MapReduce, but I can’t
because I get the error [1]. I don’t have more exceptions in the
other logs.

My Mapreduce runtime have 1 /ResourceManager/ and 3
/NodeManagers/, and the HDFS is running properly (all nodes are
alive).

I have looked to all logs, and I still don’t understand why I get
this error. Any help to fix this? Is it a problem of the remote
job that I am submitting?

[1]

|$ less logs/hadoop-ubuntu-namenode-ip-172-31-17-45.log

2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameNode.addBlock: file 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.split
fileId=16394 for DFSClient_NONMAPREDUCE_-1923902075_14
2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.getAdditionalBlock: 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.
split inodeId 16394 for DFSClient_NONMAPREDUCE_-1923902075_14
2015-05-18 10:42:16,571 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
choose remote rack (location = ~/default-rack), fallback to lo
cal rack

org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:126)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1545)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
|

​

-- 
--





--
Thanks & Regards
Gopy
Rapisource LLC
Direct: 732-419-9663
Fax: 512-287-4047
Email: g...@rapisource.com 
www.rapisource.com 
http://www.linkedin.com/in/gopykrishna

According to Bill S.1618 Title III passed by the 105th US 
Congress,this message is not considered as "Spam" as we have included 
the contact information.If you wish to be removed from our mailing 
list, please respond with "remove" in the subject field.We apologize 
for any inconvenience caused.


--
--



Re: can't submit remote job

2015-05-18 Thread Shahab Yunus
I think that poster wanted to unsubscribe from the mailing list?

Gopy, if that is the case then please see this for that:
https://hadoop.apache.org/mailing_lists.html

Regards,
Shahab

On Mon, May 18, 2015 at 9:42 AM, xeonmailinglist-gmail <
xeonmailingl...@gmail.com> wrote:

>  Why "Remove"?
>
>
> On 05/18/2015 02:25 PM, Gopy Krishna wrote:
>
> REMOVE
>
> On Mon, May 18, 2015 at 6:54 AM, xeonmailinglist-gmail <
> xeonmailingl...@gmail.com> wrote:
>
>>  Hi,
>>
>> I am trying to submit a remote job in Yarn MapReduce, but I can’t because
>> I get the error [1]. I don’t have more exceptions in the other logs.
>>
>> My Mapreduce runtime have 1 *ResourceManager* and 3 *NodeManagers*, and
>> the HDFS is running properly (all nodes are alive).
>>
>> I have looked to all logs, and I still don’t understand why I get this
>> error. Any help to fix this? Is it a problem of the remote job that I am
>> submitting?
>>
>> [1]
>>
>> $ less logs/hadoop-ubuntu-namenode-ip-172-31-17-45.log
>>
>> 2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* 
>> NameNode.addBlock: file 
>> /tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.split
>> fileId=16394 for DFSClient_NONMAPREDUCE_-1923902075_14
>> 2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* 
>> NameSystem.getAdditionalBlock: 
>> /tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.
>> split inodeId 16394 for DFSClient_NONMAPREDUCE_-1923902075_14
>> 2015-05-18 10:42:16,571 DEBUG 
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed 
>> to choose remote rack (location = ~/default-rack), fallback to lo
>> cal rack
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:
>> at 
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
>> at 
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
>> at 
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
>> at 
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
>> at 
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
>> at 
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:126)
>> at 
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1545)
>> at 
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
>> at 
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
>>
>> ​
>>
>> --
>> --
>>
>>
>
>
>  --
>  Thanks & Regards
> Gopy
> Rapisource LLC
> Direct: 732-419-9663
> Fax: 512-287-4047
> Email: g...@rapisource.com
> www.rapisource.com
> http://www.linkedin.com/in/gopykrishna
>
> According to Bill S.1618 Title III passed by the 105th US Congress,this
> message is not considered as "Spam" as we have included the contact
> information.If you wish to be removed from our mailing list, please respond
> with "remove" in the subject field.We apologize for any inconvenience
> caused.
>
>
> --
> --
>
>


Re: can't submit remote job

2015-05-18 Thread xeonmailinglist-gmail
Shabab, I think so, but the Hadoop’s site says |The user@ mailing list 
is the preferred mailing list for end-user questions and discussion.|So 
I am using the right mailing list.


Back to my problem, I think that this is a problem about HDFS security. 
But the strangest thing is that I have disabled it in |hdfs-site.xml| [1].


I think that this error happens when MapReduce is trying to write the 
job configuration files in HDFS.


I have set the username of the remote client in the mapreduce using the 
commands in [2].


Now, I am looking to the Netflix Geni to figure it out how they do it, 
but right now I still haven’t found a solution to submit a remote job 
using Java. If anyone have a hint, or advice, please tell me. I really 
don’t understand why I get this error.


[1]

|$ cat etc/hadoop/hdfs-site.xml

 dfs.permissions false 
|

[2]

|```
   In Namenode host.

   $ sudo adduser xeon
   $ sudo adduser xeon ubuntu
|

```

On 05/18/2015 02:46 PM, Shahab Yunus wrote:


I think that poster wanted to unsubscribe from the mailing list?

Gopy, if that is the case then please see this for 
that:https://hadoop.apache.org/mailing_lists.html


Regards,
Shahab

On Mon, May 18, 2015 at 9:42 AM, xeonmailinglist-gmail 
mailto:xeonmailingl...@gmail.com>> wrote:


Why "Remove"?


On 05/18/2015 02:25 PM, Gopy Krishna wrote:

REMOVE

On Mon, May 18, 2015 at 6:54 AM, xeonmailinglist-gmail
mailto:xeonmailingl...@gmail.com>> wrote:

Hi,

I am trying to submit a remote job in Yarn MapReduce, but I
can’t because I get the error [1]. I don’t have more
exceptions in the other logs.

My Mapreduce runtime have 1 /ResourceManager/ and 3
/NodeManagers/, and the HDFS is running properly (all nodes
are alive).

I have looked to all logs, and I still don’t understand why I
get this error. Any help to fix this? Is it a problem of the
remote job that I am submitting?

[1]

|$ less logs/hadoop-ubuntu-namenode-ip-172-31-17-45.log

2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: 
*BLOCK* NameNode.addBlock: file 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.split
fileId=16394 for DFSClient_NONMAPREDUCE_-1923902075_14
2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: 
BLOCK* NameSystem.getAdditionalBlock: 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.
split inodeId 16394 for DFSClient_NONMAPREDUCE_-1923902075_14
2015-05-18 10:42:16,571 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
choose remote rack (location = ~/default-rack), fallback to lo
cal rack

org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:126)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1545)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
|

​

-- 
--





-- 
Thanks & Regards

Gopy
Rapisource LLC
Direct: 732-419-9663 
Fax: 512-287-4047 
Email: g...@rapisource.com 
www.rapisource.com 
http://www.linkedin.com/in/gopykrishna

According to Bill S.1618 Title III passed by the 105th US
Congress,this message is not considered as "Spam" as we have
included the contact information.If you wish to be removed from
our mailing list, please respond with "remove" in the subject
field.We apologize for any inconvenience caused.


-- 
--




​

--
--



Re: Jr. to Mid Level Big Data jobs in Bay Area

2015-05-18 Thread Rich Haase
Hi Adam,

Questions about employment and career advice aren’t appropriate for this, or 
any, Apache mailing list .   However, there are a number of forums on Linkedin 
where this question will be much better received.

The Hadoop mailing lists get bombarded with questions from all over the 
ecosystem and in many cases not related to Hadoop at all. The community is very 
protective (as you can see) of these forums as they are often the best possible 
place to get and share expertise on the development and operation of Hadoop.

Thanks,

Rich



On May 18, 2015, at 5:19 AM, mark charts 
mailto:mcha...@yahoo.com>> wrote:

I agree.

I think its OK. But be advised there are jerks even in "sheep's clothing." 
Humans are good at complaining for no reason at all. Simply to hears their own 
voice, I suppose.



On Sunday, May 17, 2015 9:15 PM, Juan Suero 
mailto:juan.su...@gmail.com>> wrote:


Hes a human asking for human advice.. its ok methinks.
we should live in a more tolerant world.
Thanks.

On Sun, May 17, 2015 at 8:10 PM, Stephen Boesch 
mailto:java...@gmail.com>> wrote:
Hi,  This is not a job board. Thanks.

2015-05-17 16:00 GMT-07:00 Adam Pritchard 
mailto:apritchard...@gmail.com>>:

Hi everyone,

I was wondering if any of you know any openings looking to hire a big data dev 
in the Palo Alto area.

Main thing I am looking for is to be on a team that will embrace having a Jr to 
Mid level big data developer, where I can grow my skill set and contribute.


My skills are:

3 years Java
1.5 years Hadoop
1.5 years Hbase
1 year map reduce
1 year Apache Storm
<1 year Apache Spark (did a Spark Streaming project in Scala)

5 years PHP
3 years iOS development
4 years Amazon ec2 experience


Currently I am working in San Francisco as a big data developer, but the team 
I'm on is content leaving me work that I already knew how to do when I came to 
the team (web services) and I want to work with big data technologies at least 
70% of the time.


I am not a senior big data dev, but I am motivated to be and am just looking 
for an opportunity where I can work all day or most of the day with big data 
technologies, and contribute and learn from the project at hand.


Thanks if anyone can share any information,


Adam







Rich Haase| Sr. Software Engineer | Pandora
m (303) 887-1146 | rha...@pandora.com






hadoop.tmp.dir?

2015-05-18 Thread Caesar Samsi
Hello,

 

Hadoop.tmp.dir seems to be the root of all storage directories.

 

I'd like for data to be stored in separate locations.

 

Is there a list of directories and how they can be specified?

 

Thank you, Caesar.

 

(.tmp seems to indicate a temporary condition and yet it's used by HDFS,
etc.)



Has bug HDFS-8179 been fixed for 2.7.0? Is there a patch?

2015-05-18 Thread Caesar Samsi
Hello,

DFSClient#getServerDefaults returns null within 1 hour of system start

https://issues.apache.org/jira/browse/HDFS-8179

 

I'm coming across this within 5 minutes of start and having to use
-skipTrash.

 

Is there a configuration option to always use -skipTrash and avoid the bug
altogether?

 

It's cumbersome when writing scripts.

 

Thank you, Caesar.

 

LinuxMint 17 Cinnamon 64 bit.



Re: Has bug HDFS-8179 been fixed for 2.7.0? Is there a patch?

2015-05-18 Thread Ted Yu
The fix is in the upcoming 2.7.1 release.

See this thread: http://search-hadoop.com/m/uOzYt0soQDrSOkY

On Mon, May 18, 2015 at 3:46 PM, Caesar Samsi  wrote:

> Hello,
>
> *DFSClient#getServerDefaults returns null within 1 hour of system start*
>
> https://issues.apache.org/jira/browse/HDFS-8179
>
>
>
> I’m coming across this within 5 minutes of start and having to use
> –skipTrash.
>
>
>
> Is there a configuration option to always use –skipTrash and avoid the bug
> altogether?
>
>
>
> It’s cumbersome when writing scripts.
>
>
>
> Thank you, Caesar.
>
>
>
> LinuxMint 17 Cinnamon 64 bit.
>


Re: hadoop.tmp.dir?

2015-05-18 Thread Rajesh Kartha
Hello,

The 3 main settings in hdfs-site.xml are:


   -   *  dfs.name.dir*: directory where namenode stores its metadata,
   default value ${hadoop.tmp.dir}/dfs/name.
   - *dfs.data.dir:* directory where HDFS data blocks are stored,
   default value ${hadoop.tmp.dir}/dfs/data.
   - *dfs.namenode.checkpoint.dir:* directory where secondary namenode
   store its checkpoints, default value is ${hadoop.tmp.dir}/dfs/namesecondary.



By default it uses the ${hadoop.tmp.dir}:
https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

but one can provide comma-delimited list of directories paths to point to
multiple locations/disks to have them distributed.

HTH

-Rajesh



On Mon, May 18, 2015 at 2:41 PM, Caesar Samsi  wrote:

> Hello,
>
>
>
> Hadoop.tmp.dir seems to be the root of all storage directories.
>
>
>
> I’d like for data to be stored in separate locations.
>
>
>
> Is there a list of directories and how they can be specified?
>
>
>
> Thank you, Caesar.
>
>
>
> (.tmp seems to indicate a temporary condition and yet it’s used by HDFS,
> etc.)
>


Verifying distributed clusters are working?

2015-05-18 Thread Caesar Samsi
Hello,

 

How would I verify that HDFS, MapReduce, and Yarn are working across the
cluster?

 

Puspose is at least 2:

1.   Make sure the computations are distributed

2.   Ascertain the nodes are healthy (by an external
monitoring/management software).

 

Thank you, Caesar.

 

Hadoop 2.7.0