date:20130702

RE: intermediate results files

2013-07-02 Thread Devaraj k

If you are 100% sure that all the node data nodes are available and healthy for 
that period of time, you can choose the replication factor as 1 or 3.

Thanks
Devaraj k

From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: 02 July 2013 04:40
To: user@hadoop.apache.org
Subject: RE: intermediate results files

I've seen some benchmarks where replication=1 runs at about 50MB/sec and 
replication=3 runs at about 33MB/sec, but I can't seem to find that now.
John

From: Mohammad Tariq [mailto:donta...@gmail.com]
Sent: Monday, July 01, 2013 5:03 PM
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: Re: intermediate results files

Hello John,

  IMHO, it doesn't matter. Your job will write the result just once. 
Replica creation is handled at the HDFS layer so it has nothing to with your 
job. Your job will still be writing at the same speed.

Warm Regards,
Tariq
cloudfront.blogspot.comhttp://cloudfront.blogspot.com

On Tue, Jul 2, 2013 at 4:16 AM, John Lilley 
john.lil...@redpoint.netmailto:john.lil...@redpoint.net wrote:
If my reducers are going to create results that are temporary in nature 
(consumed by the next processing stage) is it recommended to use a replication 
factor 3 to improve performance?
Thanks
john

RE: temporary folders for YARN tasks

2013-07-02 Thread Devaraj k

You can make use of this configuration to do the same.

property
descriptionList of directories to store localized files in. An
  application's localized file directory will be found in:
  
${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
  Individual containers' work directories, called container_${contid}, will
  be subdirectories of this.
   /description
nameyarn.nodemanager.local-dirs/name
value${hadoop.tmp.dir}/nm-local-dir/value
  /property

Thanks
Devaraj k

From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: 02 July 2013 02:08
To: user@hadoop.apache.org
Subject: temporary folders for YARN tasks

When a YARN app and its tasks wants to write temporary files, how does it know 
where to write the files?
I am assuming that each task has some temporary space available, and I hope it 
is available across multiple disk volumes for parallel performance.
Are those files cleaned up automatically after task exit?
If I want to give lifetime control of the files to an auxiliary service (along 
the lines of MR shuffle passing files to the aux service), how would I do that, 
and would that entail different file locations?
Thanks
John

Re: Hang when add/remove a datanode into/from a 2 datanode cluster

2013-07-02 Thread sam liu

Yes, the default replication factor is 3. However, in my case, it's
strange: during decommission hangs, I found some block's expected replicas
is 3, but the 'dfs.replication' value in hdfs-site.xml of every cluster
node is always 2 from the beginning of cluster setup. Below is my steps:
1. Install a Hadoop 1.1.1 cluster, with 2 datanodes: dn1 and dn2. And, in
hdfs-site.xml, set the 'dfs.replication' to 2
2. Add node dn3 into the cluster as a new datanode, and did not change the '
dfs.replication' value in hdfs-site.xml and keep it as 2
note: step 2 passed
3. Decommission dn3 from the cluster
Expected result: dn3 could be decommissioned successfully
Actual result:
a). decommission progress hangs and the status always be 'Waiting DataNode
status: Decommissioned'. But, if I execute 'hadoop dfs -setrep -R 2 /', the
decommission continues and will be completed finally.
b). However, if the initial cluster includes = 3 datanodes, this issue
won't be encountered when add/remove another datanode. For example, if I
setup a cluster with 3 datanodes, and then I can successfully add the 4th
datanode into it, and then also can successfully remove the 4th datanode
from the cluster.

I doubt it's a bug and plan to open a jira to Hadoop HDFS for this. Any
comments?

Thanks!


2013/6/21 Harsh J ha...@cloudera.com

 The dfs.replication is a per-file parameter. If you have a client that
 does not use the supplied configs, then its default replication is 3
 and all files it will create (as part of the app or via a job config)
 will be with replication factor 3.

 You can do an -lsr to find all files and filter which ones have been
 created with a factor of 3 (versus expected config of 2).

 On Fri, Jun 21, 2013 at 3:13 PM, sam liu samliuhad...@gmail.com wrote:
  Hi George,
 
  Actually, in my hdfs-site.xml, I always set 'dfs.replication'to 2. But
 still
  encounter this issue.
 
  Thanks!
 
 
  2013/6/21 George Kousiouris gkous...@mail.ntua.gr
 
 
  Hi,
 
  I think i have faced this before, the problem is that you have the rep
  factor=3 so it seems to hang because it needs 3 nodes to achieve the
 factor
  (replicas are not created on the same node). If you set the replication
  factor=2 i think you will not have this issue. So in general you must
 make
  sure that the rep factor is = to the available datanodes.
 
  BR,
  George
 
 
  On 6/21/2013 12:29 PM, sam liu wrote:
 
  Hi,
 
  I encountered an issue which hangs the decommission operatoin. Its
 steps:
  1. Install a Hadoop 1.1.1 cluster, with 2 datanodes: dn1 and dn2. And,
 in
  hdfs-site.xml, set the 'dfs.replication' to 2
  2. Add node dn3 into the cluster as a new datanode, and did not change
 the
  'dfs.replication' value in hdfs-site.xml and keep it as 2
  note: step 2 passed
  3. Decommission dn3 from the cluster
 
  Expected result: dn3 could be decommissioned successfully
 
  Actual result: decommission progress hangs and the status always be
  'Waiting DataNode status: Decommissioned'
 
  However, if the initial cluster includes = 3 datanodes, this issue
 won't
  be encountered when add/remove another datanode.
 
  Also, after step 2, I noticed that some block's expected replicas is 3,
  but the 'dfs.replication' value in hdfs-site.xml is always 2!
 
  Could anyone pls help provide some triages?
 
  Thanks in advance!
 
 
 
  --
  ---
 
  George Kousiouris, PhD
  Electrical and Computer Engineer
  Division of Communications,
  Electronics and Information Engineering
  School of Electrical and Computer Engineering
  Tel: +30 210 772 2546
  Mobile: +30 6939354121
  Fax: +30 210 772 2569
  Email: gkous...@mail.ntua.gr
  Site: http://users.ntua.gr/gkousiou/
 
  National Technical University of Athens
  9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece
 
 



 --
 Harsh J

Re: temporary folders for YARN tasks

2013-07-02 Thread Sandy Ryza

LocalDirAllocator should help with this.  You can look through MapReduce
code to see how it's used.

-Sandy


On Mon, Jul 1, 2013 at 11:01 PM, Devaraj k devara...@huawei.com wrote:

  You can make use of this configuration to do the same. 

 ** **

 property

 descriptionList of directories to store *localized* files in. An ***
 *

   application's *localized* file directory will be found in:

   ${yarn.nodemanager.local-*dirs*}/*usercache*/${user}/*appcache*
 /application_${*appid*}.

   Individual containers' work directories, called container_${*contid*},
 will

   be *subdirectories* of this.

/description

 nameyarn.nodemanager.local-*dirs*/name

 value${hadoop.tmp.dir}/*nm*-local-*dir*/value

   /property

 ** **

 Thanks

 Devaraj k

 ** **

 *From:* John Lilley [mailto:john.lil...@redpoint.net]
 *Sent:* 02 July 2013 02:08
 *To:* user@hadoop.apache.org
 *Subject:* temporary folders for YARN tasks

 ** **

 When a YARN app and its tasks wants to write temporary files, how does it
 know where to write the files?  

 I am assuming that each task has some temporary space available, and I
 hope it is available across multiple disk volumes for parallel performance.
 

 Are those files cleaned up automatically after task exit?

 If I want to give lifetime control of the files to an auxiliary service
 (along the lines of MR shuffle passing files to the aux service), how would
 I do that, and would that entail different file locations?

 Thanks

 John

 ** **

 ** **

reply: a question about dfs.replication

2013-07-02 Thread Francis . Hu

YouPeng Yang,

 

you said that may be the answer. Thank you. 

 

发件人: YouPeng Yang [mailto:yypvsxf19870...@gmail.com] 
发送时间: Tuesday, July 02, 2013 12:52
收件人: user@hadoop.apache.org
主题: Re: reply: a question about dfs.replication

 

HI HU and Yu

 

Aggree with dfs.replication is a client side configuration, not server 
side. It make the point in my last mail  sense.

   

And the cmd:hdfs dfs -setrep -R -w 2 /  solve the problem that I can not 
change the existed file's replication value.

 

 

 

2013/7/2 Azuryy Yu azury...@gmail.com

It's not HDFS issue. 

dfs.replication is a client side configuration, not server side. so you need to 
set it to '2' on your client side( your application running on). THEN execute 
command such as : hdfs dfs -put  or call HDFS API in java application.



 

On Tue, Jul 2, 2013 at 12:25 PM, Francis.Hu francis...@reachjunction.com 
wrote:

Thanks all of you, I just get the problem fixed through the command: 

hdfs dfs -setrep -R -w 2 /

 

Is that an issue of HDFS ? Why do i need to execute manually a command to tell 
the hadoop the replication factor even it is set in hdfs-site.xml ?

 

Thanks,

Francis.Hu

 

发件人: Francis.Hu [mailto:francis...@reachjunction.com] 
发送时间: Tuesday, July 02, 2013 11:30
收件人: user@hadoop.apache.org
主题: 答复: 答复: a question about dfs.replication

 

Yes , it returns 2 correctly after hdfs getconf -confkey dfs.replication

 



 

but in web page ,it is 3 as below:



 

发件人: yypvsxf19870706 [mailto:yypvsxf19870...@gmail.com] 
发送时间: Monday, July 01, 2013 23:24
收件人: user@hadoop.apache.org
主题: Re: 答复: a question about dfs.replication

 

Hi 

 

Could you please get the property value by using : hdfs getconf -confkey 
dfs.replication.


鍙戣嚜鎴戠殑 iPhone


鍦?2013-7-1锛?5:51锛孎rancis.Hu francis...@reachjunction.com 鍐欓亾锛?br

 

Actually, My java client is running with the same configuration as the hadoop's 
. The dfs.replication is already set as 2 in my hadoop's configuration.

So i think the dfs.replication is already overrided by my configuration in 
hdfs-site.xml. but seems it doesn't work even i overrided the parameter 
evidently.

 

 

鍙戜欢浜?span lang=EN-US: 袝屑械谢褜褟薪芯胁 袘芯褉懈褋 [mailto:emelya...@post.km.ru] 
鍙戦€佹椂闂?span lang=EN-US: Monday, July 01, 2013 15:18
鏀朵欢浜?span lang=EN-US: user@hadoop.apache.org
涓婚: Re: a question about dfs.replication

 

On 01.07.2013 10:19, Francis.Hu wrote:

Hi, All

 

I am installing a cluster with Hadoop 2.0.5-alpha. I have one namenode and two 
datanodes. The dfs.replication is set as 2 in hdfs-site.xml. After all 
configuration work is done, I started all nodes. Then I saved a file into HDFS 
through java client. nOW I can access hdfs web page: x.x.x.x:50070,and also see 
the file is already listed in the hdfs list.

My question is:  The replication column in HDFS web page is showing as 3, not 
2.  Does anyone know What the problem is?

 

---Actual setting of hdfs-site.xml

property

namedfs.replication/name

value2/value

/property

 

After that, I typed dfsamdin command to check the file:

hdfs fsck /test3/

The result of above command:

/test3/hello005.txt:  Under replicated 
BP-609310498-192.168.219.129-1372323727200:blk_-1069303317294683372_1006. 
Target Replicas is 3 but found 2 replica(s).

Status: HEALTHY

 Total size:35 B

 Total dirs:1

 Total files:   1

 Total blocks (validated):  1 (avg. block size 35 B)

 Minimally replicated blocks:   1 (100.0 %)

 Over-replicated blocks:0 (0.0 %)

 Under-replicated blocks:   1 (100.0 %)

 Mis-replicated blocks: 0 (0.0 %)

 Default replication factor:2

 Average block replication: 2.0

 Corrupt blocks:0

 Missing replicas:  1 (33.32 %)

 Number of data-nodes:  3

 Number of racks:   1

FSCK ended at Sat Jun 29 16:51:37 CST 2013 in 6 milliseconds

 

 

Thanks,

Francis Hu

 

If I'm not mistaking dfs.replication parameter in config sets only default 
replication factor, which can be overrided when putting file to hdfs.

 

 

image001.pngimage002.png

Re: data loss after cluster wide power loss

2013-07-02 Thread Azuryy Yu

Hi Uma,

I think there is minimum performance degration if set
dfs.datanode.synconclose to true.

On Tue, Jul 2, 2013 at 3:31 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote:

Hi Dave,

Looks like your analysis is correct. I have faced similar issue some time
back.
See the discussion link:
http://markmail.org/message/ruev3aa4x5zh2l4w#query:+page:1+mid:33gcdcu3coodkks3+state:results
On sudden restarts, it can lost the OS filesystem edits. Similar thing
happened in our case, i.e, after restart blocks were moved back to
BeingWritten directory even though they were finalized.
After restart they were marked as corrupt. You could set
dfs.datanode.synconclose to true to avoid this sort of things, but that
will degrade performance.

Regards,
Uma

-Original Message-
From: ddlat...@gmail.com [mailto:ddlat...@gmail.com] On Behalf Of Dave
Latham
Sent: 01 July 2013 16:08
To: hdfs-u...@hadoop.apache.org
Cc: hdfs-...@hadoop.apache.org
Subject: Re: data loss after cluster wide power loss

Much appreciated, Suresh. Let me know if I can provide any more
information or if you'd like me to open a JIRA.

Dave

On Mon, Jul 1, 2013 at 8:48 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

Dave,

Thanks for the detailed email. Sorry I did not read all the details
you had sent earlier completely (on my phone). As you said, this is
not related to data loss related to HBase log and hsync. I think you
are right; the rename operation itself might not have hit the disk. I
think we should either ensure metadata operation is synced on the
datanode or handle it being reported as blockBeingWritten. Let me
spend sometime to debug this issue.

One surprising thing is, all the replicas were reported as
blockBeingWritten.

Regards,
Suresh

On Mon, Jul 1, 2013 at 6:03 PM, Dave Latham lat...@davelink.net wrote:

(Removing hbase list and adding hdfs-dev list as this is pretty
internal stuff).

Reading through the code a bit:

FSDataOutputStream.close calls
DFSOutputStream.close calls
DFSOutputStream.closeInternal
- sets currentPacket.lastPacketInBlock = true
- then calls
DFSOutputStream.flushInternal
- enqueues current packet
- waits for ack

BlockReceiver.run
- if (lastPacketInBlock !receiver.finalized) calls
FSDataset.finalizeBlock calls FSDataset.finalizeBlockInternal calls
FSVolume.addBlock calls FSDir.addBlock calls FSDir.addBlock
- renames block from blocksBeingWritten tmp dir to current dest
dir

This looks to me as I would expect a synchronous chain from a DFS
client to moving the file from blocksBeingWritten to the current dir
so that once the file is closed that it the block files would be in
the proper directory
- even if the contents of the file are still in the OS buffer rather
than synced to disk. It's only after this moving of blocks that
NameNode.complete file is called. There are several conditions and
loops in there that I'm not certain this chain is fully reliable in
all cases without a greater understanding of the code.

Could it be the case that the rename operation itself is not synced
and that ext3 lost the fact that the block files were moved?
Or is there a bug in the close file logic that for some reason the
block files are not always moved into place when a file is closed?

Thanks for your patience,
Dave

On Mon, Jul 1, 2013 at 3:35 PM, Dave Latham lat...@davelink.net
wrote:

Thanks for the response, Suresh.

I'm not sure that I understand the details properly. From my
reading of
HDFS-744 the hsync API would allow a client to make sure that at any
point in time it's writes so far hit the disk. For example, for
HBase it could apply a fsync after adding some edits to its WAL to
ensure those edits are fully durable for a file which is still open.

However, in this case the dfs file was closed and even renamed. Is
it the case that even after a dfs file is closed and renamed that
the data blocks would still not be synced and would still be stored
by the datanode in blocksBeingWritten rather than in current?
If that is case, would it be better for the NameNode not to reject
replicas that are in blocksBeingWritten, especially if it doesn't
have any other replicas available?

Dave

On Mon, Jul 1, 2013 at 3:16 PM, Suresh Srinivas
sur...@hortonworks.comwrote:

Yes this is a known issue.

The HDFS part of this was addressed in
https://issues.apache.org/jira/browse/HDFS-744 for 2.0.2-alpha and
is not available in 1.x release. I think HBase does not use this
API yet.

On Mon, Jul 1, 2013 at 3:00 PM, Dave Latham lat...@davelink.net
wrote:

We're running HBase over HDFS 1.0.2 on about 1000 nodes. On
Saturday
the
data center we were in had a total power failure and the cluster
went
down
hard. When we brought it back up, HDFS reported 4 files as CORRUPT.
We
recovered the data in question

Re: How to write/run MPI program on Yarn?

2013-07-02 Thread sam liu

Any one could help answer above questions? Thanks a lot!


2013/7/1 sam liu samliuhad...@gmail.com

 Thanks Pramod and Clark!

 1. What's the relationship of Hadoop 2.x branch and mpich2-yarn project?
 2. Does Hadoop 2.x branch plan to include MPI implementation? I mentioned
 there is already a JIRA:
 https://issues.apache.org/jira/browse/MAPREDUCE-2911


 2013/7/1 Clark Yang (杨卓荦) yangzhuo...@gmail.com

 Hi, sam

Please try it following the README.md on
 https://github.com/clarkyzl/mpich2-yarn/.
   The design is simple ,make each nodemanger runs a smpd daemon and
 make application master run mpiexec to submit mpi jobs. The biggest
 problem here, I think, is the yarn scheduler do not support gang
 secheduling algorithm, and the scheduling is very very slow.

   Any ideas please feel free to share.

 Cheers,
 Zhuoluo (Clark) Yang


 2013/6/30 Pramod N npramo...@gmail.com

 Hi Sam,
 This might interest you.
 https://github.com/clarkyzl/mpich2-yarn



 Pramod N http://atmachinelearner.blogspot.in
 Bruce Wayne of web
 @machinelearner https://twitter.com/machinelearner

 --


 On Sun, Jun 30, 2013 at 6:02 PM, sam liu samliuhad...@gmail.com wrote:

 Hi Experts,

 Does Hadoop 2.0.5-alpha supports MPI programs?

 - If yes, is there any example of writing a MPI program on Yarn? How to
 write the client side code to configure and submit the MPI job?
 - If no, which Hadoop version will support MPI programming?

 Thanks!

Re: How to write/run MPI program on Yarn?

2013-07-02 Thread Olivier Renault

Sam,

The fundamental idea of YARN is to split up the two major functionalities of
the JobTracker, resource management and job scheduling/monitoring, into
separate daemons. With YARN, we're going to be able to run multiple
workload, one of them being MapReduce another one being MPI. So mpich2-yarn
is the port of MPI to run on top of yarn.

You may want to read :
http://hortonworks.com/hadoop/yarn/
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html

Hope it helps,
Olivier


On 2 July 2013 02:21, sam liu samliuhad...@gmail.com wrote:

 Any one could help answer above questions? Thanks a lot!


 2013/7/1 sam liu samliuhad...@gmail.com

 Thanks Pramod and Clark!

 1. What's the relationship of Hadoop 2.x branch and mpich2-yarn project?
 2. Does Hadoop 2.x branch plan to include MPI implementation? I mentioned
 there is already a JIRA:
 https://issues.apache.org/jira/browse/MAPREDUCE-2911


 2013/7/1 Clark Yang (杨卓荦) yangzhuo...@gmail.com

 Hi, sam

Please try it following the README.md on
 https://github.com/clarkyzl/mpich2-yarn/.
   The design is simple ,make each nodemanger runs a smpd daemon and
 make application master run mpiexec to submit mpi jobs. The biggest
 problem here, I think, is the yarn scheduler do not support gang
 secheduling algorithm, and the scheduling is very very slow.

   Any ideas please feel free to share.

 Cheers,
 Zhuoluo (Clark) Yang


 2013/6/30 Pramod N npramo...@gmail.com

 Hi Sam,
 This might interest you.
 https://github.com/clarkyzl/mpich2-yarn



 Pramod N http://atmachinelearner.blogspot.in
 Bruce Wayne of web
 @machinelearner https://twitter.com/machinelearner

 --


 On Sun, Jun 30, 2013 at 6:02 PM, sam liu samliuhad...@gmail.comwrote:

 Hi Experts,

 Does Hadoop 2.0.5-alpha supports MPI programs?

 - If yes, is there any example of writing a MPI program on Yarn? How
 to write the client side code to configure and submit the MPI job?
 - If no, which Hadoop version will support MPI programming?

 Thanks!








-- 
Olivier Renault
Solution Engineer - Big Data - Hortonworks, Inc.
+44 7500 933 036
orena...@hortonworks.com
www.hortonworks.com
http://hortonworks.com/products/hortonworks-sandbox/

Latest disctcp code

2013-07-02 Thread Jagat Singh

Hi,

Which branch of Hadoop has latest Disctp code.

The branch-1 mentions something like distcp2

https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/tools/org/apache/hadoop/tools/distcp2/util/DistCpUtils.java

The trunk has no mention of distcp2

http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/

Same is the case of 2.1 branch

http://svn.apache.org/repos/asf/hadoop/common/branches/branch-2.1.0-beta/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/

Can we say that branch 1 has latest code and that has not been back ported
to branch 2?

What would be best place to look into , can you please point

Thanks

Re: Latest disctcp code

2013-07-02 Thread Jagat Singh

Hello Harsh,

Thank you very much for your reply.

Regards,

Jagat


On Tue, Jul 2, 2013 at 8:29 PM, Harsh J ha...@cloudera.com wrote:

 The trunk's distcp is by default distcp2. The branch-1 received a
 backport of distcp2 recently, so is named differently.

 In general we try not to have a new feature introduced in branch-1.
 All new features must go to the trunk first, before being back-ported
 into maintained release branches.

 On Tue, Jul 2, 2013 at 3:19 PM, Jagat Singh jagatsi...@gmail.com wrote:
  Hi,
 
  Which branch of Hadoop has latest Disctp code.
 
  The branch-1 mentions something like distcp2
 
 
 https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/tools/org/apache/hadoop/tools/distcp2/util/DistCpUtils.java
 
  The trunk has no mention of distcp2
 
 
 http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/
 
  Same is the case of 2.1 branch
 
 
 http://svn.apache.org/repos/asf/hadoop/common/branches/branch-2.1.0-beta/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/
 
  Can we say that branch 1 has latest code and that has not been back
 ported
  to branch 2?
 
  What would be best place to look into , can you please point
 
  Thanks
 



 --
 Harsh J

Re: data loss after cluster wide power loss

2013-07-02 Thread Dave Latham

Hi Uma,

Thanks for the pointer. Your case sounds very similar. The main
differences that I see are that in my case it happened on all 3 replicas
and the power failure occurred merely seconds after the blocks were
finalized. So I guess the question is whether HDFS can do anything to
better recover from such situations. I'm also curious whether ext4 would
be less susceptible than ext3.

I will definitely look at enabling dfs.datanode.synconclose once we upgrade
to a version of hdfs that has it. I would love to see some performance
numbers if anyone has run them. Also appears that HBase is considering
enabling it by default (cf. comments on HBase-5954).

Dave

On Tue, Jul 2, 2013 at 12:31 AM, Uma Maheswara Rao G
mahesw...@huawei.comwrote:

Hi Dave,

Regards,
Uma

Much appreciated, Suresh. Let me know if I can provide any more
information or if you'd like me to open a JIRA.

Dave

On Mon, Jul 1, 2013 at 8:48 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

Dave,

One surprising thing is, all the replicas were reported as
blockBeingWritten.

Regards,
Suresh

On Mon, Jul 1, 2013 at 6:03 PM, Dave Latham lat...@davelink.net wrote:

(Removing hbase list and adding hdfs-dev list as this is pretty
internal stuff).

Reading through the code a bit:

Thanks for your patience,
Dave

On Mon, Jul 1, 2013 at 3:35 PM, Dave Latham lat...@davelink.net
wrote:

Thanks for the response, Suresh.

Dave

On Mon, Jul 1, 2013 at 3:16 PM, Suresh Srinivas
sur...@hortonworks.comwrote:

Yes

Fwd: Number format exception : For input string

2013-07-02 Thread Mix Nin

I wrote a script as below.

Data = LOAD 'part-r-0' AS (session_start_gmt:long)
FilterData = FILTER Data BY  session_start_gmt=1369546091667


I get below error

2013-07-01 22:48:06,510 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1200: For input string: 1369546091667
In detail log it says number format exception.

When I give
x = group Data ALL;
y = FOREACH  GENERATE MIN(Data.session_start_gmt) as
min_session_start_time,MAX(Data.session_start_gmt) as
max_session_start_time;

I get below output
(1369546091667,1369638849418)

When just give session_start_gmt=1369546091 (3 digits less) it works fine

So, why is the first error coming when I compare with exact value.?


Thanks

RE: intermediate results files

2013-07-02 Thread John Lilley

Replication also has downstream effects: it puts pressure on the available 
network bandwidth and disk I/O bandwidth when the cluster is loaded.
john

From: Mohammad Tariq [mailto:donta...@gmail.com]
Sent: Monday, July 01, 2013 6:35 PM
To: user@hadoop.apache.org
Subject: Re: intermediate results files

I see. This difference is because of the fact that the next block of data will 
not be written to HDFS until the previous block was successfully written to 
'all' the DNs selected for replication. This implies that higher RF means more 
time for the completion of a block write.

Warm Regards,
Tariq
cloudfront.blogspot.comhttp://cloudfront.blogspot.com

On Tue, Jul 2, 2013 at 4:39 AM, John Lilley 
john.lil...@redpoint.netmailto:john.lil...@redpoint.net wrote:
I've seen some benchmarks where replication=1 runs at about 50MB/sec and 
replication=3 runs at about 33MB/sec, but I can't seem to find that now.
John

From: Mohammad Tariq [mailto:donta...@gmail.commailto:donta...@gmail.com]
Sent: Monday, July 01, 2013 5:03 PM
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: Re: intermediate results files

Hello John,

  IMHO, it doesn't matter. Your job will write the result just once. 
Replica creation is handled at the HDFS layer so it has nothing to with your 
job. Your job will still be writing at the same speed.

Warm Regards,
Tariq
cloudfront.blogspot.comhttp://cloudfront.blogspot.com

On Tue, Jul 2, 2013 at 4:16 AM, John Lilley 
john.lil...@redpoint.netmailto:john.lil...@redpoint.net wrote:
If my reducers are going to create results that are temporary in nature 
(consumed by the next processing stage) is it recommended to use a replication 
factor 3 to improve performance?
Thanks
john

RE: YARN tasks and child processes

2013-07-02 Thread John Lilley

Devaraj,
Thanks, this is also good information.  But I was really asking if a child 
*process* that was spawned by a task can persist, in addition to the data.
john

From: Devaraj k [mailto:devara...@huawei.com]
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org
Subject: RE: YARN tasks and child processes

It is possible to persist the data by YARN task, you can choose whichever place 
you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after 
using it.  If you choose to write in local dir, you may write the data into the 
nm local dirs (i.e 'yarn.nodemanager.local-dirs' configuration) accordingly 
with the app id  container id, and this will be cleaned up after the app 
completion.  You need to make use of this persisted data before completing the 
application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is 
complete?  I am looking at an alternative to a YARN auxiliary process that may 
be simpler to implement, if I can have a task spawn a process that persists for 
some time after the task finishes.
Thanks,
John

Re: intermediate results files

2013-07-02 Thread Ravi Prakash

Hi John!
If your block is going to be replicated to three nodes, then in the default 
block placement policy, 2 of them will be on the same rack, and a third one 
will be on a different rack. Depending on the network bandwidths available 
intra-rack and inter-rack, writing with replication factor=3 may be almost as 
fast or (more likely) slower. With replication factor=2, the default block 
placement is to place them on different racks, so you wouldn't gain much. So 
you can 
1. Either choose replication factor = 1
2. Change the block placement policy such that even with replication factor=2, 
it will choose two nodes in the same rack.

HTH
Ravi





 From: Devaraj k devara...@huawei.com
To: user@hadoop.apache.org user@hadoop.apache.org 
Sent: Tuesday, July 2, 2013 1:00 AM
Subject: RE: intermediate results files
 


 
If you are 100% sure that all the node data nodes are available and healthy for 
that period of time, you can choose the replication factor as 1 or 3.
 
Thanks
Devaraj k
 
From:John Lilley [mailto:john.lil...@redpoint.net] 
Sent: 02 July 2013 04:40
To: user@hadoop.apache.org
Subject: RE: intermediate results files
 
I’ve seen some benchmarks where replication=1 runs at about 50MB/sec and 
replication=3 runs at about 33MB/sec, but I can’t seem to find that now.
John
 
From:Mohammad Tariq [mailto:donta...@gmail.com] 
Sent: Monday, July 01, 2013 5:03 PM
To: user@hadoop.apache.org
Subject: Re: intermediate results files
 
Hello John,
 
      IMHO, it doesn't matter. Your job will write the result just once. 
Replica creation is handled at the HDFS layer so it has nothing to with your 
job. Your job will still be writing at the same speed.


Warm Regards,
Tariq
cloudfront.blogspot.com
 
On Tue, Jul 2, 2013 at 4:16 AM, John Lilley john.lil...@redpoint.net wrote:
If my reducers are going to create results that are temporary in nature 
(consumed by the next processing stage) is it recommended to use a replication 
factor 3 to improve performance?  
Thanks
john

HDFS file section rewrite

2013-07-02 Thread John Lilley

I'm sure this has been asked a zillion times, so please just point me to the 
JIRA comments: is there a feature underway to allow for re-writing of HDFS file 
sections?
Thanks
John

typical JSON data sets

2013-07-02 Thread John Lilley

I would like to hear your experiences working with large JSON data sets, 
specifically:

1)  How large is each JSON document?

2)  Do they tend to be a single JSON doc per file, or multiples per file?

3)  Do the JSON schemas change over time?

4)  Are there interesting public data sets you would recommend for 
experiment?
Thanks
John

Containers and CPU

2013-07-02 Thread John Lilley

I have YARN tasks that benefit from multicore scaling.  However, they don't 
*always* use more than one core.  I would like to allocate containers based 
only on memory, and let each task use as many cores as needed, without 
allocating exclusive CPU slots in the scheduler.  For example, on an 8-core 
node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB 
memory and each using as much CPU as they like.  Is this the default behavior 
if I don't specify CPU restrictions to the scheduler?
Thanks
John

RE: some idea about the Data Compression

2013-07-02 Thread John Lilley

Geelong,


1.   These files will probably be some standard format like .gz or .bz2 or 
.zip.  In that case, pick an appropriate InputFormat.  See e.g. 
http://cotdp.com/2012/07/hadoop-processing-zip-files-in-mapreduce/, 
http://stackoverflow.com/questions/14497572/reading-gzipped-file-in-hadoop-using-custom-recordreader

2.   Generally, compression is a Good Thing and will improve performance.  
But only if you use a fast compressor like LZO or Snappy.  Gzip, ZIP, BZ2, etc 
are no good for this.  You also need to ensure that your compressed files are 
splittable if you are going to create a single file that will be processed by 
a later MR stage, for this a SequenceFile is helpful.  For typical intermediate 
outputs it doesn't matter as much because you will have a folder of file parts 
and these are pre split in some sense.  Once upon a time, LZO compression was 
a thing that you had to install as a separate component, but I think the modern 
distros include it.  See for example: 
http://kickstarthadoop.blogspot.com/2012/02/use-compression-with-mapreduce.html 
, http://blog.cloudera.com/blog/2009/05/10-mapreduce-tips/, 
http://my.safaribooksonline.com/book/software-engineering-and-development/9781449328917/compression/id3689058,
 
https://www.inkling.com/read/hadoop-definitive-guide-tom-white-3rd/chapter-4/compression
 (section 4.2 in the Elephant book).

John

From: Geelong Yao [mailto:geelong...@gmail.com]
Sent: Thursday, June 20, 2013 12:30 AM
To: user@hadoop.apache.org
Subject: some idea about the Data Compression

Hi , everyone

I am working on the data compression
1.data compression before the raw data were uploaded into HDFS.
2.data compression while processing in Hadoop to reduce the pressure on IO.


Can anyone give me some ideas on above 2 directions


BRs
Geelong

--
From Good To Great

Custom JoinRecordReader class

2013-07-02 Thread Chloe Guszo

Hi all,

I would like some help/direction on implementing a custom join class. I
believe this is the way to address my task at hand, which is given 2
matrices in SequenceFile format, I wish to run operations on all pairs of
rows between them. The rows may not be equal in number. The actual
operations will be taken care of in Mahout.

I wrote a custom class working off of InnerJoinRecordReader and
OuterJoinRecordReader but they of course always get fed and thus return
pairs of keys that match. How can I get a return of all key pairs? Or does
this go completely against the hadoop map-reduce framework?

Thanks in advance for any input.

RE: Containers and CPU

2013-07-02 Thread Chuan Liu

I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource 
allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't 
*always* use more than one core.  I would like to allocate containers based 
only on memory, and let each task use as many cores as needed, without 
allocating exclusive CPU slots in the scheduler.  For example, on an 8-core 
node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB 
memory and each using as much CPU as they like.  Is this the default behavior 
if I don't specify CPU restrictions to the scheduler?
Thanks
John

Re: Containers and CPU

2013-07-02 Thread Sandy Ryza

CPU limits are only enforced if cgroups is turned on.  With cgroups on,
they are only limited when there is contention, in which case tasks are
given CPU time in proportion to the number of cores requested for/allocated
to them.  Does that make sense?

-Sandy


On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu chuan...@microsoft.com wrote:

  I believe this is the default behavior.

 By default, only memory limit on resources is enforced.

 The capacity scheduler will use DefaultResourceCalculator to compute
 resource allocation for containers by default, which also does not take CPU
 into account.

 ** **

 -Chuan

 ** **

 *From:* John Lilley [mailto:john.lil...@redpoint.net]
 *Sent:* Tuesday, July 02, 2013 8:57 AM
 *To:* user@hadoop.apache.org
 *Subject:* Containers and CPU

 ** **

 I have YARN tasks that benefit from multicore scaling.  However, they
 don’t **always** use more than one core.  I would like to allocate
 containers based only on memory, and let each task use as many cores as
 needed, without allocating exclusive CPU “slots” in the scheduler.  For
 example, on an 8-core node with 16GB memory, I’d like to be able to run 3
 tasks each consuming 4GB memory and each using as much CPU as they like.
 Is this the default behavior if I don’t specify CPU restrictions to the
 scheduler?

 Thanks

 John

 ** **

 ** **

RE: Yarn HDFS and Yarn Exceptions when processing larger datasets.

2013-07-02 Thread John Lilley

Blah blah,
Can you build and run the DistributedShell example?  If it does not run 
correctly this would tend to implicate your configuration.  If it run correctly 
then your code is suspect.
John

From: blah blah [mailto:tmp5...@gmail.com]
Sent: Tuesday, June 25, 2013 6:09 PM
To: user@hadoop.apache.org
Subject: Yarn HDFS and Yarn Exceptions when processing larger datasets.

Hi All
First let me excuse for the poor thread title but I have no idea how to express 
the problem in one sentence.
I have implemented new Application Master with the use of Yarn. I am using old 
Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT 3.0.0). I 
can not update to current trunk version, as prototype deadline is soon, and I 
don't have time to include Yarn API changes.
Currently I execute experiments in pseudo-distributed mode, I use guava version 
14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for larger 
datasets. My AM works fine and I can execute it without a problem for a debug 
dataset (1MB size). But when I increase the size of input to 6.8 MB, I am 
getting the following exceptions:
AM_Exceptions_Stack

Exception in thread Thread-3 java.lang.reflect.UndeclaredThrowableException
at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
at 
org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
at 
org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
at 
org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on 
local exception: java.io.IOException: Response is null.; Host Details : local 
host is: linux-ljc5.site/127.0.0.1http://127.0.0.1; destination host is: 
0.0.0.0:8030;
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
at $Proxy10.allocate(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
... 4 more
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: 
Response is null.; Host Details : local host is: 
linux-ljc5.site/127.0.0.1http://127.0.0.1; destination host is: 
0.0.0.0:8030;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
at org.apache.hadoop.ipc.Client.call(Client.java:1240)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
... 6 more
Caused by: java.io.IOException: Response is null.
at 
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
Container_Exception

Exception in thread 
org.apache.hadoop.hdfs.SocketCache@6da0d866mailto:org.apache.hadoop.hdfs.SocketCache@6da0d866
 java.lang.NoSuchMethodError: 
com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
at java.lang.Thread.run(Thread.java:662)

As I said this problem does not occur for the 1MB input. For the 6MB input 
nothing is changed except the input dataset. Now a little bit of what am I 
doing, to give you the context of the problem. My AM starts N (debug 4) 
containers and each container reads its input data part. When this process is 
finished I am exchanging parts of input between containers (exchanging IDs of 
input structures, to provide means for communication between data structures). 
During the process of exchanging IDs these exceptions occur. I start Netty 
Server/Client on each container and I use ports 12000-12099 as mean of 
communicating these IDs.
Any help will be greatly appreciated. Sorry for any typos and if the 
explanation is not clear just ask for any details you are interested in. 
Currently it is after 2 AM I hope this will be a valid excuse.
regards
tmp

Re: HDFS file section rewrite

2013-07-02 Thread Suresh Srinivas

HDFS only supports regular writes and append. Random write is not
supported. I do not know of any feature/jira that is underway to support
this feature.


On Tue, Jul 2, 2013 at 9:01 AM, John Lilley john.lil...@redpoint.netwrote:

  I’m sure this has been asked a zillion times, so please just point me to
 the JIRA comments: is there a feature underway to allow for re-writing of
 HDFS file sections?

 Thanks

 John

 ** **




-- 
http://hortonworks.com/download/

RE: YARN tasks and child processes

2013-07-02 Thread John Lilley

Thanks, that answers my question.  I am trying to explore alternatives to a 
YARN auxiliary service, but apparently this isn’t an option.
John

From: Ravi Prakash [mailto:ravi...@ymail.com]
Sent: Tuesday, July 02, 2013 9:55 AM
To: user@hadoop.apache.org
Subject: Re: YARN tasks and child processes

Nopes! The node manager kills the entire process tree when the task reports 
that it is done. Now if you were able to figure out a way for one of the 
children to break out of the process tree, maybe?

However your approach is obviously not recommended. You would be stealing from 
the resources that YARN should have available.

From: John Lilley john.lil...@redpoint.netmailto:john.lil...@redpoint.net
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org 
user@hadoop.apache.orgmailto:user@hadoop.apache.org
Sent: Tuesday, July 2, 2013 10:41 AM
Subject: RE: YARN tasks and child processes

Devaraj,
Thanks, this is also good information.  But I was really asking if a child 
*process* that was spawned by a task can persist, in addition to the data.
john

From: Devaraj k [mailto:devara...@huawei.com]
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: RE: YARN tasks and child processes

It is possible to persist the data by YARN task, you can choose whichever place 
you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after 
using it.  If you choose to write in local dir, you may write the data into the 
nm local dirs (i.e ‘yarn.nodemanager.local-dirs’ configuration) accordingly 
with the app id  container id, and this will be cleaned up after the app 
completion.  You need to make use of this persisted data before completing the 
application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is 
complete?  I am looking at an alternative to a YARN auxiliary process that may 
be simpler to implement, if I can have a task spawn a process that persists for 
some time after the task finishes.
Thanks,
John

RE: Containers and CPU

2013-07-02 Thread John Lilley

Sandy,
Sorry, I don't completely follow.
When you say with cgroups on, is that an attribute of the AM, the Scheduler, 
or the Site/RM?  In other words is it site-wide or something that my 
application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really 
like all tasks to have access to all CPU cores and simply fight it out in the 
OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.r...@cloudera.com]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they 
are only limited when there is contention, in which case tasks are given CPU 
time in proportion to the number of cores requested for/allocated to them.  
Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu 
chuan...@microsoft.commailto:chuan...@microsoft.com wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource 
allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley 
[mailto:john.lil...@redpoint.netmailto:john.lil...@redpoint.net]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't 
*always* use more than one core.  I would like to allocate containers based 
only on memory, and let each task use as many cores as needed, without 
allocating exclusive CPU slots in the scheduler.  For example, on an 8-core 
node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB 
memory and each using as much CPU as they like.  Is this the default behavior 
if I don't specify CPU restrictions to the scheduler?
Thanks
John

RE: Containers and CPU

2013-07-02 Thread John Lilley

To explain my reasoning, suppose that I have an application that performs some 
CPU-intensive calculation, and can scale to multiple cores internally, but it 
doesn't need those cores all the time because the CPU-intensive phase is only a 
part of the overall computation.  I'm not sure I understand cgroups' CPU 
control - does it statically mask cores available to processes, or does it set 
up a prioritization for access to all available cores?
Thanks,
John


From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Tuesday, July 02, 2013 1:12 PM
To: user@hadoop.apache.org
Subject: RE: Containers and CPU

Sandy,
Sorry, I don't completely follow.
When you say with cgroups on, is that an attribute of the AM, the Scheduler, 
or the Site/RM?  In other words is it site-wide or something that my 
application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really 
like all tasks to have access to all CPU cores and simply fight it out in the 
OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.r...@cloudera.com]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they 
are only limited when there is contention, in which case tasks are given CPU 
time in proportion to the number of cores requested for/allocated to them.  
Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu 
chuan...@microsoft.commailto:chuan...@microsoft.com wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource 
allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley 
[mailto:john.lil...@redpoint.netmailto:john.lil...@redpoint.net]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't 
*always* use more than one core.  I would like to allocate containers based 
only on memory, and let each task use as many cores as needed, without 
allocating exclusive CPU slots in the scheduler.  For example, on an 8-core 
node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB 
memory and each using as much CPU as they like.  Is this the default behavior 
if I don't specify CPU restrictions to the scheduler?
Thanks
John

HDFS temporary file locations

2013-07-02 Thread John Lilley

Is there any convention for clients/applications wishing to use temporary file 
space in HDFS?  For example, my application wants to:

1)  Load data into some temporary space in HDFS as an external client

2)  Run an AM, which produces HDFS output (also in the temporary space)

3)  Read the data out of the HDFS temporary space

4)  Remove the temporary files
Thanks
John

Re: Containers and CPU

2013-07-02 Thread Sandy Ryza

Use of cgroups for controlling CPU is off by default, but can be turned on
as a nodemanager configuration with
yarn.nodemanager.linux-container-executor.resources-handler.class.  So it
is site-wide.  If you want tasks to purely fight it out in the OS thread
scheduler, simply don't change from the default.

Even with cgroups on, all tasks will have access to all CPU cores.  We
don't do any pinning of tasks to cores.  If a task is requested with a
single vcore and placed on an otherwise empty machine with 8 cores, it will
have access to all 8 cores.  If 3 other tasks that requested a single vcore
are later placed on the same node, and all tasks are using as much CPU as
they can get their hands on, then each of the tasks will get 2 cores of
CPU-time.


On Tue, Jul 2, 2013 at 12:12 PM, John Lilley john.lil...@redpoint.netwrote:

  Sandy,

 Sorry, I don’t completely follow.  

 When you say “with cgroups on”, is that an attribute of the AM, the
 Scheduler, or the Site/RM?  In other words is it site-wide or something
 that my application can control?

 With cgroups on, is there still a way to get my desired behavior?  I’d
 really like all tasks to have access to all CPU cores and simply fight it
 out in the OS thread scheduler.

 Thanks,

 john

 ** **

 *From:* Sandy Ryza [mailto:sandy.r...@cloudera.com]
 *Sent:* Tuesday, July 02, 2013 11:56 AM
 *To:* user@hadoop.apache.org
 *Subject:* Re: Containers and CPU

 ** **

 CPU limits are only enforced if cgroups is turned on.  With cgroups on,
 they are only limited when there is contention, in which case tasks are
 given CPU time in proportion to the number of cores requested for/allocated
 to them.  Does that make sense?

 ** **

 -Sandy

 ** **

 On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu chuan...@microsoft.com wrote:*
 ***

 I believe this is the default behavior.

 By default, only memory limit on resources is enforced.

 The capacity scheduler will use DefaultResourceCalculator to compute
 resource allocation for containers by default, which also does not take CPU
 into account.

  

 -Chuan

  

 *From:* John Lilley [mailto:john.lil...@redpoint.net]
 *Sent:* Tuesday, July 02, 2013 8:57 AM
 *To:* user@hadoop.apache.org
 *Subject:* Containers and CPU

  

 I have YARN tasks that benefit from multicore scaling.  However, they
 don’t **always** use more than one core.  I would like to allocate
 containers based only on memory, and let each task use as many cores as
 needed, without allocating exclusive CPU “slots” in the scheduler.  For
 example, on an 8-core node with 16GB memory, I’d like to be able to run 3
 tasks each consuming 4GB memory and each using as much CPU as they like.
 Is this the default behavior if I don’t specify CPU restrictions to the
 scheduler?

 Thanks

 John

  

  

 ** **

RE: Containers and CPU

2013-07-02 Thread John Lilley

Sandy,
Thanks, I think I understand.  So it only makes a difference if cgroups is on 
AND the AM requests multiple cores?  E.g. if each task wants 4 cores the RM 
would only allow two containers per 8-core node?
John

From: Sandy Ryza [mailto:sandy.r...@cloudera.com]
Sent: Tuesday, July 02, 2013 1:26 PM
To: user@hadoop.apache.org
Subject: Re: Containers and CPU

Use of cgroups for controlling CPU is off by default, but can be turned on as a 
nodemanager configuration with 
yarn.nodemanager.linux-container-executor.resources-handler.class.  So it is 
site-wide.  If you want tasks to purely fight it out in the OS thread 
scheduler, simply don't change from the default.

Even with cgroups on, all tasks will have access to all CPU cores.  We don't do 
any pinning of tasks to cores.  If a task is requested with a single vcore and 
placed on an otherwise empty machine with 8 cores, it will have access to all 8 
cores.  If 3 other tasks that requested a single vcore are later placed on the 
same node, and all tasks are using as much CPU as they can get their hands on, 
then each of the tasks will get 2 cores of CPU-time.

On Tue, Jul 2, 2013 at 12:12 PM, John Lilley 
john.lil...@redpoint.netmailto:john.lil...@redpoint.net wrote:
Sandy,
Sorry, I don't completely follow.
When you say with cgroups on, is that an attribute of the AM, the Scheduler, 
or the Site/RM?  In other words is it site-wide or something that my 
application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really 
like all tasks to have access to all CPU cores and simply fight it out in the 
OS thread scheduler.
Thanks,
john

From: Sandy Ryza 
[mailto:sandy.r...@cloudera.commailto:sandy.r...@cloudera.com]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they 
are only limited when there is contention, in which case tasks are given CPU 
time in proportion to the number of cores requested for/allocated to them.  
Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu 
chuan...@microsoft.commailto:chuan...@microsoft.com wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource 
allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley 
[mailto:john.lil...@redpoint.netmailto:john.lil...@redpoint.net]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't 
*always* use more than one core.  I would like to allocate containers based 
only on memory, and let each task use as many cores as needed, without 
allocating exclusive CPU slots in the scheduler.  For example, on an 8-core 
node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB 
memory and each using as much CPU as they like.  Is this the default behavior 
if I don't specify CPU restrictions to the scheduler?
Thanks
John

Re: Yarn HDFS and Yarn Exceptions when processing larger datasets.

2013-07-02 Thread blah blah

Hi

Just a quick short reply (tomorrow is my prototype presentation).

@Omkar Joshi
- RM port 8030 already running when I start my AM
- I'll do the client thread size AM
- Only AM communicates with RM
- RM/NM no exceptions there (as far as I remember will check later [sorry])

Furthermore in fully distributed mode AM doesn't throw exceptions anymore,
only Containers.

@John Lilley
Yes the problem is with my code (I don't want to imply that it is YARN's
problem). I have successfully run Distributed Shell and YARN's MapReduce
jobs with much bigger datasets than 1mb ;). I just don't know where to
start looking for the problem, especially for the Containers exceptions as
they occur after my containers are done with HDFS (until they store final
output).

The only idea I have is that these exceptions occur during Containers
communication. Instead of sending multiple messages my containers aggregate
all messages per container into one big message (the biggest around
8k-10k chars), thus each container sends only 1 message to other container
(which includes multiple messages). I don't know if this information is
important, but I am planning to see what will happen if I partition the
messages (1024). I got this idea from the Containers exception 
org.apache.hadoop.hdfs.SocketCache, I am using SocketChannels to send
these big messages, so maybe I am creating some Socket conflict .

regards
tmp



2013/7/2 John Lilley john.lil...@redpoint.net

  Blah blah,

 Can you build and run the DistributedShell example?  If it does not run
 correctly this would tend to implicate your configuration.  If it run
 correctly then your code is suspect.

 John

 ** **

 ** **

 *From:* blah blah [mailto:tmp5...@gmail.com]
 *Sent:* Tuesday, June 25, 2013 6:09 PM

 *To:* user@hadoop.apache.org
 *Subject:* Yarn HDFS and Yarn Exceptions when processing larger
 datasets.

 ** **

 Hi All

 First let me excuse for the poor thread title but I have no idea how to
 express the problem in one sentence. 

 I have implemented new Application Master with the use of Yarn. I am using
 old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT
 3.0.0). I can not update to current trunk version, as prototype deadline is
 soon, and I don't have time to include Yarn API changes.

 Currently I execute experiments in pseudo-distributed mode, I use guava
 version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
 larger datasets. My AM works fine and I can execute it without a problem
 for a debug dataset (1MB size). But when I increase the size of input to
 6.8 MB, I am getting the following exceptions:

 AM_Exceptions_Stack

 Exception in thread Thread-3
 java.lang.reflect.UndeclaredThrowableException
 at
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
 at
 org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
 at
 org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
 at
 org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
 at
 org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException:
 Failed on local exception: java.io.IOException: Response is null.; Host
 Details : local host is: linux-ljc5.site/127.0.0.1; destination host
 is: 0.0.0.0:8030;
 at
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
 at $Proxy10.allocate(Unknown Source)
 at
 org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception:
 java.io.IOException: Response is null.; Host Details : local host is:
 linux-ljc5.site/127.0.0.1; destination host is: 0.0.0.0:8030;
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
 at org.apache.hadoop.ipc.Client.call(Client.java:1240)
 at
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
 ... 6 more
 Caused by: java.io.IOException: Response is null.
 at
 org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
 at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)

 Container_Exception

 Exception in thread org.apache.hadoop.hdfs.SocketCache@6da0d866
 java.lang.NoSuchMethodError:
 com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
 at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
 at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
 at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
 at java.lang.Thread.run(Thread.java:662)

 

 As I said this

RE: HDFS file section rewrite

2013-07-02 Thread John Lilley

I found this:
https://issues.apache.org/jira/browse/HADOOP-5215
Doesn't seem to have attracted much interest.
John

From: Suresh Srinivas [mailto:sur...@hortonworks.com]
Sent: Tuesday, July 02, 2013 1:03 PM
To: hdfs-u...@hadoop.apache.org
Subject: Re: HDFS file section rewrite

HDFS only supports regular writes and append. Random write is not supported. I 
do not know of any feature/jira that is underway to support this feature.

On Tue, Jul 2, 2013 at 9:01 AM, John Lilley 
john.lil...@redpoint.netmailto:john.lil...@redpoint.net wrote:
I'm sure this has been asked a zillion times, so please just point me to the 
JIRA comments: is there a feature underway to allow for re-writing of HDFS file 
sections?
Thanks
John

--
http://hortonworks.com/download/

Re: Containers and CPU

2013-07-02 Thread Sandy Ryza

That's correct.

-Sandy


On Tue, Jul 2, 2013 at 12:28 PM, John Lilley john.lil...@redpoint.netwrote:

  Sandy,

 Thanks, I think I understand.  So it only makes a difference if cgroups is
 on AND the AM requests multiple cores?  E.g. if each task wants 4 cores the
 RM would only allow two containers per 8-core node?

 John

 ** **

 ** **

 *From:* Sandy Ryza [mailto:sandy.r...@cloudera.com]
 *Sent:* Tuesday, July 02, 2013 1:26 PM

 *To:* user@hadoop.apache.org
 *Subject:* Re: Containers and CPU

 ** **

 Use of cgroups for controlling CPU is off by default, but can be turned on
 as a nodemanager configuration with
 yarn.nodemanager.linux-container-executor.resources-handler.class.  So it
 is site-wide.  If you want tasks to purely fight it out in the OS thread
 scheduler, simply don't change from the default.

 ** **

 Even with cgroups on, all tasks will have access to all CPU cores.  We
 don't do any pinning of tasks to cores.  If a task is requested with a
 single vcore and placed on an otherwise empty machine with 8 cores, it will
 have access to all 8 cores.  If 3 other tasks that requested a single vcore
 are later placed on the same node, and all tasks are using as much CPU as
 they can get their hands on, then each of the tasks will get 2 cores of
 CPU-time.

 ** **

 On Tue, Jul 2, 2013 at 12:12 PM, John Lilley john.lil...@redpoint.net
 wrote:

 Sandy,

 Sorry, I don’t completely follow.  

 When you say “with cgroups on”, is that an attribute of the AM, the
 Scheduler, or the Site/RM?  In other words is it site-wide or something
 that my application can control?

 With cgroups on, is there still a way to get my desired behavior?  I’d
 really like all tasks to have access to all CPU cores and simply fight it
 out in the OS thread scheduler.

 Thanks,

 john

  

 *From:* Sandy Ryza [mailto:sandy.r...@cloudera.com]
 *Sent:* Tuesday, July 02, 2013 11:56 AM
 *To:* user@hadoop.apache.org
 *Subject:* Re: Containers and CPU

  

 CPU limits are only enforced if cgroups is turned on.  With cgroups on,
 they are only limited when there is contention, in which case tasks are
 given CPU time in proportion to the number of cores requested for/allocated
 to them.  Does that make sense?

  

 -Sandy

  

 On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu chuan...@microsoft.com wrote:*
 ***

 I believe this is the default behavior.

 By default, only memory limit on resources is enforced.

 The capacity scheduler will use DefaultResourceCalculator to compute
 resource allocation for containers by default, which also does not take CPU
 into account.

  

 -Chuan

  

 *From:* John Lilley [mailto:john.lil...@redpoint.net]
 *Sent:* Tuesday, July 02, 2013 8:57 AM
 *To:* user@hadoop.apache.org
 *Subject:* Containers and CPU

  

 I have YARN tasks that benefit from multicore scaling.  However, they
 don’t **always** use more than one core.  I would like to allocate
 containers based only on memory, and let each task use as many cores as
 needed, without allocating exclusive CPU “slots” in the scheduler.  For
 example, on an 8-core node with 16GB memory, I’d like to be able to run 3
 tasks each consuming 4GB memory and each using as much CPU as they like.
 Is this the default behavior if I don’t specify CPU restrictions to the
 scheduler?

 Thanks

 John

  

  

  

 ** **

Re: Exception in createBlockOutputStream - poss firewall issue

2013-07-02 Thread Robin East

Hi John, exactly what I was thinking, however I haven't found a way to do that. 
If I ever have time I'll trawl through the code, however I've managed to avoid 
the issue by placing both machines inside the firewall.

Regards
Robin

Sent from my iPhone

On 2 Jul 2013, at 19:48, John Lilley john.lil...@redpoint.net wrote:

 I don’t know the answer… but if it is possible to make the DNs report a 
 domain-name instead of an IP quad it may help.
 John
  
  
 From: Robin East [mailto:robin.e...@xense.co.uk] 
 Sent: Thursday, June 27, 2013 12:18 AM
 To: user@hadoop.apache.org
 Subject: Re: Exception in createBlockOutputStream - poss firewall issue
  
 Ok I should have added that external address of the cluster is NATed to the 
 internal address. The internal address yyy.yyy.yyy.yyy is not a routable 
 address for the client. The client can reach the namenode on xxx.xxx.xxx.xxx 
 but I guess the data node must advertise itself using the internal address 
 (yyy.yyy.yyy.yyy). What is the way round this ( not an uncommon problem in 
 secure environments).
  
 Robin
 
 Sent from my iPad
 
 On 27 Jun 2013, at 03:41, Harsh J ha...@cloudera.com wrote:
 
 Clients will read/write data to the DNs directly. DNs serve on port 50010 and 
 50020 by default. Please open up these ports, aside of the NN's RPC ports, to 
 be able to read/write data.
  
 
 On Thu, Jun 27, 2013 at 2:23 AM, Robin East robin.e...@xense.co.uk wrote:
 I have a single node hadoop cluster setup behind a firewall and am trying to 
 create files using a java program outside the firewall and get the exception 
 below. The java program works fine inside the firewall. The ip address for 
 the single cluster is xxx.xxx.xxx.xxx however it appears that in the 
 createBlockOutputStream the client things the data node is at ip 
 yyy.yyy.yyy.yyy (the internal address of the cluster) which is not accessible.
 The java code looks like this (using hadoop 1.1.2):
 
 private static void createHdfsFile() throws IOException {
 Configuration conf = new Configuration();
 conf.set(fs.default.name, hdfs://+hdfsHost+:9000);
 FileSystem hdfs = FileSystem.get(conf);
 System.out.println(HDFS Working Directory:  + 
 hdfs.getWorkingDirectory().toString());
 FSDataOutputStream os = hdfs.create(new 
 Path(/user/hadoop/test2.txt));
 
 os.writeChars(Example text\n for a hadoop write call\n\ntesting\n);
 os.close();
 }
 
 Any idea how I can get this to work?
 
 
 HDFS Working Directory: hdfs://xxx.xxx.xxx.xxx:9000/user/z
 Jun 26, 2013 8:08:52 PM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream 
 createBlockOutputStream
 INFO: Exception in createBlockOutputStream yyy.yyy.yyy.yyy:50010 
 java.net.ConnectException: Connection timed out: no furth
 er information
 Jun 26, 2013 8:08:52 PM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream 
 nextBlockOutputStream
 INFO: Abandoning block blk_4933973859208379842_1028
 Jun 26, 2013 8:08:53 PM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream 
 nextBlockOutputStream
 INFO: Excluding datanode yyy.yyy.yyy.yyy:50010
 Jun 26, 2013 8:08:53 PM 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer run
 WARNING: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: 
 java.io.IOException: File /user/hadoop/test2.txt
  could only be replicated to 0 nodes, instead of 1
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
 
 at org.apache.hadoop.ipc.Client.call(Client.java:1107)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
 at $Proxy1.addBlock(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
 at java.lang.reflect.Method.invoke(Unknown Source)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
 at

[no subject]

2013-07-02 Thread Chui-Hui Chiu

Hello,

I have a Hadoop 2.0.5 Alpha cluster.  When I execute any Hadoop command, I
see the following message.

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable

Is it at the lib/native folder?  How do I configure the system to load it?

Thanks,
Chui-hui

Unable to load native-hadoop library

2013-07-02 Thread Chui-Hui Chiu

Hello,

I have a Hadoop 2.0.5 Alpha cluster.  When I execute any Hadoop command, I
see the following message.

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable

Is it at the lib/native folder?  How do I configure the system to load it?

Thanks,
Chui-hui

Re: Unable to load native-hadoop library

2013-07-02 Thread Ted Yu

Take a look here: http://search-hadoop.com/m/FXOOOTJruq1

On Tue, Jul 2, 2013 at 3:25 PM, Chui-Hui Chiu cch...@tigers.lsu.edu wrote:

 Hello,

 I have a Hadoop 2.0.5 Alpha cluster.  When I execute any Hadoop command, I
 see the following message.

 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
 platform... using builtin-java classes where applicable

 Is it at the lib/native folder?  How do I configure the system to load it?

 Thanks,
 Chui-hui

What's Yarn?

2013-07-02 Thread Azuryy Yu

Hi Dear all,

I just fount it occasionally, maybe all you know that, but I just show here
again.

Yet Another Resource Negotiator—YARN

from:
http://adtmag.com/blogs/watersworks/2012/08/apache-yarn-promotion.aspx

Parameter 'yarn.nodemanager.resource.cpu-cores' does not work

2013-07-02 Thread sam liu

Hi,

With Hadoop 2.0.4-alpha, yarn.nodemanager.resource.cpu-cores does not work
for me:
1. The performance of running same terasort job do not change, even after
increasing or decreasing the value of 'yarn.nodemanager.resource.cpu-cores'
in yarn-site.xml and restart the yarn cluster.

2. Even if I set the value of both
'yarn.nodemanager.resource.cpu-cores' and
'yarn.nodemanager.vcores-pcores-ratio' to 0, the MR job still could
complete without any exception, but the expected behavior should be that no
cpu could be assigned to the container, and then no job could be executed
on the cluster. Right?

Thanks!

39 matches

Mail list logo