Re: HDFS upgrade problem of fsImage

2013-11-20 Thread Azuryy Yu
No. I don't do any upgrade before this.

I just want to rolling upgrade HDFS to hadoop-2.2.0, any further ideas?
Thanks.


On Thu, Nov 21, 2013 at 1:28 PM, Vinayakumar B wrote:

> Looks like you have already have upgraded cluster.. And you are trying to
> upgrade one more time.
>
>
> -Original Message-
> From: Azuryy Yu [mailto:azury...@gmail.com]
> Sent: 21 November 2013 09:49
> To: hdfs-dev@hadoop.apache.org; u...@hadoop.apache.org
> Subject: HDFS upgrade problem of fsImage
>
> Hi Dear,
>
> I have a small test cluster with hadoop-2.0x, and HA configuraded, but I
> want to upgrade to hadoop-2.2.
>
> I dont't want to stop cluster during upgrade, so my steps are:
>
> 1)  on standby NN: hadoop-dameon.sh stop namenode
> 2)  remove HA configuration in the conf
> 3)   hadoop-daemon.sh start namenode -upgrade -clusterID test-cluster
>
> but Exception in the NN log, so how to upgrade and don't stop the whole
> cluster.
> Thanks.
>
>
> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
> Directory /hdfs/name is in an inconsistent state: previous fs state should
> not exist during upgrade. Finalize or rollback first.
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.doUpgrade(FSImage.java:323)
> at
>
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:248)
> at
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:858)
> at
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:620)
> at
>
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:445)
> at
>
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:494)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:692)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:677)
> at
>
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1345)
>


RE: HDFS upgrade problem of fsImage

2013-11-20 Thread Vinayakumar B
Looks like you have already have upgraded cluster.. And you are trying to 
upgrade one more time.


-Original Message-
From: Azuryy Yu [mailto:azury...@gmail.com] 
Sent: 21 November 2013 09:49
To: hdfs-dev@hadoop.apache.org; u...@hadoop.apache.org
Subject: HDFS upgrade problem of fsImage

Hi Dear,

I have a small test cluster with hadoop-2.0x, and HA configuraded, but I want 
to upgrade to hadoop-2.2.

I dont't want to stop cluster during upgrade, so my steps are:

1)  on standby NN: hadoop-dameon.sh stop namenode
2)  remove HA configuration in the conf
3)   hadoop-daemon.sh start namenode -upgrade -clusterID test-cluster

but Exception in the NN log, so how to upgrade and don't stop the whole cluster.
Thanks.


org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
Directory /hdfs/name is in an inconsistent state: previous fs state should not 
exist during upgrade. Finalize or rollback first.
at
org.apache.hadoop.hdfs.server.namenode.FSImage.doUpgrade(FSImage.java:323)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:248)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:858)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:620)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:445)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:494)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:692)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:677)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1345)


Re: Metrics2 code

2013-11-20 Thread Andrew Wang
Hey LiuLei,

Gauges can go up and down, counters only go up. Snapshot doesn't actually
reset anything, it's just a way for the metrics system to get an updated
value. There aren't any time-based rolling metrics to my knowledge besides
MutableQuantiles.

Best,
Andrew


On Wed, Nov 20, 2013 at 7:34 PM, lei liu  wrote:

> I use cdh-4.3.1 version.  I am reading the code about metrics2.
>
> There are COUNTER and GAUGE metric type in metrics v2. What is the
> difference
> between the two?
>
>
> There is @Metric MutableCounterLong bytesWritten attribute in
> DataNodeMetrics, which is used to  statistics written bytes per second on
> DataNode.So I think the value of MutableCounterLong should be divided
> by 10and be reseted to zero per ten seconds in
> MutableCounterLong.snapshot
> method, is that right? But MutableCounterLong.snapshot method don't do
> that. I miss anything please tell me.
>
> Thanks,
>
> LiuLei
>


HDFS upgrade problem of fsImage

2013-11-20 Thread Azuryy Yu
Hi Dear,

I have a small test cluster with hadoop-2.0x, and HA configuraded, but I
want to upgrade to hadoop-2.2.

I dont't want to stop cluster during upgrade, so my steps are:

1)  on standby NN: hadoop-dameon.sh stop namenode
2)  remove HA configuration in the conf
3)   hadoop-daemon.sh start namenode -upgrade -clusterID test-cluster

but Exception in the NN log, so how to upgrade and don't stop the whole
cluster.
Thanks.


org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
Directory /hdfs/name is in an inconsistent state: previous fs state should
not exist during upgrade. Finalize or rollback first.
at
org.apache.hadoop.hdfs.server.namenode.FSImage.doUpgrade(FSImage.java:323)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:248)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:858)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:620)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:445)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:494)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:692)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:677)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1345)


[jira] [Created] (HDFS-5543) fix narrow race condition in TestPathBasedCacheRequests

2013-11-20 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-5543:
--

 Summary: fix narrow race condition in TestPathBasedCacheRequests
 Key: HDFS-5543
 URL: https://issues.apache.org/jira/browse/HDFS-5543
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


TestPathBasedCacheRequests has a narrow race condition in 
testWaitForCachedReplicasInDirectory where an assert checking the number of 
bytes cached may fail.  The reason is because waitForCachedBlock looks at the 
NameNode data structures directly to see how many replicas are cached, but the 
scanner asynchronously updates the cache entries with this information.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Metrics2 code

2013-11-20 Thread lei liu
I use cdh-4.3.1 version.  I am reading the code about metrics2.

There are COUNTER and GAUGE metric type in metrics v2. What is the difference
between the two?


There is @Metric MutableCounterLong bytesWritten attribute in
DataNodeMetrics, which is used to  statistics written bytes per second on
DataNode.So I think the value of MutableCounterLong should be divided
by 10and be reseted to zero per ten seconds in
MutableCounterLong.snapshot
method, is that right? But MutableCounterLong.snapshot method don't do
that. I miss anything please tell me.

Thanks,

LiuLei


[jira] [Created] (HDFS-5541) LIBHDFS questions and performance suggestions

2013-11-20 Thread Stephen Bovy (JIRA)
Stephen Bovy created HDFS-5541:
--

 Summary: LIBHDFS questions and performance suggestions
 Key: HDFS-5541
 URL: https://issues.apache.org/jira/browse/HDFS-5541
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Stephen Bovy
Priority: Minor


Since libhdfs is a "client" interface",  and esspecially because it is a "C" 
interface , it should be assumed that the code will be used accross many 
different platforms, and many different compilers.

1) The code should be cross platform ( no Linux extras )
2) The code should compile on standard c89 compilers, the
>>>  {least common denominator rule applies here} !! <<  

C  code with  "c"   extension should follow the rules of the c standard  

All variables must be declared at the begining of scope , and no (//) comments 
allowed 

>> I just spent a week white-washing the code back to nornal C standards so 
>> that it could compile and build accross a wide range of platforms << 

Now on-to  performance questions 

1) If threads are not used why do a thread attach ( when threads are not used 
all the thread attach nonesense is a waste of time and a performance killer ) 

2) The JVM  init  code should not be imbedded within the context of every 
function call   .  The  JVM init code should be in a stand-alone  LIBINIT 
function that is only invoked once.   The JVM * and the JNI * should be global 
variables for use when no threads are utilized.  

3) When threads are utilized the attach fucntion can use the GLOBAL  jvm * 
created by the LIBINIT  { WHICH IS INVOKED ONLY ONCE } and thus safely outside 
the scope of any LOOP that is using the functions 

4) Hash Table and Locking  Why ?
When threads are used the hash table locking is going to hurt perfromance .  
Why not use thread local storage for the hash table,that way no locking is 
required either with or without threads.   
 
5) FINALLY Windows  Compatibility 

Do not use posix features if they cannot easilly be replaced on other platforms 
  !!



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5542) Fix TODO and clean up the code in HDFS-2832.

2013-11-20 Thread Tsz Wo (Nicholas), SZE (JIRA)
Tsz Wo (Nicholas), SZE created HDFS-5542:


 Summary: Fix TODO and clean up the code in HDFS-2832.
 Key: HDFS-5542
 URL: https://issues.apache.org/jira/browse/HDFS-5542
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor


- Fix TODOs.
- Remove unused code.
- Reduce visibility (e.g. change public to package private.)
- Simplify the code if possible.
- Fix comments and javadoc.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5540) Fix TestBlocksWithNotEnoughRacks

2013-11-20 Thread Binglin Chang (JIRA)
Binglin Chang created HDFS-5540:
---

 Summary: Fix TestBlocksWithNotEnoughRacks
 Key: HDFS-5540
 URL: https://issues.apache.org/jira/browse/HDFS-5540
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang


TestBlocksWithNotEnoughRacks fails with timed out waiting for corrupt replicas

java.util.concurrent.TimeoutException: Timed out waiting for corrupt replicas. 
Waiting for 1, but only found 0
at 
org.apache.hadoop.hdfs.DFSTestUtil.waitCorruptReplicas(DFSTestUtil.java:351)
at 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks.testCorruptBlockRereplicatedAcrossRacks(TestBlocksWithNotEnoughRacks.java:219)




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5539) NFS gateway secuirty enhancement

2013-11-20 Thread Brandon Li (JIRA)
Brandon Li created HDFS-5539:


 Summary: NFS gateway secuirty enhancement
 Key: HDFS-5539
 URL: https://issues.apache.org/jira/browse/HDFS-5539
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Brandon Li


Currently, NFS gateway only supports AUTH_UNIX RPC authentication. 
AUTH_UNIX is easy to deploy and use but lack of strong security support. 

This JIRA is to track the effort of NFS gateway security enhancement, such as 
RPCSEC_GSS framework and end to end Kerberos support.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5538) URLConnectionFactory should pick up the SSL related configuration by default

2013-11-20 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-5538:


 Summary: URLConnectionFactory should pick up the SSL related 
configuration by default
 Key: HDFS-5538
 URL: https://issues.apache.org/jira/browse/HDFS-5538
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai


The default instance of URLConnectionFactory, DEFAULT_CONNECTION_FACTORY does 
not pick up any hadoop-specific, SSL-related configuration. Its customers have 
to set up the ConnectionConfigurator explicitly in order to pick up these 
configurations.

This is less than ideal for HTTPS because whenever the code needs to make a 
HTTPS connection, the code is forced to go through the set up.

This jira refactors URLConnectionFactory to ease the handling of HTTPS 
connections (compared to the DEFAULT_CONNECTION_FACTORY we have right now).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5537) Remove FileWithSnapshot interface

2013-11-20 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-5537:
---

 Summary: Remove FileWithSnapshot interface
 Key: HDFS-5537
 URL: https://issues.apache.org/jira/browse/HDFS-5537
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao


We use the FileWithSnapshot interface to define a set of methods shared by 
INodeFileWithSnapshot and INodeFileUnderConstructionWithSnapshot. After using 
the Under-Construction feature to replace the INodeFileUC and 
INodeFileUCWithSnapshot, we no longer need this interface.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5536) Implement HTTP policy for Namenode and DataNode

2013-11-20 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-5536:


 Summary: Implement HTTP policy for Namenode and DataNode
 Key: HDFS-5536
 URL: https://issues.apache.org/jira/browse/HDFS-5536
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai


this jira implements the http and https policy in the namenode and the datanode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5535) Umbrella jira for improved HDFS rolling upgrades

2013-11-20 Thread Nathan Roberts (JIRA)
Nathan Roberts created HDFS-5535:


 Summary: Umbrella jira for improved HDFS rolling upgrades
 Key: HDFS-5535
 URL: https://issues.apache.org/jira/browse/HDFS-5535
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, ha, hdfs-client, namenode
Affects Versions: 2.2.0, 3.0.0
Reporter: Nathan Roberts


In order to roll a new HDFS release through a large cluster quickly and safely, 
a few enhancements are needed in HDFS. An initial High level design document 
will be attached to this jira, and sub-jiras will itemize the individual tasks.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5534) change testUncachingBlocksBeforeCachingFinishes to use explicit synchronization rather than a delay

2013-11-20 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-5534:
--

 Summary: change testUncachingBlocksBeforeCachingFinishes to use 
explicit synchronization rather than a delay
 Key: HDFS-5534
 URL: https://issues.apache.org/jira/browse/HDFS-5534
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Andrew Wang
Priority: Minor


It would be nice for {{testUncachingBlocksBeforeCachingFinishes}} to use 
explicit synchronization rather than a delay, to ensure that we are testing the 
try-to-cache-but-uncache-before-it-finishes case.  We probably need Mockito or 
some more test hooks in {{FsDatasetCache}} in order to do this right.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HDFS-5527) Fix TestUnderReplicatedBlocks on branch HDFS-2832

2013-11-20 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-5527.
-

   Resolution: Fixed
Fix Version/s: Heterogeneous Storage (HDFS-2832)

Thanks for reviewing and verifying Junping!

The fix passed Jenkins on HDFS-2832, I have committed it.

> Fix TestUnderReplicatedBlocks on branch HDFS-2832
> -
>
> Key: HDFS-5527
> URL: https://issues.apache.org/jira/browse/HDFS-5527
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: Heterogeneous Storage (HDFS-2832)
>Reporter: Junping Du
>Assignee: Arpit Agarwal
> Fix For: Heterogeneous Storage (HDFS-2832)
>
> Attachments: HDFS-5527.patch, h5527.02.patch
>
>
> The failure seems like a deadlock, which is show in:
> https://builds.apache.org/job/PreCommit-HDFS-Build/5440//testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestUnderReplicatedBlocks/testSetrepIncWithUnderReplicatedBlocks/



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: issue about rpc activity metrics

2013-11-20 Thread Andrew Wang
The metrics system generates a number of different entries per in-code
metrics object. For instance, the "SendHeartbeat"  MutableRate will
generate both "NumOps" and "AvgTime". Look in NameNodeMetrics.java for
where these are updated.

Best,
Andrew


On Tue, Nov 19, 2013 at 10:52 PM, ch huang  wrote:

> hi,all:
> i get rpc metrics from NN 50070 port ,and i try search the code to
> see how these metrics is caculated,
> i try to use grep,but get nothing ,why?
> [root@CH124 hadoop-2.0.0-cdh4.3.0]# grep -R 'DeleteNumOps' *
>  {
> "name" : "Hadoop:service=NameNode,name=RpcDetailedActivityForPort8020",
> "modelerType" : "RpcDetailedActivityForPort8020",
> "tag.port" : "8020",
> "tag.Context" : "rpcdetailed",
> "tag.Hostname" : "CHBM220",
> "SendHeartbeatNumOps" : 106434,
> "SendHeartbeatAvgTime" : 0.05366726296958853,
> "VersionRequestNumOps" : 9,
> "VersionRequestAvgTime" : 0.,
> "RegisterDatanodeNumOps" : 9,
> "RegisterDatanodeAvgTime" : 2.2223,
> "BlockReportNumOps" : 24,
> "BlockReportAvgTime" : 3.0,
> "GetServiceStatusNumOps" : 63811,
> "GetServiceStatusAvgTime" : 0.05970149253731349,
> "MonitorHealthNumOps" : 63811,
> "MonitorHealthAvgTime" : 0.0686567164179105,
> "TransitionToStandbyNumOps" : 3,
> "TransitionToStandbyAvgTime" : 27.336,
> "TransitionToActiveNumOps" : 1,
> "TransitionToActiveAvgTime" : 8026.0,
> "RollEditLogNumOps" : 210,
> "RollEditLogAvgTime" : 306.7428571428572,
> "GetListingNumOps" : 516,
> "GetListingAvgTime" : 0.18798449612403115,
> "GetFileInfoNumOps" : 507,
> "GetFileInfoAvgTime" : 0.12228796844181453,
> "CreateNumOps" : 4,
> "CreateAvgTime" : 53.5,
> "CompleteNumOps" : 4,
> "CompleteAvgTime" : 45.0,
> "SetOwnerNumOps" : 4,
> "SetOwnerAvgTime" : 43.0,
> "DeleteNumOps" : 4,
> "DeleteAvgTime" : 44.75
>   }
>


Re: Next releases

2013-11-20 Thread Arun C Murthy
Jason,

 I'm glad to see we are converging. I'll update the Roadmap wiki with details 
about major/minor/patch releases.

 Here is a straight-forward approach for now: I'll just roll contents of 
branch-2.2 as a 2.3-rc0 candidate right-away. This way we don't have to get 
embroiled in details of individual patches (there are too many). Next up, I'll 
roll 2.4 in December.

 Thoughts?

thanks,
Arun

On Nov 13, 2013, at 1:55 PM, Jason Lowe  wrote:

> I think a lot of confusion comes from the fact that the 2.x line is starting 
> to mature.  Before this there wasn't such a big contention of what went into 
> patch vs. minor releases and often the lines were blurred between the two.  
> However now we have significant customers and products starting to use 2.x as 
> a base, which means we need to start treating it like we treat 1.x.  That 
> means getting serious about what we should put into a patch release vs. what 
> we postpone to a minor release.
> 
> Here's my $0.02 on recent proposals:
> 
> +1 to releasing more often in general.  A lot of the rush to put changes into 
> a patch release is because it can be a very long time between any kind of 
> release.  If minor releases are more frequent then I hope there would be less 
> of a need to rush something or hold up a release.
> 
> +1 to limiting checkins of patch releases to Blockers/Criticals.  If 
> necessary committers check into trunk/branch-2 only and defer to the patch 
> release manager for the patch release merge.  Then there should be fewer 
> surprises for everyone what ended up in a patch release and less likely the 
> patch release becomes destabilized from the sheer amount of code churn.  
> Maybe this won't be necessary if everyone understands that the patch release 
> isn't the only way to get a change out in timely manner.
> 
> As for 2.2.1, again I think it's expectations for what that release means.  
> If it's really just a patch release then there shouldn't be features in it 
> and tons of code churn, but I think many were treating it as the next vehicle 
> to deliver changes in general.  If we think 2.2.1 is just as good or better 
> than 2.2.0 then let's wrap it up and move to a more disciplined approach for 
> subsequent patch releases and more frequent minor releases.
> 
> Jason
> 
> On 11/13/2013 12:10 PM, Arun C Murthy wrote:
>> On Nov 12, 2013, at 1:54 PM, Todd Lipcon  wrote:
>> 
>>> On Mon, Nov 11, 2013 at 2:57 PM, Colin McCabe wrote:
>>> 
 To be honest, I'm not aware of anything in 2.2.1 that shouldn't be
 there.  However, I have only been following the HDFS and common side
 of things so I may not have the full picture.  Arun, can you give a
 specific example of something you'd like to "blow away"?
>> There are bunch of issues in YARN/MapReduce which clearly aren't *critical*, 
>> similarly in HDFS a cursory glance showed up some 
>> *enhancements*/*improvements* in CHANGES.txt which aren't necessary for a 
>> patch release, plus things like:
>> 
>>  HADOOP-9623 
>> Update jets3t dependency to 0.9.0
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>  
>> Having said that, the HDFS devs know their code the best.
>> 
>>> I agree with Colin. If we've been backporting things into a patch release
>>> (third version component) which don't belong, we should explicitly call out
>>> those patches, so we can learn from our mistakes and have a discussion
>>> about what belongs.
>> Good point.
>> 
>> Here is a straw man proposal:
>> 
>> 
>> A patch (third version) release should only include *blocker* bugs which are 
>> critical from an operational, security or data-integrity issues.
>> 
>> This way, we can ensure that a minor series release (2.2.x or 2.3.x or 
>> 2.4.x) is always release-able, and more importantly, deploy-able at any 
>> point in time.
>> 
>> 
>> 
>> Sandy did bring up a related point about timing of releases and the urge for 
>> everyone to cram features/fixes into a dot release.
>> 
>> So, we could remedy that situation by doing a release every 4-6 weeks (2.3, 
>> 2.4 etc.) and keep the patch releases limited to blocker bugs.
>> 
>> Thoughts?
>> 
>> thanks,
>> Arun
>> 
>> 
>> 
>> 
>> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


RE: is block pool on NN or DN?

2013-11-20 Thread Sirianni, Eric
This blog post should clear it up:
http://hortonworks.com/blog/an-introduction-to-hdfs-federation/


-Original Message-
From: ch huang [mailto:justlo...@gmail.com] 
Sent: Wednesday, November 20, 2013 12:26 AM
To: hdfs-dev@hadoop.apache.org
Subject: is block pool on NN or DN?

hi,all:
 i have a question about block pool,i read the code ,it seems block
pool is on DN,but NN also has a block pool manager to manage all
block pool,so the block pool also in NN?


Build failed in Jenkins: Hadoop-Hdfs-trunk #1588

2013-11-20 Thread Apache Jenkins Server
See 

Changes:

[bikas] YARN-744. Race condition in ApplicationMasterService.allocate .. It 
might process same allocate request twice resulting in additional containers 
getting allocated. (Omkar Vinit Joshi via bikas)

[sandy] Move YARN-1407 under 2.2.1 in CHANGES.txt

[cmccabe] HDFS-5511. improve CacheManipulator interface to allow better unit 
testing (cmccabe)

[sandy] YARN-1407. RM Web UI and REST APIs should uniformly use 
YarnApplicationState (Sandy Ryza)

[wang] HDFS-5513. CacheAdmin commands fail when using . as the path. 
Contributed by Andrew Wang.

[jeagles] HDFS-1386. TestJMXGet fails in jdk7 (jeagles)

[sandy] YARN-786: Addendum so that 
RMAppAttemptImpl#getApplicationResourceUsageReport won't return null

[acmurthy] HADOOP-10047. Add a direct-buffer based apis for compression. 
Contributed by Gopal V.

[acmurthy] Revert HADOOP-10047, wrong patch.

[acmurthy] HADOOP-10047. Add a direct-buffer based apis for compression. 
Contributed by Gopal V.

--
[...truncated 11589 lines...]
Running org.apache.hadoop.hdfs.TestEncryptedTransfer
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 79.935 sec - 
in org.apache.hadoop.hdfs.TestEncryptedTransfer
Running org.apache.hadoop.hdfs.TestDFSUpgrade
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.518 sec - in 
org.apache.hadoop.hdfs.TestDFSUpgrade
Running org.apache.hadoop.hdfs.TestCrcCorruption
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.788 sec - in 
org.apache.hadoop.hdfs.TestCrcCorruption
Running org.apache.hadoop.hdfs.TestHFlush
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.195 sec - in 
org.apache.hadoop.hdfs.TestHFlush
Running org.apache.hadoop.hdfs.TestFileAppendRestart
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.093 sec - in 
org.apache.hadoop.hdfs.TestFileAppendRestart
Running org.apache.hadoop.hdfs.TestDatanodeReport
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.517 sec - in 
org.apache.hadoop.hdfs.TestDatanodeReport
Running org.apache.hadoop.hdfs.TestShortCircuitLocalRead
Tests run: 10, Failures: 0, Errors: 0, Skipped: 10, Time elapsed: 0.194 sec - 
in org.apache.hadoop.hdfs.TestShortCircuitLocalRead
Running org.apache.hadoop.hdfs.TestFileInputStreamCache
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.205 sec - in 
org.apache.hadoop.hdfs.TestFileInputStreamCache
Running org.apache.hadoop.hdfs.TestRestartDFS
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.892 sec - in 
org.apache.hadoop.hdfs.TestRestartDFS
Running org.apache.hadoop.hdfs.TestDFSUpgradeFromImage
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.439 sec - in 
org.apache.hadoop.hdfs.TestDFSUpgradeFromImage
Running org.apache.hadoop.hdfs.TestDFSRemove
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.062 sec - in 
org.apache.hadoop.hdfs.TestDFSRemove
Running org.apache.hadoop.hdfs.TestHDFSTrash
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.145 sec - in 
org.apache.hadoop.hdfs.TestHDFSTrash
Running org.apache.hadoop.hdfs.TestClientReportBadBlock
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 71.131 sec - in 
org.apache.hadoop.hdfs.TestClientReportBadBlock
Running org.apache.hadoop.hdfs.TestQuota
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.01 sec - in 
org.apache.hadoop.hdfs.TestQuota
Running org.apache.hadoop.hdfs.TestFileLengthOnClusterRestart
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.565 sec - in 
org.apache.hadoop.hdfs.TestFileLengthOnClusterRestart
Running org.apache.hadoop.hdfs.TestDatanodeRegistration
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.062 sec - in 
org.apache.hadoop.hdfs.TestDatanodeRegistration
Running org.apache.hadoop.hdfs.TestAbandonBlock
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.049 sec - in 
org.apache.hadoop.hdfs.TestAbandonBlock
Running org.apache.hadoop.hdfs.TestDFSShell
Tests run: 23, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.533 sec - 
in org.apache.hadoop.hdfs.TestDFSShell
Running org.apache.hadoop.hdfs.TestListFilesInDFS
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.407 sec - in 
org.apache.hadoop.hdfs.TestListFilesInDFS
Running org.apache.hadoop.hdfs.TestParallelShortCircuitReadUnCached
Tests run: 4, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 0.164 sec - in 
org.apache.hadoop.hdfs.TestParallelShortCircuitReadUnCached
Running org.apache.hadoop.hdfs.TestPeerCache
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.318 sec - in 
org.apache.hadoop.hdfs.TestPeerCache
Running org.apache.hadoop.hdfs.TestAppendDifferentChecksum
Tests run: 3, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 8.505 sec - in 
org.apache.hadoop.hdfs.TestAppendDifferentChecksum
Running org.apache.hadoop.hdfs.TestD

Hadoop-Hdfs-trunk - Build # 1588 - Failure

2013-11-20 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1588/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 11782 lines...]
[mkdir] Created dir: 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/trunk/hadoop-hdfs-project/target/test-dir
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-source-plugin:2.1.2:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-source-plugin:2.1.2:test-jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (dist-enforce) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.0:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ hadoop-hdfs-project 
---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.6:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:2.3.2:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] ** FindBugsMojo execute ***
[INFO] canGenerate is false
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  FAILURE 
[1:47:14.041s]
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS-NFS  SKIPPED
[INFO] Apache Hadoop HDFS Project  SUCCESS [1.840s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 1:47:17.260s
[INFO] Finished at: Wed Nov 20 13:21:45 UTC 2013
[INFO] Final Memory: 39M/299M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on 
project hadoop-hdfs: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/trunk/hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports
 for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Updating HDFS-1386
Updating YARN-1407
Updating YARN-744
Updating HADOOP-10047
Updating HDFS-5513
Updating YARN-786
Updating HDFS-5511
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
9 tests failed.
FAILED:  
org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache.testCacheAndUncacheBlockSimple

Error Message:
Cannot start datanode because the configured max locked memory size 
(dfs.datanode.max.locked.memory) is greater than zero and native code is not 
available.

Stack Trace:
java.lang.RuntimeException: Cannot start datanode because the configured max 
locked memory size (dfs.datanode.max.locked.memory) is greater than zero and 
native code is not available.
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:665)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:264)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1757)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1672)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1191)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:666)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:335)
at 
org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:317)
at 
org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache.setUp(TestFsDatasetCache.java:113)


FAILED:  
org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache.testCacheAndUncacheBlockWithRetries

Error Message:
Cannot start datanode because the configured max locked memory size 
(dfs.datanode.max.locked.memory) is greater tha

Hadoop-Hdfs-0.23-Build - Build # 796 - Still Failing

2013-11-20 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/796/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 7892 lines...]
[ERROR] location: class com.google.protobuf.InvalidProtocolBufferException
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/protocol/proto/DataTransferProtos.java:[3313,27]
 cannot find symbol
[ERROR] symbol  : method 
setUnfinishedMessage(org.apache.hadoop.hdfs.protocol.proto.DataTransferProtos.OpWriteBlockProto)
[ERROR] location: class com.google.protobuf.InvalidProtocolBufferException
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/protocol/proto/DataTransferProtos.java:[3319,8]
 cannot find symbol
[ERROR] symbol  : method makeExtensionsImmutable()
[ERROR] location: class 
org.apache.hadoop.hdfs.protocol.proto.DataTransferProtos.OpWriteBlockProto
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/protocol/proto/DataTransferProtos.java:[3330,10]
 cannot find symbol
[ERROR] symbol  : method 
ensureFieldAccessorsInitialized(java.lang.Class,java.lang.Class)
[ERROR] location: class com.google.protobuf.GeneratedMessage.FieldAccessorTable
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/protocol/proto/DataTransferProtos.java:[3335,31]
 cannot find symbol
[ERROR] symbol  : class AbstractParser
[ERROR] location: package com.google.protobuf
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/protocol/proto/DataTransferProtos.java:[3344,4]
 method does not override or implement a method from a supertype
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/protocol/proto/DataTransferProtos.java:[4098,12]
 cannot find symbol
[ERROR] symbol  : method 
ensureFieldAccessorsInitialized(java.lang.Class,java.lang.Class)
[ERROR] location: class com.google.protobuf.GeneratedMessage.FieldAccessorTable
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/protocol/proto/DataTransferProtos.java:[4371,104]
 cannot find symbol
[ERROR] symbol  : method getUnfinishedMessage()
[ERROR] location: class com.google.protobuf.InvalidProtocolBufferException
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/protocol/proto/DataTransferProtos.java:[5264,8]
 getUnknownFields() in 
org.apache.hadoop.hdfs.protocol.proto.DataTransferProtos.OpTransferBlockProto 
cannot override getUnknownFields() in com.google.protobuf.GeneratedMessage; 
overridden method is final
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/protocol/proto/DataTransferProtos.java:[5284,19]
 cannot find symbol
[ERROR] symbol  : method 
parseUnknownField(com.google.protobuf.CodedInputStream,com.google.protobuf.UnknownFieldSet.Builder,com.google.protobuf.ExtensionRegistryLite,int)
[ERROR] location: class 
org.apache.hadoop.hdfs.protocol.proto.DataTransferProtos.OpTransferBlockProto
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/protocol/proto/DataTransferProtos.java:[5314,15]
 cannot find symbol
[ERROR] symbol  : method 
setUnfinishedMessage(org.apache.hadoop.hdfs.protocol.proto.DataTransferProtos.OpTransferBlockProto)
[ERROR] location: class com.google.protobuf.InvalidProtocolBufferException
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/protocol/proto/DataTransferProtos.java:[5317,27]
 cannot find symbol
[ERROR] symbol  : method 
setUnfinishedMessage(org.apache.hadoop.hdfs.protocol.proto.DataTransferProtos.OpTransferBlockProto)
[ERROR] location: class com.google.protobuf.InvalidProtocolBufferException
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/protocol/proto/DataTransferProtos.java:[5323,8]
 cannot find symbol
[ERROR] symbol  : method makeExtensionsImmutable()
[ERROR] location: class 
org.apache.hadoop.hdfs

Build failed in Jenkins: Hadoop-Hdfs-0.23-Build #796

2013-11-20 Thread Apache Jenkins Server
See 

--
[...truncated 7699 lines...]
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[270,37]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[281,30]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[10533,37]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[10544,30]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[8357,37]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[8368,30]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[12641,37]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[12652,30]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[9741,37]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[9752,30]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[1781,37]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[1792,30]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[5338,37]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[5349,30]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[6290,37]
 cannot find symbol
[ERROR] symbol  : class Parser
[ERROR] location: package com.google.protobuf
[ERROR] 
:[6301,30]
 cannot find sym