date:20110113

[jira] Commented: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-01-13 Thread Liyin Liang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981176#action_12981176
 ] 

Liyin Liang commented on HDFS-1583:
---

Hi Todd,
This is mainly caused by the serialization of array. The job is done by :
{code:}
ObjectWritable::writeObject(DataOutput out, Object instance,
 Class declaredClass, 
 Configuration conf)
{code}

This function traverses the array and serialize each element as an object. 
According to my test, an byte array with 8000 elements will grow up to 56008 
elements after serialization (2.4ms). However, a wrapped object size is 8094 
after serialization (0.03ms).

By the way, there is a array wrapper class already: 
{code:}
public class ArrayWritable implements Writable
{code}
This class is used in FSEditLog to log operations, e.g. 
FSEditLog::logMkDir(String path, INode newNode).
I'll update the patch to use ArrayWritable.

 Improve backup-node sync performance by wrapping RPC parameters
 ---

 Key: HDFS-1583
 URL: https://issues.apache.org/jira/browse/HDFS-1583
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Liyin Liang
 Fix For: 0.23.0

 Attachments: HDFS-1583-1.patch


 The journal edit records are sent by the active name-node to the backup-node 
 with RPC:
 {code:}
   public void journal(NamenodeRegistration registration,
   int jAction,
   int length,
   byte[] records) throws IOException;
 {code}
 During the name-node throughput benchmark, the size of byte array _records_ 
 is around *8000*.  Then the serialization and deserialization is 
 time-consuming. I wrote a simple application to test RPC with byte array 
 parameter. When the size got to 8000, each RPC call need about 6 ms. While 
 name-node sync 8k byte to local disk only need  0.3~0.4ms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-884) DataNode makeInstance should report the directory list when failing to start up

2011-01-13 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981212#action_12981212
 ] 

Steve Loughran commented on HDFS-884:
-

no need to thank me, all I provided was a deployment where none of the 
directories were valid :)

 DataNode makeInstance should report the directory list when failing to start 
 up
 ---

 Key: HDFS-884
 URL: https://issues.apache.org/jira/browse/HDFS-884
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Affects Versions: 0.22.0
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Fix For: 0.22.0

 Attachments: HDFS-884.patch, HDFS-884.patch, InvalidDirs.patch, 
 InvalidDirs.patch, InvalidDirs.patch


 When {{Datanode.makeInstance()}} cannot work with one of the directories in 
 dfs.data.dir, it logs this at warn level (while losing the stack trace). 
 It should include the nested exception for better troubleshooting. Then, when 
 all dirs in the list fail, an exception is thrown, but this exception does 
 not include the list of directories. It should list the absolute path of 
 every missing/failing directory, so that whoever sees the exception can see 
 where to start looking for problems: either the filesystem or the 
 configuration. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-01-13 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981382#action_12981382
]

Hadoop QA commented on HDFS-1583:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12468257/HDFS-1583-2.patch
against trunk revision 1058402.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these core unit tests:
org.apache.hadoop.cli.TestHDFSCLI
org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery
org.apache.hadoop.hdfs.server.namenode.TestStorageRestore

-1 contrib tests. The patch failed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/105//testReport/
Findbugs warnings:
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/105//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/105//console

This message is automatically generated.

Improve backup-node sync performance by wrapping RPC parameters
---

Key: HDFS-1583
URL: https://issues.apache.org/jira/browse/HDFS-1583
Project: Hadoop HDFS
Issue Type: Improvement
Components: name-node
Reporter: Liyin Liang
Fix For: 0.23.0

Attachments: HDFS-1583-1.patch, HDFS-1583-2.patch

The journal edit records are sent by the active name-node to the backup-node
with RPC:
{code:}
public void journal(NamenodeRegistration registration,
int jAction,
int length,
byte[] records) throws IOException;
{code}
During the name-node throughput benchmark, the size of byte array _records_
is around *8000*. Then the serialization and deserialization is
time-consuming. I wrote a simple application to test RPC with byte array
parameter. When the size got to 8000, each RPC call need about 6 ms. While
name-node sync 8k byte to local disk only need 0.3~0.4ms.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-01-13 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981404#action_12981404
]

Konstantin Shvachko commented on HDFS-1583:
---

Liyin, This is a nice optimization, and thanks for measuring the performance.
I think this is critical for 0.22 release. Writing edits to BN should not be
slower than writing to disk.

byte array with 8000 elements will grow up to 56008 elements after
serialization

This makes RPC very inefficient.
Java arrays can not hold different type instances (see ArrayStoreException for
refs), so serializing type name for each element does not make sense. Type
should be stored only once for the entire array.
Does anybody remember discussions or jiras open for it?

We need to decide if it should be fixed in RPC or locally for BackupNode only.
Seems that RPC level fix would optimize communication in general, but will be
massively backward incompatible. It could be a good time to do that now before
the major release.

Improve backup-node sync performance by wrapping RPC parameters
---

Key: HDFS-1583
URL: https://issues.apache.org/jira/browse/HDFS-1583
Project: Hadoop HDFS
Issue Type: Improvement
Components: name-node
Reporter: Liyin Liang
Fix For: 0.23.0

Attachments: HDFS-1583-1.patch, HDFS-1583-2.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-01-13 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981415#action_12981415
 ] 

Todd Lipcon commented on HDFS-1583:
---

We did this optimization for the RPC layer in HBase long ago (HBASE-82). Here's 
the current code:

https://github.com/apache/hbase/blob/trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java#L387

Is there a way to do the change to ObjectWritable in a way that the new version 
can still read old data?

 Improve backup-node sync performance by wrapping RPC parameters
 ---

 Key: HDFS-1583
 URL: https://issues.apache.org/jira/browse/HDFS-1583
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Liyin Liang
 Fix For: 0.23.0

 Attachments: HDFS-1583-1.patch, HDFS-1583-2.patch


 The journal edit records are sent by the active name-node to the backup-node 
 with RPC:
 {code:}
   public void journal(NamenodeRegistration registration,
   int jAction,
   int length,
   byte[] records) throws IOException;
 {code}
 During the name-node throughput benchmark, the size of byte array _records_ 
 is around *8000*.  Then the serialization and deserialization is 
 time-consuming. I wrote a simple application to test RPC with byte array 
 parameter. When the size got to 8000, each RPC call need about 6 ms. While 
 name-node sync 8k byte to local disk only need  0.3~0.4ms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-01-13 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned HDFS-1583:
-

Assignee: Liyin Liang

 Improve backup-node sync performance by wrapping RPC parameters
 ---

 Key: HDFS-1583
 URL: https://issues.apache.org/jira/browse/HDFS-1583
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Liyin Liang
Assignee: Liyin Liang
 Fix For: 0.23.0

 Attachments: HDFS-1583-1.patch, HDFS-1583-2.patch


 The journal edit records are sent by the active name-node to the backup-node 
 with RPC:
 {code:}
   public void journal(NamenodeRegistration registration,
   int jAction,
   int length,
   byte[] records) throws IOException;
 {code}
 During the name-node throughput benchmark, the size of byte array _records_ 
 is around *8000*.  Then the serialization and deserialization is 
 time-consuming. I wrote a simple application to test RPC with byte array 
 parameter. When the size got to 8000, each RPC call need about 6 ms. While 
 name-node sync 8k byte to local disk only need  0.3~0.4ms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-01-13 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981477#action_12981477
 ] 

Konstantin Shvachko commented on HDFS-1583:
---

Looks like there is a jira for that HADOOP-6949. 
I'll reopen it as we have more benchmarks now and let's move the RPC discussion 
there.

 Improve backup-node sync performance by wrapping RPC parameters
 ---

 Key: HDFS-1583
 URL: https://issues.apache.org/jira/browse/HDFS-1583
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Liyin Liang
Assignee: Liyin Liang
 Fix For: 0.23.0

 Attachments: HDFS-1583-1.patch, HDFS-1583-2.patch


 The journal edit records are sent by the active name-node to the backup-node 
 with RPC:
 {code:}
   public void journal(NamenodeRegistration registration,
   int jAction,
   int length,
   byte[] records) throws IOException;
 {code}
 During the name-node throughput benchmark, the size of byte array _records_ 
 is around *8000*.  Then the serialization and deserialization is 
 time-consuming. I wrote a simple application to test RPC with byte array 
 parameter. When the size got to 8000, each RPC call need about 6 ms. While 
 name-node sync 8k byte to local disk only need  0.3~0.4ms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1572) Checkpointer should trigger checkpoint with specified period.

2011-01-13 Thread Jakob Homan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1572:
--

Status: Open  (was: Patch Available)

 Checkpointer should trigger checkpoint with specified period.
 -

 Key: HDFS-1572
 URL: https://issues.apache.org/jira/browse/HDFS-1572
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Liyin Liang
Priority: Blocker
 Fix For: 0.21.0

 Attachments: 1527-1.diff, 1572-2.diff, HDFS-1572-2.patch, 
 HDFS-1572-3.patch, HDFS-1572.patch


 {code:}
   long now = now();
   boolean shouldCheckpoint = false;
   if(now = lastCheckpointTime + periodMSec) {
 shouldCheckpoint = true;
   } else {
 long size = getJournalSize();
 if(size = checkpointSize)
   shouldCheckpoint = true;
   }
 {code}
 {dfs.namenode.checkpoint.period} in configuration determines the period of 
 checkpoint. However, with above code, the Checkpointer triggers a checkpoint 
 every 5 minutes (periodMSec=5*60*1000). According to SecondaryNameNode.java, 
 the first *if*  statement should be:
  {code:}
 if(now = lastCheckpointTime + 1000 * checkpointPeriod) {
  {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1572) Checkpointer should trigger checkpoint with specified period.

2011-01-13 Thread Jakob Homan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1572:
--

Attachment: HDFS-1572-3.patch

Good catch Konstantin.  Changed the sleep time to be the minimum of time-based 
or size-based intervals.  This way, regardless of whichever is lower, we'll 
sleep for the minimum time needed.  

 Checkpointer should trigger checkpoint with specified period.
 -

 Key: HDFS-1572
 URL: https://issues.apache.org/jira/browse/HDFS-1572
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Liyin Liang
Priority: Blocker
 Fix For: 0.21.0

 Attachments: 1527-1.diff, 1572-2.diff, HDFS-1572-2.patch, 
 HDFS-1572-3.patch, HDFS-1572.patch


 {code:}
   long now = now();
   boolean shouldCheckpoint = false;
   if(now = lastCheckpointTime + periodMSec) {
 shouldCheckpoint = true;
   } else {
 long size = getJournalSize();
 if(size = checkpointSize)
   shouldCheckpoint = true;
   }
 {code}
 {dfs.namenode.checkpoint.period} in configuration determines the period of 
 checkpoint. However, with above code, the Checkpointer triggers a checkpoint 
 every 5 minutes (periodMSec=5*60*1000). According to SecondaryNameNode.java, 
 the first *if*  statement should be:
  {code:}
 if(now = lastCheckpointTime + 1000 * checkpointPeriod) {
  {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1572) Checkpointer should trigger checkpoint with specified period.

2011-01-13 Thread Jakob Homan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1572:
--

Status: Patch Available  (was: Open)

Hudson!

 Checkpointer should trigger checkpoint with specified period.
 -

 Key: HDFS-1572
 URL: https://issues.apache.org/jira/browse/HDFS-1572
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Liyin Liang
Priority: Blocker
 Fix For: 0.21.0

 Attachments: 1527-1.diff, 1572-2.diff, HDFS-1572-2.patch, 
 HDFS-1572-3.patch, HDFS-1572.patch


 {code:}
   long now = now();
   boolean shouldCheckpoint = false;
   if(now = lastCheckpointTime + periodMSec) {
 shouldCheckpoint = true;
   } else {
 long size = getJournalSize();
 if(size = checkpointSize)
   shouldCheckpoint = true;
   }
 {code}
 {dfs.namenode.checkpoint.period} in configuration determines the period of 
 checkpoint. However, with above code, the Checkpointer triggers a checkpoint 
 every 5 minutes (periodMSec=5*60*1000). According to SecondaryNameNode.java, 
 the first *if*  statement should be:
  {code:}
 if(now = lastCheckpointTime + 1000 * checkpointPeriod) {
  {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1582) Remove auto-generated native build files

2011-01-13 Thread Eli Collins (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981498#action_12981498
 ] 

Eli Collins commented on HDFS-1582:
---

Hey Roman,

My bad, looks like this error was due to a partially updated libtool package on 
my system, after rebooting the build with your patch it compiles as expected.

Thanks,
Eli

 Remove auto-generated native build files
 

 Key: HDFS-1582
 URL: https://issues.apache.org/jira/browse/HDFS-1582
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: contrib/libhdfs
Reporter: Roman Shaposhnik
 Fix For: 0.23.0

 Attachments: HADOOP-6436.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 The repo currently includes the automake and autoconf generated files for the 
 native build. Per discussion on HADOOP-6421 let's remove them and use the 
 host's automake and autoconf. We should also do this for libhdfs and 
 fuse-dfs. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1582) Remove auto-generated native build files

2011-01-13 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981506#action_12981506
]

Hadoop QA commented on HDFS-1582:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12468203/HADOOP-6436.patch
against trunk revision 1057414.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these core unit tests:
org.apache.hadoop.fs.permission.TestStickyBit
org.apache.hadoop.hdfs.security.TestDelegationToken
org.apache.hadoop.hdfs.server.common.TestDistributedUpgrade
org.apache.hadoop.hdfs.server.datanode.TestBlockReport

org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting

org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics
org.apache.hadoop.hdfs.server.namenode.TestBackupNode

org.apache.hadoop.hdfs.server.namenode.TestBlocksWithNotEnoughRacks
org.apache.hadoop.hdfs.server.namenode.TestBlockTokenWithDFS
org.apache.hadoop.hdfs.server.namenode.TestCheckpoint
org.apache.hadoop.hdfs.server.namenode.TestFsck
org.apache.hadoop.hdfs.server.namenode.TestNameEditsConfigs
org.apache.hadoop.hdfs.server.namenode.TestStorageRestore
org.apache.hadoop.hdfs.TestCrcCorruption
org.apache.hadoop.hdfs.TestDatanodeBlockScanner
org.apache.hadoop.hdfs.TestDatanodeDeath
org.apache.hadoop.hdfs.TestDFSClientRetries
org.apache.hadoop.hdfs.TestDFSFinalize
org.apache.hadoop.hdfs.TestDFSRollback
org.apache.hadoop.hdfs.TestDFSShell
org.apache.hadoop.hdfs.TestDFSStartupVersions
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
org.apache.hadoop.hdfs.TestDFSUpgradeFromImage
org.apache.hadoop.hdfs.TestDFSUpgrade
org.apache.hadoop.hdfs.TestDistributedFileSystem
org.apache.hadoop.hdfs.TestFileAppend2
org.apache.hadoop.hdfs.TestFileAppend3
org.apache.hadoop.hdfs.TestFileAppend4
org.apache.hadoop.hdfs.TestFileAppend
org.apache.hadoop.hdfs.TestFileConcurrentReader
org.apache.hadoop.hdfs.TestFileCreationNamenodeRestart
org.apache.hadoop.hdfs.TestFileCreation
org.apache.hadoop.hdfs.TestHDFSFileSystemContract
org.apache.hadoop.hdfs.TestHDFSTrash
org.apache.hadoop.hdfs.TestPread
org.apache.hadoop.hdfs.TestQuota
org.apache.hadoop.hdfs.TestReplication
org.apache.hadoop.hdfs.TestRestartDFS
org.apache.hadoop.hdfs.TestSetrepDecreasing
org.apache.hadoop.hdfs.TestSetrepIncreasing
org.apache.hadoop.hdfs.TestWriteConfigurationToDFS

-1 contrib tests. The patch failed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/101//testReport/
Findbugs warnings:
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/101//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/101//console

This message is automatically generated.

Remove auto-generated native build files

Key: HDFS-1582
URL: https://issues.apache.org/jira/browse/HDFS-1582
Project: Hadoop HDFS
Issue Type: Improvement
Components: contrib/libhdfs
Reporter: Roman Shaposhnik
Fix For: 0.23.0

Attachments: HADOOP-6436.patch

Original Estimate: 24h
Remaining Estimate: 24h

The repo currently includes the automake and autoconf generated files for the
native build. Per discussion on HADOOP-6421 let's remove them and use the
host's automake and autoconf. We should also do this for libhdfs and
fuse-dfs.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue

[jira] Updated: (HDFS-1547) Improve decommission mechanism

2011-01-13 Thread Suresh Srinivas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1547:
--

Status: Open  (was: Patch Available)

 Improve decommission mechanism
 --

 Key: HDFS-1547
 URL: https://issues.apache.org/jira/browse/HDFS-1547
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.23.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Fix For: 0.23.0

 Attachments: HDFS-1547.1.patch, HDFS-1547.patch, show-stats-broken.txt


 Current decommission mechanism driven using exclude file has several issues. 
 This bug proposes some changes in the mechanism for better manageability. See 
 the proposal in the next comment for more details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1547) Improve decommission mechanism

2011-01-13 Thread Suresh Srinivas (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Suresh Srinivas updated HDFS-1547:
--

Attachment: HDFS-1547.2.patch

Attached patch fixes the following:
# Cluster stats updates had a bug. Thanks Todd for pointing it out. I have
fixed it and added tests to check this.
# Node removed from include file was not being shutdown. I added this
functionality back and added a test to test this.
# Decommissioning/decommissioned update the stats as given below:
#* Node used capacity is counted towards cluster used capacity.
#* Node capacity is not counted towards the cluster capacity. Only node used
capacity is used counted towards cluster capacity.
#* Node capacity remaining is not counted towards cluster capacity remaining.
# Cleaned up TestDecommission and moved it to junit4.

Improve decommission mechanism
--

Key: HDFS-1547
URL: https://issues.apache.org/jira/browse/HDFS-1547
Project: Hadoop HDFS
Issue Type: Improvement
Components: name-node
Affects Versions: 0.23.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
Fix For: 0.23.0

Attachments: HDFS-1547.1.patch, HDFS-1547.2.patch, HDFS-1547.patch,
show-stats-broken.txt

Current decommission mechanism driven using exclude file has several issues.
This bug proposes some changes in the mechanism for better manageability. See
the proposal in the next comment for more details.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1547) Improve decommission mechanism

2011-01-13 Thread Suresh Srinivas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1547:
--

Status: Patch Available  (was: Open)

 Improve decommission mechanism
 --

 Key: HDFS-1547
 URL: https://issues.apache.org/jira/browse/HDFS-1547
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.23.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Fix For: 0.23.0

 Attachments: HDFS-1547.1.patch, HDFS-1547.2.patch, HDFS-1547.patch, 
 show-stats-broken.txt


 Current decommission mechanism driven using exclude file has several issues. 
 This bug proposes some changes in the mechanism for better manageability. See 
 the proposal in the next comment for more details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1547) Improve decommission mechanism

2011-01-13 Thread Suresh Srinivas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1547:
--

Attachment: HDFS-1547.3.patch

Additional change to test cluster stats when a datanode decommissioning is 
stopped.

 Improve decommission mechanism
 --

 Key: HDFS-1547
 URL: https://issues.apache.org/jira/browse/HDFS-1547
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.23.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Fix For: 0.23.0

 Attachments: HDFS-1547.1.patch, HDFS-1547.2.patch, HDFS-1547.3.patch, 
 HDFS-1547.patch, show-stats-broken.txt


 Current decommission mechanism driven using exclude file has several issues. 
 This bug proposes some changes in the mechanism for better manageability. See 
 the proposal in the next comment for more details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1547) Improve decommission mechanism

2011-01-13 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981557#action_12981557
]

Hadoop QA commented on HDFS-1547:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12468302/HDFS-1547.2.patch
against trunk revision 1058402.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 13 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these core unit tests:
org.apache.hadoop.hdfs.server.namenode.TestStorageRestore

-1 contrib tests. The patch failed contrib unit tests.

-1 system test framework. The patch failed system test framework compile.

Test results:
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/107//testReport/
Findbugs warnings:
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/107//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/107//console

This message is automatically generated.

Improve decommission mechanism
--

Attachments: HDFS-1547.1.patch, HDFS-1547.2.patch, HDFS-1547.3.patch,
HDFS-1547.patch, show-stats-broken.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1547) Improve decommission mechanism

2011-01-13 Thread Suresh Srinivas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1547:
--

Status: Open  (was: Patch Available)

 Improve decommission mechanism
 --

 Key: HDFS-1547
 URL: https://issues.apache.org/jira/browse/HDFS-1547
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.23.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Fix For: 0.23.0

 Attachments: HDFS-1547.1.patch, HDFS-1547.2.patch, HDFS-1547.3.patch, 
 HDFS-1547.patch, show-stats-broken.txt


 Current decommission mechanism driven using exclude file has several issues. 
 This bug proposes some changes in the mechanism for better manageability. See 
 the proposal in the next comment for more details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1547) Improve decommission mechanism

2011-01-13 Thread Suresh Srinivas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1547:
--

Status: Patch Available  (was: Open)

 Improve decommission mechanism
 --

 Key: HDFS-1547
 URL: https://issues.apache.org/jira/browse/HDFS-1547
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.23.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Fix For: 0.23.0

 Attachments: HDFS-1547.1.patch, HDFS-1547.2.patch, HDFS-1547.3.patch, 
 HDFS-1547.patch, show-stats-broken.txt


 Current decommission mechanism driven using exclude file has several issues. 
 This bug proposes some changes in the mechanism for better manageability. See 
 the proposal in the next comment for more details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1578) First step towards data transter protocol compatibility: support DatanodeProtocol#getDataTransferProtocolVersion

2011-01-13 Thread Todd Lipcon (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981647#action_12981647
]

Todd Lipcon commented on HDFS-1578:
---

bq. Todd, I really like your proposal of including data transfer version # in
DN's descriptor. Is there a simple way of making this work without breaking
protocol compatiblity?

Hmm, you're trying to add a compatibility layer that is itself compatible with
previous versions? Seems tricky, since we'd have to add a field to
DatanodeInfo... so, no, no good ideas if that is a goal.

First step towards data transter protocol compatibility: support
DatanodeProtocol#getDataTransferProtocolVersion

Key: HDFS-1578
URL: https://issues.apache.org/jira/browse/HDFS-1578
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Hairong Kuang
Assignee: Hairong Kuang
Fix For: 0.23.0

HADOOP-6904 allows us to handle RPC changes in a compatible way. However, we
have one more protocol to take care of, the data transfer protocol, which a
dfs client uses to read data from or write data to a datanode.

My proposal is to add a new RPC getDataTransferVersion to DatanodeProtocol
that returns the data transfer protocol version running on the datanode. A
dfs client gets the datanode's version number before it reads from/writes to
a datanode. With this, the dfs client could behave differently according to
datanode's data transfer version. This provides a base for us to make data
transfer protocol changes in a compatible way.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

[jira] Commented: (HDFS-884) DataNode makeInstance should report the directory list when failing to start up

[jira] Commented: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

[jira] Commented: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

[jira] Commented: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

[jira] Assigned: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

[jira] Commented: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

[jira] Updated: (HDFS-1572) Checkpointer should trigger checkpoint with specified period.

[jira] Updated: (HDFS-1572) Checkpointer should trigger checkpoint with specified period.

[jira] Updated: (HDFS-1572) Checkpointer should trigger checkpoint with specified period.

[jira] Commented: (HDFS-1582) Remove auto-generated native build files

[jira] Commented: (HDFS-1582) Remove auto-generated native build files

[jira] Updated: (HDFS-1547) Improve decommission mechanism

[jira] Updated: (HDFS-1547) Improve decommission mechanism

[jira] Updated: (HDFS-1547) Improve decommission mechanism

[jira] Updated: (HDFS-1547) Improve decommission mechanism

[jira] Commented: (HDFS-1547) Improve decommission mechanism

[jira] Updated: (HDFS-1547) Improve decommission mechanism

[jira] Updated: (HDFS-1547) Improve decommission mechanism

[jira] Commented: (HDFS-1578) First step towards data transter protocol compatibility: support DatanodeProtocol#getDataTransferProtocolVersion

20 matches

Site Navigation

Mail list logo

Footer information