[jira] [Created] (HDFS-3722) TaskTracker's heartbeat is out of control

2012-07-24 Thread Liyin Liang (JIRA)
Liyin Liang created HDFS-3722:
-

 Summary: TaskTracker's heartbeat is out of control
 Key: HDFS-3722
 URL: https://issues.apache.org/jira/browse/HDFS-3722
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 1.0.3, 1.0.2, 1.0.1, 1.0.0
Reporter: Liyin Liang




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (HDFS-1572) Checkpointer should trigger checkpoint with specified period.

2011-01-06 Thread Liyin Liang (JIRA)
Checkpointer should trigger checkpoint with specified period.
-

 Key: HDFS-1572
 URL: https://issues.apache.org/jira/browse/HDFS-1572
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Liyin Liang
Priority: Blocker


{code:}
  long now = now();
  boolean shouldCheckpoint = false;
  if(now >= lastCheckpointTime + periodMSec) {
shouldCheckpoint = true;
  } else {
long size = getJournalSize();
if(size >= checkpointSize)
  shouldCheckpoint = true;
  }
{code}
{dfs.namenode.checkpoint.period} in configuration determines the period of 
checkpoint. However, with above code, the Checkpointer triggers a checkpoint 
every 5 minutes (periodMSec=5*60*1000). According to SecondaryNameNode.java, 
the first *if*  statement should be:
 {code:}
if(now >= lastCheckpointTime + 1000 * checkpointPeriod) {
 {code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1572) Checkpointer should trigger checkpoint with specified period.

2011-01-06 Thread Liyin Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Liang updated HDFS-1572:
--

Attachment: 1527-1.diff

patch for trunk.

> Checkpointer should trigger checkpoint with specified period.
> -
>
> Key: HDFS-1572
> URL: https://issues.apache.org/jira/browse/HDFS-1572
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Liyin Liang
>Priority: Blocker
> Attachments: 1527-1.diff
>
>
> {code:}
>   long now = now();
>   boolean shouldCheckpoint = false;
>   if(now >= lastCheckpointTime + periodMSec) {
> shouldCheckpoint = true;
>   } else {
> long size = getJournalSize();
> if(size >= checkpointSize)
>   shouldCheckpoint = true;
>   }
> {code}
> {dfs.namenode.checkpoint.period} in configuration determines the period of 
> checkpoint. However, with above code, the Checkpointer triggers a checkpoint 
> every 5 minutes (periodMSec=5*60*1000). According to SecondaryNameNode.java, 
> the first *if*  statement should be:
>  {code:}
> if(now >= lastCheckpointTime + 1000 * checkpointPeriod) {
>  {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1572) Checkpointer should trigger checkpoint with specified period.

2011-01-09 Thread Liyin Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Liang updated HDFS-1572:
--

Attachment: 1572-2.diff

Hi Jakob,
Thanks for you work and advice. We can move some logic to static functions 
to make it more testable. I'll attach a patch with a simple test case. Any 
thoughts?

> Checkpointer should trigger checkpoint with specified period.
> -
>
> Key: HDFS-1572
> URL: https://issues.apache.org/jira/browse/HDFS-1572
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Liyin Liang
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: 1527-1.diff, 1572-2.diff, HDFS-1572.patch
>
>
> {code:}
>   long now = now();
>   boolean shouldCheckpoint = false;
>   if(now >= lastCheckpointTime + periodMSec) {
> shouldCheckpoint = true;
>   } else {
> long size = getJournalSize();
> if(size >= checkpointSize)
>   shouldCheckpoint = true;
>   }
> {code}
> {dfs.namenode.checkpoint.period} in configuration determines the period of 
> checkpoint. However, with above code, the Checkpointer triggers a checkpoint 
> every 5 minutes (periodMSec=5*60*1000). According to SecondaryNameNode.java, 
> the first *if*  statement should be:
>  {code:}
> if(now >= lastCheckpointTime + 1000 * checkpointPeriod) {
>  {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-01-12 Thread Liyin Liang (JIRA)
Improve backup-node sync performance by wrapping RPC parameters
---

 Key: HDFS-1583
 URL: https://issues.apache.org/jira/browse/HDFS-1583
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Liyin Liang


The journal edit records are sent by the active name-node to the backup-node 
with RPC:
{code:}
  public void journal(NamenodeRegistration registration,
  int jAction,
  int length,
  byte[] records) throws IOException;
{code}
During the name-node throughput benchmark, the size of byte array _records_ is 
around *8000*.  Then the serialization and deserialization is time-consuming. I 
wrote a simple application to test RPC with byte array parameter. When the size 
got to 8000, each RPC call need about 6 ms. While name-node sync 8k byte to 
local disk only need  0.3~0.4ms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-01-12 Thread Liyin Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Liang updated HDFS-1583:
--

Attachment: HDFS-1583-1.patch

Attach a patch which wraps Journal function's parameters with a writable class 
_JournalArgs_. The following table shows the performance number of RPC call 
with and without this patch.
The contents of the form is each rpc call's average time which is total 
elapse(ms) divided by 10,000 times. 
|array size|1k|2k|3k|4k|5k|6k|7k|8k| 
|without patch|1.2212|1.9266|2.5415|3.2025|4.8677|4.5679|5.2211|5.9386|
|with patch|0.4774|0.4087|0.4521|0.4375|0.4215|0.4616|0.4551|0.4844|

> Improve backup-node sync performance by wrapping RPC parameters
> ---
>
> Key: HDFS-1583
> URL: https://issues.apache.org/jira/browse/HDFS-1583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Liyin Liang
> Attachments: HDFS-1583-1.patch
>
>
> The journal edit records are sent by the active name-node to the backup-node 
> with RPC:
> {code:}
>   public void journal(NamenodeRegistration registration,
>   int jAction,
>   int length,
>   byte[] records) throws IOException;
> {code}
> During the name-node throughput benchmark, the size of byte array _records_ 
> is around *8000*.  Then the serialization and deserialization is 
> time-consuming. I wrote a simple application to test RPC with byte array 
> parameter. When the size got to 8000, each RPC call need about 6 ms. While 
> name-node sync 8k byte to local disk only need  0.3~0.4ms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-01-12 Thread Liyin Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Liang updated HDFS-1583:
--

Fix Version/s: 0.23.0
   Status: Patch Available  (was: Open)

> Improve backup-node sync performance by wrapping RPC parameters
> ---
>
> Key: HDFS-1583
> URL: https://issues.apache.org/jira/browse/HDFS-1583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Liyin Liang
> Fix For: 0.23.0
>
> Attachments: HDFS-1583-1.patch
>
>
> The journal edit records are sent by the active name-node to the backup-node 
> with RPC:
> {code:}
>   public void journal(NamenodeRegistration registration,
>   int jAction,
>   int length,
>   byte[] records) throws IOException;
> {code}
> During the name-node throughput benchmark, the size of byte array _records_ 
> is around *8000*.  Then the serialization and deserialization is 
> time-consuming. I wrote a simple application to test RPC with byte array 
> parameter. When the size got to 8000, each RPC call need about 6 ms. While 
> name-node sync 8k byte to local disk only need  0.3~0.4ms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-01-13 Thread Liyin Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981176#action_12981176
 ] 

Liyin Liang commented on HDFS-1583:
---

Hi Todd,
This is mainly caused by the serialization of array. The job is done by :
{code:}
ObjectWritable::writeObject(DataOutput out, Object instance,
 Class declaredClass, 
 Configuration conf)
{code}

This function traverses the array and serialize each element as an object. 
According to my test, an byte array with 8000 elements will grow up to 56008 
elements after serialization (2.4ms). However, a wrapped object size is 8094 
after serialization (0.03ms).

By the way, there is a array wrapper class already: 
{code:}
public class ArrayWritable implements Writable
{code}
This class is used in FSEditLog to log operations, e.g. 
FSEditLog::logMkDir(String path, INode newNode).
I'll update the patch to use ArrayWritable.

> Improve backup-node sync performance by wrapping RPC parameters
> ---
>
> Key: HDFS-1583
> URL: https://issues.apache.org/jira/browse/HDFS-1583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Liyin Liang
> Fix For: 0.23.0
>
> Attachments: HDFS-1583-1.patch
>
>
> The journal edit records are sent by the active name-node to the backup-node 
> with RPC:
> {code:}
>   public void journal(NamenodeRegistration registration,
>   int jAction,
>   int length,
>   byte[] records) throws IOException;
> {code}
> During the name-node throughput benchmark, the size of byte array _records_ 
> is around *8000*.  Then the serialization and deserialization is 
> time-consuming. I wrote a simple application to test RPC with byte array 
> parameter. When the size got to 8000, each RPC call need about 6 ms. While 
> name-node sync 8k byte to local disk only need  0.3~0.4ms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-01-13 Thread Liyin Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Liang updated HDFS-1583:
--

Attachment: HDFS-1583-2.patch

add a test case of _JournalArgs_.
_ArrayWritable_ only accepts writable object, so we cannot use it to wrap byte 
array.

> Improve backup-node sync performance by wrapping RPC parameters
> ---
>
> Key: HDFS-1583
> URL: https://issues.apache.org/jira/browse/HDFS-1583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Liyin Liang
> Fix For: 0.23.0
>
> Attachments: HDFS-1583-1.patch, HDFS-1583-2.patch
>
>
> The journal edit records are sent by the active name-node to the backup-node 
> with RPC:
> {code:}
>   public void journal(NamenodeRegistration registration,
>   int jAction,
>   int length,
>   byte[] records) throws IOException;
> {code}
> During the name-node throughput benchmark, the size of byte array _records_ 
> is around *8000*.  Then the serialization and deserialization is 
> time-consuming. I wrote a simple application to test RPC with byte array 
> parameter. When the size got to 8000, each RPC call need about 6 ms. While 
> name-node sync 8k byte to local disk only need  0.3~0.4ms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-01-15 Thread Liyin Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Liang updated HDFS-1583:
--

Attachment: test-rpc.diff

Hi Konstantin,
I'll attach the benchmark code as a diff file, which is a simple client-server 
app based on trunk-common project.

> Improve backup-node sync performance by wrapping RPC parameters
> ---
>
> Key: HDFS-1583
> URL: https://issues.apache.org/jira/browse/HDFS-1583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Liyin Liang
>Assignee: Liyin Liang
> Fix For: 0.23.0
>
> Attachments: HDFS-1583-1.patch, HDFS-1583-2.patch, test-rpc.diff
>
>
> The journal edit records are sent by the active name-node to the backup-node 
> with RPC:
> {code:}
>   public void journal(NamenodeRegistration registration,
>   int jAction,
>   int length,
>   byte[] records) throws IOException;
> {code}
> During the name-node throughput benchmark, the size of byte array _records_ 
> is around *8000*.  Then the serialization and deserialization is 
> time-consuming. I wrote a simple application to test RPC with byte array 
> parameter. When the size got to 8000, each RPC call need about 6 ms. While 
> name-node sync 8k byte to local disk only need  0.3~0.4ms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] [Updated] (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-03-21 Thread Liyin Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Liang updated HDFS-1583:
--

Status: Open  (was: Patch Available)

HADOOP-6949 has been accepted and committed.

> Improve backup-node sync performance by wrapping RPC parameters
> ---
>
> Key: HDFS-1583
> URL: https://issues.apache.org/jira/browse/HDFS-1583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Liyin Liang
>Assignee: Liyin Liang
> Fix For: 0.23.0
>
> Attachments: HDFS-1583-1.patch, HDFS-1583-2.patch, test-rpc.diff
>
>
> The journal edit records are sent by the active name-node to the backup-node 
> with RPC:
> {code:}
>   public void journal(NamenodeRegistration registration,
>   int jAction,
>   int length,
>   byte[] records) throws IOException;
> {code}
> During the name-node throughput benchmark, the size of byte array _records_ 
> is around *8000*.  Then the serialization and deserialization is 
> time-consuming. I wrote a simple application to test RPC with byte array 
> parameter. When the size got to 8000, each RPC call need about 6 ms. While 
> name-node sync 8k byte to local disk only need  0.3~0.4ms.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-1583) Improve backup-node sync performance by wrapping RPC parameters

2011-03-21 Thread Liyin Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Liang resolved HDFS-1583.
---

Resolution: Fixed

This bug has been fixed by HADOOP-6949.

> Improve backup-node sync performance by wrapping RPC parameters
> ---
>
> Key: HDFS-1583
> URL: https://issues.apache.org/jira/browse/HDFS-1583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Liyin Liang
>Assignee: Liyin Liang
> Fix For: 0.23.0
>
> Attachments: HDFS-1583-1.patch, HDFS-1583-2.patch, test-rpc.diff
>
>
> The journal edit records are sent by the active name-node to the backup-node 
> with RPC:
> {code:}
>   public void journal(NamenodeRegistration registration,
>   int jAction,
>   int length,
>   byte[] records) throws IOException;
> {code}
> During the name-node throughput benchmark, the size of byte array _records_ 
> is around *8000*.  Then the serialization and deserialization is 
> time-consuming. I wrote a simple application to test RPC with byte array 
> parameter. When the size got to 8000, each RPC call need about 6 ms. While 
> name-node sync 8k byte to local disk only need  0.3~0.4ms.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira