[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2017-01-06 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-6658:
-
Labels:   (was: BB2015-05-TBR)

> Namenode memory optimization - Block replicas list 
> ---
>
> Key: HDFS-6658
> URL: https://issues.apache.org/jira/browse/HDFS-6658
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.4.1
>Reporter: Amir Langer
>Assignee: Daryn Sharp
> Attachments: BlockListOptimizationComparison.xlsx, BlocksMap 
> redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, HDFS-6658.patch, Namenode 
> Memory Optimizations - Block replicas list.docx, New primative indexes.jpg, 
> Old triplets.jpg
>
>
> Part of the memory consumed by every BlockInfo object in the Namenode is a 
> linked list of block references for every DatanodeStorageInfo (called 
> "triplets"). 
> We propose to change the way we store the list in memory. 
> Using primitive integer indexes instead of object references will reduce the 
> memory needed for every block replica (when compressed oops is disabled) and 
> in our new design the list overhead will be per DatanodeStorageInfo and not 
> per block replica.
> see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2017-01-06 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-6658:
-
Target Version/s:   (was: 2.8.0)

> Namenode memory optimization - Block replicas list 
> ---
>
> Key: HDFS-6658
> URL: https://issues.apache.org/jira/browse/HDFS-6658
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.4.1
>Reporter: Amir Langer
>Assignee: Daryn Sharp
> Attachments: BlockListOptimizationComparison.xlsx, BlocksMap 
> redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, HDFS-6658.patch, Namenode 
> Memory Optimizations - Block replicas list.docx, New primative indexes.jpg, 
> Old triplets.jpg
>
>
> Part of the memory consumed by every BlockInfo object in the Namenode is a 
> linked list of block references for every DatanodeStorageInfo (called 
> "triplets"). 
> We propose to change the way we store the list in memory. 
> Using primitive integer indexes instead of object references will reduce the 
> memory needed for every block replica (when compressed oops is disabled) and 
> in our new design the list overhead will be per DatanodeStorageInfo and not 
> per block replica.
> see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2015-06-29 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-6658:
--
Target Version/s: 2.8.0  (was: 2.6.0)

Moving features/enhancements out of previously closed releases into the next 
minor release 2.8.0.

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
Assignee: Daryn Sharp
  Labels: BB2015-05-TBR
 Attachments: BlockListOptimizationComparison.xlsx, BlocksMap 
 redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, HDFS-6658.patch, Namenode 
 Memory Optimizations - Block replicas list.docx, New primative indexes.jpg, 
 Old triplets.jpg


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



Auto-Re: [jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2015-06-29 Thread wsb
您的邮件已收到!谢谢!

[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2015-05-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-6658:
---
Labels: BB2015-05-TBR  (was: )

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
Assignee: Daryn Sharp
  Labels: BB2015-05-TBR
 Attachments: BlockListOptimizationComparison.xlsx, BlocksMap 
 redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, HDFS-6658.patch, Namenode 
 Memory Optimizations - Block replicas list.docx, New primative indexes.jpg, 
 Old triplets.jpg


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2015-03-11 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-6658:
--
Attachment: HDFS-6658.patch

Sorry, last minute change to revert code back to as close as possible to 
current code busted the repl monitor with NPE.

Based on preconditions I've added, they are detecting some bugs in the BM that 
are currently masked.  Namely the BM is designed to return phony values for 
blocks not in the blocks map, ie. 0 counts, 0 storages, etc - instead of the 
caller dealing with the situation.  Added log to getStorages when iterating a 
non-existent block.

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
Assignee: Daryn Sharp
 Attachments: BlockListOptimizationComparison.xlsx, BlocksMap 
 redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, HDFS-6658.patch, Namenode 
 Memory Optimizations - Block replicas list.docx


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2015-03-11 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-6658:
--
Attachment: New primative indexes.jpg
Old triplets.jpg

Excuse my bad whiteboard drawing skills.  These pictures attempt to illustrate 
the triplets vs the data structures.  It shows a 3-block file with repl factor 
2 that is stored on 2 nodes.  I started trying to diagram a 3-repl factor 
picture with proper block placement on multiple nodes but it was spaghetti for 
the triplets.  My whiteboard isn't that big.

Everything is a reference in the triplets pic.  The new pic is based on 
primitive indexes.  The design I recently posted goes into more detail on the 
indexing.

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
Assignee: Daryn Sharp
 Attachments: BlockListOptimizationComparison.xlsx, BlocksMap 
 redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, HDFS-6658.patch, Namenode 
 Memory Optimizations - Block replicas list.docx, New primative indexes.jpg, 
 Old triplets.jpg


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2015-03-10 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-6658:
--
Attachment: HDFS-6658.patch

There's likely to be some todo debris, maybe snippets of other work, and some 
tests might fail, but this is the fruit of a multi-month effort.  I was able to 
bench an earlier prototype to have comparable block report processing times 
+/-5%.  Since then, I've lost sleep ensuring there are adequate precondition 
and co-modification checks to prevent the data structures from going off the 
rails and scribbling itself to death.

I need to re-benchmark on a large perf cluster to make sure I didn't regress on 
performance.

The change looks big, but it's actually a lot of tests, and I feel I need even 
more.  The pre-existing logic has generally just moved.  There's no fundamental 
changes.

The main change required to make this primitive array approach work is 
requiring the block manager to manage all block related data structures.  The 
DNDs and DNSIs become dumb model objects controlled by the BM. 

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
Assignee: Daryn Sharp
 Attachments: BlockListOptimizationComparison.xlsx, BlocksMap 
 redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, Namenode Memory Optimizations 
 - Block replicas list.docx


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2015-02-27 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-6658:
--
Attachment: BlocksMap redesign.pdf

For months I've been adapting the concepts of Amir's work, and extensively 
profiling implementations.  Here is a rough design doc that describes a working 
implementation.   I'll post a patch, hopefully this afternoon, after rebasing 
on trunk.

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
Assignee: Daryn Sharp
 Attachments: BlockListOptimizationComparison.xlsx, BlocksMap 
 redesign.pdf, HDFS-6658.patch, Namenode Memory Optimizations - Block replicas 
 list.docx


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2014-09-10 Thread Amir Langer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amir Langer updated HDFS-6658:
--
Attachment: HDFS-6658.patch

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
Assignee: Amir Langer
 Attachments: BlockListOptimizationComparison.xlsx, HDFS-6658.patch, 
 Namenode Memory Optimizations - Block replicas list.docx


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2014-09-10 Thread Amir Langer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amir Langer updated HDFS-6658:
--
Status: Patch Available  (was: Open)

Path includes all sub-tasks

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
Assignee: Amir Langer
 Attachments: BlockListOptimizationComparison.xlsx, HDFS-6658.patch, 
 Namenode Memory Optimizations - Block replicas list.docx


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2014-08-28 Thread Amir Langer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amir Langer updated HDFS-6658:
--

Attachment: BlockListOptimizationComparison.xlsx

Added comparison of memory with and without CompressedOops in both the original 
and modified code.
(Memory in bytes collected using jmap).
Difference with compressed oops is marginal but significant without it.



 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
Assignee: Amir Langer
 Attachments: BlockListOptimizationComparison.xlsx, Namenode Memory 
 Optimizations - Block replicas list.docx


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2014-07-10 Thread Amir Langer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amir Langer updated HDFS-6658:
--

Attachment: Namenode Memory Optimizations - Block replicas list.docx

Design doc. + Evaluation results

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
 Fix For: 3.0.0, 2.5.0

 Attachments: Namenode Memory Optimizations - Block replicas list.docx


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2014-07-10 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6658:
-

Target Version/s: 2.6.0

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
 Attachments: Namenode Memory Optimizations - Block replicas list.docx


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2014-07-10 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6658:
-

Fix Version/s: (was: 2.5.0)
   (was: 3.0.0)

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
 Attachments: Namenode Memory Optimizations - Block replicas list.docx


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list

2014-07-10 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6658:
-

Assignee: Amir Langer

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
Assignee: Amir Langer
 Attachments: Namenode Memory Optimizations - Block replicas list.docx


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)