[jira] [Comment Edited] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-28 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983105#comment-13983105
 ] 

Kihwal Lee edited comment on HDFS-6293 at 4/28/14 3:33 PM:
---

Attaching heap histogram of OIV. The max heap was set to 2GB to make it go out 
of heap early and dump the heap.  It only loaded  about 3M files/dirs before 
crashing. If we optimize the PB inefficiencies, we might be able to make it 
work with 50% of the heap. But that will still be too much.


was (Author: kihwal):
Attaching heap histogram of OIV. The max heap was set to 2GB to make it go out 
of heap early and dump the heap.  I only loaded up about 3M files/dirs before 
crashing. If we optimize the PB inefficiencies, we might be able to make it 
work with 50% of the heap. But that will still be too much.

> Issues with OIV processing PB-based fsimages
> 
>
> Key: HDFS-6293
> URL: https://issues.apache.org/jira/browse/HDFS-6293
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Kihwal Lee
>Priority: Blocker
> Attachments: Heap Histogram.html
>
>
> There are issues with OIV when processing fsimages in protobuf. 
> Due to the internal layout changes introduced by the protobuf-based fsimage, 
> OIV consumes excessive amount of memory.  We have tested with a fsimage with 
> about 140M files/directories. The peak heap usage when processing this image 
> in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
> the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
> heap (max new size was 1GB).  It should be possible to process any image with 
> the default heap size of 1.5GB.
> Another issue is the complete change of format/content in OIV's XML output.  
> I also noticed that the secret manager section has no tokens while there were 
> unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
> they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-29 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984353#comment-13984353
 ] 

Kihwal Lee edited comment on HDFS-6293 at 4/29/14 2:46 PM:
---

bq. One solution we can do is - Add an option to print directory tree 
information (along the lines ls -r) that works against fsimage.
It is still there in 2.4. It was removed by HDFS-6164 in 2.5.  Even if we add 
it back, the excessive memory requirement makes it useless.  

bq. We could also consider either building a tool that works efficiently in 
memory or reorganize the fsimage to make that possible.
It will be great if someone can come up with a standalone tool that allows 
dumping directory structure and content with, say, 1-2GB heap AND completes in 
comparable execution time.  Otherwise, we will have to rearrange the internal 
layout of fsimage similar to the previous format.

bq. Kihwal Lee, can you please provide the use cases you are using OIV for?
There is existing apps that use a custom Visitor similar to lsr. It outputs 
directory entries with full path and list of blocks for files.


was (Author: kihwal):
bq. One solution we can do is - Add an option to print directory tree 
information (along the lines ls -r) that works against fsimage.
It is still there in 2.4. It was removed by HDFS-6164 in 2.5.  Even if we add 
it back, the excessive memory requirement makes it useless.  

bq. We could also consider either building a tool that works efficiently in 
memory or reorganize the fsimage to make that possible.
I will be great if someone can come up with a standalone tool that allows 
dumping directory structure and content with, say, 1-2GB heap AND completes in 
comparable execution time.  Otherwise, we will have to rearrange the internal 
layout of fsimage similar to the previous format.

bq. Kihwal Lee, can you please provide the use cases you are using OIV for?
There is existing apps that use a custom Visitor similar to lsr. It outputs 
directory entries with full path and list of blocks for files.

> Issues with OIV processing PB-based fsimages
> 
>
> Key: HDFS-6293
> URL: https://issues.apache.org/jira/browse/HDFS-6293
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Kihwal Lee
>Priority: Blocker
> Attachments: Heap Histogram.html
>
>
> There are issues with OIV when processing fsimages in protobuf. 
> Due to the internal layout changes introduced by the protobuf-based fsimage, 
> OIV consumes excessive amount of memory.  We have tested with a fsimage with 
> about 140M files/directories. The peak heap usage when processing this image 
> in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
> the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
> heap (max new size was 1GB).  It should be possible to process any image with 
> the default heap size of 1.5GB.
> Another issue is the complete change of format/content in OIV's XML output.  
> I also noticed that the secret manager section has no tokens while there were 
> unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
> they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-29 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984360#comment-13984360
 ] 

Kihwal Lee edited comment on HDFS-6293 at 4/29/14 3:02 PM:
---

Another way may be adding an option for checkpointing (2NN, SBN and the 
checkpoint command) to write out an image in a format that is readily 
consumable. This image would be written out in addition to the real fsimage and 
only meant to be used by OIV or any other postprocessing tools.


was (Author: kihwal):
Another option may be adding an option for checkpointing (2NN, SBN and the 
checkpoint command) to write out an image in a format that is readily 
consumable. This image will be written out in addition to the real fsimage and 
only meant to be used by OIV. 

> Issues with OIV processing PB-based fsimages
> 
>
> Key: HDFS-6293
> URL: https://issues.apache.org/jira/browse/HDFS-6293
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Kihwal Lee
>Priority: Blocker
> Attachments: Heap Histogram.html
>
>
> There are issues with OIV when processing fsimages in protobuf. 
> Due to the internal layout changes introduced by the protobuf-based fsimage, 
> OIV consumes excessive amount of memory.  We have tested with a fsimage with 
> about 140M files/directories. The peak heap usage when processing this image 
> in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
> the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
> heap (max new size was 1GB).  It should be possible to process any image with 
> the default heap size of 1.5GB.
> Another issue is the complete change of format/content in OIV's XML output.  
> I also noticed that the secret manager section has no tokens while there were 
> unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
> they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-05-05 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990204#comment-13990204
 ] 

Haohui Mai edited comment on HDFS-6293 at 5/6/14 4:57 AM:
--

bq. There is existing apps that use a custom Visitor similar to lsr. It outputs 
directory entries with full path and list of blocks for files.

[~kihwal], can you please elaborate it? If you're talking about use cases like 
hdfs-du, scanning through the records might be sufficient.

bq. That was the first thing I thought about doing, but the processing time 
matters too.

It might not be as bad as you thought. I ran an experiments to see how much 
time is required to convert an fsimage to a level db on an 8-core Xeon E5530 
CPU @ 2.4GHz, 24G memory, 2TB SATA 3 drive @ 7200 rpm. The machine is running 
RHEL 6.2, Java 1.6. The numbers reported below are comparable to the numbers 
reported in HDFS-5698.

|Size in Old|512M|1G|2G|4G|8G| 
|Size in PB|469M|950M|1.9G|3.7G|7.0G| 
|Converting to LevelDB (ms)|30505|56531|121579|373108|1047121|

The additional latency for a 8G fsimage is around 15mins.


was (Author: wheat9):
bq. There is existing apps that use a custom Visitor similar to lsr. It outputs 
directory entries with full path and list of blocks for files.

[~kihwal], can you please elaborate it? If you're talking about use cases like 
hdfs-du, there is no need to construct the whole namespace from bottom up. 
Scanning through the records would be sufficient.

bq. That was the first thing I thought about doing, but the processing time 
matters too.

It might not be as bad as you thought. I ran an experiments to see how much 
time is required to convert an fsimage to a level db on an 8-core Xeon E5530 
CPU @ 2.4GHz, 24G memory, 2TB SATA 3 drive @ 7200 rpm. The machine is running 
RHEL 6.2, Java 1.6. The numbers reported below are comparable to the numbers 
reported in HDFS-5698.

|Size in Old|512M|1G|2G|4G|8G| 
|Size in PB|469M|950M|1.9G|3.7G|7.0G| 
|Converting to LevelDB (ms)|30505|56531|121579|373108|1047121|

The additional latency for a 8G fsimage is around 15mins, which looks 
reasonable for me for the use cases of an offline tool.

> Issues with OIV processing PB-based fsimages
> 
>
> Key: HDFS-6293
> URL: https://issues.apache.org/jira/browse/HDFS-6293
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Kihwal Lee
>Assignee: Haohui Mai
>Priority: Blocker
> Attachments: HDFS-6293.000.patch, HDFS-6293.001.patch, Heap 
> Histogram.html
>
>
> There are issues with OIV when processing fsimages in protobuf. 
> Due to the internal layout changes introduced by the protobuf-based fsimage, 
> OIV consumes excessive amount of memory.  We have tested with a fsimage with 
> about 140M files/directories. The peak heap usage when processing this image 
> in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
> the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
> heap (max new size was 1GB).  It should be possible to process any image with 
> the default heap size of 1.5GB.
> Another issue is the complete change of format/content in OIV's XML output.  
> I also noticed that the secret manager section has no tokens while there were 
> unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
> they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-05-13 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996649#comment-13996649
 ] 

Kihwal Lee edited comment on HDFS-6293 at 5/13/14 5:19 PM:
---

Oops. The patch applies to trunk but it introduces a duplicate import in the 
test case. I will upload separate patches.


was (Author: kihwal):
Ouch. The patch applies to trunk but it introduces a duplicate import in the 
test case. I will upload separate patches.

> Issues with OIV processing PB-based fsimages
> 
>
> Key: HDFS-6293
> URL: https://issues.apache.org/jira/browse/HDFS-6293
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Blocker
> Attachments: HDFS-6293.000.patch, HDFS-6293.001.patch, 
> HDFS-6293.002-save-deprecated-fsimage.patch, HDFS-6293.branch-2.patch, 
> HDFS-6293.trunk.patch, HDFS-6293.trunk.patch, 
> HDFS-6293_sbn_ckpt_retention.patch, 
> HDFS-6293_sbn_ckpt_retention_oiv_legacy.patch, Heap Histogram.html
>
>
> There are issues with OIV when processing fsimages in protobuf. 
> Due to the internal layout changes introduced by the protobuf-based fsimage, 
> OIV consumes excessive amount of memory.  We have tested with a fsimage with 
> about 140M files/directories. The peak heap usage when processing this image 
> in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
> the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
> heap (max new size was 1GB).  It should be possible to process any image with 
> the default heap size of 1.5GB.
> Another issue is the complete change of format/content in OIV's XML output.  
> I also noticed that the secret manager section has no tokens while there were 
> unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
> they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-05-13 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996922#comment-13996922
 ] 

Suresh Srinivas edited comment on HDFS-6293 at 5/13/14 9:36 PM:


+1 for the patch. Thanks [~wheat9] and [~kihwal] for making these changes.


was (Author: sureshms):
+1 for the patch. Thanks [~hmai] and [~kihwal] for making these changes.

> Issues with OIV processing PB-based fsimages
> 
>
> Key: HDFS-6293
> URL: https://issues.apache.org/jira/browse/HDFS-6293
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Blocker
> Attachments: HDFS-6293.000.patch, HDFS-6293.001.patch, 
> HDFS-6293.002-save-deprecated-fsimage.patch, HDFS-6293.branch-2.patch, 
> HDFS-6293.trunk.patch, HDFS-6293.trunk.patch, HDFS-6293.v2.branch-2.patch, 
> HDFS-6293.v2.trunk.patch, HDFS-6293_sbn_ckpt_retention.patch, 
> HDFS-6293_sbn_ckpt_retention_oiv_legacy.patch, Heap Histogram.html
>
>
> There are issues with OIV when processing fsimages in protobuf. 
> Due to the internal layout changes introduced by the protobuf-based fsimage, 
> OIV consumes excessive amount of memory.  We have tested with a fsimage with 
> about 140M files/directories. The peak heap usage when processing this image 
> in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
> the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
> heap (max new size was 1GB).  It should be possible to process any image with 
> the default heap size of 1.5GB.
> Another issue is the complete change of format/content in OIV's XML output.  
> I also noticed that the secret manager section has no tokens while there were 
> unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
> they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-05-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994126#comment-13994126
 ] 

Kihwal Lee edited comment on HDFS-6293 at 5/10/14 3:45 AM:
---

The new patch adds back the old OIV. It can be used with "hdfs oiv_legacy" 
command. It is from 2.3 plus HDFS-5961. Also it is using 
{{NameNodeLayoutVersion}} instead of {{LayoutVersion}}.

I will post more test results later.


was (Author: kihwal):
The new patch adds back the old OIV. It can be used with "hdfs oiv_legacy" 
command. It is from 2.3 plus HDFS-5961. Also it is using 
{{NameNodeLayoutVersion}} instead of {{LayoutVersion}}.

I will also post more test results later.

> Issues with OIV processing PB-based fsimages
> 
>
> Key: HDFS-6293
> URL: https://issues.apache.org/jira/browse/HDFS-6293
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Kihwal Lee
>Assignee: Haohui Mai
>Priority: Blocker
> Attachments: HDFS-6293.000.patch, HDFS-6293.001.patch, 
> HDFS-6293.002-save-deprecated-fsimage.patch, 
> HDFS-6293_sbn_ckpt_retention.patch, 
> HDFS-6293_sbn_ckpt_retention_oiv_legacy.patch, Heap Histogram.html
>
>
> There are issues with OIV when processing fsimages in protobuf. 
> Due to the internal layout changes introduced by the protobuf-based fsimage, 
> OIV consumes excessive amount of memory.  We have tested with a fsimage with 
> about 140M files/directories. The peak heap usage when processing this image 
> in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
> the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
> heap (max new size was 1GB).  It should be possible to process any image with 
> the default heap size of 1.5GB.
> Another issue is the complete change of format/content in OIV's XML output.  
> I also noticed that the secret manager section has no tokens while there were 
> unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
> they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)