[ 
https://issues.apache.org/jira/browse/HDDS-1530?focusedWorklogId=245982&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-245982
 ]

ASF GitHub Bot logged work on HDDS-1530:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 21/May/19 12:11
            Start Date: 21/May/19 12:11
    Worklog Time Spent: 10m 
      Work Description: iamcaoxudong commented on pull request #830: HDDS-1530. 
Freon support big files larger than 2GB and add --bufferSize and 
--validateWrites options.
URL: https://github.com/apache/hadoop/pull/830#discussion_r285991232
 
 

 ##########
 File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/RandomKeyGenerator.java
 ##########
 @@ -228,8 +243,20 @@ public Void call() throws Exception {
       init(freon.createOzoneConfiguration());
     }
 
-    keyValue =
-        DFSUtil.string2Bytes(RandomStringUtils.randomAscii(keySize - 36));
+    keyValueBuffer = DFSUtil.string2Bytes(
+        RandomStringUtils.randomAscii(bufferSize));
 
 Review comment:
   Thank you, but the generation of random data may not take too much time, 
just like xiaoyuyao said below, repeatedly calculate the same content won't 
change the digest, so as an optimization, MessageDigest.update() is only 
required for the first calculation and the last calculation. this is quickly 
indeed.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 245982)
    Time Spent: 2h 40m  (was: 2.5h)

> Ozone: Freon: Support big files larger than 2GB and add "--bufferSize" and 
> "--validateWrites" options.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-1530
>                 URL: https://issues.apache.org/jira/browse/HDDS-1530
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: test
>            Reporter: Xudong Cao
>            Assignee: Xudong Cao
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Current problems:*
>  1. Freon does not support big files larger than 2GB because it use an int 
> type "keySize" parameter and also "keyValue" buffer size.
>  2. Freon allocates a entire buffer for each key at once, so if the key size 
> is large and the concurrency is high, freon will report OOM exception 
> frequently.
>  3. Freon lacks option such as "--validateWrites", thus users cannot manually 
> specify that verification is required after writing.
> *Some solutions:*
>  1. Use a long type "keySize" parameter, make sure freon can support big 
> files larger than 2GB.
>  2. Use a small buffer repeatedly than allocating the entire key-size buffer 
> at once, the default buffer size is 4K and can be configured by "–bufferSize" 
> parameter.
>  3. Add a "--validateWrites" option to Freon command line, users can provide 
> this option to indicate that a validation is required after write.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to