[ 
https://issues.apache.org/jira/browse/HDDS-1530?focusedWorklogId=244426&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-244426
 ]

ASF GitHub Bot logged work on HDDS-1530:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/May/19 23:16
            Start Date: 17/May/19 23:16
    Worklog Time Spent: 10m 
      Work Description: xiaoyuyao commented on pull request #830: HDDS-1530. 
Freon support big files larger than 2GB and add --bufferSize and 
--validateWrites options.
URL: https://github.com/apache/hadoop/pull/830#discussion_r285313507
 
 

 ##########
 File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/RandomKeyGenerator.java
 ##########
 @@ -622,7 +642,11 @@ public void run() {
                 try (Scope writeScope = GlobalTracer.get()
                     .buildSpan("writeKeyData")
                     .startActive(true)) {
-                  os.write(keyValue);
+                  for (long nrRemaining = keySize - randomValue.length;
+                        nrRemaining > 0; nrRemaining -= bufferSize) {
+                    int curSize = (int)Math.min(bufferSize, nrRemaining);
+                    os.write(keyValueBuffer, 0, curSize);
 
 Review comment:
   Can we generate random string here and update the digest inline up to the 
buffersize? 
   This way, we will not be writing the same buffer repeatedly that will be 
cached in OS buffer/cache. 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 244426)
    Time Spent: 40m  (was: 0.5h)

> Ozone: Freon: Support big files larger than 2GB and add "--bufferSize" and 
> "--validateWrites" options.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-1530
>                 URL: https://issues.apache.org/jira/browse/HDDS-1530
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: test
>            Reporter: Xudong Cao
>            Assignee: Xudong Cao
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> *Current problems:*
>  1. Freon does not support big files larger than 2GB because it use an int 
> type "keySize" parameter and also "keyValue" buffer size.
>  2. Freon allocates a entire buffer for each key at once, so if the key size 
> is large and the concurrency is high, freon will report OOM exception 
> frequently.
>  3. Freon lacks option such as "--validateWrites", thus users cannot manually 
> specify that verification is required after writing.
> *Some solutions:*
>  1. Use a long type "keySize" parameter, make sure freon can support big 
> files larger than 2GB.
>  2. Use a small buffer repeatedly than allocating the entire key-size buffer 
> at once, the default buffer size is 4K and can be configured by "–bufferSize" 
> parameter.
>  3. Add a "--validateWrites" option to Freon command line, users can provide 
> this option to indicate that a validation is required after write.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to