[jira] [Work logged] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode

2019-08-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1094?focusedWorklogId=303051=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303051
 ]

ASF GitHub Bot logged work on HDDS-1094:


Author: ASF GitHub Bot
Created on: 28/Aug/19 17:05
Start Date: 28/Aug/19 17:05
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #1323: HDDS-1094. 
Performance test infrastructure : skip writing user data on Datanode. 
Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/1323
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303051)
Time Spent: 1h 50m  (was: 1h 40m)

> Performance test infrastructure : skip writing user data on Datanode
> 
>
> Key: HDDS-1094
> URL: https://issues.apache.org/jira/browse/HDDS-1094
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Goal:
> It can be useful to exercise the IO and control paths in Ozone for simulated 
> large datasets without having huge disk capacity at hand. For example, this 
> will allow us to get things like container reports and incremental container 
> reports, while not needing huge cluster capacity. The 
> [SimulatedFsDataset|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java]
>  does something similar in HDFS. It has been an invaluable tool to simulate 
> large data stores.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1094?focusedWorklogId=301779=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301779
 ]

ASF GitHub Bot logged work on HDDS-1094:


Author: ASF GitHub Bot
Created on: 27/Aug/19 08:29
Start Date: 27/Aug/19 08:29
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1323: HDDS-1094. 
Performance test infrastructure : skip writing user data on Datanode. 
Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/1323#issuecomment-525198173
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 35 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 69 | Maven dependency ordering for branch |
   | +1 | mvninstall | 658 | trunk passed |
   | +1 | compile | 382 | trunk passed |
   | +1 | checkstyle | 78 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 959 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 180 | trunk passed |
   | 0 | spotbugs | 443 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 660 | trunk passed |
   | -0 | patch | 479 | Used diff version of patch file. Binary files and 
potentially other changes not applied. Please rebase and squash commits if 
necessary. |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 29 | Maven dependency ordering for patch |
   | +1 | mvninstall | 631 | the patch passed |
   | +1 | compile | 396 | the patch passed |
   | +1 | javac | 396 | the patch passed |
   | -0 | checkstyle | 40 | hadoop-hdds: The patch generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 754 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 80 | hadoop-hdds generated 2 new + 16 unchanged - 0 fixed = 
18 total (was 16) |
   | -1 | findbugs | 382 | hadoop-ozone in the patch failed. |
   ||| _ Other Tests _ |
   | +1 | unit | 343 | hadoop-hdds in the patch passed. |
   | -1 | unit | 415 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 40 | The patch does not generate ASF License warnings. |
   | | | 6631 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1323 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 1bda1577d9f1 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 3329257 |
   | Default Java | 1.8.0_222 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/2/artifact/out/diff-checkstyle-hadoop-hdds.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/2/artifact/out/diff-javadoc-javadoc-hadoop-hdds.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/2/artifact/out/patch-findbugs-hadoop-ozone.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/2/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/2/testReport/ |
   | Max. process+thread count | 864 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/container-service 
hadoop-ozone/tools U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/2/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301779)
Time Spent: 1h 40m  (was: 1.5h)

> Performance test infrastructure : skip writing user data on Datanode
> 

[jira] [Work logged] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode

2019-08-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1094?focusedWorklogId=300088=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300088
 ]

ASF GitHub Bot logged work on HDDS-1094:


Author: ASF GitHub Bot
Created on: 23/Aug/19 06:44
Start Date: 23/Aug/19 06:44
Worklog Time Spent: 10m 
  Work Description: supratimdeka commented on pull request #1323: 
HDDS-1094. Performance test infrastructure : skip writing user data on 
Datanode. Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/1323#discussion_r316993125
 
 

 ##
 File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/RandomKeyGenerator.java
 ##
 @@ -251,6 +252,12 @@ public void init(OzoneConfiguration configuration) throws 
IOException {
   @Override
   public Void call() throws Exception {
 if (ozoneConfiguration != null) {
+  if 
(ozoneConfiguration.getBoolean(HddsConfigKeys.HDDS_CONTAINER_PERSISTDATA,
 
 Review comment:
   this is required because each individual test case in TestDataValidate sets 
this parameter in the RandomKeyGenerator builder. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300088)
Time Spent: 1.5h  (was: 1h 20m)

> Performance test infrastructure : skip writing user data on Datanode
> 
>
> Key: HDDS-1094
> URL: https://issues.apache.org/jira/browse/HDDS-1094
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Goal:
> It can be useful to exercise the IO and control paths in Ozone for simulated 
> large datasets without having huge disk capacity at hand. For example, this 
> will allow us to get things like container reports and incremental container 
> reports, while not needing huge cluster capacity. The 
> [SimulatedFsDataset|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java]
>  does something similar in HDFS. It has been an invaluable tool to simulate 
> large data stores.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1094?focusedWorklogId=299406=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299406
 ]

ASF GitHub Bot logged work on HDDS-1094:


Author: ASF GitHub Bot
Created on: 22/Aug/19 12:40
Start Date: 22/Aug/19 12:40
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1323: HDDS-1094. 
Performance test infrastructure : skip writing user data on Datanode. 
Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/1323#issuecomment-523888159
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 109 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 63 | Maven dependency ordering for branch |
   | +1 | mvninstall | 631 | trunk passed |
   | +1 | compile | 363 | trunk passed |
   | +1 | checkstyle | 67 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 857 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 151 | trunk passed |
   | 0 | spotbugs | 414 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 610 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 28 | Maven dependency ordering for patch |
   | +1 | mvninstall | 567 | the patch passed |
   | +1 | compile | 402 | the patch passed |
   | +1 | javac | 402 | the patch passed |
   | -0 | checkstyle | 36 | hadoop-hdds: The patch generated 12 new + 0 
unchanged - 0 fixed = 12 total (was 0) |
   | -0 | checkstyle | 36 | hadoop-ozone: The patch generated 5 new + 0 
unchanged - 0 fixed = 5 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 2 | The patch has no whitespace issues. |
   | +1 | shadedclient | 635 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 145 | the patch passed |
   | -1 | findbugs | 205 | hadoop-hdds generated 1 new + 0 unchanged - 0 fixed 
= 1 total (was 0) |
   ||| _ Other Tests _ |
   | -1 | unit | 235 | hadoop-hdds in the patch failed. |
   | -1 | unit | 2746 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 43 | The patch does not generate ASF License warnings. |
   | | | 8522 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | FindBugs | module:hadoop-hdds |
   |  |  Possible doublecheck on 
org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerFactory.instance in 
org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerFactory.getChunkManager(Configuration,
 boolean)  At 
ChunkManagerFactory.java:org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerFactory.getChunkManager(Configuration,
 boolean)  At ChunkManagerFactory.java:[lines 46-48] |
   | Failed junit tests | hadoop.ozone.container.ozoneimpl.TestOzoneContainer |
   |   | hadoop.ozone.container.server.TestSecureContainerServer |
   |   | hadoop.ozone.TestStorageContainerManager |
   |   | hadoop.ozone.TestOzoneConfigurationFields |
   |   | hadoop.hdds.scm.pipeline.TestRatisPipelineCreateAndDestory |
   |   | hadoop.ozone.client.rpc.Test2WayCommitInRatis |
   |   | hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures |
   |   | hadoop.ozone.TestMiniOzoneCluster |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1323 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux a27542af2401 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / ee7c261 |
   | Default Java | 1.8.0_222 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/1/artifact/out/diff-checkstyle-hadoop-hdds.txt
 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/1/artifact/out/diff-checkstyle-hadoop-ozone.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/1/artifact/out/new-findbugs-hadoop-hdds.html
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/1/artifact/out/patch-unit-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1323/1/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 

[jira] [Work logged] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1094?focusedWorklogId=299163=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299163
 ]

ASF GitHub Bot logged work on HDDS-1094:


Author: ASF GitHub Bot
Created on: 22/Aug/19 04:47
Start Date: 22/Aug/19 04:47
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #1323: HDDS-1094. 
Performance test infrastructure : skip writing user data on Datanode. 
Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/1323#discussion_r316494881
 
 

 ##
 File path: 
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/RandomKeyGenerator.java
 ##
 @@ -251,6 +252,12 @@ public void init(OzoneConfiguration configuration) throws 
IOException {
   @Override
   public Void call() throws Exception {
 if (ozoneConfiguration != null) {
+  if 
(ozoneConfiguration.getBoolean(HddsConfigKeys.HDDS_CONTAINER_PERSISTDATA,
 
 Review comment:
   Is this required because you are added a test for freon with the null 
ChunkManager?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 299163)
Time Spent: 1h 10m  (was: 1h)

> Performance test infrastructure : skip writing user data on Datanode
> 
>
> Key: HDDS-1094
> URL: https://issues.apache.org/jira/browse/HDDS-1094
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Goal:
> Make Ozone chunk Read/Write operations CPU/network bound for specially 
> constructed performance micro benchmarks.
> Remove disk bandwidth and latency constraints - running ozone data path 
> against extreme low-latency & high throughput storage will expose performance 
> bottlenecks in the flow. But low-latency storage(NVME flash drives, Storage 
> class memory etc) is expensive and availability is limited. Is there a 
> workaround which achieves similar running conditions for the software without 
> actually having the low latency storage? At least for specially constructed 
> datasets -  for example zero-filled blocks (*not* zero-length blocks).
> Required characteristics of the solution:
> No changes in Ozone client, OM and SCM. Changes limited to Datanode, Minimal 
> footprint in datanode code.
> Possible High level Approach:
> The ChunkManager and ChunkUtils can enable writeChunk for zero-filled chunks 
> to be dropped without actually writing to the local filesystem. Similarly, if 
> readChunk can construct a zero-filled buffer without reading from the local 
> filesystem whenever it detects a zero-filled chunk. Specifics of how to 
> detect and record a zero-filled chunk can be discussed on this jira. Also 
> discuss how to control this behaviour and make it available only for internal 
> testing.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1094?focusedWorklogId=299160=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299160
 ]

ASF GitHub Bot logged work on HDDS-1094:


Author: ASF GitHub Bot
Created on: 22/Aug/19 04:40
Start Date: 22/Aug/19 04:40
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #1323: HDDS-1094. 
Performance test infrastructure : skip writing user data on Datanode. 
Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/1323#discussion_r316493799
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/ChunkManagerFactory.java
 ##
 @@ -0,0 +1,89 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.container.keyvalue.impl;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hdds.HddsConfigKeys;
+import org.apache.hadoop.ozone.container.keyvalue.interfaces.ChunkManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_PERSISTDATA;
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_PERSISTDATA_DEFAULT;
+
+/**
+ * Select an appropriate ChunkManager implementation as per config setting.
+ * Ozone ChunkManager is a Singleton
+ */
+public final class ChunkManagerFactory {
+  static final Logger LOG = LoggerFactory.getLogger(ChunkManagerFactory.class);
+
+  private static ChunkManager instance = null;
+  private static boolean syncChunks = false;
+
+  private ChunkManagerFactory() {
+  }
+
+  public static ChunkManager getChunkManager(Configuration config,
+  boolean sync) {
+if (instance == null) {
+  synchronized (ChunkManagerFactory.class) {
+if (instance == null) {
+  instance = createChunkManager(config, sync);
+  syncChunks = sync;
+}
+  }
+}
+
+Preconditions.checkArgument((syncChunks == sync),
+"value of sync conflicts with previous invocation");
+return instance;
+  }
+
+  private static ChunkManager createChunkManager(Configuration config,
+  boolean sync) {
+ChunkManager manager = null;
+boolean persist = config.getBoolean(HDDS_CONTAINER_PERSISTDATA,
+HDDS_CONTAINER_PERSISTDATA_DEFAULT);
+
+if (persist == false) {
+  boolean scrubber = config.getBoolean(
+  HddsConfigKeys.HDDS_CONTAINERSCRUB_ENABLED,
+  HddsConfigKeys.HDDS_CONTAINERSCRUB_ENABLED_DEFAULT);
+  if (scrubber) {
+// Data Scrubber needs to be disabled for non-persistent chunks.
+LOG.warn("Failed to set " + HDDS_CONTAINER_PERSISTDATA + " to false."
++ " Please set " + HddsConfigKeys.HDDS_CONTAINERSCRUB_ENABLED
++ " also to false to enable non-persistent containers.");
+persist = true;
+  }
+}
+
+if (persist == true) {
 
 Review comment:
   `if (persist)1
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 299160)
Time Spent: 40m  (was: 0.5h)

> Performance test infrastructure : skip writing user data on Datanode
> 
>
> Key: HDDS-1094
> URL: https://issues.apache.org/jira/browse/HDDS-1094
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Goal:
> Make Ozone chunk Read/Write operations CPU/network bound for specially 
> constructed performance micro 

[jira] [Work logged] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1094?focusedWorklogId=299162=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299162
 ]

ASF GitHub Bot logged work on HDDS-1094:


Author: ASF GitHub Bot
Created on: 22/Aug/19 04:40
Start Date: 22/Aug/19 04:40
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #1323: HDDS-1094. 
Performance test infrastructure : skip writing user data on Datanode. 
Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/1323#discussion_r316493928
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/ChunkManagerFactory.java
 ##
 @@ -0,0 +1,89 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.container.keyvalue.impl;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hdds.HddsConfigKeys;
+import org.apache.hadoop.ozone.container.keyvalue.interfaces.ChunkManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_PERSISTDATA;
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_PERSISTDATA_DEFAULT;
+
+/**
+ * Select an appropriate ChunkManager implementation as per config setting.
+ * Ozone ChunkManager is a Singleton
+ */
+public final class ChunkManagerFactory {
+  static final Logger LOG = LoggerFactory.getLogger(ChunkManagerFactory.class);
+
+  private static ChunkManager instance = null;
+  private static boolean syncChunks = false;
+
+  private ChunkManagerFactory() {
+  }
+
+  public static ChunkManager getChunkManager(Configuration config,
+  boolean sync) {
+if (instance == null) {
+  synchronized (ChunkManagerFactory.class) {
+if (instance == null) {
+  instance = createChunkManager(config, sync);
+  syncChunks = sync;
+}
+  }
+}
+
+Preconditions.checkArgument((syncChunks == sync),
+"value of sync conflicts with previous invocation");
+return instance;
+  }
+
+  private static ChunkManager createChunkManager(Configuration config,
+  boolean sync) {
+ChunkManager manager = null;
+boolean persist = config.getBoolean(HDDS_CONTAINER_PERSISTDATA,
+HDDS_CONTAINER_PERSISTDATA_DEFAULT);
+
+if (persist == false) {
+  boolean scrubber = config.getBoolean(
+  HddsConfigKeys.HDDS_CONTAINERSCRUB_ENABLED,
+  HddsConfigKeys.HDDS_CONTAINERSCRUB_ENABLED_DEFAULT);
+  if (scrubber) {
+// Data Scrubber needs to be disabled for non-persistent chunks.
+LOG.warn("Failed to set " + HDDS_CONTAINER_PERSISTDATA + " to false."
++ " Please set " + HddsConfigKeys.HDDS_CONTAINERSCRUB_ENABLED
++ " also to false to enable non-persistent containers.");
+persist = true;
+  }
+}
+
+if (persist == true) {
+  manager = new ChunkManagerImpl(sync);
+} else {
+  LOG.warn(HDDS_CONTAINER_PERSISTDATA
 
 Review comment:
   Also augment this message to say that if this setting should never be 
enabled outside of a test environment.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 299162)
Time Spent: 1h  (was: 50m)

> Performance test infrastructure : skip writing user data on Datanode
> 
>
> Key: HDDS-1094
> URL: https://issues.apache.org/jira/browse/HDDS-1094
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  

[jira] [Work logged] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1094?focusedWorklogId=299161=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299161
 ]

ASF GitHub Bot logged work on HDDS-1094:


Author: ASF GitHub Bot
Created on: 22/Aug/19 04:40
Start Date: 22/Aug/19 04:40
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #1323: HDDS-1094. 
Performance test infrastructure : skip writing user data on Datanode. 
Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/1323#discussion_r316493799
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/ChunkManagerFactory.java
 ##
 @@ -0,0 +1,89 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.container.keyvalue.impl;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hdds.HddsConfigKeys;
+import org.apache.hadoop.ozone.container.keyvalue.interfaces.ChunkManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_PERSISTDATA;
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_PERSISTDATA_DEFAULT;
+
+/**
+ * Select an appropriate ChunkManager implementation as per config setting.
+ * Ozone ChunkManager is a Singleton
+ */
+public final class ChunkManagerFactory {
+  static final Logger LOG = LoggerFactory.getLogger(ChunkManagerFactory.class);
+
+  private static ChunkManager instance = null;
+  private static boolean syncChunks = false;
+
+  private ChunkManagerFactory() {
+  }
+
+  public static ChunkManager getChunkManager(Configuration config,
+  boolean sync) {
+if (instance == null) {
+  synchronized (ChunkManagerFactory.class) {
+if (instance == null) {
+  instance = createChunkManager(config, sync);
+  syncChunks = sync;
+}
+  }
+}
+
+Preconditions.checkArgument((syncChunks == sync),
+"value of sync conflicts with previous invocation");
+return instance;
+  }
+
+  private static ChunkManager createChunkManager(Configuration config,
+  boolean sync) {
+ChunkManager manager = null;
+boolean persist = config.getBoolean(HDDS_CONTAINER_PERSISTDATA,
+HDDS_CONTAINER_PERSISTDATA_DEFAULT);
+
+if (persist == false) {
+  boolean scrubber = config.getBoolean(
+  HddsConfigKeys.HDDS_CONTAINERSCRUB_ENABLED,
+  HddsConfigKeys.HDDS_CONTAINERSCRUB_ENABLED_DEFAULT);
+  if (scrubber) {
+// Data Scrubber needs to be disabled for non-persistent chunks.
+LOG.warn("Failed to set " + HDDS_CONTAINER_PERSISTDATA + " to false."
++ " Please set " + HddsConfigKeys.HDDS_CONTAINERSCRUB_ENABLED
++ " also to false to enable non-persistent containers.");
+persist = true;
+  }
+}
+
+if (persist == true) {
 
 Review comment:
   `if (persist)`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 299161)
Time Spent: 50m  (was: 40m)

> Performance test infrastructure : skip writing user data on Datanode
> 
>
> Key: HDDS-1094
> URL: https://issues.apache.org/jira/browse/HDDS-1094
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Goal:
> Make Ozone chunk Read/Write operations CPU/network bound for specially 
> constructed performance micro 

[jira] [Work logged] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1094?focusedWorklogId=299159=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299159
 ]

ASF GitHub Bot logged work on HDDS-1094:


Author: ASF GitHub Bot
Created on: 22/Aug/19 04:39
Start Date: 22/Aug/19 04:39
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #1323: HDDS-1094. 
Performance test infrastructure : skip writing user data on Datanode. 
Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/1323#discussion_r316493701
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/ChunkManagerFactory.java
 ##
 @@ -0,0 +1,89 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.container.keyvalue.impl;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hdds.HddsConfigKeys;
+import org.apache.hadoop.ozone.container.keyvalue.interfaces.ChunkManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_PERSISTDATA;
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_PERSISTDATA_DEFAULT;
+
+/**
+ * Select an appropriate ChunkManager implementation as per config setting.
+ * Ozone ChunkManager is a Singleton
+ */
+public final class ChunkManagerFactory {
+  static final Logger LOG = LoggerFactory.getLogger(ChunkManagerFactory.class);
+
+  private static ChunkManager instance = null;
+  private static boolean syncChunks = false;
+
+  private ChunkManagerFactory() {
+  }
+
+  public static ChunkManager getChunkManager(Configuration config,
+  boolean sync) {
+if (instance == null) {
+  synchronized (ChunkManagerFactory.class) {
+if (instance == null) {
+  instance = createChunkManager(config, sync);
+  syncChunks = sync;
+}
+  }
+}
+
+Preconditions.checkArgument((syncChunks == sync),
+"value of sync conflicts with previous invocation");
+return instance;
+  }
+
+  private static ChunkManager createChunkManager(Configuration config,
+  boolean sync) {
+ChunkManager manager = null;
+boolean persist = config.getBoolean(HDDS_CONTAINER_PERSISTDATA,
+HDDS_CONTAINER_PERSISTDATA_DEFAULT);
+
+if (persist == false) {
 
 Review comment:
   `!persist`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 299159)
Time Spent: 0.5h  (was: 20m)

> Performance test infrastructure : skip writing user data on Datanode
> 
>
> Key: HDDS-1094
> URL: https://issues.apache.org/jira/browse/HDDS-1094
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Goal:
> Make Ozone chunk Read/Write operations CPU/network bound for specially 
> constructed performance micro benchmarks.
> Remove disk bandwidth and latency constraints - running ozone data path 
> against extreme low-latency & high throughput storage will expose performance 
> bottlenecks in the flow. But low-latency storage(NVME flash drives, Storage 
> class memory etc) is expensive and availability is limited. Is there a 
> workaround which achieves similar running conditions for the software without 
> actually having the low latency storage? At least for specially constructed 
> datasets -  for example zero-filled blocks (*not* zero-length blocks).
> 

[jira] [Work logged] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode

2019-08-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1094?focusedWorklogId=299158=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299158
 ]

ASF GitHub Bot logged work on HDDS-1094:


Author: ASF GitHub Bot
Created on: 22/Aug/19 04:37
Start Date: 22/Aug/19 04:37
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #1323: HDDS-1094. 
Performance test infrastructure : skip writing user data on Datanode. 
Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/1323#discussion_r316493551
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/ChunkManagerFactory.java
 ##
 @@ -0,0 +1,89 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.container.keyvalue.impl;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hdds.HddsConfigKeys;
+import org.apache.hadoop.ozone.container.keyvalue.interfaces.ChunkManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_PERSISTDATA;
+import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_PERSISTDATA_DEFAULT;
+
+/**
+ * Select an appropriate ChunkManager implementation as per config setting.
+ * Ozone ChunkManager is a Singleton
+ */
+public final class ChunkManagerFactory {
+  static final Logger LOG = LoggerFactory.getLogger(ChunkManagerFactory.class);
+
+  private static ChunkManager instance = null;
+  private static boolean syncChunks = false;
+
+  private ChunkManagerFactory() {
+  }
+
+  public static ChunkManager getChunkManager(Configuration config,
+  boolean sync) {
+if (instance == null) {
 
 Review comment:
   Let's remove this null check.
   
   https://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
   
   In Java, if you are using this pattern, then `instance` should be volatile.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 299158)
Time Spent: 20m  (was: 10m)

> Performance test infrastructure : skip writing user data on Datanode
> 
>
> Key: HDDS-1094
> URL: https://issues.apache.org/jira/browse/HDDS-1094
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Goal:
> Make Ozone chunk Read/Write operations CPU/network bound for specially 
> constructed performance micro benchmarks.
> Remove disk bandwidth and latency constraints - running ozone data path 
> against extreme low-latency & high throughput storage will expose performance 
> bottlenecks in the flow. But low-latency storage(NVME flash drives, Storage 
> class memory etc) is expensive and availability is limited. Is there a 
> workaround which achieves similar running conditions for the software without 
> actually having the low latency storage? At least for specially constructed 
> datasets -  for example zero-filled blocks (*not* zero-length blocks).
> Required characteristics of the solution:
> No changes in Ozone client, OM and SCM. Changes limited to Datanode, Minimal 
> footprint in datanode code.
> Possible High level Approach:
> The ChunkManager and ChunkUtils can enable writeChunk for zero-filled chunks 
> to be dropped without actually writing to the local filesystem. Similarly, if 
> readChunk can construct a zero-filled buffer without reading from the local 
> filesystem whenever 

[jira] [Work logged] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode

2019-08-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1094?focusedWorklogId=298413=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-298413
 ]

ASF GitHub Bot logged work on HDDS-1094:


Author: ASF GitHub Bot
Created on: 21/Aug/19 03:43
Start Date: 21/Aug/19 03:43
Worklog Time Spent: 10m 
  Work Description: supratimdeka commented on pull request #1323: 
HDDS-1094. Performance test infrastructure : skip writing user data on 
Datanode. Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/1323
 
 
   https://issues.apache.org/jira/browse/HDDS-1094
   
   Added an alternate ChunkManager implementation which drops all chunk writes 
without writing to disk. Chunk Reads are cooked up zero-filled buffers.
   The goal of this infrastructure is to enable high-throughput tests and 
stress the pipeline (including the Ozone metadata components) without using 
faster storage devices like flash drives.
   
   Added an extension to TestDataValidate (with the RandomKeyGenerator) to test 
the alternate ChunkManager.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 298413)
Remaining Estimate: 0h
Time Spent: 10m

> Performance test infrastructure : skip writing user data on Datanode
> 
>
> Key: HDDS-1094
> URL: https://issues.apache.org/jira/browse/HDDS-1094
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Goal:
> Make Ozone chunk Read/Write operations CPU/network bound for specially 
> constructed performance micro benchmarks.
> Remove disk bandwidth and latency constraints - running ozone data path 
> against extreme low-latency & high throughput storage will expose performance 
> bottlenecks in the flow. But low-latency storage(NVME flash drives, Storage 
> class memory etc) is expensive and availability is limited. Is there a 
> workaround which achieves similar running conditions for the software without 
> actually having the low latency storage? At least for specially constructed 
> datasets -  for example zero-filled blocks (*not* zero-length blocks).
> Required characteristics of the solution:
> No changes in Ozone client, OM and SCM. Changes limited to Datanode, Minimal 
> footprint in datanode code.
> Possible High level Approach:
> The ChunkManager and ChunkUtils can enable writeChunk for zero-filled chunks 
> to be dropped without actually writing to the local filesystem. Similarly, if 
> readChunk can construct a zero-filled buffer without reading from the local 
> filesystem whenever it detects a zero-filled chunk. Specifics of how to 
> detect and record a zero-filled chunk can be discussed on this jira. Also 
> discuss how to control this behaviour and make it available only for internal 
> testing.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org