djkevincr commented on a change in pull request #179: GORA-532: Apache Gora 
Benchmark initial pull request for review and comments
URL: https://github.com/apache/gora/pull/179#discussion_r315592665
 
 

 ##########
 File path: 
gora-benchmark/src/main/java/org/apache/gora/benchmark/GoraBenchmarkClient.java
 ##########
 @@ -0,0 +1,271 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.gora.benchmark;
+
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Properties;
+import java.util.Set;
+import java.util.Vector;
+import org.apache.gora.query.Query;
+import org.apache.gora.query.Result;
+import org.apache.gora.store.DataStore;
+import org.apache.gora.store.DataStoreFactory;
+import org.apache.gora.util.GoraException;
+import org.apache.hadoop.conf.Configuration;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import com.yahoo.ycsb.ByteIterator;
+import com.yahoo.ycsb.DB;
+import com.yahoo.ycsb.DBException;
+import com.yahoo.ycsb.Status;
+import com.yahoo.ycsb.StringByteIterator;
+import com.yahoo.ycsb.workloads.CoreWorkload;
+import org.apache.gora.benchmark.generated.User;
+
+/**
+ * The Class GoraBenchmarkClient
+ *
+ * @author sc306 This class extends the Yahoo! Cloud Service Benchmark 
benchmark
+ *         {@link #com.yahoo.ycsb.DB DB} class to provide functionality for
+ *         {@link #insert(String, String, HashMap) insert},
+ *         {@link #read(String, String, Set, HashMap) read},
+ *         {@link #scan(String, String, int, Set, Vector) scan} and
+ *         {@link #update(String, String, HashMap) update} methods as per 
Apache
+ *         Gora implementation.
+ */
+public class GoraBenchmarkClient extends DB {
+  private static final Logger LOG = 
LoggerFactory.getLogger(GoraBenchmarkClient.class);
+  private static final String FIELDS[] = User._ALL_FIELDS;
+  private static volatile boolean executed;
+  public static int fieldCount;
+  /** This is only for set to array conversion in {@link read()} method */
+  private String[] DUMMY_ARRAY = new String[0];
+  DataStore<String, User> dataStore;
+  GoraBenchmarkUtils goraBenchmarkUtils = new GoraBenchmarkUtils();
+  User user = new User();
+  private Properties prop;
+
+  public GoraBenchmarkClient() {
+  }
+
+  /***
+   * Initialisation method. This method is called once for each database
+   * instance.
+   */
+  public void init() throws DBException {
+    try {
+      // Get YCSB properties
+      prop = getProperties();
+      fieldCount = Integer
+          .parseInt(prop.getProperty(CoreWorkload.FIELD_COUNT_PROPERTY, 
CoreWorkload.FIELD_COUNT_PROPERTY_DEFAULT));
+      String keyClass = prop.getProperty("key.class", "java.lang.String");
+      String persistentClass = prop.getProperty("persistent.class", 
"org.apache.gora.benchmark.generated.User");
+      Properties p = DataStoreFactory.createProps();
+      dataStore = DataStoreFactory.getDataStore(keyClass, persistentClass, p, 
new Configuration());
 
 Review comment:
   My point is there s no need to create a datastore instance as above for a 
already initialized GoraBenchmarkClient  instance. Let's say one thread 
executes init() method and set executed boolean variable value to 'true', 
generate mapping file etc and returns. Lets say a second and third thread come 
and execute executes init(), it checks executed variable since it s true it 
will return without doing doing generation of mapping file etc. For second and 
third threads, there s overhead to creating that datastore instance and 
override the class level field. 1st, 2nd created will be garbage collected, 
when you override that class level field from second and third threads. Which 
is I think unnecessary. 
   If you move that DataStoreFactory.getDataStore into synchronize block, there 
will be no additional datastore object creation and will create only one 
instance. This is true for all the class level field value you set before the 
synchronize block. Those are overhead. Datastore creation is pretty expensive 
operation due it s network call etc nature.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to