I see. Thanks so much! Bing
On Wed, Aug 29, 2012 at 11:59 PM, N Keywal <nkey...@gmail.com> wrote: > It's not useful here: if you have a memory issue, it's when your using the > list, not when you have finished with it and set it to null. > You need to monitor the memory consumption of the jvm, both the client & > the server. > Google around these keywords, there are many examples on the web. > Google as well arrayList initialization. > > Note as well that the important is not the memory size of the structure on > disk but the size of the" List<Put> puts = new ArrayList<Put>();" before > the table put. > > On Wed, Aug 29, 2012 at 5:42 PM, Bing Li <lbl...@gmail.com> wrote: > > > Dear N Keywal, > > > > Thanks so much for your reply! > > > > The total amount of data is about 110M. The available memory is enough, > 2G. > > > > In Java, I just set a collection to NULL to collect garbage. Do you think > > it is fine? > > > > Best regards, > > Bing > > > > > > On Wed, Aug 29, 2012 at 11:22 PM, N Keywal <nkey...@gmail.com> wrote: > > > >> Hi Bing, > >> > >> You should expect HBase to be slower in the generic case: > >> 1) it writes much more data (see hbase data model), with extra columns > >> qualifiers, timestamps & so on. > >> 2) the data is written multiple times: once in the write-ahead-log, once > >> per replica on datanode & so on again. > >> 3) there are inter process calls & inter machine calls on the critical > >> path. > >> > >> This is the cost of the atomicity, reliability and scalability features. > >> With these features in mind, HBase is reasonably fast to save data on a > >> cluster. > >> > >> On your specific case (without the points 2 & 3 above), the performance > >> seems to be very bad. > >> > >> You should first look at: > >> - how much is spent in the put vs. preparing the list > >> - do you have garbage collection going on? even swap? > >> - what's the size of your final Array vs. the available memory? > >> > >> Cheers, > >> > >> N. > >> > >> > >> > >> On Wed, Aug 29, 2012 at 4:08 PM, Bing Li <lbl...@gmail.com> wrote: > >> > >>> Dear all, > >>> > >>> By the way, my HBase is in the pseudo-distributed mode. Thanks! > >>> > >>> Best regards, > >>> Bing > >>> > >>> On Wed, Aug 29, 2012 at 10:04 PM, Bing Li <lbl...@gmail.com> wrote: > >>> > >>> > Dear all, > >>> > > >>> > According to my experiences, it is very slow for HBase to save data? > >>> Am I > >>> > right? > >>> > > >>> > For example, today I need to save data in a HashMap to HBase. It took > >>> > about more than three hours. However when saving the same HashMap in > a > >>> file > >>> > in the text format with the redirected System.out, it took only 4.5 > >>> seconds! > >>> > > >>> > Why is HBase so slow? It is indexing? > >>> > > >>> > My code to save data in HBase is as follows. I think the code must be > >>> > correct. > >>> > > >>> > ...... > >>> > public synchronized void > >>> > AddVirtualOutgoingHHNeighbors(ConcurrentHashMap<String, > >>> > ConcurrentHashMap<String, Set<String>>> hhOutNeighborMap, int > >>> timingScale) > >>> > { > >>> > List<Put> puts = new ArrayList<Put>(); > >>> > > >>> > String hhNeighborRowKey; > >>> > Put hubKeyPut; > >>> > Put groupKeyPut; > >>> > Put topGroupKeyPut; > >>> > Put timingScalePut; > >>> > Put nodeKeyPut; > >>> > Put hubNeighborTypePut; > >>> > > >>> > for (Map.Entry<String, ConcurrentHashMap<String, > >>> > Set<String>>> sourceHubGroupNeighborEntry : > >>> hhOutNeighborMap.entrySet()) > >>> > { > >>> > for (Map.Entry<String, Set<String>> > >>> > groupNeighborEntry : > sourceHubGroupNeighborEntry.getValue().entrySet()) > >>> > { > >>> > for (String neighborKey : > >>> > groupNeighborEntry.getValue()) > >>> > { > >>> > hhNeighborRowKey = > >>> > NeighborStructure.HUB_HUB_NEIGHBOR_ROW + > >>> > Tools.GetAHash(sourceHubGroupNeighborEntry.getKey() + > >>> > groupNeighborEntry.getKey() + timingScale + neighborKey); > >>> > > >>> > hubKeyPut = new > >>> > Put(Bytes.toBytes(hhNeighborRowKey)); > >>> > > >>> > > hubKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), > >>> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_HUB_KEY_COLUMN), > >>> > Bytes.toBytes(sourceHubGroupNeighborEntry.getKey())); > >>> > puts.add(hubKeyPut); > >>> > > >>> > groupKeyPut = new > >>> > Put(Bytes.toBytes(hhNeighborRowKey)); > >>> > > >>> > > >>> > groupKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), > >>> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_GROUP_KEY_COLUMN), > >>> > Bytes.toBytes(groupNeighborEntry.getKey())); > >>> > puts.add(groupKeyPut); > >>> > > >>> > topGroupKeyPut = new > >>> > Put(Bytes.toBytes(hhNeighborRowKey)); > >>> > > >>> > > >>> > topGroupKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), > >>> > > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_TOP_GROUP_KEY_COLUMN), > >>> > > >>> > Bytes.toBytes(GroupRegistry.WWW().GetParentGroupKey(groupNeighborEntry.getKey()))); > >>> > puts.add(topGroupKeyPut); > >>> > > >>> > timingScalePut = new > >>> > Put(Bytes.toBytes(hhNeighborRowKey)); > >>> > > >>> > > >>> > timingScalePut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), > >>> > > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_TIMING_SCALE_COLUMN), > >>> > Bytes.toBytes(timingScale)); > >>> > puts.add(timingScalePut); > >>> > > >>> > nodeKeyPut = new > >>> > Put(Bytes.toBytes(hhNeighborRowKey)); > >>> > > >>> > > >>> > nodeKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), > >>> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_NODE_KEY_COLUMN), > >>> > Bytes.toBytes(neighborKey)); > >>> > puts.add(nodeKeyPut); > >>> > > >>> > hubNeighborTypePut = new > >>> > Put(Bytes.toBytes(hhNeighborRowKey)); > >>> > > >>> > > >>> > hubNeighborTypePut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), > >>> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_TYPE_COLUMN), > >>> > Bytes.toBytes(SocialRole.VIRTUAL_NEIGHBOR)); > >>> > puts.add(hubNeighborTypePut); > >>> > } > >>> > } > >>> > } > >>> > > >>> > try > >>> > { > >>> > this.neighborTable.put(puts); > >>> > } > >>> > catch (IOException e) > >>> > { > >>> > e.printStackTrace(); > >>> > } > >>> > } > >>> > ...... > >>> > > >>> > Thanks so much! > >>> > > >>> > Best regards, > >>> > Bing > >>> > > >>> > >> > >> > > >