Well... someone should create a ticket for automatic merging behavior. It seems like the sort of thing you'd really want to have to avoid fragmentation in tables with a lot of deletion.

On Dec 8, 2007, at 5:45 PM, Chad Walters (JIRA) wrote:


[ https://issues.apache.org/jira/browse/HADOOP-2075? page=com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanel#action_12549785 ]

Chad Walters commented on HADOOP-2075:
--------------------------------------

Eventually that might be true but merging is currently a manually- triggered operation. Also, unless a more intelligent heuristic were in place, a small region would count against a whole region server until it was merged, which would slow down the loading.

[hbase] Bulk load and dump tools
--------------------------------

                Key: HADOOP-2075
URL: https://issues.apache.org/jira/browse/ HADOOP-2075
            Project: Hadoop
         Issue Type: New Feature
         Components: contrib/hbase
           Reporter: stack
           Priority: Minor

Hbase needs tools to facilitate bulk upload and possibly dumping. Going via the current APIs, particularly if the dataset is large and cell content is small, uploads can take a long time even when using many concurrent clients. PNUTS folks talked of need for a different API to manage bulk upload/dump. Another notion would be to somehow have the bulk loader tools somehow write regions directly in hdfs.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Reply via email to