Well... someone should create a ticket for automatic merging
behavior. It seems like the sort of thing you'd really want to have
to avoid fragmentation in tables with a lot of deletion.
On Dec 8, 2007, at 5:45 PM, Chad Walters (JIRA) wrote:
[ https://issues.apache.org/jira/browse/HADOOP-2075?
page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
tabpanel#action_12549785 ]
Chad Walters commented on HADOOP-2075:
--------------------------------------
Eventually that might be true but merging is currently a manually-
triggered operation. Also, unless a more intelligent heuristic were
in place, a small region would count against a whole region server
until it was merged, which would slow down the loading.
[hbase] Bulk load and dump tools
--------------------------------
Key: HADOOP-2075
URL: https://issues.apache.org/jira/browse/
HADOOP-2075
Project: Hadoop
Issue Type: New Feature
Components: contrib/hbase
Reporter: stack
Priority: Minor
Hbase needs tools to facilitate bulk upload and possibly dumping.
Going via the current APIs, particularly if the dataset is large
and cell content is small, uploads can take a long time even when
using many concurrent clients.
PNUTS folks talked of need for a different API to manage bulk
upload/dump.
Another notion would be to somehow have the bulk loader tools
somehow write regions directly in hdfs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.