The merging compaction typically does not strip out garbage. Only if the
merging compaction is converted into a major compaction will it strip
garbage. See the logic at the top of AccessGroup::run_compaction():
if (MaintenanceFlag::merging_compaction(maintenance_flags)) {
m_needs_merging = find_merge_run(&merge_offset, &merge_length);
if (!m_needs_merging)
break;
m_end_merge = (merge_offset + merge_length) == m_stores.size();
HT_INFOF("Starting Merging Compaction of %s (end_merge=%s)",
m_full_name.c_str(), m_end_merge ? "true" : "false");
if (merge_length == m_stores.size())
major = true;
[...]
Notice how it converts to a major compaction if all of the cell stores are
marked for merging.
- Doug
On Tue, Apr 1, 2014 at 8:14 AM, David <[email protected]> wrote:
> I think originally that merge compaction does not strip out of the garbage
> (such as expired or deleted cell), but after reading the source code of
> AccessGroup::run_compaction ,it seems that all compaction except minor
> compaction strip out of the garbage. I think the following code fragment
> can prove my idea:
> ......
> {
> if (m_in_memory) {
> mscanner = new MergeScannerAccessGroup(m_table_name, scan_context,
>
> MergeScanner::ACCUMULATE_COUNTERS);
> scanner = mscanner;
> m_cell_cache_manager->add_immutable_scanner(mscanner,
> scan_context);
> filtered_cache = new CellCache();
> }
> else if (merging) {
> mscanner = new MergeScannerAccessGroup(m_table_name, scan_context,
> MergeScanner::IS_COMPACTION
> |
>
> MergeScanner::RETURN_DELETES);
> scanner = mscanner;
> max_num_entries = 0;
> for (size_t i=merge_offset; i<merge_offset+merge_length; i++) {
> HT_ASSERT(m_stores[i].cs);
>
> mscanner->add_scanner(m_stores[i].cs->create_scanner(scan_context));
> int divisor =
> (boost::any_cast<uint32_t>(m_stores[i].cs->get_trailer()->get("flags")) &
> CellStoreTrailerV6::SPLIT) ? 2: 1;
> max_num_entries += (boost::any_cast<int64_t>
>
> (m_stores[i].cs->get_trailer()->get("total_entries")))/divisor;
> }
> }
> else if (major || gc) {
> mscanner = new MergeScannerAccessGroup(m_table_name, scan_context,
> MergeScanner::IS_COMPACTION
> |
>
> MergeScanner::ACCUMULATE_COUNTERS);
> scanner = mscanner;
> m_cell_cache_manager->add_immutable_scanner(mscanner,
> scan_context);
> for (size_t i=0; i<m_stores.size(); i++) {
> HT_ASSERT(m_stores[i].cs);
>
> mscanner->add_scanner(m_stores[i].cs->create_scanner(scan_context));
> int divisor =
> (boost::any_cast<uint32_t>(m_stores[i].cs->get_trailer()->get("flags")) &
> CellStoreTrailerV6::SPLIT) ? 2: 1;
> max_num_entries += (boost::any_cast<int64_t>
>
> (m_stores[i].cs->get_trailer()->get("total_entries")))/divisor;
> }
> }
> else {
> scanner =
> m_cell_cache_manager->create_immutable_scanner(scan_context);
> HT_ASSERT(scanner);
> }
> }
>
> cellstore->create(cs_file.c_str(), max_num_entries, m_cellstore_props,
> &m_identifier);
>
> while (scanner->get(key, value)) {
> cellstore->add(key, value);
> if (m_in_memory)
> filtered_cache->add(key, value);
> scanner->forward();
> }
> ......
>
> Obviously, merging, major and gc do the similar operation, i.e. all place
> the data purged into the new cellstore file, so i think all strip out of
> the garbage.
> Maybe i misunderstand the source code. Any ideas can be appreciated!
>
> --
> You received this message because you are subscribed to the Google Groups
> "Hypertable Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/hypertable-dev.
> For more options, visit https://groups.google.com/d/optout.
>
--
Doug Judd
CEO, Hypertable Inc.
--
You received this message because you are subscribed to the Google Groups
"Hypertable Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/hypertable-dev.
For more options, visit https://groups.google.com/d/optout.