The merging compaction typically does not strip out garbage.  Only if the
merging compaction is converted into a major compaction will it strip
garbage.  See the logic at the top of AccessGroup::run_compaction():

if (MaintenanceFlag::merging_compaction(maintenance_flags)) {
  m_needs_merging = find_merge_run(&merge_offset, &merge_length);
  if (!m_needs_merging)
    break;
  m_end_merge = (merge_offset + merge_length) == m_stores.size();
  HT_INFOF("Starting Merging Compaction of %s (end_merge=%s)",
           m_full_name.c_str(), m_end_merge ? "true" : "false");
  if (merge_length == m_stores.size())
    major = true;
  [...]
Notice how it converts to a major compaction if all of the cell stores are
marked for merging.

- Doug


On Tue, Apr 1, 2014 at 8:14 AM, David <[email protected]> wrote:

> I think originally that merge compaction does not strip out of the garbage
> (such as expired or deleted cell), but after reading  the source code of
> AccessGroup::run_compaction ,it seems that all compaction except minor
> compaction strip out of the garbage.  I think the following code fragment
> can prove my idea:
> ......
> {
>     if (m_in_memory) {
>         mscanner = new MergeScannerAccessGroup(m_table_name, scan_context,
>
>  MergeScanner::ACCUMULATE_COUNTERS);
>         scanner = mscanner;
>         m_cell_cache_manager->add_immutable_scanner(mscanner,
> scan_context);
>         filtered_cache = new CellCache();
>       }
>       else if (merging) {
>         mscanner = new MergeScannerAccessGroup(m_table_name, scan_context,
>                                                MergeScanner::IS_COMPACTION
> |
>
>  MergeScanner::RETURN_DELETES);
>         scanner = mscanner;
>         max_num_entries = 0;
>         for (size_t i=merge_offset; i<merge_offset+merge_length; i++) {
>           HT_ASSERT(m_stores[i].cs);
>
> mscanner->add_scanner(m_stores[i].cs->create_scanner(scan_context));
>           int divisor =
> (boost::any_cast<uint32_t>(m_stores[i].cs->get_trailer()->get("flags")) &
> CellStoreTrailerV6::SPLIT) ? 2: 1;
>           max_num_entries += (boost::any_cast<int64_t>
>
> (m_stores[i].cs->get_trailer()->get("total_entries")))/divisor;
>         }
>       }
>       else if (major || gc) {
>         mscanner = new MergeScannerAccessGroup(m_table_name, scan_context,
>                                                MergeScanner::IS_COMPACTION
> |
>
>  MergeScanner::ACCUMULATE_COUNTERS);
>         scanner = mscanner;
>         m_cell_cache_manager->add_immutable_scanner(mscanner,
> scan_context);
>         for (size_t i=0; i<m_stores.size(); i++) {
>           HT_ASSERT(m_stores[i].cs);
>
> mscanner->add_scanner(m_stores[i].cs->create_scanner(scan_context));
>           int divisor =
> (boost::any_cast<uint32_t>(m_stores[i].cs->get_trailer()->get("flags")) &
> CellStoreTrailerV6::SPLIT) ? 2: 1;
>           max_num_entries += (boost::any_cast<int64_t>
>
> (m_stores[i].cs->get_trailer()->get("total_entries")))/divisor;
>         }
>       }
>       else {
>         scanner =
> m_cell_cache_manager->create_immutable_scanner(scan_context);
>         HT_ASSERT(scanner);
>       }
>     }
>
>     cellstore->create(cs_file.c_str(), max_num_entries, m_cellstore_props,
> &m_identifier);
>
>     while (scanner->get(key, value)) {
>       cellstore->add(key, value);
>       if (m_in_memory)
>         filtered_cache->add(key, value);
>       scanner->forward();
>     }
> ......
>
> Obviously, merging, major and gc do the similar operation, i.e. all place
> the  data purged into the new cellstore file, so i think all strip out of
> the garbage.
> Maybe i misunderstand the source code. Any ideas can be appreciated!
>
> --
> You received this message because you are subscribed to the Google Groups
> "Hypertable Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/hypertable-dev.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Doug Judd
CEO, Hypertable Inc.

-- 
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/hypertable-dev.
For more options, visit https://groups.google.com/d/optout.

Reply via email to