I think originally that merge compaction does not strip out of the garbage 
(such as expired or deleted cell), but after reading  the source code of 
AccessGroup::run_compaction ,it seems that all compaction except minor 
compaction strip out of the garbage.  I think the following code fragment 
can prove my idea:
...... 
{ 
    if (m_in_memory) {
        mscanner = new MergeScannerAccessGroup(m_table_name, scan_context,
                                            
 MergeScanner::ACCUMULATE_COUNTERS);
        scanner = mscanner;
        m_cell_cache_manager->add_immutable_scanner(mscanner, scan_context);
        filtered_cache = new CellCache();
      }
      else if (merging) {
        mscanner = new MergeScannerAccessGroup(m_table_name, scan_context,
                                               MergeScanner::IS_COMPACTION |
                                              
 MergeScanner::RETURN_DELETES);
        scanner = mscanner;
        max_num_entries = 0;
        for (size_t i=merge_offset; i<merge_offset+merge_length; i++) {
          HT_ASSERT(m_stores[i].cs);
          
mscanner->add_scanner(m_stores[i].cs->create_scanner(scan_context));
          int divisor = 
(boost::any_cast<uint32_t>(m_stores[i].cs->get_trailer()->get("flags")) & 
CellStoreTrailerV6::SPLIT) ? 2: 1;
          max_num_entries += (boost::any_cast<int64_t>
              
(m_stores[i].cs->get_trailer()->get("total_entries")))/divisor;
        }
      }
      else if (major || gc) {
        mscanner = new MergeScannerAccessGroup(m_table_name, scan_context, 
                                               MergeScanner::IS_COMPACTION |
                                            
 MergeScanner::ACCUMULATE_COUNTERS);
        scanner = mscanner;
        m_cell_cache_manager->add_immutable_scanner(mscanner, scan_context);
        for (size_t i=0; i<m_stores.size(); i++) {
          HT_ASSERT(m_stores[i].cs);
          
mscanner->add_scanner(m_stores[i].cs->create_scanner(scan_context));
          int divisor = 
(boost::any_cast<uint32_t>(m_stores[i].cs->get_trailer()->get("flags")) & 
CellStoreTrailerV6::SPLIT) ? 2: 1;
          max_num_entries += (boost::any_cast<int64_t>
              
(m_stores[i].cs->get_trailer()->get("total_entries")))/divisor;
        }
      }
      else {
        scanner = 
m_cell_cache_manager->create_immutable_scanner(scan_context);
        HT_ASSERT(scanner);
      }
    }

    cellstore->create(cs_file.c_str(), max_num_entries, m_cellstore_props, 
&m_identifier);

    while (scanner->get(key, value)) {
      cellstore->add(key, value);
      if (m_in_memory)
        filtered_cache->add(key, value);
      scanner->forward();
    }
......

Obviously, merging, major and gc do the similar operation, i.e. all place 
the  data purged into the new cellstore file, so i think all strip out of 
the garbage.
Maybe i misunderstand the source code. Any ideas can be appreciated!

-- 
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/hypertable-dev.
For more options, visit https://groups.google.com/d/optout.

Reply via email to