Luo Chen has posted comments on this change.

Change subject: Avoid always merging old components in prefix policy
......................................................................


Patch Set 7:

(1 comment)

> After tracing some examples, I think the complexity of
 > getMergableComponentsIndex should be n*log(n), where the log base
 > is roughly 2.  It's better than what I thought of, although it
 > would be nice to be O(n).
 > 
 > I think we can add a simple "performance test" to prevent future
 > perf. regressions of this policy.  Below is my draft.  I tried
 > various values of MAX_MERGABLE_COMPONENT_SIZE_RATIO with the test,
 > it looks 1.2 indeed is good tradeoff :-)
 > 
 > public class PrefixMergePolicyTest extends TestCase {
 > 
 > private final int MAX_COMPONENT_SIZE = 100;
 > 
 > private final int MAX_COMPONENT_COUNT = 5;
 > 
 > public void test() {
 > try {
 > List<Long> sizes = new ArrayList<>();
 > ILSMMergePolicy policy = mockMergePolicy();
 > int maxNumComponents = 0;
 > int sumComponents = 0;
 > Set<Long> writeAmplification = new HashSet<>();
 > int pass = 0;
 > do {
 > pass++;
 > sizes.add(0, 1L);
 > ILSMIndex index = mockIndex(sizes);
 > policy.diskComponentAdded(index, false);
 > int length = sizes.size();
 > maxNumComponents = maxNumComponents >= length ? maxNumComponents :
 > length;
 > sumComponents += length;
 > System.out.println(sizes);
 > writeAmplification.add(sizes.get(sizes.size() - 1));
 > } while (sizes.get(sizes.size() - 1) < 100L);
 > writeAmplification.remove(1L);
 > System.out.println("max num components (read perf.): " +
 > maxNumComponents);
 > System.out.println("avg num components (read perf.): " +
 > sumComponents / pass);
 > System.out.println("write amplications (write perf.): " +
 > (writeAmplification.size() - 1));
 > Assert.assertTrue(maxNumComponents <= 7);
 > Assert.assertTrue(sumComponents / pass <= 3);
 > Assert.assertTrue(writeAmplification.size() <= 5);
 > } catch (HyracksDataException e) {
 > Assert.fail(e.getMessage());
 > }
 > }
 > 
 > private ILSMMergePolicy mockMergePolicy() {
 > Map<String, String> properties = new HashMap<>();
 > properties.put("max-mergable-component-size", 
 > String.valueOf(MAX_COMPONENT_SIZE));
 > properties.put("max-tolerance-component-count", 
 > String.valueOf(MAX_COMPONENT_COUNT));
 > ILSMMergePolicy policy = new PrefixMergePolicy();
 > policy.configure(properties);
 > return policy;
 > }
 > 
 > private ILSMIndex mockIndex(List<Long> componentSizes) throws
 > HyracksDataException {
 > List<ILSMDiskComponent> components = new ArrayList<>();
 > for (Long size : componentSizes) {
 > ILSMDiskComponent component = Mockito.mock(ILSMDiskComponent.class);
 > Mockito.when(component.getComponentSize()).thenReturn(size);
 > Mockito.when(component.getState()).thenReturn(ComponentState.READABLE_UNWRITABLE);
 > components.add(component);
 > }
 > 
 > ILSMIndex index = Mockito.mock(ILSMIndex.class);
 > Mockito.when(index.getImmutableComponents()).thenReturn(components);
 > 
 > ILSMIndexAccessor accessor = Mockito.mock(ILSMIndexAccessor.class);
 > Mockito.doAnswer(new Answer<Void>() {
 > 
 > @Override
 > public Void answer(InvocationOnMock invocation) throws Throwable {
 > List<ILSMDiskComponent> mergedComponents = invocation.getArgumentAt(1,
 > List.class);
 > long sum = 0;
 > for (ILSMDiskComponent c : mergedComponents) {
 > sum += c.getComponentSize();
 > }
 > 
 > ILSMDiskComponent component = Mockito.mock(ILSMDiskComponent.class);
 > Mockito.when(component.getComponentSize()).thenReturn(sum);
 > Mockito.when(component.getState()).thenReturn(ComponentState.READABLE_UNWRITABLE);
 > int swapIndex = components.indexOf(mergedComponents.get(0));
 > components.removeAll(mergedComponents);
 > components.add(swapIndex, component);
 > 
 > componentSizes.clear();
 > for (ILSMDiskComponent c : components) {
 > componentSizes.add(c.getComponentSize());
 > }
 > return null;
 > }
 > }).when(accessor).scheduleMerge(Mockito.any(ILSMIOOperationCallback.class),
 > Mockito.anyListOf(ILSMDiskComponent.class));
 > 
 > Mockito.when(index.createAccessor(Mockito.any(IModificationOperationCallback.class),
 > Mockito.any(ISearchOperationCallback.class))).thenReturn(accessor);
 > return index;
 > }
 > 
 > }

Done, I added this method into PrefixMergePolicyTest.
BTW, if we use a lower ratio (e.g., 0.5 instead of 1.2), we could get a better 
a even lower write amplification at the cost having more disk components. 
However, since applications Cloudberry would have many disk components any way, 
maybe having more disk components could be tolerated? (though this would be a 
future issue of evaluating merge policies :) )

https://asterix-gerrit.ics.uci.edu/#/c/1818/7/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/PrefixMergePolicy.java
File 
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/PrefixMergePolicy.java:

PS7, Line 285:  if (startComponentSize == 0) {
             :                     startComponentSize = componentSize;
             :                 }
> pull this to the beginning of the loop?
Done


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1818
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I464da3fed38cded0aee7b319a35664eae069a2ba
Gerrit-PatchSet: 7
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Luo Chen <[email protected]>
Gerrit-Reviewer: Ian Maxon <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Luo Chen <[email protected]>
Gerrit-Reviewer: Yingyi Bu <[email protected]>
Gerrit-Reviewer: abdullah alamoudi <[email protected]>
Gerrit-HasComments: Yes

Reply via email to