[ 
https://issues.apache.org/jira/browse/HDFS-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379394#comment-16379394
 ] 

Tsz Wo Nicholas Sze edited comment on HDFS-13102 at 2/27/18 11:05 PM:
----------------------------------------------------------------------

Thanks for the update.  Some comments on the 007 patch:
- In SkipListNode, similar to getSkipNode, setSkipDiff and setSkipTo should 
take care about resize so the callers don't have to.  We should also remove the 
addSkipDiff method.
{code}
    public void setSkipDiff(ChildrenDiff cDiff, int level) {
      if (level < skipDiffList.size()) {
        skipDiffList.get(level).setDiff(cDiff);
      } else {
        skipDiffList.add(new SkipDiff(null, cDiff));
      }
    }

    public void setSkipTo(SkipListNode node, int level) {
      for (int i = skipDiffList.size(); i <= level; i++) {
        skipDiffList.add(null);
      }
      skipDiffList.get(level).setSkipTo(node);
    }
{code}

- In addFirst, there are two "if (combined != null)".

- We should not directly use ID_INTEGER_COMPARATOR in DirectoryDiffList.
-* In SkipListNode, change compareTo to
{code}
    public final int compareTo(final Integer that) {
      return diff.compareTo(that);
    }
{code}
-* In getMinListForRange, use
{code}
next.getDiff().compareTo(toSnapshotId) > 0)
{code}


- In getMinListForRange, we can clean up the code as follow.
{code}
  @Override
  public List<DirectoryDiff> getMinListForRange(
      int fromIndex, int toIndex, INodeDirectory dir) {
    final List<DirectoryDiff> subList = new ArrayList<>();
    final DirectoryDiff toDiff = get(toIndex);
    final int toSnapshotId = toDiff.getSnapshotId();

    for (SkipListNode current = getNode(fromIndex); current != null; ) {
      SkipListNode next = null;
      ChildrenDiff childrenDiff = null;
      for(int level = current.level(); level >= 0; level--) {
        next = current.getSkipNode(level);
        if (next != null && next.getDiff().compareTo(toSnapshotId) <= 0) {
          childrenDiff = current.getChildrenDiff(level);
          break;
        }
      }

      final DirectoryDiff curDiff = current.getDiff();
      subList.add(childrenDiff == null? curDiff
          : new DirectoryDiff(curDiff.getSnapshotId(), dir, childrenDiff);

      current = next;
    }
    return subList;
  }
{code}



was (Author: szetszwo):
Thanks for the update.  Some comments on the 007 patch:
- In SkipListNode, similar to getSkipNode, setSkipDiff and setSkipTo should 
take care about resize so the callers don't have to.  We should also remove the 
addSkipDiff method.
{code}
    public void setSkipDiff(ChildrenDiff cDiff, int level) {
      if (level < skipDiffList.size()) {
        skipDiffList.get(level).setDiff(cDiff);
      } else {
        skipDiffList.add(new SkipDiff(null, cDiff));
      }
    }

    public void setSkipTo(SkipListNode node, int level) {
      for (int i = skipDiffList.size(); i <= level; i++) {
        skipDiffList.add(null);
      }
      skipDiffList.get(level).setSkipTo(node);
    }
{code}

- In addFirst, there are two "if (combined != null)".

- We should not directly use ID_INTEGER_COMPARATOR in DirectoryDiffList.
-* In SkipListNode, change compareTo to
{code}
    public final int compareTo(final Integer that) {
      return diff.compareTo(that);
    }
{code}
-* In getMinListForRange, use
{code}
next.getDiff().compareTo(toSnapshotId) > 0)
{code}


- In getMinListForRange, is it never level < 0 unless the list is inconsistent? 
 So that we can clean up the code as follow.
{code}
  @Override
  public List<DirectoryDiff> getMinListForRange(
      int fromIndex, int toIndex, INodeDirectory dir) {
    final List<DirectoryDiff> subList = new ArrayList<>();
    final DirectoryDiff toDiff = get(toIndex);
    final int toSnapshotId = toDiff.getSnapshotId();

    for(SkipListNode current = getNode(fromIndex); current != null; ) {
      int level = current.level();
      SkipListNode next = current.getSkipNode(level);
      while (next == null || next.getDiff().compareTo(toSnapshotId) > 0) {
        level--;
        next = current.getSkipNode(level);
      }

      DirectoryDiff diff = new DirectoryDiff(current.getDiff().getSnapshotId(),
          dir, current.getChildrenDiff(level));
      subList.add(diff);
      current = next;
    }
    return subList;
  }
}
{code}


> Implement SnapshotSkipList class to store Multi level DirectoryDiffs
> --------------------------------------------------------------------
>
>                 Key: HDFS-13102
>                 URL: https://issues.apache.org/jira/browse/HDFS-13102
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: snapshots
>            Reporter: Shashikant Banerjee
>            Assignee: Shashikant Banerjee
>            Priority: Major
>         Attachments: HDFS-13102.001.patch, HDFS-13102.002.patch, 
> HDFS-13102.003.patch, HDFS-13102.004.patch, HDFS-13102.005.patch, 
> HDFS-13102.006.patch, HDFS-13102.007.patch
>
>
> HDFS-11225 explains an issue where deletion of older snapshots can take a 
> very long time in case the no of snapshot diffs is quite large for 
> directories. For any directory under a snapshot, to construct the children 
> list , it needs to combine all the diffs from that particular snapshot to the 
> last snapshotDiff record and reverseApply to the current children list of the 
> directory on live fs. This can take  a significant time if the no of snapshot 
> diffs are quite large and changes per diff is significant.
> This Jira proposes to store the Directory diffs in a SnapshotSkip list, where 
> we store multi level DirectoryDiffs. At each level, the Directory Diff will 
> be cumulative diff of k snapshot diffs,
> where k is the level of a node in the list. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to