[jira] [Comment Edited] (OAK-114) MicroKernel API: specify retention policy for old revisions

2012-07-04 Thread Jukka Zitting (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406637#comment-13406637
 ] 

Jukka Zitting edited comment on OAK-114 at 7/4/12 5:53 PM:
---

bq. i can't follow this argument

Here's a snippet of code that illustrates Michael's point:

{code}
String revision = mk.getHeadRevision();
mk.commit(...);  // Could occur in another thread
TimeUnit.MINUTES.sleep(5);   // Could be any delay <10mins, or no delay at 
all
mk.getNodes("/", revision, ...);
{code}

Say the {{revision}} returned from the first call was committed something like 
an hour ago. Then by the time the {{getNodes}} call is reached it can be that 
the garbage collector has already removed that revision since it's already 
older than 10 minutes and it isn't anymore the latest revision in the 
repository.

If that problem isn't fixed, a client can't make any reasonable assumptions 
about how long it can expect a particular revision to stay alive. The only way 
for a client to guarantee that it can see a given revision for at least the 
next 10 minutes would be for it to directly commit that revision, but that's 
definitely not something we want read-only clients to be doing.

  was (Author: jukkaz):
bq. i can't follow this argument

Here's a snippet of code that illustrates Michael's point:

{code}
String revision = mk.getHeadRevision();
mk.commit(...);  // Could occur in another thread
TimeUnit.MINUTES.sleep(5);   // Could be any delay <10mins, or no delay at 
all
mk.getNodes("/", revision, ...);
{code}

Say the {{revision}} returned from the first call was committed something like 
an hour ago. Then by the time the {{getNodes}} call is reached it can be that 
the garbage collector has already removed that revision since it's already 
older than 10ms and it isn't the latest revision in the repository.

If that problem isn't fixed, a client can't make any reasonable assumptions 
about how long it can expect a particular revision to stay alive. The only way 
for a client to guarantee that it can see a given revision for at least the 
next 10 minutes would be for it to directly commit that revision, but that's 
definitely not something we want read-only clients to be doing.
  
> MicroKernel API: specify retention policy for old revisions
> ---
>
> Key: OAK-114
> URL: https://issues.apache.org/jira/browse/OAK-114
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mk
>Reporter: Stefan Guggisberg
>Assignee: Stefan Guggisberg
> Attachments: OAK-114.patch
>
>
> the MicroKernel API javadoc should specify the minimal guaranteed retention 
> period for old revisions. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (OAK-114) MicroKernel API: specify retention policy for old revisions

2012-07-05 Thread Dominique Pfister (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406910#comment-13406910
 ] 

Dominique Pfister edited comment on OAK-114 at 7/5/12 8:30 AM:
---

The javadoc is possibly not clear enough: a revision returned by 
getHeadRevision remains accessible for _at least_ 10 minutes or _even longer_ 
if it is still the head revision, regardless of the time it was committed. So 
in Jukka's snippet above, the getNodes call wouldn't fail, because only 5 
minutes passed.

Anyway, I think we really need some performance figures first, before we can 
decide whether this policy is too aggressive. OTOH, the current GC logic is 
quite small and straightforward, so it shouldn't be difficult to change it at a 
later time if need arises.

  was (Author: dpfister):
The javadoc is possibly not clear enough: a revision returned by 
getHeadRevision remains accessible for _at least_ 10 minutes or _even longer_ 
if it is still the head revision, regardless of the time it was committed. So 
in Jukka's snippet above, the getNodes call wouldn't fail, because only 10 
minutes passed.

Anyway, I think we really need some performance figures first, before we can 
decide whether this policy is too aggressive. OTOH, the current GC logic is 
quite small and straightforward, so it shouldn't be difficult to change it at a 
later time if need arises.
  
> MicroKernel API: specify retention policy for old revisions
> ---
>
> Key: OAK-114
> URL: https://issues.apache.org/jira/browse/OAK-114
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mk
>Reporter: Stefan Guggisberg
>Assignee: Stefan Guggisberg
> Attachments: OAK-114.patch
>
>
> the MicroKernel API javadoc should specify the minimal guaranteed retention 
> period for old revisions. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (OAK-114) MicroKernel API: specify retention policy for old revisions

2012-07-05 Thread Dominique Pfister (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13407201#comment-13407201
 ] 

Dominique Pfister edited comment on OAK-114 at 7/5/12 3:24 PM:
---

bq. If we are keeping track of when a particular revision was last returned by 
getHeadRevision, wouldn't it be simple to use the same mechanism to also keep 
track of when revisions are returned from or passed to other MicroKernel 
methods? I don't see how that would imply any more "complex state management" 
than what's already needed.

If we just remember the earliest revision returned by getHeadRevision, we need 
just one field and the next GC cycle can skip all revisions committed later. If 
we remember all revisions accessed, we'll end up with some possibly sparse list 
of revisions, and the GC cycle would need to re-link these revisions - modify 
parent commit, re-calculate diff - to get a consistent view.

bq. The benefit of switching from "last returned as head revision" to "last 
accessed/seen" for figuring out when a revision is still needed is that we can 
allow unused revisions expire much faster. With the "last accessed/seen" 
pattern there'll be no problem with an expiry time of just a few seconds, which 
would in most cases allow the garbage collector to be much more aggressive than 
with the 10 minute time proposed here.

I can see the advantage, but this would leave the door open for some bogus 
polling client that keeps some very old revision alive, which I'd like to avoid.


  was (Author: dpfister):
bq.{quote} If we are keeping track of when a particular revision was last 
returned by getHeadRevision, wouldn't it be simple to use the same mechanism to 
also keep track of when revisions are returned from or passed to other 
MicroKernel methods? I don't see how that would imply any more "complex state 
management" than what's already needed.{quote}

If we just remember the earliest revision returned by getHeadRevision, we need 
just one field and the next GC cycle can skip all revisions committed later. If 
we remember all revisions accessed, we'll end up with some possibly sparse list 
of revisions, and the GC cycle would need to re-link these revisions - modify 
parent commit, re-calculate diff - to get a consistent view.

bq.{quote} The benefit of switching from "last returned as head revision" to 
"last accessed/seen" for figuring out when a revision is still needed is that 
we can allow unused revisions expire much faster. With the "last accessed/seen" 
pattern there'll be no problem with an expiry time of just a few seconds, which 
would in most cases allow the garbage collector to be much more aggressive than 
with the 10 minute time proposed here.{quote}

I can see the advantage, but this would leave the door open for some bogus 
polling client that keeps some very old revision alive, which I'd like to avoid.

  
> MicroKernel API: specify retention policy for old revisions
> ---
>
> Key: OAK-114
> URL: https://issues.apache.org/jira/browse/OAK-114
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mk
>Reporter: Stefan Guggisberg
>Assignee: Stefan Guggisberg
> Attachments: OAK-114.patch
>
>
> the MicroKernel API javadoc should specify the minimal guaranteed retention 
> period for old revisions. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira