Re: [DISCUSS] Move completely to Github

2017-10-09 Thread Henry Saputra
We could have a VOTE.

But, I think there iff there is no concern or main objection from DISCUSS
thread, we could move to Github via Gtibox without VOTE.

Once done, we just need to do official ANNOUNCE to make sure dev@ at
bookkeeper and dost-log knows about the changes and what to expect.

I am +1 for it.

- Henry

On Fri, Oct 6, 2017 at 1:27 AM, Ivan Kelly  wrote:

> Shouldn't we have a vote before we start actually moving stuff?
>
> I'm going to start going through jira now, closing dead stuff, and
> adding labels for stuff that needs to be moved.
>
> -Ivan
>
> On Thu, Oct 5, 2017 at 10:16 AM, Ivan Kelly  wrote:
> > I'll do the first pass tomorrow.
> >
> > -Ivan
> >
> > On Thu, Oct 5, 2017 at 9:16 AM, Sijie Guo  wrote:
> >> That is awesome! +1 Thank you, Ivan.
> >>
> >> On Wed, Oct 4, 2017 at 11:22 PM, Ivan Kelly  wrote:
> >>
> >>> btw, I'm volunteering to do this.
> >>>
> >>> -Ivan
> >>>
> >>> On Thu, Oct 5, 2017 at 8:21 AM, Ivan Kelly  wrote:
> >>> > +1
> >>> >
> >>> > The github flow is much nicer than the jira flow. I suggest we do a
> >>> > triage on the jira issues, closing stuff that will unlikely never be
> >>> > done.
> >>> >
> >>> > It would also be worthwhile for someone to triage the issues in
> github
> >>> > periodically, catagorizing, bumping priority, closing stale stuff,
> >>> > selecting issues for releases, etc. Otherwise it's just going to grow
> >>> > into something unmanageable again.
> >>> >
> >>> > -Ivan
> >>> >
> >>> > On Thu, Oct 5, 2017 at 5:21 AM, Enrico Olivelli  >
> >>> wrote:
> >>> >> Yeah,
> >>> >> I would have sent this email this week...
> >>> >> I am totally  +1
> >>> >> Maybe we can look for some tool to move open JIRAs to GitHub
> >>> automatically
> >>> >>
> >>> >> Enrico
> >>> >>
> >>> >> 2017-10-05 5:20 GMT+02:00 Sijie Guo :
> >>> >>
> >>> >>> Hi all,
> >>> >>>
> >>> >>> It has been more than 3 months after we moved to use Github for
> issue
> >>> >>> tracking (for BP-9
> >>> >>>  >>> >>> BP-9+-+Github+issues+for+Issue+Tracking>).
> >>> >>> And we have been successfully released 4.5.0 when using Github. I
> >>> think it
> >>> >>> is the time to discuss to make JIRA readonly and all the activities
> >>> should
> >>> >>> happen in Github.
> >>> >>>
> >>> >>> Any thoughts?
> >>> >>>
> >>> >>> - Sijie
> >>> >>>
> >>>
>


git/github commit hooks

2017-10-09 Thread Sam Just
Last thursday, we a had a short discussion about possibly changing the
merge process to allow unsquashed commits and the use of the github
merge button.  One sticking point is that we'd like an automatic way
to enforce some commit message metadata requirements and formatting.

Git lets you define some hooks for validating commits and commit
messages locally, see
https://git-scm.com/book/gr/v2/Customizing-Git-Git-Hooks.
Specifically, you can define a commit-msg hook which gets to validate
the file containing the commit message before allowing the commit.  I
think https://developer.github.com/webhooks/ can be leveraged to do
the same checks on a github PR prior to allowing the PR to be merged,
but haven't had time yet to figure out precisely how.
-Sam


Re: Cookies and empty disks

2017-10-09 Thread Ivan Kelly
On Mon, Oct 9, 2017 at 6:32 PM, Venkateswara Rao Jujjuri
 wrote:
> Can we have a doc to put all these things? Thread has grown enough to cause
> confusion.
I created a github project earlier today. We can manage the different
streams of work from there. Each stream should have a doc though.

https://github.com/apache/bookkeeper/projects/1

> Immediate things.
> 1. Don't assume new bookie if journal dir is empty.
I've already created an issue for this.

> 2. Put cookies through bookie format, and bookie never boots on an empty
> cookie or mismatched cookie.
The reason it was like this in the first place was for backward
compatibility I think. If a bookie was upgraded from the software that
didn't have cookies, the cookies would be created automatically.
Perhaps this would have been better if it was a "format" like command
that would create the cookies on old bookies, but we didn't think of
it at the time. If we change the cookies, we'd probably need to add an
upgrade command also.

Let's discuss more in a doc.

> 3. We can live with operations procedure to deal with incarnation issue.
> Infact we run an automated bookie decomm script which runs through the
> entire metadata and makes sure that the bookie is not part of any ledger.
>
> For next step:
> 1. Establish incarnation support.
> 2. Deal with bitrot.
>
> Makes sense?
lgtm.

-Ivan


Re: Cookies and empty disks

2017-10-09 Thread Enrico Olivelli
I like this too.
I have no time immediately for working on this sorry.
Maybe the only blocker isse is about the boot with empty dirs which Sijie
pointed

Enrico

Il lun 9 ott 2017, 19:08 Sijie Guo  ha scritto:

> +1. I liked this summary.
>
> JV, is this related to what you were writing? or anyone else want to drive
> this?
>
> - Sijie
>
> On Mon, Oct 9, 2017 at 9:32 AM, Venkateswara Rao Jujjuri <
> jujj...@gmail.com>
> wrote:
>
> > Can we have a doc to put all these things? Thread has grown enough to
> cause
> > confusion.
> >
> > Immediate things.
> > 1. Don't assume new bookie if journal dir is empty.
> > 2. Put cookies through bookie format, and bookie never boots on an empty
> > cookie or mismatched cookie.
> > 3. We can live with operations procedure to deal with incarnation issue.
> > Infact we run an automated bookie decomm script which runs through the
> > entire metadata and makes sure that the bookie is not part of any ledger.
> >
> > For next step:
> > 1. Establish incarnation support.
> > 2. Deal with bitrot.
> >
> > Makes sense?
> >
> > JV
> >
> > On Mon, Oct 9, 2017 at 8:55 AM, Sijie Guo  wrote:
> >
> > > On Oct 9, 2017 1:54 AM, "Ivan Kelly"  wrote:
> > >
> > > Hi folks,
> > >
> > > I was travelling over the weekend, so I didn't have a chance to reply
> > > to anything on this thread. First off, as Enrico said, there's a lot
> > > of different topics being discussed at once. Perhaps each should be
> > > broken into a github issue, and then we can continue each conversation
> > > there, as it's getting a but unwieldy for email.
> > >
> > > I've created a cookie monster project, which we can throw all the
> issues
> > > into.
> > > https://github.com/apache/bookkeeper/projects/1
> > >
> > > There's a few individual opinions I'd like to give here though.
> > >
> > > > Needing the check the instance of the bookie when auditing
> > >
> > > The auditor, while it does check when bookies have disappeared, it
> > > also periodically checks all ledgers by reading the first and last
> > > entry of each segment. So even if a bookie has resurrected, the
> > > auditor will find that it is missing entries it is supposed to have.
> > >
> > > > UUID in ledger metadata
> > >
> > > At least for the write path, I'm not sure if this is needed, but
> > > consider the following.
> > >
> > > Only one writer can "vote" on the entries of the ledger. Other writers
> > > are fencing writers. A fencing writer has to hit a majority of bookies
> > > to proceed to closing the ledger. Unless a majority have been wiped,
> > > it will not proceed to close as an empty ledger. However, if a
> > > majority have been wiped, the correct behaviour would be for it not be
> > > possible to close the ledger, as we cannot know what the end of the
> > > ledger is.
> > >
> > > That said, not boot if any ledger refers to a bookie could solve this.
> > >
> > > > No ledgers referencing bookie? (Sijie's suggestion)
> > >
> > > I'm resistant this idea, because it assumes a central oracle where all
> > > ledgers can be queried. I know we currently have this, but I don't
> > > think it scales for each bookie to read the metadata of the whole
> > > system.
> > >
> > > In any case, why not instead of refusing to start if any ledgers
> > > reference the bookie, on boot the bookie checks which ledgers it is
> > > supposed to have, and if it doesn't have them, start pulling the data
> > > for them itself. While doing this replication it should avoid all new
> > > writes.
> > >
> > >
> > > Yes, that's another thing we need to improve for auto recovery. It is
> not
> > > only on boot, you need to do it periodically, in the garbage collection
> > > thread. The bookie need to scan what ledgers are missing and what
> entries
> > > are missing and replicate them.
> > >
> > >
> > >
> > > > Storing the list of files in the cookie? (Enrico's suggestion)
> > >
> > > I don't think this is needed. The purpose of the cookie is to protect
> > > against stuff like a mount not coming up, or a machine being
> > > completely wiped. We assume that on a journalled filesystem, files
> > > don't just disappear arbitrarily. There may be corruption in
> > > individual files, but see my first point.
> > >
> > > Anyhow, as I said earlier, we should decide the broad topics here and
> > > move into issues. I've made a first pass.
> > >
> > > Regards,
> > > Ivan
> > >
> >
> >
> >
> > --
> > Jvrao
> > ---
> > First they ignore you, then they laugh at you, then they fight you, then
> > you win. - Mahatma Gandhi
> >
>
-- 


-- Enrico Olivelli


Re: [VOTE] Move completely to github

2017-10-09 Thread Matteo Merli
+1

On Mon, Oct 9, 2017 at 10:08 AM Sijie Guo  wrote:

> +1
>
> On Mon, Oct 9, 2017 at 2:16 AM, Ivan Kelly  wrote:
>
> > Hi folks,
> >
> > We discussed this in
> >
> http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201710.mbox/ajax/%
> > 3CCAO2yDyarL1%2Bf7AWB89NfUH3Ji35RZVaR-CmLYV16i-1RW%2BA8Lg%40mail.
> > gmail.com%3E
> >
> > But we never formally called a vote, so here's the vote.
> >
> > The bylaws don't actually cover votes like this, so lets use lazy
> > majority, active committers.
> >
> > -Ivan
> >
>
-- 
Matteo Merli



Re: [VOTE] Move completely to github

2017-10-09 Thread Sijie Guo
+1

On Mon, Oct 9, 2017 at 2:16 AM, Ivan Kelly  wrote:

> Hi folks,
>
> We discussed this in
> http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201710.mbox/ajax/%
> 3CCAO2yDyarL1%2Bf7AWB89NfUH3Ji35RZVaR-CmLYV16i-1RW%2BA8Lg%40mail.
> gmail.com%3E
>
> But we never formally called a vote, so here's the vote.
>
> The bylaws don't actually cover votes like this, so lets use lazy
> majority, active committers.
>
> -Ivan
>


Re: Cookies and empty disks

2017-10-09 Thread Sijie Guo
+1. I liked this summary.

JV, is this related to what you were writing? or anyone else want to drive
this?

- Sijie

On Mon, Oct 9, 2017 at 9:32 AM, Venkateswara Rao Jujjuri 
wrote:

> Can we have a doc to put all these things? Thread has grown enough to cause
> confusion.
>
> Immediate things.
> 1. Don't assume new bookie if journal dir is empty.
> 2. Put cookies through bookie format, and bookie never boots on an empty
> cookie or mismatched cookie.
> 3. We can live with operations procedure to deal with incarnation issue.
> Infact we run an automated bookie decomm script which runs through the
> entire metadata and makes sure that the bookie is not part of any ledger.
>
> For next step:
> 1. Establish incarnation support.
> 2. Deal with bitrot.
>
> Makes sense?
>
> JV
>
> On Mon, Oct 9, 2017 at 8:55 AM, Sijie Guo  wrote:
>
> > On Oct 9, 2017 1:54 AM, "Ivan Kelly"  wrote:
> >
> > Hi folks,
> >
> > I was travelling over the weekend, so I didn't have a chance to reply
> > to anything on this thread. First off, as Enrico said, there's a lot
> > of different topics being discussed at once. Perhaps each should be
> > broken into a github issue, and then we can continue each conversation
> > there, as it's getting a but unwieldy for email.
> >
> > I've created a cookie monster project, which we can throw all the issues
> > into.
> > https://github.com/apache/bookkeeper/projects/1
> >
> > There's a few individual opinions I'd like to give here though.
> >
> > > Needing the check the instance of the bookie when auditing
> >
> > The auditor, while it does check when bookies have disappeared, it
> > also periodically checks all ledgers by reading the first and last
> > entry of each segment. So even if a bookie has resurrected, the
> > auditor will find that it is missing entries it is supposed to have.
> >
> > > UUID in ledger metadata
> >
> > At least for the write path, I'm not sure if this is needed, but
> > consider the following.
> >
> > Only one writer can "vote" on the entries of the ledger. Other writers
> > are fencing writers. A fencing writer has to hit a majority of bookies
> > to proceed to closing the ledger. Unless a majority have been wiped,
> > it will not proceed to close as an empty ledger. However, if a
> > majority have been wiped, the correct behaviour would be for it not be
> > possible to close the ledger, as we cannot know what the end of the
> > ledger is.
> >
> > That said, not boot if any ledger refers to a bookie could solve this.
> >
> > > No ledgers referencing bookie? (Sijie's suggestion)
> >
> > I'm resistant this idea, because it assumes a central oracle where all
> > ledgers can be queried. I know we currently have this, but I don't
> > think it scales for each bookie to read the metadata of the whole
> > system.
> >
> > In any case, why not instead of refusing to start if any ledgers
> > reference the bookie, on boot the bookie checks which ledgers it is
> > supposed to have, and if it doesn't have them, start pulling the data
> > for them itself. While doing this replication it should avoid all new
> > writes.
> >
> >
> > Yes, that's another thing we need to improve for auto recovery. It is not
> > only on boot, you need to do it periodically, in the garbage collection
> > thread. The bookie need to scan what ledgers are missing and what entries
> > are missing and replicate them.
> >
> >
> >
> > > Storing the list of files in the cookie? (Enrico's suggestion)
> >
> > I don't think this is needed. The purpose of the cookie is to protect
> > against stuff like a mount not coming up, or a machine being
> > completely wiped. We assume that on a journalled filesystem, files
> > don't just disappear arbitrarily. There may be corruption in
> > individual files, but see my first point.
> >
> > Anyhow, as I said earlier, we should decide the broad topics here and
> > move into issues. I've made a first pass.
> >
> > Regards,
> > Ivan
> >
>
>
>
> --
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi
>


Re: [VOTE] Move completely to github

2017-10-09 Thread Venkateswara Rao Jujjuri
+1

On Mon, Oct 9, 2017 at 2:17 AM, Ivan Kelly  wrote:

> My vote is a +1
>
> On Mon, Oct 9, 2017 at 11:17 AM, Ivan Kelly  wrote:
> > I forgot the timeframe. The deadline to vote is midday UTC, on
> > Thursday 12th October.
> >
> >
> > On Mon, Oct 9, 2017 at 11:16 AM, Ivan Kelly  wrote:
> >> Hi folks,
> >>
> >> We discussed this in
> >> http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/
> 201710.mbox/ajax/%3CCAO2yDyarL1%2Bf7AWB89NfUH3Ji35RZVaR-
> CmLYV16i-1RW%2BA8Lg%40mail.gmail.com%3E
> >>
> >> But we never formally called a vote, so here's the vote.
> >>
> >> The bylaws don't actually cover votes like this, so lets use lazy
> >> majority, active committers.
> >>
> >> -Ivan
>



-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi


Re: Cookies and empty disks

2017-10-09 Thread Venkateswara Rao Jujjuri
Can we have a doc to put all these things? Thread has grown enough to cause
confusion.

Immediate things.
1. Don't assume new bookie if journal dir is empty.
2. Put cookies through bookie format, and bookie never boots on an empty
cookie or mismatched cookie.
3. We can live with operations procedure to deal with incarnation issue.
Infact we run an automated bookie decomm script which runs through the
entire metadata and makes sure that the bookie is not part of any ledger.

For next step:
1. Establish incarnation support.
2. Deal with bitrot.

Makes sense?

JV

On Mon, Oct 9, 2017 at 8:55 AM, Sijie Guo  wrote:

> On Oct 9, 2017 1:54 AM, "Ivan Kelly"  wrote:
>
> Hi folks,
>
> I was travelling over the weekend, so I didn't have a chance to reply
> to anything on this thread. First off, as Enrico said, there's a lot
> of different topics being discussed at once. Perhaps each should be
> broken into a github issue, and then we can continue each conversation
> there, as it's getting a but unwieldy for email.
>
> I've created a cookie monster project, which we can throw all the issues
> into.
> https://github.com/apache/bookkeeper/projects/1
>
> There's a few individual opinions I'd like to give here though.
>
> > Needing the check the instance of the bookie when auditing
>
> The auditor, while it does check when bookies have disappeared, it
> also periodically checks all ledgers by reading the first and last
> entry of each segment. So even if a bookie has resurrected, the
> auditor will find that it is missing entries it is supposed to have.
>
> > UUID in ledger metadata
>
> At least for the write path, I'm not sure if this is needed, but
> consider the following.
>
> Only one writer can "vote" on the entries of the ledger. Other writers
> are fencing writers. A fencing writer has to hit a majority of bookies
> to proceed to closing the ledger. Unless a majority have been wiped,
> it will not proceed to close as an empty ledger. However, if a
> majority have been wiped, the correct behaviour would be for it not be
> possible to close the ledger, as we cannot know what the end of the
> ledger is.
>
> That said, not boot if any ledger refers to a bookie could solve this.
>
> > No ledgers referencing bookie? (Sijie's suggestion)
>
> I'm resistant this idea, because it assumes a central oracle where all
> ledgers can be queried. I know we currently have this, but I don't
> think it scales for each bookie to read the metadata of the whole
> system.
>
> In any case, why not instead of refusing to start if any ledgers
> reference the bookie, on boot the bookie checks which ledgers it is
> supposed to have, and if it doesn't have them, start pulling the data
> for them itself. While doing this replication it should avoid all new
> writes.
>
>
> Yes, that's another thing we need to improve for auto recovery. It is not
> only on boot, you need to do it periodically, in the garbage collection
> thread. The bookie need to scan what ledgers are missing and what entries
> are missing and replicate them.
>
>
>
> > Storing the list of files in the cookie? (Enrico's suggestion)
>
> I don't think this is needed. The purpose of the cookie is to protect
> against stuff like a mount not coming up, or a machine being
> completely wiped. We assume that on a journalled filesystem, files
> don't just disappear arbitrarily. There may be corruption in
> individual files, but see my first point.
>
> Anyhow, as I said earlier, we should decide the broad topics here and
> move into issues. I've made a first pass.
>
> Regards,
> Ivan
>



-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi


Re: Cookies and empty disks

2017-10-09 Thread Sijie Guo
On Oct 9, 2017 1:54 AM, "Ivan Kelly"  wrote:

Hi folks,

I was travelling over the weekend, so I didn't have a chance to reply
to anything on this thread. First off, as Enrico said, there's a lot
of different topics being discussed at once. Perhaps each should be
broken into a github issue, and then we can continue each conversation
there, as it's getting a but unwieldy for email.

I've created a cookie monster project, which we can throw all the issues
into.
https://github.com/apache/bookkeeper/projects/1

There's a few individual opinions I'd like to give here though.

> Needing the check the instance of the bookie when auditing

The auditor, while it does check when bookies have disappeared, it
also periodically checks all ledgers by reading the first and last
entry of each segment. So even if a bookie has resurrected, the
auditor will find that it is missing entries it is supposed to have.

> UUID in ledger metadata

At least for the write path, I'm not sure if this is needed, but
consider the following.

Only one writer can "vote" on the entries of the ledger. Other writers
are fencing writers. A fencing writer has to hit a majority of bookies
to proceed to closing the ledger. Unless a majority have been wiped,
it will not proceed to close as an empty ledger. However, if a
majority have been wiped, the correct behaviour would be for it not be
possible to close the ledger, as we cannot know what the end of the
ledger is.

That said, not boot if any ledger refers to a bookie could solve this.

> No ledgers referencing bookie? (Sijie's suggestion)

I'm resistant this idea, because it assumes a central oracle where all
ledgers can be queried. I know we currently have this, but I don't
think it scales for each bookie to read the metadata of the whole
system.

In any case, why not instead of refusing to start if any ledgers
reference the bookie, on boot the bookie checks which ledgers it is
supposed to have, and if it doesn't have them, start pulling the data
for them itself. While doing this replication it should avoid all new
writes.


Yes, that's another thing we need to improve for auto recovery. It is not
only on boot, you need to do it periodically, in the garbage collection
thread. The bookie need to scan what ledgers are missing and what entries
are missing and replicate them.



> Storing the list of files in the cookie? (Enrico's suggestion)

I don't think this is needed. The purpose of the cookie is to protect
against stuff like a mount not coming up, or a machine being
completely wiped. We assume that on a journalled filesystem, files
don't just disappear arbitrarily. There may be corruption in
individual files, but see my first point.

Anyhow, as I said earlier, we should decide the broad topics here and
move into issues. I've made a first pass.

Regards,
Ivan


Re: Cookies and empty disks

2017-10-09 Thread Sijie Guo
On Oct 9, 2017 1:13 AM, "Enrico Olivelli"  wrote:

2017-10-09 9:21 GMT+02:00 Sijie Guo :

> okay, but why do you want to track the list of files? I don't get your
idea
> here.
>


If you allow a bookie to start with a journal directory which contains the
cookie file but without the other files the bookie thinks that have been
persisted durably you will fall into the correctness issue we are talking
about, you will lose fence bits for instance.
So having a directory which contains the cookie flie is not enough to say
that the bookie is in good status.


Sure. The case you described can happen. Bit can also be corrupted. In the
case of missing other files, it is same as bit corruption. This has to be
covered by auto recovery or disk scrubber rather than reusing cookie.


-- Enrico





>
> - Sijie
>
> On Sun, Oct 8, 2017 at 11:45 PM, Enrico Olivelli 
> wrote:
>
> > 2017-10-09 7:52 GMT+02:00 Sijie Guo :
> >
> > > On Sat, Oct 7, 2017 at 9:53 AM, Enrico Olivelli 
> > > wrote:
> > >
> > > > Il sab 7 ott 2017, 00:27 Sijie Guo  ha scritto:
> > > >
> > > > > Enrico,
> > > > >
> > > > > Let's try to come to a conclusion or an agreement what we should
> fix
> > > and
> > > > > improve, before talking who is going to drive this.
> > > > >
> > > >
> > > > Sure.
> > > >
> > > > This is my point of view:
> > > > View have separate issues:
> > > > - missing checksums, to protect fence bits
> > > > - have a bug in bookie boot, we should not allow empty directories
> > > > - have a clear lifecycle for the bookie, add/remove
> > > > - deal with reincarnation of bookies
> > > > - ensuring the correctness of the contents of the directories of the
> > > bookie
> > > >
> > > > I would like to add a new point, we have rhe cookie inside every
> > > configured
> > > > directory managed by the bookie.
> > > > No cookie -> no boot
> > > > This will not be enough, we have to write in that file not only the
> > > > identity of the bookie but the list of files expected to be in the
> > > > directory.
> > > > This way you will not boot with a corrupted directory.
> > > > Config ->  list of dirs -> list of files
> > > >
> > >
> > > I am not sure why this is a new point. This is exactly what cookie is
> > > doing, no?
> > >
> >
> > Sorry, I can't find such behavior in code on master brach
> > https://github.com/apache/bookkeeper/blob/master/
> > bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Cookie.java
> >
> > I we have a copy of the cookie inside each directory (index + data +
> > journal) I mean that each file should carry the exact list of files
> > expected to be present in the directory at boot.
> > So for instance when you add a new file to the set of files on a journal
> > directory you must update the file in that directory, same for index,
> > data.
> >
> > Maybe I am missing something.
> > It seems to me that cookie contains only a list a of directories not of
> > "files"
> >
> > Enrico
> >
> >
> >
> >
> > >
> > >
> > > >
> > > > I agree on the fact that the bookie should be added (bookie format)
> > only
> > > if
> > > > there is no reference to it in zk.
> > > > The bookie format operation should write the cookie in any
configured
> > > > directory so that a bookie with empty directories won't ever start.
> > > >
> > > > I have to think more about this, but I wanted to share my first
> > thoughts
> > > >
> > > > Enrico
> > > >
> > > >
> > > > > - Sijie
> > > > >
> > > > > On Fri, Oct 6, 2017 at 1:14 PM, Enrico Olivelli <
> eolive...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > +1 for fixing the problem of missing cookie in 4.6
> > > > > >
> > > > > > Who drives the issue?
> > > > > >
> > > > > > Thank you all for the interesting points
> > > > > > Enrico
> > > > > >
> > > > > > Il ven 6 ott 2017, 21:27 Venkateswara Rao Jujjuri <
> > jujj...@gmail.com
> > > >
> > > > ha
> > > > > > scritto:
> > > > > >
> > > > > > > Thanks for the writeup Sijie, comments below.
> > > > > > >
> > > > > > > On Fri, Oct 6, 2017 at 12:14 PM, Sijie Guo  >
> > > > wrote:
> > > > > > >
> > > > > > > > I think the question is mainly around "how do we recognize
> the
> > > > > bookie"
> > > > > > or
> > > > > > > > "incarnations". And the purpose of a cookie is designed for
> > > > > addressing
> > > > > > > > "incarnations".
> > > > > > > >
> > > > > > > > I will try to cover following aspects, and will try to
answer
> > > > > questions
> > > > > > > > that Ivan and JV raised.
> > > > > > > >
> > > > > > > > - what is cookie?
> > > > > > > > - how the behavior became bad?
> > > > > > > > - how do we fix current bad behavior?
> > > > > > > > - is the cookie enough?
> > > > > > > >
> > > > > > > >
> > > > > > > > *What is Cookie?*
> > > > > > > >
> > > > > > > > Cookie is originally introduced in this commit -
> > > > > > > >
> > > > > > > https://github.com/apache/bookkeeper/commit/
> > > > > > c6cc7cca3a85603c8e935ba6d06fbf
> > > > > > > > 3d8d7a7eb5
> > > > > > > > .
> > > > > > > >
> > > > > > > > A cookie is a identifier 

Re: Cookies and empty disks

2017-10-09 Thread Ivan Kelly
>> In any case, why not instead of refusing to start if any ledgers
>> reference the bookie, on boot the bookie checks which ledgers it is
>> supposed to have,
>
> How can you do this without querying the big oracle? You can use the local
> view as source of truth. Maybe I am missing one piece

Sorry, I was unclear on this. I meant to say, that if we do go down
the big oracle route, doing this may be a better option.

-Ivan


Re: Cookies and empty disks

2017-10-09 Thread Enrico Olivelli
Il lun 9 ott 2017, 10:54 Ivan Kelly  ha scritto:

> Hi folks,
>
> I was travelling over the weekend, so I didn't have a chance to reply
> to anything on this thread. First off, as Enrico said, there's a lot
> of different topics being discussed at once. Perhaps each should be
> broken into a github issue, and then we can continue each conversation
> there, as it's getting a but unwieldy for email.
>
> I've created a cookie monster project, which we can throw all the issues
> into.
> https://github.com/apache/bookkeeper/projects/1
>
> There's a few individual opinions I'd like to give here though.
>
> > Needing the check the instance of the bookie when auditing
>
> The auditor, while it does check when bookies have disappeared, it
> also periodically checks all ledgers by reading the first and last
> entry of each segment. So even if a bookie has resurrected, the
> auditor will find that it is missing entries it is supposed to have.
>
> > UUID in ledger metadata
>
> At least for the write path, I'm not sure if this is needed, but
> consider the following.
>
> Only one writer can "vote" on the entries of the ledger. Other writers
> are fencing writers. A fencing writer has to hit a majority of bookies
> to proceed to closing the ledger. Unless a majority have been wiped,
> it will not proceed to close as an empty ledger. However, if a
> majority have been wiped, the correct behaviour would be for it not be
> possible to close the ledger, as we cannot know what the end of the
> ledger is.
>
> That said, not boot if any ledger refers to a bookie could solve this.
>
> > No ledgers referencing bookie? (Sijie's suggestion)
>
> I'm resistant this idea, because it assumes a central oracle where all
> ledgers can be queried. I know we currently have this, but I don't
> think it scales for each bookie to read the metadata of the whole
> system.
>
Makes sense

>
> In any case, why not instead of refusing to start if any ledgers
> reference the bookie, on boot the bookie checks which ledgers it is
> supposed to have,

How can you do this without querying the big oracle? You can use the local
view as source of truth. Maybe I am missing one piece

and if it doesn't have them, start pulling the data
> for them itself. While doing this replication it should avoid all new
> writes.
>
> > Storing the list of files in the cookie? (Enrico's suggestion)
>
> I don't think this is needed. The purpose of the cookie is to protect
> against stuff like a mount not coming up, or a machine being
> completely wiped. We assume that on a journalled filesystem, files
> don't just disappear arbitrarily. There may be corruption in
> individual files, but see my first point.
>

I am fine with this assumption. I never saw such type if corruption ideed.
I just wanted to enumerate as many cases of error as possible.

>
> Anyhow, as I said earlier, we should decide the broad topics here and
> move into issues. I've made a first pass.
>
> Regards,
> Ivan
>
-- 


-- Enrico Olivelli


Re: [VOTE] Move completely to github

2017-10-09 Thread Ivan Kelly
My vote is a +1

On Mon, Oct 9, 2017 at 11:17 AM, Ivan Kelly  wrote:
> I forgot the timeframe. The deadline to vote is midday UTC, on
> Thursday 12th October.
>
>
> On Mon, Oct 9, 2017 at 11:16 AM, Ivan Kelly  wrote:
>> Hi folks,
>>
>> We discussed this in
>> http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201710.mbox/ajax/%3CCAO2yDyarL1%2Bf7AWB89NfUH3Ji35RZVaR-CmLYV16i-1RW%2BA8Lg%40mail.gmail.com%3E
>>
>> But we never formally called a vote, so here's the vote.
>>
>> The bylaws don't actually cover votes like this, so lets use lazy
>> majority, active committers.
>>
>> -Ivan


Re: [VOTE] Move completely to github

2017-10-09 Thread Ivan Kelly
I forgot the timeframe. The deadline to vote is midday UTC, on
Thursday 12th October.


On Mon, Oct 9, 2017 at 11:16 AM, Ivan Kelly  wrote:
> Hi folks,
>
> We discussed this in
> http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201710.mbox/ajax/%3CCAO2yDyarL1%2Bf7AWB89NfUH3Ji35RZVaR-CmLYV16i-1RW%2BA8Lg%40mail.gmail.com%3E
>
> But we never formally called a vote, so here's the vote.
>
> The bylaws don't actually cover votes like this, so lets use lazy
> majority, active committers.
>
> -Ivan


Re: [VOTE] Move completely to github

2017-10-09 Thread Enrico Olivelli
+1
-- Enrico

2017-10-09 11:16 GMT+02:00 Ivan Kelly :

> Hi folks,
>
> We discussed this in
> http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201710.mbox/ajax/%
> 3CCAO2yDyarL1%2Bf7AWB89NfUH3Ji35RZVaR-CmLYV16i-1RW%2BA8Lg%40mail.
> gmail.com%3E
>
> But we never formally called a vote, so here's the vote.
>
> The bylaws don't actually cover votes like this, so lets use lazy
> majority, active committers.
>
> -Ivan
>


[VOTE] Move completely to github

2017-10-09 Thread Ivan Kelly
Hi folks,

We discussed this in
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201710.mbox/ajax/%3CCAO2yDyarL1%2Bf7AWB89NfUH3Ji35RZVaR-CmLYV16i-1RW%2BA8Lg%40mail.gmail.com%3E

But we never formally called a vote, so here's the vote.

The bylaws don't actually cover votes like this, so lets use lazy
majority, active committers.

-Ivan


Re: BookKeeper web ui

2017-10-09 Thread Ivan Kelly
On Sun, Oct 8, 2017 at 9:47 AM, Enrico Olivelli  wrote:
> Do you have plans for creating a web ui?
AFAIK, there are no plans for a web ui. The http stuff is to allow a
thin client, and nothing else.

I could be wrong though.

-Ivan


Re: Cookies and empty disks

2017-10-09 Thread Ivan Kelly
Hi folks,

I was travelling over the weekend, so I didn't have a chance to reply
to anything on this thread. First off, as Enrico said, there's a lot
of different topics being discussed at once. Perhaps each should be
broken into a github issue, and then we can continue each conversation
there, as it's getting a but unwieldy for email.

I've created a cookie monster project, which we can throw all the issues into.
https://github.com/apache/bookkeeper/projects/1

There's a few individual opinions I'd like to give here though.

> Needing the check the instance of the bookie when auditing

The auditor, while it does check when bookies have disappeared, it
also periodically checks all ledgers by reading the first and last
entry of each segment. So even if a bookie has resurrected, the
auditor will find that it is missing entries it is supposed to have.

> UUID in ledger metadata

At least for the write path, I'm not sure if this is needed, but
consider the following.

Only one writer can "vote" on the entries of the ledger. Other writers
are fencing writers. A fencing writer has to hit a majority of bookies
to proceed to closing the ledger. Unless a majority have been wiped,
it will not proceed to close as an empty ledger. However, if a
majority have been wiped, the correct behaviour would be for it not be
possible to close the ledger, as we cannot know what the end of the
ledger is.

That said, not boot if any ledger refers to a bookie could solve this.

> No ledgers referencing bookie? (Sijie's suggestion)

I'm resistant this idea, because it assumes a central oracle where all
ledgers can be queried. I know we currently have this, but I don't
think it scales for each bookie to read the metadata of the whole
system.

In any case, why not instead of refusing to start if any ledgers
reference the bookie, on boot the bookie checks which ledgers it is
supposed to have, and if it doesn't have them, start pulling the data
for them itself. While doing this replication it should avoid all new
writes.

> Storing the list of files in the cookie? (Enrico's suggestion)

I don't think this is needed. The purpose of the cookie is to protect
against stuff like a mount not coming up, or a machine being
completely wiped. We assume that on a journalled filesystem, files
don't just disappear arbitrarily. There may be corruption in
individual files, but see my first point.

Anyhow, as I said earlier, we should decide the broad topics here and
move into issues. I've made a first pass.

Regards,
Ivan


Re: Cookies and empty disks

2017-10-09 Thread Enrico Olivelli
2017-10-09 9:21 GMT+02:00 Sijie Guo :

> okay, but why do you want to track the list of files? I don't get your idea
> here.
>


If you allow a bookie to start with a journal directory which contains the
cookie file but without the other files the bookie thinks that have been
persisted durably you will fall into the correctness issue we are talking
about, you will lose fence bits for instance.
So having a directory which contains the cookie flie is not enough to say
that the bookie is in good status.

-- Enrico





>
> - Sijie
>
> On Sun, Oct 8, 2017 at 11:45 PM, Enrico Olivelli 
> wrote:
>
> > 2017-10-09 7:52 GMT+02:00 Sijie Guo :
> >
> > > On Sat, Oct 7, 2017 at 9:53 AM, Enrico Olivelli 
> > > wrote:
> > >
> > > > Il sab 7 ott 2017, 00:27 Sijie Guo  ha scritto:
> > > >
> > > > > Enrico,
> > > > >
> > > > > Let's try to come to a conclusion or an agreement what we should
> fix
> > > and
> > > > > improve, before talking who is going to drive this.
> > > > >
> > > >
> > > > Sure.
> > > >
> > > > This is my point of view:
> > > > View have separate issues:
> > > > - missing checksums, to protect fence bits
> > > > - have a bug in bookie boot, we should not allow empty directories
> > > > - have a clear lifecycle for the bookie, add/remove
> > > > - deal with reincarnation of bookies
> > > > - ensuring the correctness of the contents of the directories of the
> > > bookie
> > > >
> > > > I would like to add a new point, we have rhe cookie inside every
> > > configured
> > > > directory managed by the bookie.
> > > > No cookie -> no boot
> > > > This will not be enough, we have to write in that file not only the
> > > > identity of the bookie but the list of files expected to be in the
> > > > directory.
> > > > This way you will not boot with a corrupted directory.
> > > > Config ->  list of dirs -> list of files
> > > >
> > >
> > > I am not sure why this is a new point. This is exactly what cookie is
> > > doing, no?
> > >
> >
> > Sorry, I can't find such behavior in code on master brach
> > https://github.com/apache/bookkeeper/blob/master/
> > bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Cookie.java
> >
> > I we have a copy of the cookie inside each directory (index + data +
> > journal) I mean that each file should carry the exact list of files
> > expected to be present in the directory at boot.
> > So for instance when you add a new file to the set of files on a journal
> > directory you must update the file in that directory, same for index,
> > data.
> >
> > Maybe I am missing something.
> > It seems to me that cookie contains only a list a of directories not of
> > "files"
> >
> > Enrico
> >
> >
> >
> >
> > >
> > >
> > > >
> > > > I agree on the fact that the bookie should be added (bookie format)
> > only
> > > if
> > > > there is no reference to it in zk.
> > > > The bookie format operation should write the cookie in any configured
> > > > directory so that a bookie with empty directories won't ever start.
> > > >
> > > > I have to think more about this, but I wanted to share my first
> > thoughts
> > > >
> > > > Enrico
> > > >
> > > >
> > > > > - Sijie
> > > > >
> > > > > On Fri, Oct 6, 2017 at 1:14 PM, Enrico Olivelli <
> eolive...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > +1 for fixing the problem of missing cookie in 4.6
> > > > > >
> > > > > > Who drives the issue?
> > > > > >
> > > > > > Thank you all for the interesting points
> > > > > > Enrico
> > > > > >
> > > > > > Il ven 6 ott 2017, 21:27 Venkateswara Rao Jujjuri <
> > jujj...@gmail.com
> > > >
> > > > ha
> > > > > > scritto:
> > > > > >
> > > > > > > Thanks for the writeup Sijie, comments below.
> > > > > > >
> > > > > > > On Fri, Oct 6, 2017 at 12:14 PM, Sijie Guo  >
> > > > wrote:
> > > > > > >
> > > > > > > > I think the question is mainly around "how do we recognize
> the
> > > > > bookie"
> > > > > > or
> > > > > > > > "incarnations". And the purpose of a cookie is designed for
> > > > > addressing
> > > > > > > > "incarnations".
> > > > > > > >
> > > > > > > > I will try to cover following aspects, and will try to answer
> > > > > questions
> > > > > > > > that Ivan and JV raised.
> > > > > > > >
> > > > > > > > - what is cookie?
> > > > > > > > - how the behavior became bad?
> > > > > > > > - how do we fix current bad behavior?
> > > > > > > > - is the cookie enough?
> > > > > > > >
> > > > > > > >
> > > > > > > > *What is Cookie?*
> > > > > > > >
> > > > > > > > Cookie is originally introduced in this commit -
> > > > > > > >
> > > > > > > https://github.com/apache/bookkeeper/commit/
> > > > > > c6cc7cca3a85603c8e935ba6d06fbf
> > > > > > > > 3d8d7a7eb5
> > > > > > > > .
> > > > > > > >
> > > > > > > > A cookie is a identifier of a bookie. A cookie is created on
> > > > > zookeeper
> > > > > > > when
> > > > > > > > a brand new bookie joint the cluster, the cookie is
> > representing
> > > > the
> > > > > > > bookie
> > > > > > > > instance
> > > > > > > > during its lifecycle. The cookie is sto

Re: Cookies and empty disks

2017-10-09 Thread Sijie Guo
okay, but why do you want to track the list of files? I don't get your idea
here.

- Sijie

On Sun, Oct 8, 2017 at 11:45 PM, Enrico Olivelli 
wrote:

> 2017-10-09 7:52 GMT+02:00 Sijie Guo :
>
> > On Sat, Oct 7, 2017 at 9:53 AM, Enrico Olivelli 
> > wrote:
> >
> > > Il sab 7 ott 2017, 00:27 Sijie Guo  ha scritto:
> > >
> > > > Enrico,
> > > >
> > > > Let's try to come to a conclusion or an agreement what we should fix
> > and
> > > > improve, before talking who is going to drive this.
> > > >
> > >
> > > Sure.
> > >
> > > This is my point of view:
> > > View have separate issues:
> > > - missing checksums, to protect fence bits
> > > - have a bug in bookie boot, we should not allow empty directories
> > > - have a clear lifecycle for the bookie, add/remove
> > > - deal with reincarnation of bookies
> > > - ensuring the correctness of the contents of the directories of the
> > bookie
> > >
> > > I would like to add a new point, we have rhe cookie inside every
> > configured
> > > directory managed by the bookie.
> > > No cookie -> no boot
> > > This will not be enough, we have to write in that file not only the
> > > identity of the bookie but the list of files expected to be in the
> > > directory.
> > > This way you will not boot with a corrupted directory.
> > > Config ->  list of dirs -> list of files
> > >
> >
> > I am not sure why this is a new point. This is exactly what cookie is
> > doing, no?
> >
>
> Sorry, I can't find such behavior in code on master brach
> https://github.com/apache/bookkeeper/blob/master/
> bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Cookie.java
>
> I we have a copy of the cookie inside each directory (index + data +
> journal) I mean that each file should carry the exact list of files
> expected to be present in the directory at boot.
> So for instance when you add a new file to the set of files on a journal
> directory you must update the file in that directory, same for index,
> data.
>
> Maybe I am missing something.
> It seems to me that cookie contains only a list a of directories not of
> "files"
>
> Enrico
>
>
>
>
> >
> >
> > >
> > > I agree on the fact that the bookie should be added (bookie format)
> only
> > if
> > > there is no reference to it in zk.
> > > The bookie format operation should write the cookie in any configured
> > > directory so that a bookie with empty directories won't ever start.
> > >
> > > I have to think more about this, but I wanted to share my first
> thoughts
> > >
> > > Enrico
> > >
> > >
> > > > - Sijie
> > > >
> > > > On Fri, Oct 6, 2017 at 1:14 PM, Enrico Olivelli  >
> > > > wrote:
> > > >
> > > > > +1 for fixing the problem of missing cookie in 4.6
> > > > >
> > > > > Who drives the issue?
> > > > >
> > > > > Thank you all for the interesting points
> > > > > Enrico
> > > > >
> > > > > Il ven 6 ott 2017, 21:27 Venkateswara Rao Jujjuri <
> jujj...@gmail.com
> > >
> > > ha
> > > > > scritto:
> > > > >
> > > > > > Thanks for the writeup Sijie, comments below.
> > > > > >
> > > > > > On Fri, Oct 6, 2017 at 12:14 PM, Sijie Guo 
> > > wrote:
> > > > > >
> > > > > > > I think the question is mainly around "how do we recognize the
> > > > bookie"
> > > > > or
> > > > > > > "incarnations". And the purpose of a cookie is designed for
> > > > addressing
> > > > > > > "incarnations".
> > > > > > >
> > > > > > > I will try to cover following aspects, and will try to answer
> > > > questions
> > > > > > > that Ivan and JV raised.
> > > > > > >
> > > > > > > - what is cookie?
> > > > > > > - how the behavior became bad?
> > > > > > > - how do we fix current bad behavior?
> > > > > > > - is the cookie enough?
> > > > > > >
> > > > > > >
> > > > > > > *What is Cookie?*
> > > > > > >
> > > > > > > Cookie is originally introduced in this commit -
> > > > > > >
> > > > > > https://github.com/apache/bookkeeper/commit/
> > > > > c6cc7cca3a85603c8e935ba6d06fbf
> > > > > > > 3d8d7a7eb5
> > > > > > > .
> > > > > > >
> > > > > > > A cookie is a identifier of a bookie. A cookie is created on
> > > > zookeeper
> > > > > > when
> > > > > > > a brand new bookie joint the cluster, the cookie is
> representing
> > > the
> > > > > > bookie
> > > > > > > instance
> > > > > > > during its lifecycle. The cookie is stored on all the disks for
> > > > > > > verification purpose. so if any of the disks misses the cookie
> > > (e.g.
> > > > > > disks
> > > > > > > were reformat or wiped out,
> > > > > > > disks are not mounted correctly), a bookie will reject to
> start.
> > > > > > >
> > > > > > >
> > > > > > > *How the behavior became bad?*
> > > > > > >
> > > > > > > The original behavior worked as expected to use the cookie in
> > > > zookeeper
> > > > > > as
> > > > > > > the source of truth. See
> > > > > > >
> > > > > > https://github.com/apache/bookkeeper/commit/
> > > > > c6cc7cca3a85603c8e935ba6d06fbf
> > > > > > > 3d8d7a7eb5
> > > > > > >
> > > > > > >
> > > > > > > The behavior was changed at
> > > > > > >
> > > > > > https://github.com/apache/b