Re: [fossil-users] Unintentional fork/race condition

j. v. d. hoff Sun, 13 Jan 2013 03:59:23 -0800

just my 2 cents:

1.

I agree that it should be easier (or occuring even automatically?) tomerge such random forks. the `monotone' example was given. `hg' is anotherobvious one doing that painlessly.

2.

I agree that improvement of the CLI is not given enough attention incomparison to the GUI and that is especially difficult (not reallfeasible, that is) to keep track of the branch/fork/merge structure of thetimeline when solely using the CLI. in this context: some time ago I askedthe list whether an 'ASCII art' DAG added to the timeline could be added(as an option, not as default!) which looks like this in `Mercurial':


8<--------------------------------------------------------------------
@  changeset:   230:ba70fc98b524
|  user:        u2
|  date:        Thu Dec 02 19:33:36 2010 +0100
|  summary:     inclusion of joe's changes, part 3.
|
o    changeset:   229:896e4bf421cc
|\   parent:      228:b577d53d4484
| |  parent:      227:096dd5485186
| |  user:        u2
| |  date:        Thu Dec 02 17:43:29 2010 +0100
| |  summary:     Automated merge with ssh://somehost/somefile
| o  changeset:   228:b577d53d4484
| |  parent:      226:25a0f016d4e5
| |  user:        u1
| |  date:        Thu Dec 02 17:15:43 2010 +0100
| |  summary:     - updated fig.13
| |
o |  changeset:   227:096dd5485186
|/   user:        u2
|    date:        Thu Dec 02 17:43:24 2010 +0100
|    summary:     intermediate state
|
8<--------------------------------------------------------------------

I believe if such a thing were available, even "militant" CLI users wouldbe able to keep track where they are on the graph (`hg' uses the `@' signfor *CURRENT*, by the way) and the reported problem would be lessannoying/confusing. doing this is probably somewhat tedious but at leastthe logic for drawing the graph is already in place and used in the webGUI. so maybe it is feasible in finite time...

j.

On Sun, 13 Jan 2013 07:45:51 +0100, Matt Welland <estifo...@gmail.com>wrote:

On Sat, Jan 12, 2013 at 5:31 PM, Richard Hipp <d...@sqlite.org> wrote:
On Sat, Jan 12, 2013 at 6:41 PM, Matt Welland <estifo...@gmail.com>wrote:
This is with regards to the problem described here:


http://lists.fossil-scm.org:8080/pipermail/fossil-users/2008-February/000060.html

We are seeing on the order of 3-5 of these a year in our heaviest hit
repos. While this may seem like no big deal the fact that it is sosilentis quite disruptive. The problem is that a developer working intentlyon a
problem may not notice for hours or even days that they are no longer
actually working on the main thread of development.
I contend that this points up issues with your development process, not
with Fossil. If your developers do not notice that a fork has occurredfor
days, then they are doing "heads down" programming.  They are not
maintaining situational awareness.  (
http://en.wikipedia.org/wiki/Situation_awareness)  They are fixating on
their own (small) problems and missing the big picture.  This can lead
dissatisfied customers and/or quality problems.

"Situational awareness" is usually studied in dynamic environments that
are safety critical, such as aviation and surgery.  Loss of situational
awareness is a leading cause of airplane crashes and medical errors.Lossof situational awareness is sometimes referred to as "tunnel vision".Theperson fixates on one tiny aspect of the problem and ignores the muchlarge
crisis unfolding around him.  Eastern Airlines flight 401 (
http://en.wikipedia.org/wiki/Eastern_Air_Lines_Flight_401) is a classic
example of this: All three pilots of an L-1011 where "working intently"on
a malfunctioning indicator light to the point that none of them noticed
that the plane was losing altitude until seconds before it crashed inthe
Florida Everglades.

Though usually studied in safety critical environments, situational
awareness is applicable in any complex and dynamic problem environment,
such as a developing advanced software.  When you tell me that your
developers are "intently working" on one small aspect of the problem, to
the point of not noticing for several days that the trunk as forked -that
tells me that there are likely other far more serious problems that they
are also not noticing. The fork is easily fixed with a merge. Theothermore serious problems might not have such an easy fix. And they mightgo
undetected until your customer stumbles over them.

So, I would use the observation that forks are going undetected as a
symptom of more serious process problems in your organization, and
encourage you to seek ways of getting your developers to spend more time
"heads up" and looking at the big picture.
(Did you notice - "situational awareness" is kind of a big issue withme.Fossil is my effort at building a DVCS that does a better job ofpromoting
situational awareness that the other popular VCSes out there.  I'm
constantly looking for ways to enhance Fossil to promote bettersituational
awareness.  Suggestions are welcomed.)
Curious response. Did you intend to be insulting? I'm working with abunch
of very smart people who are very reluctantly learning a new tool and a
different way of doing things and forks are very confusing when theyhappen
in a scenario where they seemingly should not. We are not operating in a
disconnected fashion here. Fossil falls somewhat short in the support of
people who like to get their job done at the command line (about 80% of
users on my team). Distilling from the fossil timeline command that there
is a fork and how to fix it is not easy. It is very tiresome to have togo
back to the ui to ensure that a fork hasn't magically appeared.

Anyhow, I misunderstood the exact nature of the cause. I assumed that the
race condition lay within the users fossil process between the time thedbquery that checked for leaf and the insertion of the new checkin data into
the db. That is of course incorrect. The actual cause is that the central
database is free to receive a commit via sync after having just done asyncthat informs the users fossil process that it is fine to commit.Something
like the following:

User1           User2        central
sync
leafcheck       sync
commit          leafcheck
sync            commit       receives delta from user1 just fine
sync receives delta from user2 and now a forkexists
As you point out below that is very difficult if not impossible to "fix".
What I think would alleviate this issue would be a check for forkcreationat the end of the final sync. If a fork is found notify the user so itcan
be dealt with before confusion is created.

Just to illustrate, I think monotone deals rather nicely with the natural
but annoying creation of forks. The user is informed immediately the fork
occurs. Then the user only has to issue "mtn merge" and it does the easy
and obvious merge. With fossil I have to poll the ui to ensure I don'thavea fork, if I do have a fork I have to browse the UI and figure out thehash
id of the fork, do the merge and finally do a commit, manually doing what
could probably be mostly automated.
Contrast with git where you know when you are causing a fork because youdoit all the time and dealing with forks is just day to day business.Fossil
will silently fork and only by starting up the ui and digging around will
it become apparent that there is a fork.

In the referred to message DRH writes:

DVCSs make it very easy to fork the tree.  To listen to
Linus Torvalds you would think this is a good thing.  But
experience suggests otherwise.
I still mostly agree with this, but requiring that every developer pollthe
database for forks or risk confusion makes me think that the git approach
is perhaps not so crazy after all. If forks suck but only take seconds to
resolve, get people used to dealing with them, don't randomly create them
for no apparent reason. At least provide a heads up when they happen and
provide some help to resolve them.
In short fossil does an imperfect job of hiding the pain of forking andso
when it does occur it can be surprising and a hassle..
We added the fork detection code to the fossil wrapper which helps (we
also see forks due to time lag on syncing between remote sites) but itis
still a rather annoying problem.

My question is can this be solved by wrapping the code that determines
that we are at a leaf and the code that does the final commit with a"BEGIN
IMMEDIATE;" ... "END;"?
No.  Fossil already does that.  Has done so for years.
Ah, I saw the calls to db_begin_transaction in commit.c wrapping thecheck
for a fork and db_begin_transaction does "BEGIN" not "BEGIN IMMEDIATE".
The problem is that there are multiple disconnected replica of the
database.  You cannot (reasonably) lock them all.  See
http://en.wikipedia.org/wiki/CAP_theorem - DVCSes like Fossil choose
availability and partition tolerance and the expense of (immediate)
consistency, since consistency is easily restored later by merging inthe
rare event where it doesn't work out straight away.

To "fix" this problem (and again - I'm not yet convinced that it is a
problem that needs fixing) I think what you would need to do is createsomekind of "reservation" system for commits. Suppose user A and user Bboth
are about to commit.  Each local fossil sends a message to the central
repository that tries to "reserve" the tip of trunk for some limitedperiod
of time, say 60 seconds.  (The reservation interval might need to be
adjusted depending on network latencies). The first reservation wins.Ifuser B is second, he gets back an error that says "User A is alsotrying tocommit - wait 60 seconds and try again". That gives user B anopportunity
to go for coffee, then merge in user A's changes before he tries again
later. You can make a reasonable argument that this is a good approachtodevelopment. In terms of the CAP theorem, you are selecting CP ratherthan
the current AP.

Of course, this fix doesn't really work if you try to do a commit while
off network, since then you cannot make a reservation.  It also doesn't
work if you don't have a single central repository that everybodycommits
to.  So it isn't for everybody.  But I can understand how some
organizations would want this.
This increases the risk of leaving the db in a locked state so having a
fossil command to unlock a database would be nice.
In this same vein it would be very nice to be able to control thesqlite3timeout. I'm fairly sure that a longer timeout would give us muchbetter
behaviour in our usage model.
I have some scripting that can generate the forks and I'm willing totake
a stab at making this change but wanted to hear from the list if this
solution was worth trying.

_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
--
D. Richard Hipp
d...@sqlite.org
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users



--
Using Opera's revolutionary email client: http://www.opera.com/mail/
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Re: [fossil-users] Unintentional fork/race condition

Reply via email to