hi Ludo,

> > i have prepared the rest of my commits that were needed to hunt down the 
> > shepherd hanging bug. you can find them at:
> > 
> > https://codeberg.org/attila-lendvai-patches/shepherd/commits/branch/attila
> > 
> > there's some dependency among the commits, so sending them to debbugs would 
> > be either as one big series of commits, or a hopeless labirinth of patches 
> > otherwise.
> 
> 
> Yes, but OTOH, piecemeal, focused changes sent to Debbugs are easier to
> review for me. (There are 34 commits in this branch touching different
> aspects.)


i understand that, but cutting out some of the commits from this branch is a 
lot of work at best, and not possible at worst due to semantic dependencies.

e.g. how shall i implement proper error handling without being able to inspect 
what's happening (i.e. proper logging)?

nevertheless, i'll rebase my work on the devel branch eventually. it will be a 
lot of pain in itself, but if i need to reimplement/rebase stuff by hand 
anyway, then i'll try to further sort the commits in a least-controversial 
order.


> I cherry-picked a couple of patches.
> 
> Some notes:
> 
> + 94c1143 shepherd: Add tests/startup-error.sh
> 
> Redundant with ‘tests/startup-failure.sh’ I think?


one of them just returns #f from its start lambda, while the new one throws an 
error. they exercise different code paths in shepherd.


> + e802761 service: Add custom printer for <service> records.
> 
> 
> Good idea, but the goal is to remove GOOPS, so put aside for now.


ok, i'll get rid of it (move it away into a local kludge branch). its main 
purpose is to be able to simply FORMAT some service objects into the log.


> + af2ebec service: respawn-limit: make #f mean no limit.
> 
> I’d rather not do that: one can use +inf.0 when needed.


i found the respawn-limit API somewhat confusing (it requires a cons cell with 
two numbers). i thought #f could be a simple way to disable the respawn limit; 
simple both in implementation and as an API.

FWIW, it's the first time i've ever met +inf.0

but as you wish, we can manage without this commit.


> + 095e930 shepherd: Do not respawn disabled services.
> 
> That’s already the case (see commit
> 7c88d67076a0bb1d9014b3bc23ed9c68f1c702ab; maybe we hacked it
> independently in parallel).


err, hrm... i'm not sure anymore why i created that commit.

"Respawning ~a." is printed before calling START-SERVICE (that then does honor 
the ENABLED? flag). maybe i recorded this commit without actually checking 
whether the service is respawned (as opposed to merely printing an inert log 
message).

i'll get rid of this, but the incorrect respawning message will remain a source 
of confusion.


> + dbc9150 shepherd: Increase the time range for the default respawn limit.
> 
> This arbitrary and thus debatable, but I think the current setting
> works well, doesn’t it?


the current limit will not catch services whose start-to-fail time is not in 
the ballpark of 1 sec (5 times in 7 seconds).

the startup-to-fail time of the service i'm working with is way above 1 sec.


> + e03b958 support: Add logging operators.
> + 39c2e14 shepherd: add call-with-error-handling
> 
> I like the idea: we really need those backtraces to be logged!
> There are mostly-stylistic issues that would need to be discussed
> though. I’d like logging to be less baroque; I’m not convinced by:


what do you mean by 'baroque' here? too verbose in the source code?


> + 7183c9c shepherd: Populate the code with some log lines.
> 
> This is exactly what I’d like to avoid—adding logging statements all
> around the code base, possibly redundant with existing logging
> statements that target users.
> 
> What I do want though is to have “first-class logs”, pretty much like
> what we see with ‘herd log’ etc. To me, that is much more useful than
> writing the arguments passed each and every ‘fork+exec-command’ call.


don't they serve two entirely different purposes?

1) logs meant for the users of shepherd (aka herd log)

vs

2) logs that the shepherd and service developers need to understand shepherd's 
temporal behavior.


i added every logging related code in the various pursuits of hunting down 
specific bugs.

1. bug gets triggered
2. stare at logs, have some questions
3. add some more log statements
4. goto 1.

i'm not aware of any way to efficiently inspect the temporal behavior of a 
codebase other than adding explicit log statements. ideally using multiple, 
hierarchical log categories that can be turned on and off separately, both at 
runtime and at compile time.

what i added to shepherd is a super simplified, local, mock version of that 
(short of porting/finding a proper logging library in scheme).


> I’ll have to look further that branch. I admit I have limited bandwidth
> available and, perhaps selfishly, I like to use my free-time computing
> to hack myself.


it is of course your call how you make a tradeoff between building/fixing 
things by yourself, and spending your time on external contributions.

note though that such decisions will fundamentally influence the path the 
project takes. and often it's an unseen influence in the form of contributions 
that could have manifested, but in the end did not manifest.

the input i can give here is that it would be dismaying for a contributor like 
me to see a oneliner bugfix get accepted, while the entire work that was needed 
for finding that bug remains ignored (and be unavailable when looking for the 
next bug, or remain in a local branch that regularly takes up a lot of time to 
rebase).

and the icing on the cake here is that many people, who are not enthusiastic 
enough, would not even complain. they just quietly chose not to contribute.

this ties back into other discussions about the use of debbugs, the bandwidth 
issues around contributions, the various demands like the changelog format in 
commit messages, what is seen and what is not seen (Bastiat reference), etc.


> Regardless, I’d like to thank you for your continued efforts on the
> Shepherd. In one way or another, it contributes to shaping it.


my pleaseure. and thank you for creating/maintaining shepherd! it's a lot of 
fun to work with, and it makes things possible that i wouldn't even try in e.g. 
NixOS (where i have arrived from).

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“[…] without children and old people mixing in daily life a community has no 
future and no past, only a continuous present.”
        — John Taylor Gatto (1935–2018), 'Teacher of the Year Acceptance Speech'


Reply via email to