Re: Beancount v3

Martin Blais Mon, 13 Jul 2020 22:34:35 -0700

On Mon, Jul 6, 2020 at 5:00 AM Stefano Zacchiroli <z...@upsilon.cc> wrote:


> On Sat, Jul 04, 2020 at 02:34:35AM -0400, Martin Blais wrote:
> > Today I'm starting development on Beancount v3.
> >
> > This is going to be a pretty big change and will take a while.
> > I've laid down the details in this document:
> >
> https://docs.google.com/document/d/1qPdNXaz5zuDQ8M9uoZFyyFis7hA0G55BEfhWhrVBsfc/
>
> This is very exciting. And, as usual, your design documents are very
> interesting and insightful to read. I took some time to read through all
> of them and I'm sharing some thoughts of mine about them below.
>
> ==================================
>
>
> Directives
> ----------
>
> Having as output of beancount core two streams of clearly separated
> incomplete/syntactic v. complete/semantic directives sounds like a great
> approach. In terms of terminology, you might use the "raw v. cooked"
> terminology (which I've picked up from proof assistants years ago, but
> which I find fitting here; YMMV). It's not yet clear to me if both
> streams will be accessible to plugins (I think they should). And, if
> they are, how will they be interleaved: a single stream with both raw
> and cooked transactions? Two separate streams?
>

One will really just be an AST and the other the actual desired stream.
One reason for letting plugins access the AST would be to let them make
changes to the input *before* other plugins run over their result.
I'm not sure I have a super compelling use case for this though, and won't
bother with the extra complexity of exposing this if I don't.



Parser
> ------
>
> You mention you're gonna keep using flex/bison, which is for sure well
> known technology. However, the expressivity of bison grammars make it
> kinda hard to hack on existing parsers, raising the barrier for
> contributors. Have you considered switching to PEG parsing?
>

It's fashionable, it came through my feed a couple of times recently.
I like the idea of removing the distinction between the scanner and grammar.
But at the EOTD, what's there works pretty well and changes to the grammar
are also rare (on purpose).
Maybe a toy project for later.


Unrelated (but still on parsing), I don't understand your point about
> getting rid of the cache. Sure, we all hope it will no longer needed for
> interactive use, but it would still be useful for people building small
> services on top of relatively static Beancount ledgers; including Fava.
> Also, as the output of Beancount core is gonna be streams of protobufs,
> those will be trivial to serialize, and also cross language, why not
> imagine a cache of protobufs serialized on disks?
>

Yes, that's precisely the idea. That's why I'm linking Riegeli, it'll the
the container for that.



> The rework of includes sounds great. We have discussed it on the list in
> the past, so I guess it's your goal, but as it's not explicitly stated
> in the design doc let me repeat it here. I think the goal should be
> "include invariance", i.e., one should always be able to take an
> existing Beancount ledger in a single file and break it down in an
> arbitrary amount of smaller ledger files that include each other,
> without any semantic change. (The stated goal in your doc of being able
> to declare plugins elsewhere than in the main file will derive from
> this, but this principle is more general.)
>

Yes, that should be the goal, though I have in mind a perhaps more
restricted version where, like today, the options have to be set in the
top-level file; the only difference is that it'll barf when you try to set
options in included files (which it always should have, this is essentially
a bug fix).


The main feature I lack to have feature parity with Ledger-CLI is the
> ability to add tags to individual transaction legs. I'm assuming this
> will go hand-in-hand with relaxing the distinction between metadata/
> tags/ links (by making them syntactic sugar for metadata, I'm guessing),
> which is great, thanks!
>

You mean you'd like to have the ability to add #.... at the end of a
posting line?
That should be easy to add, but I'd have to change the schema.
Can you motivate it?
When / how / why do you need to tag individual postings whereby tagging the
transaction isn't enough?
That would be added in v2.



> Ulque
> -----
>
> This sounds like an exciting project.
>
> In addition to support for balance columns and totals, there are a bunch
> of other features that would be very welcome, like the ability to filter
> out 0 columns, or to add derived columns (e.g., differences between
> columns, to compute P&L in investments). I don't know how much you plan
> to build on top of Pandas (which will trivially offer many of these),
> but it is absolutely brilliant to see the analogy between the two
> worlds.
>
> Something I'm surprising to haven't see mentioned on this is your vision
> (which we discussed a while ago on list) that the hierarchical nature of
> the account hierarchy is kinda arbitrary and gets in the way (e.g., one
> often wants to pivot around from "Expenses:Home:Repair +
> Expenses:Car:Repair" to "Expenses:Repair:Home + Expenses:Repair:Car" as
> there is no right or wrong hierarchy there). Is this idea of being able
> to pivot around the account hierarchy, considering each component a
> facet of sort, part of your plans for Ulque, or is it out of scope?


I haven't thought much about changing anything in Beancount's core for
that, it seems to me like it belongs in the query tool, as transformations
on the account names. Just functions provided to manipulate the components
of account names would be sufficient.



> Code quality
> ------------
>
> Typing: outside of Google I've the feeling that the state-of-the-art
> static type checker is Mypy. I've myself migrated a substantial codebase
> to it and it's a vibrant environment (with a lot of involvement from
> Guido himself) and active development that goes hand in hand with the
> refinement of the type system (via periodic PEPs). I'd be weary of going
> pytype instead of Mypy, even though I realized that the type annotations
> are (supposed to be) compatible.
>

No opinion on that... I find pytype to be slow, wouldn't mind giving mypy a
try.
The more static checking the better IMHO.
Basically I need to figure out how to integrate this in Bazel, static type
errors should be treated like build errors.


How about automated code formatting via Black?
> (https://github.com/psf/black) I've recently switched to it a
> substantial code base and I find it pretty life changing. It would also
> help contributors I think, which is one of your worthwhile meta-goals
> for v3.
>

Auto-formatting actually drives me a bit crazy someitmes. One of the guys
on my team at work hasn't figured out how to setup his editor to disable
clang-format, and it'll arbitrarily fill function call arguments that were
carefully arranged for readabilty on code he didn't otherwise touch.  I'd
use auto-formatting if it was smart enough to figure out that it should
only change code near a diff hunk...  Sometimes I just want to write things
a certain way.

Maybe it's just one of those things one day you just give up and learn to
love the lack of control.


Strict payee
> ------------
>
> YAY, everything that makes possible to have even more automated sanity
> checks is a welcome addition.  I wonder if a relaxed policy where any
> new payee is OK on first use even if undeclared, unless it's "near" (as
> string distance) to a previous one would work well as a default policy.
> But that's probably a matter for a plugin anyway...
>

Yeah, I don't know, maybe.
If the right solution required dedicated syntax it might find its way a
little closer to the core than a plugin.


Unsigned debit and credit
> -------------------------
>
> This is a very concrete need, which I routinely struggle with when
> showing accounting reports extracted from Beancount (or Fava) to other
> family members. But I'm surprised you mention it as a potential feature
> for Beancount itself. Wouldn't it belong to front-ends, like Fava (or
> maybe Ulque in the future), instead? In the view of "Beancount as an
> accounting calculator", which I've always adhered too, that seems to
> belong elsewhere.
>

I agree, but the parser (the input) is located in the core.
For the output part, I think you're right.


bean-sed
> --------
>
> This is something which is not in your design documents, but seems
> important enough to me to be mentioned in light of a new Beancount
> generation. In plain text accounting we maintain two things at once: the
> semantic information captured in our books, and the syntax of those
> books, which matters more than the syntax of paper-based books (which is
> why we use Git to version and often allow ourselves to amend/curate very
> old transactions, which is something you never do with paper-based
> books, and for sure not reaching further in the past before the most
> recent book closure).
>
> But our textual books grow larger and we often need to perform batch
> changes. E.g., split an account category, merge some, rename accounts,
> etc., spanning all our books. Some of these operations are purely
> syntactic, some have impact on the semantics of our accounting data. I
> think we need a tool to automate this, more powerful than search and
> replace in vim/emacs, and with some knowledge of the data it's
> manipulating.
>
> The current style of plugins is not useful for this need. It is OK to
> patch transactions/directives post parsing, but cannot reflect those
> changes back to the textual books.
>
> Would something like this fit your vision for Beancount 3? In
> particular, I'd like to know if the raw/syntactic directives you imagine
> coming out of the new Beancount core would be close enough to the book
> concrete syntax to allow manipulation such as meddling with spacing
> Provided that, and a good pretty printer for concrete syntax, a
> "bean-sed" project with a dedicated manipulation language can probably
> be created and maintained separately of core.   j
>

I find operating on the source to have been pretty sufficient for those
things.
It would require tracking whitespace and comments in the AST in order to
ensure a full round-trip, and it's not obvious.
It doesn't seem worth the effort to me.


==================================
>
>
> > The short version is that v3's core is going to be ported to C++ using a
> > Bazel build, and the codebase will be sectioned between core and the
> rest.
> > I just merged the new build definition in master.
>
> Bazel is indeed a great build system, but you should know that, at least
> for now, it is not in Debian/Ubuntu yet. So for the time being it will
> be impossible to ship Beancount v3 on those distros (and any other
> Debian-based distro) until Bazel itself is part of Debian. Work is
> ongoing (see: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=782654
> ), but I'm unable to guess when it will actually happen.
>

As always, I'm not too concerned about packaging... more concerned about
writing good code that will compile and install easily for a long time.
I'm sure they'll figure it out.




> Cheers
> --
> Stefano Zacchiroli . z...@upsilon.cc . upsilon.cc/zack . . o . . . o . o
> Computer Science Professor . CTO Software Heritage . . . . . o . . . o o
> Former Debian Project Leader & OSI Board Director  . . . o o o . . . o .
> « the first rule of tautology club is the first rule of tautology club »
>
> --
> You received this message because you are subscribed to the Google Groups
> "Beancount" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to beancount+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/beancount/20200706090020.xr73ygh3ivlme433%40upsilon.cc
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beancount+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/CAK21%2BhOfx2L765_qVRXfOG5Y6Yo2DVj4wevf4MH_azgcgeEaGA%40mail.gmail.com.

Re: Beancount v3

Reply via email to