This is fantastic, thanks for sharing, Martin. I typed up a review of my experiences with it. See the doc here <https://docs.google.com/document/d/1FsbdqbPWaupPg67kZp1CK0BoX2Ri4WhGnLa5vNTBBAs/edit#>. I tried to paste it below, but the formatting is lost (this used to work earlier).
*Overview* Extracting cashflows needed to compute IRR is challenging because of the flexibility that bookkeeping systems like Beancount need to offer in order to be useful. This is a review of beancount/experiments/returns that Martin posted recently, including what you need to get up and running, and pitfalls to watch out for. First, a review: - Fantastic documentation (as usual) and clarity in code (also as usual), along with clear output that includes the transactions, posting categorizations, templates that were identified, and extracted cashflows in the output files. - The simplicity of the generalized cashflow method (investments.py: produce_cash_flows_general()) is neat and helpful in building one’s mental model when working with this tool. - Runtime performance: it takes ~3.5 minutes to run on my transactions of about 10k investment related postings on older hardware (x86, intel core duo 1.2GHz) *Per-commodity accounting: not a hard requirement* The documentation may seem to suggest that per-commodity accounts are a requirement or are assumed by the code, but this is not actually true. The code works at the account level, and if you don’t have or use per-commodity accounting, you can still compute returns at an account level. Below are some pitfalls, solutions, and some drawbacks to not using per-commodity accounting. - The configure.py script may not work for you in this case, but it’s trivial to write your own configuration. - One problem is, it’s impossible to know which commodity to ascribe a dividend to. Even worse, reinvested dividends in the “Dividends” computation, if the cash flows into the asset account. Eg: 2020-01-02 * "Buy" Assets:Brokerage 1000 HOOL {1 USD} Assets:Bank 2020-08-01 * "Dividends" Assets:Brokerage 500 USD Income:Dividends:Brokerage ; ← not a cash account; not counted 2020-08-01 * "Reinvest dividends" Assets:Brokerage 500 HOOL {1 USD} Assets:Brokerage -500 USD ; ← not an external account; not counted However, this is only a problem if you want your total performance broken down into dividends vs. appreciation. If not, the total performance numbers you get will still be correct. Simply ignore the ex-dividends and dividends numbers. - Several of my accounts correspond to a “real” brokerage account, meaning they mix multiple commodities, cash and/or the money market settlement fund - returns.py:truncate_cash_flows() fails because my account contains both cash and a settlement fund, variably used to trade commodities. This means the cashflow can’t be reduced to a single currency based on cost, but ends up with two (cash, settlement fund). Note that I use price conversions for the settlement fund (vs. holding them at cost), to avoid generating a ton of lots with no real purpose given that the NAV of these don’t stray much from 1 USD. Fix: @@ -141,7 +144,10 @@ def truncate_cash_flows( balance = compute_balance_at(account_data.transactions, date_start) if not balance.is_empty(): cost_balance = balance.reduce(pricer.get_value, date_start) + cost_balance = cost_balance.reduce(convert.convert_position, "USD", pricer.price_map) cost_position = cost_balance.get_only_position() @@ -154,7 +160,13 @@ def truncate_cash_flows( balance = compute_balance_at(account_data.transactions, date_end) if not balance.is_empty(): cost_balance = balance.reduce(pricer.get_value, date_end) - cost_position = cost_balance.get_only_position() + cost_balance = cost_balance.reduce(convert.convert_position, "USD", pricer.price_map) + try: + cost_position = cost_balance.get_only_position() *Verification of results can be challenging:* After building a basic config file and perhaps making a few simple changes to your ledger, compute_returns.py spits out numbers. How can the correctness of these numbers be verified? One way to identify problems is to construct groupings that reflect IRRs provided by brokerages. However, the details and terminology of how IRR calculations are made can vary widely by brokerage. For example, there seems to be no agreement on what the start/end dates should be when computing “3-year performance”. It can thus be a lot of effort to figure these out even if one has just a handful of brokerages. Verifying the output of compute_returns.py may involve some effort. The challenge fundamentally seems to stem from the (fantastic) flexibility Beancount provides in constructing one's ledger, which means that the configuration for compute_returns.py needs to be correct. Short of bugs, all incorrect results are generally because of errors in configuration. So how can one verify that their configuration is correct? Thankfully, this is a place where compute_returns.py shines, with its clear, relevant output. Below, I share a few tips and personal stumbling blocks to help others figure their own: - Examine the helpful “category map” in the output files. It is the set of all accounts that are involved in transactions with an asset account in question: cat returns.output/investments/*.org | grep "^.'" | grep None Also consider whether each of them has been categorized correctly: cat returns.output/investments/*.org | grep "^.'" - The root finding solver is initialized to 0.2, so if you see a 20.00% as your CAGR, it’s almost certainly because the solver crashed - Examining the cashflow graph: - Do the inflows and outflows make sense? With many accounts, you have a sense of how frequently you transfer money and how much (eg: every paycheck). Is that being violated? - Do the dividends make sense? - Transfers should show up as an equal inflow/outflow. If not, investigate - Spot check at least some of the transactions in the investments/*.org files to ensure their categories make sense. See if there are transactions that may be missing - Finally, building one’s intuition for what the numbers should be, how the cashflow should look like, and how compute_returns.py works, is generally helpful in spotting errors *Modifications I had to make to my ledger:* There were some gotchas that I had to work through that I’ve shared below to give a sense of what one might need to look out for: - Tighten initial pad postings, so they appear on the day of the first transaction. Else, the period used for computing returns is lengthened, resulting in incorrectly worse performance - I use the rename plugin <https://github.com/redstreet/beancount_reds_plugins/tree/master/beancount_reds_plugins/rename_accounts> I wrote to rename Expenses:Taxes:* to Income:Taxes:*, to avoid it from dominating my expenses for analysis. This meant that the income accounts were categorized as income (investments.py matches against an “Income:” regex) and therefore (incorrectly) not counted as casflow, which happens in transactions that involve tax (eg: sell to cover). I had to add “Income:Taxes:*” as a cash_account. The problems manifested as accounts with obviously incorrect IRR values (eg: -144%), though this may not always happen. - In transaction entries involving a paycheck (eg: contributions to an employer sponsored retirement account), many postings were uncategorized, and thus eventually had incorrect workflows. My solution was to insulate brokerage transactions from paycheck transactions by having a transfer account in between. I started using transfer accounts in the last few years anyway via the zerosum <https://github.com/redstreet/beancount_reds_plugins/tree/master/beancount_reds_plugins/zerosum> plugin, but older transactions didn’t - Eventually, I plan to evaluate switching to per-commodity accounts that Martin has long recommended *Configuration maintenance* As one’s ledger evolves, the reports config needs to be kept up to date. One downside of per-commodity accounts is they are likely to be opened and closed more frequently than physical accounts. Omissions of all types in the returns configuration (asset, dividend, and cash accounts) can be hard to detect. Solutions are likely to be highly specific to the conventions one uses in their ledger. For my ledger for example, I plan to ensure all accounts under “(Assets|Income):Investments:*” appear at least once in the reports config file. *In-kind transfers and returns* One case that is not covered seems to be in-kind transfers. For example: option "operating_currency" "USD" plugin "beancount.plugins.implicit_prices" 2005-01-01 commodity USD 2005-01-01 commodity HOOL 2000-01-01 open Assets:Bank 2000-01-01 open Assets:Brokerage:HOOL "STRICT" 2000-01-01 open Assets:Zero-Sum-Accounts:Transfers "STRICT" 2010-01-01 * "Buy" Assets:Zero-Sum-Accounts:Transfers 1000 HOOL {0.5 USD} Assets:Bank 2020-01-01 price HOOL 1 USD 2020-01-02 * "Transfer" Assets:Brokerage:HOOL 1000 HOOL {0.5 USD} Assets:Zero-Sum-Accounts:Transfers -1000 HOOL {0.5 USD} 2020-12-23 price HOOL 1.1 USD Using the market value on the transfer date for the cashflow value should solve this case. This appears in several cases in me ledger: moving from one brokerage to another, and options exercise. It could be argued that the former could be handled by bundling together the old and new accounts, but that assumes all accounts in question deal with only a single commodity each, which means (for me), redoing a whole bunch of historical transactions. The latter is an interesting case. For context, assume an option to buy stock ABC at 1 USD was exercised when the fair market value of ABC was 5 USD. It was then held for a year, and sold for 10 USD. It is helpful to compute the returns of holding on to the exercised stock (which is an investing decision) separately from the “compensation” portion, which is the 5-1 = 4 USD. To do this, I track them in separate accounts like so: 2015-01-01 * "Exercised" Assets:Vested -100 ABC_OPTIONS Expenses:Stock-Options 100 ABC_OPTIONS Assets:XTrade:Exercised 100 ABC {1 USD} ; FMV was 4 USD Assets:Bank 100 USD ; 100 * 1 USD 2015-01-01 price ABC 4 USD ; FMV 2015-01-01 * “Transferred to brokerage” Assets:XTrade:Exercised -100 ABC {1 USD} Assets:XTrade:Invested 100 ABC {1 USD} If stock transfers in-kind were computed in the way above, one could simply compute returns separately on Assets:XTrade:Exercised and Assets:XTrade:Invested, and distinguish between the “compensation” and the “investment” portions. *Minor feedback* - Legend in graph (in .svg): both amortized value from flows, and market value show up as a black line. In the graph, one is blue + dots, other is black - How rae trailing, rolling, and total returns period define? Since everyone defines these a bit differently, perhaps balloon help on the UI would clarify it. This script helps, meanwhile: # Run on 2020-12-23 >>> from reports import * >>> from tabulate import tabulate >>> print(tabulate(get_calendar_intervals(TODAY))) ---- ---------- ---------- 2005 2005-01-01 2006-01-01 2006 2006-01-01 2007-01-01 2007 2007-01-01 2008-01-01 2008 2008-01-01 2009-01-01 2009 2009-01-01 2010-01-01 2010 2010-01-01 2011-01-01 2011 2011-01-01 2012-01-01 2012 2012-01-01 2013-01-01 2013 2013-01-01 2014-01-01 2014 2014-01-01 2015-01-01 2015 2015-01-01 2016-01-01 2016 2016-01-01 2017-01-01 2017 2017-01-01 2018-01-01 2018 2018-01-01 2019-01-01 2019 2019-01-01 2020-01-01 2020 2020-01-01 2020-12-23 ---- ---------- ---------- >>> print(tabulate(get_cumulative_intervals(TODAY))) -------------------- ---------- ---------- 15_years_ago 2005-01-01 2020-12-23 10_years_ago 2010-01-01 2020-12-23 5_years_ago 2015-01-01 2020-12-23 4_years_ago 2016-01-01 2020-12-23 3_years_ago 2017-01-01 2020-12-23 2_years_ago 2018-01-01 2020-12-23 1_year_ago 2019-01-01 2020-12-23 ytd 2020-01-01 2020-12-23 rolling_6_months_ago 2020-06-25 2020-12-23 rolling_3_months_ago 2020-09-25 2020-12-23 -------------------- ---------- ---------- -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to beancount+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/ebba4de0-c401-4290-a119-2cab59d5b432n%40googlegroups.com.