On Mon, Dec 20, 2021 at 5:56 PM Aaron Stacy <[email protected]> wrote: > > Hi, I'm looking for suggestions for categorizing spending (not so much things > like paycheck, brokerage transactions, etc, but stuff like credit card > spending for budgeting). My ledger has around 2800 transactions over about 2 > years, so it's not a ton of data, but it seems like enough that I could > leverage something smarter than just string matching the transaction > narrations. > > Does anyone have recommendations for categorizing spending? > > I'm thinking of applying a full text search index as follows: > > - Each expense account is a "document". > - The document contents is the narration of every transaction for that > account. > - To categorize a new transaction, use an engine like Lucene to or > sklearn.TfidfVectorizer and pick the most likely account. > > Any thoughts on this approach? (aside from being over-engineered. I'm an > engineer, IDK what to tell you it's what I do)
Greetings Not sure how many transactions in my file (a command would be wonderful). My file is more complicated as I run a few sole proprietorships as well. I started with the GIFI codes from my gooberment. For me that just wasn't enough granularity (most would likely find my system WAY over the top!!!) so I added 6 more digits (sometimes I could use a couple more in fact) in a xxxx.xx.xx.xx pattern. I enter my additions into the document (the list of GIFI codes) that I started with. That document started as an 8 page file. Its now a 54 page file after about 7 years of use. Dunno how many actual line items but I separate out a lot of things to help me do good long term thinking (the type of product works better than that for longer term. When I was looking for record keeping (most commonly called accounting) software the level of granularity I thought I needed (and am now using) would have moved me into a cost area enjoyed by very large companies. I have found that ledger enables me to fairly quickly and very very accurately keep my record keeping and is so very easy to pull reports from. I tend to need to ask questions to find good queries but I am finding the support here excellent and so I now endorse 'ledger' for this use. I can call any level of account (and even combinations of accounts) as a 'document' and every transaction is listed. As for categorization - - - well - - - I look to my account doc and find the area the item - - - or service, belongs in and if its a new to me so far - - -well I add a new account number and - - - we're off to the races. HTH -- --- You received this message because you are subscribed to the Google Groups "Ledger" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/ledger-cli/CAPpdf59FSC4hi04_%3DE5TShmcEvoMTKoyUPAsj9ZEGb%2B8kqPOpA%40mail.gmail.com.
