Very interesting, thank you all. I wonder if a single user's journal would suffice for a learning dataset in this case. For me, expenses across categories of interest are those have been stable for years. Plus, I’m willing to deal with false positives (but preferably not false negatives).
There is a kind of machine learning problem called outlier detection. I think sciki-learn library is a good starting point Excellent, thank you for the helpful pointers! A quick search brought up these, which I’ve noted down to look into when I have time(TM): https://scikit-learn.org/stable/modules/outlier_detection.html https://scikit-learn.org/stable/auto_examples/neighbors/plot_lof_outlier_detection.html On Wednesday, January 24, 2024 at 10:09:54 AM UTC-8 erical...@gmail.com wrote: This would probably be more useful if users can provide their own examples of abnormal and normal expenses. In that case, the model itself is probably not very difficult; I imagine a variety of off the shelf toolkits would work. To me, the harder part seems like making the workflow smooth and robust -- deciding how users would flag outliers, run the classifier, correct misclassifications, cause retraining to happen, etc. On Wed, Jan 24, 2024 at 10:05 AM Yichu Zhou <flyaw...@gmail.com> wrote: There is a kind of machine learning problem called outlier detection. I think sciki-learn library is a good starting point if we want to use ML techniques. But in our case, I feel the definition of “abnormal” varies on different personal situations. It might be tricky to formulate the problem properly. On Tue, Jan 23, 2024 at 21:24 Red S <redst...@gmail.com> wrote: Definitely! That's what I had in mind. Would you or others on this list have experience in how to frame the problem from a deep learning classification problem, what tools/libraries to use, and such? Pointers appreciated. On Tuesday, January 23, 2024 at 8:08:31 AM UTC-8 char...@gmail.com wrote: Sounds like a good opportunity for deep learning classification problem. On Friday, January 19, 2024 at 11:45:35 AM UTC+1 Red S wrote: I'm curious, has anyone setup Beancount scripts or reports to flag expenses that might need further attention? The situation that made me think about this is a quarterly bill that doubled multiple times after years of being stable, which is an obvious red flag. Unlike in the past, virtually of my Beancount interactions are highly automated, which combined with the fact that time is at a premium these days, causes me to miss details like this. In this particular case, a rule to flag expenses that deviate from their norm over a certain time period (monthly, annually) might be simple to write, but I was wondering for a more general, perhaps fancier solution that would learn to distinguish what's normal and call attention to what's not, as rules based solutions tend to be incomplete and require constant fiddling. -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to beancount+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/d1cffa4f-4c32-484c-b4f8-c6d4cbf748cen%40googlegroups.com.