Christopher Browne <[EMAIL PROTECTED]> writes:

> > > 1.  Ifilter performance gets increasingly /greatly/ ugly as the number
> > > of categories grows.  Doing this totally automagically means that each
> > > transaction is a "category," and if there are thousands of transactions,
> > > that's not terribly nice.
> > 
> > No, each *account* is a category. This is *NOT* duplicate matching.
> 
> Ah, yes, you're right.  "Thousands of categories" might be ugly, but the
> objection falls away pretty neatly when there is a natural
> already-present, trivial-to-fix-if-it-gets-it-wrong categorization.

Well, cmorgan and I had this conversation on #gnucash last night.  We
basically decided that what we need is an architecture where we have
the following set of maps for each import account.  I'm trying to show
this as a "tree" for simplicity, as it's a multi-layer hierarchy:

<token1>/
  <acct1> == <token_count>
  <acct2> == <token_count>
  ...
<token2>/
  <acct1> == <token_count>
  <acct2> == <token_count>
  ...

Based on this layout, the "find account" algorithm would look
something like:

1- for each token
 1a- lookup the token map
 1b- build the partial percentages for the potential accounts of that token
2- combine all the partial percentages for all the potential accounts
3- choose the account with the highest percentage (or none if some theshold is
   not met)

The algorithm to "store account" would look like:

1- for each token
  1a- lookup the token map
  1b- increment the token count for the account (or add the account to the map)

If you look closely, you'll notice that this hierarchy maps quite
nicely to the KVP tree ;)

-derek
-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       [EMAIL PROTECTED]                        PGP key available
_______________________________________________
gnucash-devel mailing list
[EMAIL PROTECTED]
http://www.gnucash.org/cgi-bin/mailman/listinfo/gnucash-devel

Reply via email to