I made a change a little while ago that allows you to turn your import file
into a script:

from beancount.ingest.scripts_utils import ingest
...
CONFIG = [ .. list of importer instances .. ]
...
ingest(CONFIG)

This makes your .import file into a script, you can run it with a
"identify", "extract" or "file" subcommand.
(You can still use the bean-identify, bean-extract, bean-file programs with
it as before.)

But why am I mentioning this?
Well, because the purpose of doing that was to allow you to insert code
before and/or after running the ingestion processes, and also to pass in
arguments to the ingestion tools to customize it, see here:
https://bitbucket.org/blais/beancount/src/353d874f678149eb4af951d1e57b92041f7bbc7b/beancount/ingest/scripts_utils.py#lines-29

What you're looking for here is the "detect_duplicates_func".
You should be able to insert

  ingest(CONFIG, my_duplicates_func)

at the bottom of your .import file and it should be invoked.
If for whatever reason it doesn't fulfill your customization need, please
let me know.




On Sun, Sep 2, 2018 at 8:23 AM Stefano Zacchiroli <z...@upsilon.cc> wrote:

> Heya,
>   I'm using the built-in CSV importer (beancount.ingest.importers.csv)
> with bean-extract and, in spite of being documented as bare bone, it
> works perfectly fine for my need :)
>
> The only issue I'm facing is that I want to customize the behavior of
> beancount.ingest.similar.SimilarityComparator and I didn't find a way to
> do so.
>
> (In short, I've a special metadata key, bank-label, which I import from
> my CSV files and which I trust as quasi-unique ID for deduplicating
> transactions. That key + transaction date would be my ideal
> deduplication criteria. SimilarityComparator() is both more strict,
> e.g., it requires dates to be relatively near in time, without a way to
> pass a different time window; and more lax, e.g., allow amounts to vary
> a bit; than what I want.)
>
> Ideally, I'd like to write my own SimilarityComparator and pass it down
> to bean-extract via the importer configuration, but the configuration
> API doesn't allow to do so ATM. Would such a generalization be welcome
> to you, Martin? (as bug report and/or patch)
>
> Cheers
> --
> Stefano Zacchiroli . z...@upsilon.cc . upsilon.cc/zack . . o . . . o . o
> Computer Science Professor . CTO Software Heritage . . . . . o . . . o o
> Former Debian Project Leader & OSI Board Director  . . . o o o . . . o .
> « the first rule of tautology club is the first rule of tautology club »
>
> --
> You received this message because you are subscribed to the Google Groups
> "Beancount" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to beancount+unsubscr...@googlegroups.com.
> To post to this group, send email to beancount@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/beancount/20180902122320.GA27063%40upsilon.cc
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beancount+unsubscr...@googlegroups.com.
To post to this group, send email to beancount@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/CAK21%2BhP2KHCQX2%3D_wksgoXOvbhWNEO8G%2BEGDv8gcnXNag26QQw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to