Well, thank you for the answer, but the actual issue is that data sent by the decoder (stipulated in the conf file) is properly collected by dovecot core, but /not/ sent to the plugin : the plugin receives the original data.

This is not linked to a particular plugin (xapian, solr, squat, etc..) but seems to be a general issue of dovecot core

On 2021-02-08 01:03, John Fawcett wrote:

On 07/02/2021 18:51, Joan Moreau wrote:

more info : the function fts_parser_script_more in plugins/fts/fts-parser.c properly read the output of the script

still, the data is not sent to the FTS pligins (xapian or any other)

On 2021-02-07 17:37, Joan Moreau wrote:

more info : I am running dovecot git version

On 2021-02-07 17:15, Joan Moreau wrote:

a bit more on this, adding log in the decode2text.sh, I can see that pdftotext output the right data, but that data is /not/ transmitted to the fts plugin for indexing (only the original pdf code is)

On 2021-02-07 17:00, Joan Moreau wrote:

Hello,

I am trying to deal properly with email attachements in fts-xapian plugins.

I tried the default script with a PDF file.

The data I receive in the fts plugin part ("xxx_build_more") is the original document, no the output of the pdftotext

Is there anything I am missing ?

Here my config:

plugin {
plugin = fts_xapian managesieve sieve

fts = xapian
fts_xapian = partial=2 full=20 verbose=1 attachments=1

fts_autoindex = yes
fts_enforced = yes
fts_autoindex_exclude = \Trash
fts_autoindex_exclude2 = \Drafts

fts_decoder = decode2text

sieve = /data/mail/%d/%n/local.sieve
sieve_after = /data/mail/after.sieve
sieve_before = /data/mail/before.sieve
sieve_dir = /data/mail/%d/%n/sieve
sieve_global_dir = /data/mail
sieve_global_path = /data/mail/global.sieve
}

...

service decode2text {
executable = script /usr/libexec/dovecot/decode2text.sh
user = dovecot
unix_listener decode2text {
mode = 0666
}
}

Thank you

Joan

I'm not sure I can be much use for xapian, but looking at your configuration I did notice some differences with the documentation. I don't know if they are relevant to the issue you're seeing.

First of all I don't see

mail_plugins = fts

plugin = fts

settings which are both mentioned in the xapian documentation.

Also the documentation states that attachments=1 can only index text attachments. Maybe you should be using attachments=0 and let fts_decode handle the attachments.

Failing that, I can only advise to turn on some debugging and see what that brings.

best regards

John

Reply via email to