Hi Bernd,
Thanks for the response. Here's the problem scenario A company has
lots of dealing with their client, Joe. Several different employees
have to email Joe regularly. So correspondence to Joe is spread across
several mail accounts and several folders in each account (sent, inbox,
archive 2013, archive 2014, etc, and some employees might have a Joe's
stuff folder). There are no restrictions as to what folder a
particular email might be in (and possibly be moved to a different
folder tomorrow). The 'boss' wants to be able to see a list of 'all
email to Joe'.
As mail arrives I extract relevant search criteria and store it in my
search engine database. So it is easy for me to assemble the list of
emails to Joe. The one thing I need. I need to be able then to
query JAMES_MAIL and extract the actual mail records. A simple,
guaranteed, unchangeable UUID in JAMES_MAIL is ALL I need. I simply
store the UUID as the key in my search engine. (Once I find the record,
I can determine the account/folder from the record and then use standard
IMAP functions to access the mail item).
My main design points...
1) The UUID must be immutable, folder independent, and folder-move
independent.
2) I do not want to duplicate a near million mail entry db somewhere
else. I want to pull the mail from the existing JAMES_MAIL db.
3) New mail and deleted mail will be handled by search engine sync
utilities and is not an issue
In summary my search engine finds the index/key record it wants
It must then locate that particular mail item in the JAMES_MAIL table
I'm currently generating a hash UUID including various fields such as
from, to, subject, etc. that is working fairly well at generating a
unique id/key. It still feels like a hack. And I'm sure there will be
situations where the calculated hash is a dup from a different email.
So it's not 100%.
As you originally theorized, the simplest solution would have been to
have the db autogen an incrementing id. But as you pointed out, the
copy/delete on folder move kills that id. Perhaps add a UUID header to
the mail when it first comes in if such header does not exist. Then
always reflect that UUID header value to the JAMES_MAIL table's UUID
field for db query use (??). Headers remain intact in case of
Thunderbird's copy/delete, correct?
Thoughts?
Jerry
On 3/13/2015 6:15 PM, Bernd Waibel wrote:
Sorry,
Thought about again:
I think using a sequence is wrong. Cause Thunderbird makes a COPY, you will get a new
UUID for the B:42 mail, and as I understand that is not what you need.
Greetings
Bernd
-Ursprüngliche Nachricht-
Von: Bernd Waibel [mailto:bwai...@intarsys.de]
Gesendet: Samstag, 14. März 2015 00:07
An: James Users List
Betreff: AW: Tracking Mail After Folder Moves [unsigned]
Hello Jerry,
just a few thoughts about alternatives (not sure I got your problem).
Why don't use a database sequence field or AUTO_INCREMENT field, instead of a
UUID? And let the database handle the UUID creation?
But if you would like to use UUIDs: Make sure it is not part of a race
condition.
As shortly described here for postgres sequences:
http://www.neilconway.org/docs/sequences/
James is multithreaded.
Maybe the UUID field should be indexed, if you search for it often (a sequence
field does not need to be indexed).
Maybe a database trigger on insert could create your index table. And another trigger
could delete on delete.
You said, you will have a hourly delay of indexing when using cron. What
happens, if a new mail arrives, and the user moves this mail immediately to
another folder, before indexed, is this ok for your process?
It is just the way I handle my mails: on arrival I move the mails to a new
folder (after reading).
But a good indexing solution implemented in James would be nice, too. ;-)
Greetings
Bernd
-Ursprüngliche Nachricht-
Von: Jerry Malcolm [mailto:techst...@malcolms.com]
Gesendet: Freitag, 13. März 2015 22:08
An: server-user@james.apache.org
Betreff: Re: Tracking Mail After Folder Moves
Benoit,
Thanks for the info. Kinda what I was suspecting. Here's what I've done so
far...
My ultimate objective is to maintain a searchable index for all of the hundreds
of thousands of emails stored in my JAMES mail db. As previously discussed,
this is only possible assuming I have a way to later locate a particular email
that I have built an index for (assuming the user will move it around between
folders...)
1) Step one was to add one more column to the JAMES_MAIL table for my own
globally-unique UUID
2) When JAMES stores an email, this column defaults to -1, so I'll know it
hasn't yet been indexed
3) A chron job runs hourly and creates an index for the new mail. It also adds
the matching index records with all of the keyword info I want to track into my
own separate index table.
4) I have code to process index queries and identify the UUID for the desired
mail
5) I