Re: AW: Tracking Mail After Folder Moves [unsigned]

2015-03-28 Thread Jerry Malcolm

Hi Bernd,

Thanks for the response.  Here's the problem scenario A company has 
lots of dealing with their client, Joe.  Several different employees 
have to email Joe regularly.  So correspondence to Joe is spread across 
several mail accounts and several folders in each account (sent, inbox, 
archive 2013, archive 2014, etc, and some employees might have a Joe's 
stuff folder).  There are no restrictions as to what folder a 
particular email might be in (and possibly be moved to a different 
folder tomorrow).  The 'boss' wants to be able to see a list of 'all 
email to Joe'.


As mail arrives I extract relevant search criteria and store it in my 
search engine database.  So it is easy for me to assemble the list of 
emails to Joe.  The one thing I need. I need to be able then to 
query JAMES_MAIL and extract the actual mail records.  A simple, 
guaranteed, unchangeable UUID in JAMES_MAIL is ALL I need. I simply 
store the UUID as the key in my search engine.  (Once I find the record, 
I can determine the account/folder from the record and then use standard 
IMAP functions to access the mail item).


My main design points...

1) The UUID must be immutable,  folder independent, and folder-move 
independent.
2) I do not want to duplicate a near million mail entry db somewhere 
else.  I want to pull the mail from the existing JAMES_MAIL db.
3) New mail and deleted mail will be handled by search engine sync 
utilities and is not an issue


In summary my search engine finds the index/key record it wants 
It must then locate that  particular mail item in the JAMES_MAIL table


I'm currently generating a hash UUID including various fields such as 
from, to, subject, etc. that is working fairly well at generating a 
unique id/key.  It still feels like a hack.  And I'm sure there will be 
situations where the calculated hash is a dup from a different email.  
So it's not 100%.


As you originally theorized, the simplest solution would have been to 
have the db autogen an incrementing id.  But as you pointed out, the 
copy/delete on folder move kills that id.  Perhaps add a UUID header to 
the mail when it first comes in if such header does not exist.  Then 
always reflect that UUID header value to the JAMES_MAIL table's UUID 
field for db query use (??).  Headers remain intact in case of 
Thunderbird's  copy/delete, correct?


Thoughts?

Jerry

On 3/13/2015 6:15 PM, Bernd Waibel wrote:

Sorry,

Thought about again:
I think using a sequence is wrong. Cause Thunderbird makes a COPY, you will get a new 
UUID for the B:42 mail, and as I understand that is not what you need.

Greetings
Bernd

-Ursprüngliche Nachricht-
Von: Bernd Waibel [mailto:bwai...@intarsys.de]
Gesendet: Samstag, 14. März 2015 00:07
An: James Users List
Betreff: AW: Tracking Mail After Folder Moves [unsigned]

Hello Jerry,

just a few thoughts about alternatives (not sure I got your problem).

Why don't use a database sequence field or AUTO_INCREMENT field, instead of a 
UUID? And let the database handle the UUID creation?
But if you would like to use UUIDs: Make sure it is not part of a race 
condition.
As shortly described here for postgres sequences: 
http://www.neilconway.org/docs/sequences/
James is multithreaded.

Maybe the UUID field should be indexed, if you search for it often (a sequence 
field does not need to be indexed).

Maybe a database trigger on insert could create your index table. And another trigger 
could delete on delete.

You said, you will have a hourly delay of indexing when using cron. What 
happens, if a new mail arrives, and the user moves this mail immediately to 
another folder, before indexed, is this ok for your process?
It is just the way I handle my mails: on arrival I move the mails to a new 
folder (after reading).


But a good indexing solution implemented in James would be nice, too. ;-)


Greetings
Bernd

-Ursprüngliche Nachricht-
Von: Jerry Malcolm [mailto:techst...@malcolms.com]
Gesendet: Freitag, 13. März 2015 22:08
An: server-user@james.apache.org
Betreff: Re: Tracking Mail After Folder Moves

Benoit,

Thanks for the info.  Kinda what I was suspecting.  Here's what I've done so 
far...

My ultimate objective is to maintain a searchable index for all of the hundreds 
of thousands of emails stored in my JAMES mail db.  As previously discussed, 
this is only possible assuming I have a way to later locate a particular email 
that I have built an index for (assuming the user will move it around between 
folders...)

1) Step one was to add one more column to the JAMES_MAIL table for my own 
globally-unique UUID
2) When JAMES stores an email, this column defaults to -1, so I'll know it 
hasn't yet been indexed
3) A chron job runs hourly and creates an index for the new mail. It also adds 
the matching index records with all of the keyword info I want to track into my 
own separate index table.
4) I have code to process index queries and identify the UUID for the desired 
mail
5) I 

Re: AW: Tracking Mail After Folder Moves [unsigned]

2015-03-13 Thread Benoit Tellier

Le 13/03/2015 17:36, Bernd Waibel a écrit :
 I am not firm with IMAP, is there a move operation?
 If the move operation is implemented as a delete and create operation, 
 the identity will be lost.
 Is it possible to implement the move operation as a database renaming 
 operation, to keep the identity?


The MOVE IMAP operation is not implemented in James :

 - the processor of the IMAP command is incomplete
 - lot's of MAILBOX implementation does not have this operation implemented.

But, yes you can imagine just updating the mail entry, with setting a
new mailbox, new UID and new ModSeq.

The actual behaviour is the copy and delete one

Le 13/03/2015 17:36, Bernd Waibel a écrit :
 But you may need to implement something like a trash inside the
 database? To cover the delete and insert action.
 Would this help?

You can do this by logging add, copy and delete operations, but you
still have to do modifications in James to achieve this, and need to
look in these logs each time you want the history of an e-mail. I think
this can be expansive.

If I had this problem, I would add to the database schema a value that
identifies a mail and its copies...

Regards,

Benoit

-
To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org
For additional commands, e-mail: server-user-h...@james.apache.org