Re: Tracking Mail After Folder Moves
Benoit, Thanks for the info. Kinda what I was suspecting. Here's what I've done so far... My ultimate objective is to maintain a searchable index for all of the hundreds of thousands of emails stored in my JAMES mail db. As previously discussed, this is only possible assuming I have a way to later locate a particular email that I have built an index for (assuming the user will move it around between folders...) 1) Step one was to add one more column to the JAMES_MAIL table for my own globally-unique UUID 2) When JAMES stores an email, this column defaults to -1, so I'll know it hasn't yet been indexed 3) A chron job runs hourly and creates an index for the new mail. It also adds the matching index records with all of the keyword info I want to track into my own separate index table. 4) I have code to process index queries and identify the UUID for the desired mail 5) I query the JAMES_MAIL table for the mail record using the UUID value. I then extract the folder and ID info in that record. 6) Finally, I go back around to the 'front door' and use the standard IMAP interface with the folder and ID info to access the desired email for the user. Granted, emails can be deleted. I periodically clean out index entries for UUIDs that no longer exist. This is all pretty much working. But as you said, this is going to require remerging everything each time I upgrade JAMES. I'm not really thrilled with modifying the schema for JAMES db tables. I wouldn't expect all of my indexing functionality to be in JAMES. But I would love to have JAMES maintain a single global UUID column in JAMES_MAIL. That would make merging my functionality with JAMES much cleaner. As I said, this is pretty much working now the way I described. I just decided to bring it up here on the forum to make sure I'm not re-inventing the wheel or something by overlooking existing functionality in JAMES. It appears now that I'm blazing new trails and not duplicating anything that's existing. But if there's any talk in the future, I definitely want to keep up with discussions. Thanks again. Jerry On 3/13/2015 11:42 AM, Benoit Tellier wrote: Hi Jerry, You are right ... This is what happens when you drag and drop an e-mail in thunderbid from folder A to B : 1 : Client receive a mail in folder A . The mail is identified by the pair ( mailbox path + uid ). Mailbox path ( or mailbox Id ) is folder specific. Uid is a long, per mailbox generated. It makes no sens alone. Let say we have ( A : 36 ). 2 : You perform the drag and drop 3 : Thunderbird issue a UID COPY command. 4 : So you have the exact same mail in B, let say ( B : 42 ). 5 : James dispatch a Added event for ( B : 42 ) ( Here we don't know where this mail came from ) 6 : Your client perform a UID EXPUNGE command on ( A : 36 ). 7 : ( A : 36 ) is deleted 8 : You have de delete event for ( A : 36 ) ( Here we don't know where this mail came from ) Note that the events I quoted you triggers IDLE operation, and thunderbird gets aware of what is happening. Then it reads ( B : 42 ) and displays it. Well, to sum up : - You do not have global e-mail identifier that survives copy. - You can not base such a feature on event So what can you do ? If I were you, I would do this : 1 : to choose a MAILBOX implementation ( the one your client want to use ? ), 2 : generate an value on mapper's add operation ( either a long (if you want it sorted) or a UUID. ) 3 : Provides a custom message implementation with an accessor on this value. 3.5 : Every where in your mapper you need to use this new message type. 4 : Upon message mapper copy calls, you cast the copied message into your message type, and copy the field without modiffying it. 5 : Here we are ( not that this value may not be unic as message can get copied but not deleted ). You can just build it, and replaces the old jar for your MAILBOX implementation with the new one, and restart your James server ( yes it works ). Note : update the db schemas before restarting James ;-) Note that you do not need more : such a feature can not be accessed over IMAP, but you can read it using an other application. So your are commpelled to access it threw your mail's storage ( you said it was no problem ... ) Don't worry, such a feature is not that hard to implement. Drawbacks : you may have to merge it with other James releases. ( Or get it accepted in the project ? ). Hope it helps, Benoit Le 13/03/2015 16:50, Jerry Malcolm a écrit : This is somewhat an IMAP question. But also a JAMES implementation question. My client has a massive amount of mail that must be kept and accessed. They use Thunderbird and Outlook to do the normal mail handling stuff. No problems at all on the client side. But on the back end, I need to sort and organize and keep track of emails and be able to pull them up using a web interface on demand, completely independent of folders that
Too many open files - 2.3.2
Hi All, I am getting the following error in one of our James installations. This is not related to File repository. I checked the source. This is being thrown from MimeMessageInputStreamSource where it tries to create a temp file. Increasing the ulimit will solve the problem ? Please provide your comments and appreciate your help on this. javax.mail.MessagingException: Unable to retrieve the data: Too many open files; nested exception is: java.io.IOException: Too many open files at org.apache.james.core.MimeMessageInputStreamSource.init(MimeMessageInputStreamSource.java:101) at org.apache.james.core.MailImpl.init(MailImpl.java:181) at org.apache.james.smtpserver.DataCmdHandler.processMail(DataCmdHandler.java:266) at org.apache.james.smtpserver.DataCmdHandler.doDATA(DataCmdHandler.java:133) at org.apache.james.smtpserver.DataCmdHandler.onCommand(DataCmdHandler.java:81) at org.apache.james.smtpserver.SMTPHandler.handleConnection(SMTPHandler.java:393) at org.apache.james.util.connection.ServerConnection$ClientConnectionRunner.run(ServerConnection.java:432) at org.apache.excalibur.thread.impl.ExecutableRunnable.execute(ExecutableRunnable.java:55) at org.apache.excalibur.thread.impl.WorkerThread.run(WorkerThread.java:116) Thanks Mahesh
AW: Tracking Mail After Folder Moves [unsigned]
Hello Jerry, just a few thoughts about alternatives (not sure I got your problem). Why don't use a database sequence field or AUTO_INCREMENT field, instead of a UUID? And let the database handle the UUID creation? But if you would like to use UUIDs: Make sure it is not part of a race condition. As shortly described here for postgres sequences: http://www.neilconway.org/docs/sequences/ James is multithreaded. Maybe the UUID field should be indexed, if you search for it often (a sequence field does not need to be indexed). Maybe a database trigger on insert could create your index table. And another trigger could delete on delete. You said, you will have a hourly delay of indexing when using cron. What happens, if a new mail arrives, and the user moves this mail immediately to another folder, before indexed, is this ok for your process? It is just the way I handle my mails: on arrival I move the mails to a new folder (after reading). But a good indexing solution implemented in James would be nice, too. ;-) Greetings Bernd -Ursprüngliche Nachricht- Von: Jerry Malcolm [mailto:techst...@malcolms.com] Gesendet: Freitag, 13. März 2015 22:08 An: server-user@james.apache.org Betreff: Re: Tracking Mail After Folder Moves Benoit, Thanks for the info. Kinda what I was suspecting. Here's what I've done so far... My ultimate objective is to maintain a searchable index for all of the hundreds of thousands of emails stored in my JAMES mail db. As previously discussed, this is only possible assuming I have a way to later locate a particular email that I have built an index for (assuming the user will move it around between folders...) 1) Step one was to add one more column to the JAMES_MAIL table for my own globally-unique UUID 2) When JAMES stores an email, this column defaults to -1, so I'll know it hasn't yet been indexed 3) A chron job runs hourly and creates an index for the new mail. It also adds the matching index records with all of the keyword info I want to track into my own separate index table. 4) I have code to process index queries and identify the UUID for the desired mail 5) I query the JAMES_MAIL table for the mail record using the UUID value. I then extract the folder and ID info in that record. 6) Finally, I go back around to the 'front door' and use the standard IMAP interface with the folder and ID info to access the desired email for the user. Granted, emails can be deleted. I periodically clean out index entries for UUIDs that no longer exist. This is all pretty much working. But as you said, this is going to require remerging everything each time I upgrade JAMES. I'm not really thrilled with modifying the schema for JAMES db tables. I wouldn't expect all of my indexing functionality to be in JAMES. But I would love to have JAMES maintain a single global UUID column in JAMES_MAIL. That would make merging my functionality with JAMES much cleaner. As I said, this is pretty much working now the way I described. I just decided to bring it up here on the forum to make sure I'm not re-inventing the wheel or something by overlooking existing functionality in JAMES. It appears now that I'm blazing new trails and not duplicating anything that's existing. But if there's any talk in the future, I definitely want to keep up with discussions. Thanks again. Jerry On 3/13/2015 11:42 AM, Benoit Tellier wrote: Hi Jerry, You are right ... This is what happens when you drag and drop an e-mail in thunderbid from folder A to B : 1 : Client receive a mail in folder A . The mail is identified by the pair ( mailbox path + uid ). Mailbox path ( or mailbox Id ) is folder specific. Uid is a long, per mailbox generated. It makes no sens alone. Let say we have ( A : 36 ). 2 : You perform the drag and drop 3 : Thunderbird issue a UID COPY command. 4 : So you have the exact same mail in B, let say ( B : 42 ). 5 : James dispatch a Added event for ( B : 42 ) ( Here we don't know where this mail came from ) 6 : Your client perform a UID EXPUNGE command on ( A : 36 ). 7 : ( A : 36 ) is deleted 8 : You have de delete event for ( A : 36 ) ( Here we don't know where this mail came from ) Note that the events I quoted you triggers IDLE operation, and thunderbird gets aware of what is happening. Then it reads ( B : 42 ) and displays it. Well, to sum up : - You do not have global e-mail identifier that survives copy. - You can not base such a feature on event So what can you do ? If I were you, I would do this : 1 : to choose a MAILBOX implementation ( the one your client want to use ? ), 2 : generate an value on mapper's add operation ( either a long (if you want it sorted) or a UUID. ) 3 : Provides a custom message implementation with an accessor on this value. 3.5 : Every where in your mapper you need to use this new message type. 4 : Upon message mapper copy calls, you cast
AW: Tracking Mail After Folder Moves [unsigned]
Sorry, Thought about again: I think using a sequence is wrong. Cause Thunderbird makes a COPY, you will get a new UUID for the B:42 mail, and as I understand that is not what you need. Greetings Bernd -Ursprüngliche Nachricht- Von: Bernd Waibel [mailto:bwai...@intarsys.de] Gesendet: Samstag, 14. März 2015 00:07 An: James Users List Betreff: AW: Tracking Mail After Folder Moves [unsigned] Hello Jerry, just a few thoughts about alternatives (not sure I got your problem). Why don't use a database sequence field or AUTO_INCREMENT field, instead of a UUID? And let the database handle the UUID creation? But if you would like to use UUIDs: Make sure it is not part of a race condition. As shortly described here for postgres sequences: http://www.neilconway.org/docs/sequences/ James is multithreaded. Maybe the UUID field should be indexed, if you search for it often (a sequence field does not need to be indexed). Maybe a database trigger on insert could create your index table. And another trigger could delete on delete. You said, you will have a hourly delay of indexing when using cron. What happens, if a new mail arrives, and the user moves this mail immediately to another folder, before indexed, is this ok for your process? It is just the way I handle my mails: on arrival I move the mails to a new folder (after reading). But a good indexing solution implemented in James would be nice, too. ;-) Greetings Bernd -Ursprüngliche Nachricht- Von: Jerry Malcolm [mailto:techst...@malcolms.com] Gesendet: Freitag, 13. März 2015 22:08 An: server-user@james.apache.org Betreff: Re: Tracking Mail After Folder Moves Benoit, Thanks for the info. Kinda what I was suspecting. Here's what I've done so far... My ultimate objective is to maintain a searchable index for all of the hundreds of thousands of emails stored in my JAMES mail db. As previously discussed, this is only possible assuming I have a way to later locate a particular email that I have built an index for (assuming the user will move it around between folders...) 1) Step one was to add one more column to the JAMES_MAIL table for my own globally-unique UUID 2) When JAMES stores an email, this column defaults to -1, so I'll know it hasn't yet been indexed 3) A chron job runs hourly and creates an index for the new mail. It also adds the matching index records with all of the keyword info I want to track into my own separate index table. 4) I have code to process index queries and identify the UUID for the desired mail 5) I query the JAMES_MAIL table for the mail record using the UUID value. I then extract the folder and ID info in that record. 6) Finally, I go back around to the 'front door' and use the standard IMAP interface with the folder and ID info to access the desired email for the user. Granted, emails can be deleted. I periodically clean out index entries for UUIDs that no longer exist. This is all pretty much working. But as you said, this is going to require remerging everything each time I upgrade JAMES. I'm not really thrilled with modifying the schema for JAMES db tables. I wouldn't expect all of my indexing functionality to be in JAMES. But I would love to have JAMES maintain a single global UUID column in JAMES_MAIL. That would make merging my functionality with JAMES much cleaner. As I said, this is pretty much working now the way I described. I just decided to bring it up here on the forum to make sure I'm not re-inventing the wheel or something by overlooking existing functionality in JAMES. It appears now that I'm blazing new trails and not duplicating anything that's existing. But if there's any talk in the future, I definitely want to keep up with discussions. Thanks again. Jerry On 3/13/2015 11:42 AM, Benoit Tellier wrote: Hi Jerry, You are right ... This is what happens when you drag and drop an e-mail in thunderbid from folder A to B : 1 : Client receive a mail in folder A . The mail is identified by the pair ( mailbox path + uid ). Mailbox path ( or mailbox Id ) is folder specific. Uid is a long, per mailbox generated. It makes no sens alone. Let say we have ( A : 36 ). 2 : You perform the drag and drop 3 : Thunderbird issue a UID COPY command. 4 : So you have the exact same mail in B, let say ( B : 42 ). 5 : James dispatch a Added event for ( B : 42 ) ( Here we don't know where this mail came from ) 6 : Your client perform a UID EXPUNGE command on ( A : 36 ). 7 : ( A : 36 ) is deleted 8 : You have de delete event for ( A : 36 ) ( Here we don't know where this mail came from ) Note that the events I quoted you triggers IDLE operation, and thunderbird gets aware of what is happening. Then it reads ( B : 42 ) and displays it. Well, to sum up : - You do not have global e-mail identifier that survives copy. - You can not base such a feature on event So what can you do ? If I were you, I would do
Tracking Mail After Folder Moves
This is somewhat an IMAP question. But also a JAMES implementation question. My client has a massive amount of mail that must be kept and accessed. They use Thunderbird and Outlook to do the normal mail handling stuff. No problems at all on the client side. But on the back end, I need to sort and organize and keep track of emails and be able to pull them up using a web interface on demand, completely independent of folders that they may currently be in. In other words, I need to keep track of 'email x' and be able to find it at a later time no matter how many times the user moves it from folder to folder. I believe I understand the philosophy of IMAP for the client is to find a folder, display the contents, refresh periodically and add/remove mail from its records for that folder as contents change. Basically if the user moves a mail item from one folder to another, the first folder recognizes it's no longer there, and is done with it. The other folder subsequently realizes it has a new email item and displays it. But there is no knowledge that this is the same email. Have I got it pretty much correct? So... I realize I may be stretching/bending the intent of IMAP. But that doesn't diminish the fact that I have the requirement. I've dug through all of the database table schemas for JAMES and have a pretty good handle on how mail is stored and tracked internally. But I may have missed something. So my main question is is there a way for me to permanently track an email item and be able to locate it at some point down the road even if it's been moved around folders several times? Basically, is there a global unique ID for every email stored? BTW I'm not bound by having to use only IMAP. I have no problem at all back-dooring to the JAMES database and writing code to use SQL to track through the database tables to find the email. I just don't think there is anything unique/unchangeable that will allow me to permanently track a particular email. Am I totally off the wall in considering something like this? Seems a complete waste to have to duplicate a hundred gigs of mail data for my own archive when JAMES has a perfectly good copy of everything. Suggestions? Thanks. Jerry - To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org For additional commands, e-mail: server-user-h...@james.apache.org
AW: Tracking Mail After Folder Moves [unsigned]
Hello Jerry, a very good question. I would like to tell my opinion, not sure if I could help. We use James v2.3.2. We currently do not use the mailboxes, but anyway. We develop with James. Some time ago we moved our old mail system (postfix based) to a new mail system (MS Exchange). Because every user would like to keep the old emails (GB of it), we used a tool to move the mails by IMAP from one system to another. We used a tool called IMAPSync, I think that was the name, and the author does support many different mail systems, and does have a lot of experience. As I could remember, there does not have to be a ID of an email. It could by, especially the Message-ID, but this header is optional. The code in IMAPSync for syncing this mails did a lot of identity handling. The software tried to sync only missing mails, so mail in both systems needed to be identified as identical, to not get transferred a second time on second sync. Same problem you may have. The author of the software wrote something about this, and had a lot of options in his software to handle this. As I could remember, the software tried to identify the identical mail by using headers, and if the headers missed, it tried some hash values (or something like that). Worked fine with some exceptions: Some mails got changed by the MS Exchange on arrival. It seemed to be calendar events, which will be handled by Exchange Servers, to get stored in the Outlook calendar. This mails got changed every time on every sync. So we had some mails, which got duplicated with every sync. We simply accepted that. It was a oneway sync. So you may use the message-id and some other headers to identify the identical mail. But I think this is risky. I think it could be possible to identify a mail by it content. The IMAP folder structure is a virtual structure, it does not need to be the same on the IMAP server. Even the folder names in the client do not need to be the same on the server. As you will have a look at James, the storage of the files may be a file storage, but it could be also a database storage or anything else. James does support that. So what happens if you store the mails in a database engine, representing the folder structure as database schema? Every mail is an object. The folder structure is nothing more than tables or something like that. Because most database do keep IDs of each object, or hash values, the object identity should be simply a database field. I am not firm with IMAP, is there a move operation? If the move operation is implemented as a delete and create operation, the identity will be lost. Is it possible to implement the move operation as a database renaming operation, to keep the identity? Or another: You could set a header (UUID) every time a mail arrives. Just needs a set header action in james. Than you have a sure trackable ID. But you may need to implement something like a trash inside the database? To cover the delete and insert action. Would this help? Regards Bernd Waibel -Ursprüngliche Nachricht- Von: Jerry Malcolm [mailto:techst...@malcolms.com] Gesendet: Freitag, 13. März 2015 16:50 An: James Users List Betreff: Tracking Mail After Folder Moves This is somewhat an IMAP question. But also a JAMES implementation question. My client has a massive amount of mail that must be kept and accessed. They use Thunderbird and Outlook to do the normal mail handling stuff. No problems at all on the client side. But on the back end, I need to sort and organize and keep track of emails and be able to pull them up using a web interface on demand, completely independent of folders that they may currently be in. In other words, I need to keep track of 'email x' and be able to find it at a later time no matter how many times the user moves it from folder to folder. I believe I understand the philosophy of IMAP for the client is to find a folder, display the contents, refresh periodically and add/remove mail from its records for that folder as contents change. Basically if the user moves a mail item from one folder to another, the first folder recognizes it's no longer there, and is done with it. The other folder subsequently realizes it has a new email item and displays it. But there is no knowledge that this is the same email. Have I got it pretty much correct? So... I realize I may be stretching/bending the intent of IMAP. But that doesn't diminish the fact that I have the requirement. I've dug through all of the database table schemas for JAMES and have a pretty good handle on how mail is stored and tracked internally. But I may have missed something. So my main question is is there a way for me to permanently track an email item and be able to locate it at some point down the road even if it's been moved around folders several times? Basically, is there a global unique ID for every email stored? BTW I'm not bound by having to use only
Re: Tracking Mail After Folder Moves
Hi Jerry, You are right ... This is what happens when you drag and drop an e-mail in thunderbid from folder A to B : 1 : Client receive a mail in folder A . The mail is identified by the pair ( mailbox path + uid ). Mailbox path ( or mailbox Id ) is folder specific. Uid is a long, per mailbox generated. It makes no sens alone. Let say we have ( A : 36 ). 2 : You perform the drag and drop 3 : Thunderbird issue a UID COPY command. 4 : So you have the exact same mail in B, let say ( B : 42 ). 5 : James dispatch a Added event for ( B : 42 ) ( Here we don't know where this mail came from ) 6 : Your client perform a UID EXPUNGE command on ( A : 36 ). 7 : ( A : 36 ) is deleted 8 : You have de delete event for ( A : 36 ) ( Here we don't know where this mail came from ) Note that the events I quoted you triggers IDLE operation, and thunderbird gets aware of what is happening. Then it reads ( B : 42 ) and displays it. Well, to sum up : - You do not have global e-mail identifier that survives copy. - You can not base such a feature on event So what can you do ? If I were you, I would do this : 1 : to choose a MAILBOX implementation ( the one your client want to use ? ), 2 : generate an value on mapper's add operation ( either a long (if you want it sorted) or a UUID. ) 3 : Provides a custom message implementation with an accessor on this value. 3.5 : Every where in your mapper you need to use this new message type. 4 : Upon message mapper copy calls, you cast the copied message into your message type, and copy the field without modiffying it. 5 : Here we are ( not that this value may not be unic as message can get copied but not deleted ). You can just build it, and replaces the old jar for your MAILBOX implementation with the new one, and restart your James server ( yes it works ). Note : update the db schemas before restarting James ;-) Note that you do not need more : such a feature can not be accessed over IMAP, but you can read it using an other application. So your are commpelled to access it threw your mail's storage ( you said it was no problem ... ) Don't worry, such a feature is not that hard to implement. Drawbacks : you may have to merge it with other James releases. ( Or get it accepted in the project ? ). Hope it helps, Benoit Le 13/03/2015 16:50, Jerry Malcolm a écrit : This is somewhat an IMAP question. But also a JAMES implementation question. My client has a massive amount of mail that must be kept and accessed. They use Thunderbird and Outlook to do the normal mail handling stuff. No problems at all on the client side. But on the back end, I need to sort and organize and keep track of emails and be able to pull them up using a web interface on demand, completely independent of folders that they may currently be in. In other words, I need to keep track of 'email x' and be able to find it at a later time no matter how many times the user moves it from folder to folder. I believe I understand the philosophy of IMAP for the client is to find a folder, display the contents, refresh periodically and add/remove mail from its records for that folder as contents change. Basically if the user moves a mail item from one folder to another, the first folder recognizes it's no longer there, and is done with it. The other folder subsequently realizes it has a new email item and displays it. But there is no knowledge that this is the same email. Have I got it pretty much correct? So... I realize I may be stretching/bending the intent of IMAP. But that doesn't diminish the fact that I have the requirement. I've dug through all of the database table schemas for JAMES and have a pretty good handle on how mail is stored and tracked internally. But I may have missed something. So my main question is is there a way for me to permanently track an email item and be able to locate it at some point down the road even if it's been moved around folders several times? Basically, is there a global unique ID for every email stored? BTW I'm not bound by having to use only IMAP. I have no problem at all back-dooring to the JAMES database and writing code to use SQL to track through the database tables to find the email. I just don't think there is anything unique/unchangeable that will allow me to permanently track a particular email. Am I totally off the wall in considering something like this? Seems a complete waste to have to duplicate a hundred gigs of mail data for my own archive when JAMES has a perfectly good copy of everything. Suggestions? Thanks. Jerry - To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org For additional commands, e-mail: server-user-h...@james.apache.org - To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org For
Re: AW: Tracking Mail After Folder Moves [unsigned]
Le 13/03/2015 17:36, Bernd Waibel a écrit : I am not firm with IMAP, is there a move operation? If the move operation is implemented as a delete and create operation, the identity will be lost. Is it possible to implement the move operation as a database renaming operation, to keep the identity? The MOVE IMAP operation is not implemented in James : - the processor of the IMAP command is incomplete - lot's of MAILBOX implementation does not have this operation implemented. But, yes you can imagine just updating the mail entry, with setting a new mailbox, new UID and new ModSeq. The actual behaviour is the copy and delete one Le 13/03/2015 17:36, Bernd Waibel a écrit : But you may need to implement something like a trash inside the database? To cover the delete and insert action. Would this help? You can do this by logging add, copy and delete operations, but you still have to do modifications in James to achieve this, and need to look in these logs each time you want the history of an e-mail. I think this can be expansive. If I had this problem, I would add to the database schema a value that identifies a mail and its copies... Regards, Benoit - To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org For additional commands, e-mail: server-user-h...@james.apache.org