Hey folks,

I've been in the process of converting over my mail server to gafyd.
I moved a lot mail via the IMAP migration tool, which worked pretty
well ... but I have several thousand/GB of messages in a "not
conducive for IMAP server" format which I also want to move over.  I
thought it'd be pretty straightforward, so I wrote up a python script
to handle this format and also formats supported by the mailbox
module.  Messages are batched up and then submitted as you would
expect using gdata.

The issue is that not all the messages are making it into gmail.  I've
spent a bit of time trying to debug this, and I seem to not be able to
figure out what's going on.  In short:

Submitting batch of 2038 messages
Submitting batch of 74 messages
Processed a total of 2112 message(s), 7.22M.
Submitting batch of 1372 messages
Submitting batch of 70 messages
Processed a total of 1442 message(s), 7.68M.

so that's 3554 messages total.  But after letting gmail process it,
ie: waiting a couple of hours until "All Mail" stops changing, I end
up only having 2089 messages, from across the 4 submissions (ie: it's
not just from a single batch upload).  Since the source messages are
broken up on-disk by date, I tried uploading several single months in
individual batches, and ended up with:

(source msg count, month/file, imported msg count)

    31 2002.07.csv  15
    70 2002.08.csv  53
   105 2002.09.csv  66
   111 2002.10.csv  67
   101 2002.11.csv  62
   103 2002.12.csv  63

so clearly, I'm usually getting under 2/3 of the messages I upload.
In this per-month mode, if I go ahead and try to batch upload a whole
month again, the missing messages all show up.  This tells me that it
is not my script or the source messages themselves which are the
issue.

Based on the documentation, I was expecting an exception in the case
of an upload error, but none were raised.
I then was going to look at the return value from
gdata.apps.migration.service.SubmitBatch() to see if there's anything
useful there, but noticed two things:

a) The pydoc says:
          Returns:
            A HTTPResponse from the web service call.

  which is great, except it's not actually HTTPResponse.  it's
gdata.apps.migration.BatchMailEventFeed. :(

b) as far as I can tell, there's no useful response values in
gdata.apps.migration.BatchMailEventFeed, all methods seem to be about
getting URLs.

Based on some other messages I've read, I thought maybe there were too
many/frequent submissions, so after each submit I added a sleep(O(log
n)), which helped out somewhat but still didn't completely solve the
problem.  I haven't tried making the sleep time larger, or trying a
single large batch to see what happens.

Any thoughts?  Is there a way to get better result information?  Is
there a way to look at the gmail import process and see errors (I'm
assuming API migration is the same as IMAP migration, in that there's
a "submit into queue" portion and a "import from queue" portion)?

Thanks. :)



PS: I'm counting/checking the imported messages using an IMAP client
against gmail, so the difference isn't a message count vs conversation
count issue.  Also, the batches are limited by message count and size
(sum of the return values from AddBatchEntry()), so I shouldn't be
anywhere near the documented 32MB limit.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google Apps APIs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-apps-apis?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to