Hi Developers,

In case you haven't seen recent Developer Meeting notes, I wanted to 
update everyone here on recent working investing the migration of our 
DSpace mailing lists off of SourceForge (lists.sourceforge.net). As you 
may have heard, SourceForge had some major stability issues recently 
[1], plus there's been controversy around its practices [2], not to 
mention the fact that all our mailing lists have crashed twice this year 
already (Feb then last week).

So, in some discussions on IRC, several of us feel it's about time to 
move entirely off SourceForge. This includes finding a new home for our 
mailing lists (including this one).

Thus far, my concentration has been in looking to migrate us to Google 
Groups. While everyone has their favorites, I've personally found Google 
Groups easier to manage, and much easier to browse and search (than 
Mailman which SourceForge uses).  Plus, many other open source projects 
in our space have jumped to Google Groups, including Fedora, Hydra, 
Islandora. DSpace also already uses Google Groups for the DSpace 
Community Advisory Team (DCAT) mailing list (and it's become the "de 
facto" standard within DuraSpace for new mailing lists, honestly). So, 
in a sense we'd be consolidating on GG.

But, there is a big "gotcha" (hence this email discussion).

In my testing, while I can migrate our SF mailing list archives to GG, 
Google Groups ignores the *original* message's "Date" header. This means 
that if we were to move our mailing list archives to Google Groups, all 
the old messages will "appear" as if they were posted on the migration 
date (i.e. while the message's date header may say 2004, Google Groups 
will show it as 2015).  Only the *date* seems affected. From my testing, 
the archived messages, the authors, subjects and their discussion 
threads all migrate well (and in the proper order). But, the visible 
date ends up wrong.

(If anyone else has experience with this, please do get in touch. At 
this point, I suspect it's just Google Groups ignores these old "Date" 
email headers in favor of the latest "Received" email header. But I 
honestly cannot find proof of others seeing the same behavior.  
Strangely, Fedora didn't see this behavior when they migrated back in 
2013 from SF to GG. But, since I'm using the exact same process they 
used, I suspect this may be a recent change in GG behavior.)

Because of this odd date issue, we are left with a bit of a conundrum. 
Do we...

1) Migrate to Google Groups, and just let the older messages all appear 
under Aug 2015 (or whatever the migration date ends up being).  This 
makes the old archives browsable/searchable via GG, but the dates are 
not at all trustworthy / may cause confusion.

2) Migrate to Google Groups, but leave our archives behind / saved 
elsewhere.  This would mean we'd be starting "fresh".  The old SF 
archives could be saved as static files off dspace.org (so they would be 
searchable in Google).  Plus, they'd still be searchable via archival 
sites like Nabble, GMane, The Mail Archive, etc.  (and we tend to point 
users to those services to search our archives anyways, since SF 
archives are hard to search/browse).

3) Look into migrating our list elsewhere (not Google Groups). (Though 
as mentioned, GG seems to be the new "de facto" standard these days both 
within DuraSpace and with other open source repository platforms. I 
don't see that changing anytime soon, as they all seem happy with GG.)

4) Stay on SourceForge a bit longer for mailing lists ONLY.  (Though as 
mentioned, our lists have crashed twice in the last 6 months. Not very 
confidence building.)

Thoughts? Or anyone else have experience with migrating list archives 
into Google Groups with tips to share?

- Tim

[1] https://twitter.com/sfnet_ops (see posts from July 17 until today. 
As of today, all SF services are still not fully restored)

Tim Donohue
Technical Lead for DSpace & DSpaceDirect
DuraSpace.org | DSpace.org | DSpaceDirect.org

Dspace-devel mailing list

Reply via email to