On Thu, 04 Dec 2014 19:15:54 -0500 Andrew Douglas Pitonyak <and...@pitonyak.org> wrote:
> > On 12/04/2014 12:36 PM, Alexandro Colorado wrote: > > On Thu, Dec 4, 2014 at 11:11 AM, Andrea Pescetti <pesce...@apache.org> > > wrote: > > > >> Alexandro Colorado wrote: > >> > >>> Unfortunately seems these matters went into private lists. I would > >>> suggest a public IRC meetup > >>> > >> This is not an official resource of the project, so the project is trying > >> to help simply as a benefit to existing users. Edward, who owns the domain > >> name, was cooperative and we had a brief exchange of e-mails a few months > >> ago. > >> > >> The outcome, with no need of dedicated discussions, is that the best > >> solution is: > >> 1) Edward keeps the oooforum.org domain name, since it has historically > >> been his > >> 2) We agree that Ed will point oooforum.org to something like > >> forum-archive.openoffice.org (the name is made up, but I mean something > >> under Apache control) > >> 3) Ed provides Apache with a full database dump and a full files tree for > >> the phpbb installation now powering oooforum.org > >> 4) oooforum.org remains as a public archive, but gradually we encourage > >> people to post to forum.openoffice.org (a neutral resource, but on Apache > >> infrastructure and under control of the project) > >> > >> If Ed agrees with this, we can surely implement it reasonably quickly. But > >> we will need action from his side for item #3. > >> > > Agreed and maybe he is under a lot of work. My question here is if he ever > > got back, were there further outreach? And is it possible to share the > > admin credentials with an AOO contributor like Andrew P. I heard he already > > did an rsync of the site but was too large to hold on his client. Maybe AOO > > could share a space to rsync there as a read-only. And then perform some > > cleanup to tag spam posts and delete the pages. 100G should do it IMO. > > Problem is that it was not able to package up what was needed so that it > could be downloaded. I have plenty of storage to have been able to > download it. > > I did a scrape of the pages, and it is about 8GB last time I did it. Off > hand, I expect that a huge chunk of that is SPAM, especially since most > of the SPAMS have large graphics included. I considered writing a PERL > script to clean that based on certain search criteria, but, it just > feels like a huge annoyance to spend hours removing posts and then > trolling the rest of the files to rearrange all of the links so that > things continue to function. So, I did not start the clean-up process > from my scrape. > > -- > Andrew Pitonyak > My Macro Document: http://www.pitonyak.org/AndrewMacro.odt > Info: http://www.pitonyak.org/oo.php A possible method to speed a clean-up might be to leave the spamposting in place (to maintain the structure), but to delete the content, replacing it with a "Spam deleted" flag. Not terribly elegant but could probably be done automatically. -- Rory O'Farrell <ofarr...@iol.ie> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org For additional commands, e-mail: dev-h...@openoffice.apache.org