On Thu, 04 Dec 2014 19:15:54 -0500
Andrew Douglas Pitonyak <and...@pitonyak.org> wrote:

> 
> On 12/04/2014 12:36 PM, Alexandro Colorado wrote:
> > On Thu, Dec 4, 2014 at 11:11 AM, Andrea Pescetti <pesce...@apache.org>
> > wrote:
> >
> >> Alexandro Colorado wrote:
> >>
> >>> Unfortunately seems these matters went into private lists. I would
> >>> suggest a public IRC meetup
> >>>
> >> This is not an official resource of the project, so the project is trying
> >> to help simply as a benefit to existing users. Edward, who owns the domain
> >> name, was cooperative and we had a brief exchange of e-mails a few months
> >> ago.
> >>
> >> The outcome, with no need of dedicated discussions, is that the best
> >> solution is:
> >> 1) Edward keeps the oooforum.org domain name, since it has historically
> >> been his
> >> 2) We agree that Ed will point oooforum.org to something like
> >> forum-archive.openoffice.org (the name is made up, but I mean something
> >> under Apache control)
> >> 3) Ed provides Apache with a full database dump and a full files tree for
> >> the phpbb installation now powering oooforum.org
> >> 4) oooforum.org remains as a public archive, but gradually we encourage
> >> people to post to forum.openoffice.org (a neutral resource, but on Apache
> >> infrastructure and under control of the project)
> >>
> >> If Ed agrees with this, we can surely implement it reasonably quickly. But
> >> we will need action from his side for item #3.
> >>
> > ​Agreed and maybe he is under a lot of work. My question here is if he ever
> > got back, were there further outreach? And is it possible to share the
> > admin credentials with an AOO contributor like Andrew P. I heard he already
> > did an rsync of the site but was too large to hold on his client. Maybe AOO
> > could share a space to rsync there as a read-only. And then perform some
> > cleanup to tag spam posts and delete the pages. 100G should do it IMO.
> 
> Problem is that it was not able to package up what was needed so that it 
> could be downloaded. I have plenty of storage to have been able to 
> download it.
> 
> I did a scrape of the pages, and it is about 8GB last time I did it. Off 
> hand, I expect that a huge chunk of that is SPAM, especially since most 
> of the SPAMS have large graphics included. I considered writing a PERL 
> script to clean that based on certain search criteria, but, it just 
> feels like a huge annoyance to spend hours removing posts and then 
> trolling the rest of the files to rearrange all of the links so that 
> things continue to function. So, I did not start the clean-up process 
> from my scrape.
> 
> -- 
> Andrew Pitonyak
> My Macro Document: http://www.pitonyak.org/AndrewMacro.odt
> Info:  http://www.pitonyak.org/oo.php

A possible method to speed a clean-up might be to leave the spamposting in 
place (to maintain the structure), but to delete the content, replacing it with 
a "Spam deleted" flag.  Not terribly elegant but could probably be done 
automatically.
-- 
Rory O'Farrell <ofarr...@iol.ie>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org

Reply via email to