Re: wiki slightly broken still?
On Tue, Aug 3, 2010 at 4:18 PM, David Gerard wrote: > [to list as well] > > On 3 August 2010 15:07, Dimi Paun wrote: >> On Tue, 2010-08-03 at 14:30 +0200, Alexandru Băluț wrote: > >>> How difficult would it be to use ReCaptcha? >>> http://www.google.com/recaptcha > >> Hm, don't know. We could hack our version to support recaptcha, >> but I'm not familiar with the code base, and I don't have the >> time right now. But I can take patches if someone is willing >> to do it. > > > The MoinMoin developers consider TextCHA inherently superior and so > have no interest in writing a reCaptcha interface: > > http://moinmo.in/FeatureRequests/ReCaptcha > > (Note also the problems people have had with TextCHA: it becomes too > much work to write the questions and to answer the questions.) > > If someone wants reCaptcha in MoinMoin, it appears they will need to > write it all themselves. > > > - d. > > > reCaptcha has essentially been cracked now (http://it.slashdot.org/story/10/08/05/2054247/ReCAPTCHAnet-Now-Vulnerable-to-Algorithmic-Attack) so I'm not sure it's worth using it in the wiki. Damjan Jovanovic
Re: wiki slightly broken still?
[to list as well] On 3 August 2010 15:07, Dimi Paun wrote: > On Tue, 2010-08-03 at 14:30 +0200, Alexandru Băluț wrote: >> How difficult would it be to use ReCaptcha? >> http://www.google.com/recaptcha > Hm, don't know. We could hack our version to support recaptcha, > but I'm not familiar with the code base, and I don't have the > time right now. But I can take patches if someone is willing > to do it. The MoinMoin developers consider TextCHA inherently superior and so have no interest in writing a reCaptcha interface: http://moinmo.in/FeatureRequests/ReCaptcha (Note also the problems people have had with TextCHA: it becomes too much work to write the questions and to answer the questions.) If someone wants reCaptcha in MoinMoin, it appears they will need to write it all themselves. - d.
Re: wiki slightly broken still?
On Tue, 2010-08-03 at 14:30 +0200, Alexandru Băluț wrote: > How difficult would it be to use ReCaptcha? > > http://www.google.com/recaptcha Hm, don't know. We could hack our version to support recaptcha, but I'm not familiar with the code base, and I don't have the time right now. But I can take patches if someone is willing to do it. -- Dimi Paun Lattica, Inc.
Re: wiki slightly broken still?
On Thu, Jul 29, 2010 at 18:01, Dimi Paun wrote: > Now that this issue is fixed, we can look again at the spam > problem. It was suggested that we use a 'TextChas' for non > logged in users: > http://moinmo.in/HelpOnSpam > > But it seems it's not too easy to come up with decent questions. > Should we try it? How difficult would it be to use ReCaptcha? http://www.google.com/recaptcha Thanks, Alex
Re: wiki slightly broken still?
On Fri, 2010-07-30 at 08:58 +0200, Francois Gouget wrote: > I have a theory: did the script move the remaining files to another > directory? Yes, it did. -- Dimi Paun Lattica, Inc.
Re: wiki slightly broken still?
On Thu, 29 Jul 2010, Dimi Paun wrote: > On Thu, 2010-07-29 at 18:17 +0200, Michael Stefaniuc wrote: > > Yes, the LocalBadContent page got pretty long; I'm fairly sure it's > > the spam checking that takes so long. > > I tried to empty it, and it does seem to help. However, it's not > the only cause of the problem, it's still not fast even with an > empty LocalBadContent. I have a theory: did the script move the remaining files to another directory? If not it may be that there's a fragmentation problem at the directory level; i.e. the directory structure was grown to accomodate 32k entries, not there's only 5k entries but they are spread over the old 32k entries leading to inefficient lookups? If so something like this should fix it: mkdir newdir mv olddir/* newdir # hope there's no dot file rmdir olddir mv newdir olddir -- Francois Gouget http://fgouget.free.fr/ A polar bear is a cartesian bear after a coordinate transform.
Re: wiki slightly broken still?
> "Octavian" == Octavian Voicu writes: >> "What is the name for a billion bytes?" Terabyte, at least in germany. Billion -> 10^12. 10^9 -> "Milliarde" So these questions can be tricky... -- Uwe Bonnesb...@elektron.ikp.physik.tu-darmstadt.de Institut fuer Kernphysik Schlossgartenstrasse 9 64289 Darmstadt - Tel. 06151 162516 Fax. 06151 164321 --
Re: wiki slightly broken still?
On Thu, Jul 29, 2010 at 8:41 PM, Dan Kegel wrote: > I like the idea. It is hard, but here are some possible questions: > > "What is the first name of the Finn who created the Linux operating system?" > "What is the abbreviation for the GNU C Compiler?" > "What is the name of the simple text editor that comes with Windows?" > "Complete the phrase: screen of death" > "What is the name for a billion bytes?" > > I have no idea if those will faze spammers. Those questions will certainly keep away non-geek spammers. We could make a quiz-like anti-spam system with not-so-trivial questions :) Octavian
Re: wiki slightly broken still?
On Thu, Jul 29, 2010 at 9:01 AM, Dimi Paun wrote: >> Should we also add another hurdle (possibly even manual approval) >> to make it harder for spammers to get accounts? > > Now that this issue is fixed, we can look again at the spam > problem. It was suggested that we use a 'TextChas' for non > logged in users: > http://moinmo.in/HelpOnSpam > > But it seems it's not too easy to come up with decent questions. > Should we try it? I like the idea. It is hard, but here are some possible questions: "What is the first name of the Finn who created the Linux operating system?" "What is the abbreviation for the GNU C Compiler?" "What is the name of the simple text editor that comes with Windows?" "Complete the phrase: screen of death" "What is the name for a billion bytes?" I have no idea if those will faze spammers. - Dan
Re: wiki slightly broken still?
On Thu, 2010-07-29 at 18:17 +0200, Michael Stefaniuc wrote: > Yes, the LocalBadContent page got pretty long; I'm fairly sure it's > the spam checking that takes so long. I tried to empty it, and it does seem to help. However, it's not the only cause of the problem, it's still not fast even with an empty LocalBadContent. -- Dimi Paun Lattica, Inc.
Re: wiki slightly broken still?
Dimi Paun wrote: > On Wed, 2010-07-28 at 22:35 +0100, David Gerard wrote: >> Ubuntu hit this one: >> >> https://bugs.edge.launchpad.net/ubuntu/+source/moin/+bug/217191 >> http://moinmo.in/MoinMoinBugs/AllPagesSavedToSingleDirectory > > Thanks David for the links. > > I've run the cleanup scripts, and we are now down to ~5K pages, > down from 32K. So there is still plenty of room to grow for the > time being. > > If we hit the limit again, please let me know and I'll clean it > up right away, now I know what I need to do :) > > P.S. There is still something wrong with the Wiki, saving pages > takes a really long time with no reason whatsoever (no load on the > box, etc). I think we're hitting an inefficiency in Moin, as the > httpd process shoots up to 95% CPU usage for a few good seconds. > I've trimmed the edit-log and the event-log files, which were very > big, but that doesn't seem to help. Any other ideas? Yes, the LocalBadContent page got pretty long; I'm fairly sure it's the spam checking that takes so long. bye michael
Re: wiki slightly broken still?
On Wed, 2010-07-28 at 15:06 -0700, Dan Kegel wrote: > > I'm looking into how we can clean this up. > > Should we also add another hurdle (possibly even manual approval) > to make it harder for spammers to get accounts? Now that this issue is fixed, we can look again at the spam problem. It was suggested that we use a 'TextChas' for non logged in users: http://moinmo.in/HelpOnSpam But it seems it's not too easy to come up with decent questions. Should we try it? -- Dimi Paun Lattica, Inc.
Re: wiki slightly broken still?
On Wed, 2010-07-28 at 22:35 +0100, David Gerard wrote: > Ubuntu hit this one: > > https://bugs.edge.launchpad.net/ubuntu/+source/moin/+bug/217191 > http://moinmo.in/MoinMoinBugs/AllPagesSavedToSingleDirectory Thanks David for the links. I've run the cleanup scripts, and we are now down to ~5K pages, down from 32K. So there is still plenty of room to grow for the time being. If we hit the limit again, please let me know and I'll clean it up right away, now I know what I need to do :) P.S. There is still something wrong with the Wiki, saving pages takes a really long time with no reason whatsoever (no load on the box, etc). I think we're hitting an inefficiency in Moin, as the httpd process shoots up to 95% CPU usage for a few good seconds. I've trimmed the edit-log and the event-log files, which were very big, but that doesn't seem to help. Any other ideas? -- Dimi Paun Lattica, Inc.
Re: wiki slightly broken still?
On Wed, Jul 28, 2010 at 23:35, David Gerard wrote: > On 28 July 2010 21:49, Dimi Paun wrote: >> On Wed, 2010-07-28 at 13:05 -0700, Dan Kegel wrote: > >>> Creating new wiki pages seems broken today... > >> Yes, due to all the spam, we've hit the ext3 limit >> of subdirectories (32k). More here: >> http://www.rooftopsolutions.nl/blog/135 >> I'm looking into how we can clean this up. > > > Ubuntu hit this one: > > https://bugs.edge.launchpad.net/ubuntu/+source/moin/+bug/217191 > http://moinmo.in/MoinMoinBugs/AllPagesSavedToSingleDirectory > > The other solution is permanent deletion of the spam pages from the > actual file system. I've done such pruning before, and it needs > (obviously) to be done with *remarkable* care. It's also very fiddly. > I eventually cobbled together scripts to do the deletion for me. (At > an old workplace, I don't have them to hand.) The MoinMoin page above > lists maintenance scripts that can do it for you. > > They also suggest moving the wiki directories to a filesystem that can > allow stupid amounts of directories, like XFS. (Even ext4 only scales > to 64,000 directories.) https://ext4.wiki.kernel.org/index.php/Ext4_Howto#Sub_directory_scalability seems to indicate there is no such limit. Maybe this was the case a couple of years ago. Additionally, migrating from ext3 to ext4 should give the least headaches (maybe a kernel recompile, YMMV) > MoinMoin 2.0 will apparently use a database instead of flat files. > ETA: some time or other in the far future. "we can't tell exactly when > the new storage stuff will be production ready, but I expect end 2008 > .. mid 2009." Ahem. > > Oh, and moinmo.in regards this as not being a "bug", but the result of > bad file system design. (And not, e.g., a wiki that doesn't scale.) > > > - d. > > >
Re: wiki slightly broken still?
On Wed, Jul 28, 2010 at 1:49 PM, Dimi Paun wrote: > Yes, due to all the spam, we've hit the ext3 limit > of subdirectories (32k). More here: > http://www.rooftopsolutions.nl/blog/135 > > I'm looking into how we can clean this up. Should we also add another hurdle (possibly even manual approval) to make it harder for spammers to get accounts?
Re: wiki slightly broken still?
On 28 July 2010 21:49, Dimi Paun wrote: > On Wed, 2010-07-28 at 13:05 -0700, Dan Kegel wrote: >> Creating new wiki pages seems broken today... > Yes, due to all the spam, we've hit the ext3 limit > of subdirectories (32k). More here: > http://www.rooftopsolutions.nl/blog/135 > I'm looking into how we can clean this up. Ubuntu hit this one: https://bugs.edge.launchpad.net/ubuntu/+source/moin/+bug/217191 http://moinmo.in/MoinMoinBugs/AllPagesSavedToSingleDirectory The other solution is permanent deletion of the spam pages from the actual file system. I've done such pruning before, and it needs (obviously) to be done with *remarkable* care. It's also very fiddly. I eventually cobbled together scripts to do the deletion for me. (At an old workplace, I don't have them to hand.) The MoinMoin page above lists maintenance scripts that can do it for you. They also suggest moving the wiki directories to a filesystem that can allow stupid amounts of directories, like XFS. (Even ext4 only scales to 64,000 directories.) MoinMoin 2.0 will apparently use a database instead of flat files. ETA: some time or other in the far future. "we can't tell exactly when the new storage stuff will be production ready, but I expect end 2008 .. mid 2009." Ahem. Oh, and moinmo.in regards this as not being a "bug", but the result of bad file system design. (And not, e.g., a wiki that doesn't scale.) - d.
Re: wiki slightly broken still?
On Wed, 2010-07-28 at 13:05 -0700, Dan Kegel wrote: > Creating new wiki pages seems broken today... Yes, due to all the spam, we've hit the ext3 limit of subdirectories (32k). More here: http://www.rooftopsolutions.nl/blog/135 I'm looking into how we can clean this up. -- Dimi Paun Lattica, Inc.
wiki slightly broken still?
Creating new wiki pages seems broken today...