Re: [OSM-dev] Chinese spam diaries, an analysis
On 03.12.14 17:14, Andy Allan wrote: Thanks for the analysis, I hope it provides developers with ideas for combatting it via the automated spam filters that we already have[1]. I'd suggest to extend/refine the automated filter somewhat. Say: * a novice ist not allowed to post at all * a novice who did some changesets is allowed to post say once per day * an intermediate is allowed to post say once per hour * for an expert (either subscribed for years or lots of changesets) the posting limit is waived One could even think to allow experts to delete other user's posts (because of spam). Of course a log has to be maintained. And so no special people (moderators etc.) are needed! And of course the parameters need to be optimized: * how long is a user a novice? * is 10 changesets enough to allow him/her to post? * when does the intermediate level start? 2 years? 100 changesets? * what are the achievements to reach the expert level? 4 years? 1000 changesets? Those parameters could be tweaked on the fly, I'd say. /al ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Chinese spam diaries, an analysis
On 04/12/14 11:17, Andreas Labres wrote: On 03.12.14 17:14, Andy Allan wrote: Thanks for the analysis, I hope it provides developers with ideas for combatting it via the automated spam filters that we already have[1]. I'd suggest to extend/refine the automated filter somewhat. Say: * a novice ist not allowed to post at all * a novice who did some changesets is allowed to post say once per day * an intermediate is allowed to post say once per hour * for an expert (either subscribed for years or lots of changesets) the posting limit is waived So in other words, most of things we already factor in to our spam scoring... We're just not quite as rigid. In particular you can still post (within reason) without having made any edits - it is actually surprisingly common for non-spammers to do that. Tom -- Tom Hughes (t...@compton.nu) http://compton.nu/ ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Chinese spam diaries, an analysis
On 03/12/14 16:14, Andy Allan wrote: However, spam is an arms race, and I think we might need a different long-term approach. I know in the past using 3rd-party spam filtering services was too expensive (and not really very OSM-ish either). The main such system is akmiset and I'd love to use it but (a) it costs money and (b) to make it most effective we would have to send it things like email addresses and IP addresses which I figure people may object to. Perhaps we need a new set of human content moderators on the site, say 40-80 people with a variety of languages between them. We can consider grey-listing all accounts - i.e. the first few posts of every account is held for review automatically by default, and enable direct posting after we're more certain they aren't a spammer. Once we have a review queue and moderator system then obviously it becomes trivial to do things like holding posts from new users for moderation - we need the basic infrastructure first though. Tom -- Tom Hughes (t...@compton.nu) http://compton.nu/ ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Chinese spam diaries, an analysis
On 04.12.14 12:33, Tom Hughes wrote: So in other words, most of things we already factor in to our spam scoring... We're just not quite as rigid. A (hidden) spam score is bad (IMO). Nobody sees it, almost nobody can test it. A documented user level with documented rules would make much more sense and (IMO) would much more likely be accepted. In particular you can still post (within reason) without having made any edits - it is actually surprisingly common for non-spammers to do that. OSM is not a blog site. OSM is about making the data better. Once you have somehow figured out a little bit how OSM works, you could blog about it. IMO. /al ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Chinese spam diaries, an analysis
On 04/12/14 12:06, Andreas Labres wrote: On 04.12.14 12:33, Tom Hughes wrote: So in other words, most of things we already factor in to our spam scoring... We're just not quite as rigid. A (hidden) spam score is bad (IMO). Nobody sees it, almost nobody can test it. Nothing is hidden: https://github.com/openstreetmap/openstreetmap-website/blob/master/app/models/user.rb#L210 Tom -- Tom Hughes (t...@compton.nu) http://compton.nu/ ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
[OSM-dev] Chinese spam diaries, an analysis
A spammer is periodically posting messages in Chinese to the User Diaries. These diaries follow a distinct pattern: 1. Reading machine translations, the messages advertise a variety of products and services that are against the law. This may be to attract people who would be reluctant to contact the authorities and admit what they are looking for. 2. Diaries are posted in batches of considerable size (up to 20+), typically differing only in having different names of cities and provinces in the text. This would appear to be targeted at searches through search engines. 3. Diaries rarely contain links (occasional exceptions) so cannot be targeted at search engine rankings for pages hosted away from OSM. 4. Numbers preceded with the letters QQ appear regularly; these may be accounts with the Tencent QQ messaging service. 5. The spammer has come back repeatedly creating new accounts so it is likely that that the operation is successful. I have not followed any message account, keyword or link in any of the spams. Among other issues I am wary about possible malware in scam pages. -- Andrew ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Chinese spam diaries, an analysis
On 3 December 2014 at 15:46, Andrew Hain andrewhain...@hotmail.co.uk wrote: A spammer is periodically posting messages in Chinese to the User Diaries. Thanks for the analysis, I hope it provides developers with ideas for combatting it via the automated spam filters that we already have[1]. However, spam is an arms race, and I think we might need a different long-term approach. I know in the past using 3rd-party spam filtering services was too expensive (and not really very OSM-ish either). Perhaps we need a new set of human content moderators on the site, say 40-80 people with a variety of languages between them. We can consider grey-listing all accounts - i.e. the first few posts of every account is held for review automatically by default, and enable direct posting after we're more certain they aren't a spammer. Of course, this would all need coding, but I'm interested in other people's ideas. The current situation where our spam filters can be overwhelmed, and all the removal of spam depends on full-blown system-administrator[2] accounts, isn't perfect! Thanks, Andy [1] https://github.com/openstreetmap/openstreetmap-website/blob/master/app/models/user.rb#L211 [2] https://github.com/openstreetmap/openstreetmap-website/blob/master/app/controllers/diary_entry_controller.rb#L10 ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Chinese spam diaries, an analysis
2014-12-03 17:14 GMT+01:00 Andy Allan gravityst...@gmail.com: Thanks for the analysis, I hope it provides developers with ideas for combatting it via the automated spam filters that we already have[1]. However, spam is an arms race, and I think we might need a different long-term approach. I know in the past using 3rd-party spam filtering services was too expensive (and not really very OSM-ish either). Perhaps we need a new set of human content moderators on the site, say 40-80 people with a variety of languages between them. We can consider grey-listing all accounts - i.e. the first few posts of every account is held for review automatically by default, and enable direct posting after we're more certain they aren't a spammer. maybe we could have a crowd-sourced approach and introduce a spam-flag that logged-in users could set, i.e. another button in the comment, reply line which says something like flag as spam, with a counter, and if more than x people have clicked on it we would automatically or manually hide/delete the post. This should work similar to our stackexchange-like helpsystem (you can flag or unflag with the same button). Cheers, Martin ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Chinese spam diaries, an analysis
On 3 December 2014 at 16:25, Martin Koppenhoefer dieterdre...@gmail.com wrote: maybe we could have a crowd-sourced approach and introduce a spam-flag that logged-in users could set, i.e. another button in the comment, reply line which says something like flag as spam, with a counter, and if more than x people have clicked on it we would automatically or manually hide/delete the post. Good idea. I'd also like to do something to prevent (as well as react to) spam, since reaction-only processes still fill RSS/atom feeds and relayers like https://twitter.com/osmblogs with spam posts. Cheers, Andy ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Chinese spam diaries, an analysis
I think the solution to this is actually pretty simple and straightforward. First, right now there's only a single person who can remove spam from diary entries or profiles. Allowing other people (such as existing site moderators) to address this would go a long way. Second of all, we need a flagging mechanism. I know that Tom wants a complete solution that includes a work queue, etc. I think that's a very laudable goal, but something that just sends an email would be great for now. Third, I think we can get our queue, etc. though either funding, or else through a GSoC project next year, as we did with Changeset Discussions. I volunteer to mentor for it. - Serge On Wed, Dec 3, 2014 at 11:25 AM, Martin Koppenhoefer dieterdre...@gmail.com wrote: 2014-12-03 17:14 GMT+01:00 Andy Allan gravityst...@gmail.com: Thanks for the analysis, I hope it provides developers with ideas for combatting it via the automated spam filters that we already have[1]. However, spam is an arms race, and I think we might need a different long-term approach. I know in the past using 3rd-party spam filtering services was too expensive (and not really very OSM-ish either). Perhaps we need a new set of human content moderators on the site, say 40-80 people with a variety of languages between them. We can consider grey-listing all accounts - i.e. the first few posts of every account is held for review automatically by default, and enable direct posting after we're more certain they aren't a spammer. maybe we could have a crowd-sourced approach and introduce a spam-flag that logged-in users could set, i.e. another button in the comment, reply line which says something like flag as spam, with a counter, and if more than x people have clicked on it we would automatically or manually hide/delete the post. This should work similar to our stackexchange-like helpsystem (you can flag or unflag with the same button). Cheers, Martin ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Chinese spam diaries, an analysis
On 03/12/14 16:25, Martin Koppenhoefer wrote: maybe we could have a crowd-sourced approach and introduce a spam-flag that logged-in users could set, i.e. another button in the comment, reply line which says something like flag as spam, with a counter, and if more than x people have clicked on it we would automatically or manually hide/delete the post. This should work similar to our stackexchange-like helpsystem (you can flag or unflag with the same button). Because nobody has ever thought of that before, or maybe discussed how it might work, or... Oh, but they have: https://github.com/openstreetmap/openstreetmap-website/issues/841 You might also notice that there hasn't actually been any such spam for nearly a fortnight now. Maybe the administrators noticed there was a problem and made a change to combat it? Tom -- Tom Hughes (t...@compton.nu) http://compton.nu/ ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Chinese spam diaries, an analysis
On 3 December 2014 at 16:33, Serge Wroclawski emac...@gmail.com wrote: First, right now there's only a single person who can remove spam from diary entries or profiles. Not strictly true - any user with site administrator priviledges can remove spam - see my previous link to the code. There are multiple people who have those privileges. Of course, in reality it's mainly one person (Tom) who does the work. Second of all, we need a flagging mechanism. I know that Tom wants a complete solution that includes a work queue, etc. I think that's a very laudable goal, but something that just sends an email would be great for now. The flagging mechanism is still reactive - I'd like to look at ideas for blocking spam before it hits the site. Cheers, Andy ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Chinese spam diaries, an analysis
There’s another suspicious post at http://www.openstreetmap.org/user/Medyum%20Y%C4%B1lmaz%20Eren%20Hoca/diary/28134, which is Turkish. I personally prefer an increase in human blog moderators. ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Chinese spam diaries, an analysis
On Wed, Dec 3, 2014 at 2:17 PM, Antje 2...@minoa.li wrote: There’s another suspicious post at http://www.openstreetmap.org/user/Medyum%20Y%C4%B1lmaz%20Eren%20Hoca/diary/28134, which is Turkish. I personally prefer an increase in human blog moderators. I agree. It's usually pretty obvious (even without knowing the language) when a diary post is spam. I'm happy to help if someone gives me the ability to do it. ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev