most but not all cases: undiscussed imports get reverted and when they get
the go ahead they would be marked as spam. Very bad way to train the
dataset vs ground truthed spam identification.

On Mar 5, 2018 9:50 AM, "Michał Brzozowski" <www.ha...@gmail.com> wrote:

Could we use something similar to detect generic vandalism by training on
reverted changesets? Many of them have "this changeset was reverted fully
or in part..." comments. Also, analyzing object history or detecting
created_by=reverter;JOSM * would give you more examples to train on.

* Unfortunately this persists for the whole JOSM session, so there will be
some false positives.

Michał

5 mar 2018 15:09 "Jason Remillard" <remillard.ja...@gmail.com> napisał(a):

> Hi,
>
> This weekend I put together a SPAM detector for OSM changesets.
>
> https://github.com/jremillard/osm-changeset-classification
>
> You don't need to be a developer to contribute, send over any SPAM'y
> changesets you come across via a github issue, a pull request, or even an
> email to me. I just need the changeset id.
>
> The code is currently hitting 99+% accuracy detecting the difference
> between 1500 random normal edits and 1500 sketchy changesets that Fredrick
> shared with the talk-us last last week. This is with zero tuning, so it
> looks like it will work well.
>
> Jason
>
> _______________________________________________
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
>
_______________________________________________
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk
_______________________________________________
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk

Reply via email to