Re: [OSM-dev] OSM Project Idea for GSoC 17: Vandalism Detection in Map Edits

2016-12-21 Thread Animesh Sinha
Hi,

As Ethan mentioned, binary classification would be a tough nut to crack
because of the existence of multiple objects in  a changeset. According to
the paper [2], OSMPatrol marked 44% of the edits as possible vandalism. It
means there are a lot of false positives. 50% of the users whose edits were
detected as vandalism had reputation more than 66%. Hence, some work can
also be done on the rule for assigning reputation for users. Since, I have
worked on text processing in the past, I'd like to work on detecting this
type of vandalism: *Adding fake data and tags*.

As Jason mentioned, there would be plenty of data representing good edits.
So, we know that most of the edits would be regular or correct. Hence, the
distribution of good to bad edits would be skewed and there would be a lot
of scope in modeling this problem as an Anomaly Detection task. As I
mentioned above that some edits of users having reputation more than 66%
are also marked as potential vandalism, this is against the well defined
notion of normal behavior of the highly reputed user. Hence, this is an
anomaly in the behaviour of the user.

I plan to use *One Class SVM* (kernel can be decided using cross
validation) and *Isolation Forest* to detect these kind of possible
outliers. The anomaly detection systems implemented in the past usually
detect outliers with the dataset as a whole and not focusing on a
particular stakeholder (editor in this case). Hence, I would like to
explore if implementing such a system would increase Precision and Recall
simultaneously.

Thanks,
Animesh Sinha

On Tue, Dec 20, 2016 at 9:27 AM, Jason Remillard 
wrote:

> Hi,
>
> There is obviously plenty of data that represents "good" changes. The
> data working group reversions could be used to train a classifier on
> what a bad edit looks like. After that, looking for change sets that
> are logically erased after a short period of time (say 2 weeks), might
> also yield some bad change set.
>
> Jason
>
> On Sun, Dec 18, 2016 at 6:38 PM, Animesh Sinha
>  wrote:
> > Hi,
> >
> > I am a first year masters students at Purdue University and would like to
> > propose a project idea for GSoC 2017. I have worked on Vandalism
> Detection
> > in Wikipedia in the past and understand how important it is to predict
> if an
> > information is correct or not as it may be misleading to others.
> >
> > Hence, I would like to propose this project idea:
> >
> > Title: Detect if a user edit made in OSM is a vandal edit or regular.
> > Summary: It's a very challenging task to monitor the malicious edits or
> > spams manually for a large active user base. I plan to identify the
> cases of
> > vandalism on OSM by classifying edits as either regular or vandal. This
> is
> > clearly a Binary Classification task, but if the distribution of regular
> and
> > vandalism cases in the dataset are skewed, it can also be explored as an
> > Anomaly Detection problem.
> > Requirements: Lots of data about the edits made, information about the
> users
> > making the edit, information about the people annotating the true labels,
> > etc.
> >
> > I would appreciate if someone can provide a feedback on the project idea
> and
> > the requirements needed.
> >
> > Thanks,
> > Animesh Sinha
> >
> > ___
> > dev mailing list
> > dev@openstreetmap.org
> > https://lists.openstreetmap.org/listinfo/dev
> >
>
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


[OSM-dev] OpenStreetMap Carto release v3.0.0

2016-12-21 Thread Paul Norman

Dear all,

Today, v3.0.0 of the openstreetmap-carto stylesheet (the default
stylesheet on openstreetmap.org) has been released.

Major changes include

- Mapnik 3 is now required
- CartoCSS 0.16.x is now required
- Official Tilemill support is dropped
- Shapefiles are downloaded with a new python script

Changes include

- Noto Naskh is now used for Arabic
- Visual impact of campsites and quarries reduced below z13
- Wilderness huts rendered
- Subway entrances rendered

Thanks to all the contributors for this release including jojo4u, a new 
contributor.


For a full list of commits, see
https://github.com/gravitystorm/openstreetmap-carto/compare/v2.45.1...v3.0.0

As always, we welcome any bug reports at
https://github.com/gravitystorm/openstreetmap-carto/issues

See also http://www.openstreetmap.org/user/pnorman/diary/40114 for diary 
version



___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev