[OSM-dev] OSM Level of Detail for GSoC 2017

2016-12-20 Thread Balázs Dukai
Hi Peda,

yes indeed, yours is a very pertinent question. For the problem of 
differentiating between low-detail complex buildings and high-detail 
simple buildings currently I have two ideas. One, check the spatial 
distribution of the level of detail. This is based on the assumption 
that buildings in a spatial cluster are mapped in similar level of 
detail. I'm aware that there are several exceptions to this that might 
make my assumption invalid. Secondly, use an authorative data set to 
compare the detail of the footprints. Again, assuming that the 
authorative data set is consistent in its level of detail.

I admit that I was not aware of your second point, so thank you for 
bringing it to my attention. Which also helps me understand the 
limitations of such a method to enrich the OSM database. I will still 
work with OSM data in my thesis, however my project for GSoC 2017 if off 
the table.

In general, my work is focused on the statistical comparison of 
footprints or 3D buildings by using a machine learning approach. However 
I'm still in the middle of developing the method, and ultimately I would 
like to know that until what extent is the automatic inference of LOD 
possible.

Thank you very much for your feedback!
Balázs

-- 
Balázs Dukai
MSc Geomatics engineer candidate | MEng Landscape architect
The Hague, Netherlands
LinkedIn: www.linkedin.com/in/balazsdukai
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] OSM Project Idea for GSoC 17: Vandalism Detection in Map Edits

2016-12-20 Thread Jason Remillard
Hi,

There is obviously plenty of data that represents "good" changes. The
data working group reversions could be used to train a classifier on
what a bad edit looks like. After that, looking for change sets that
are logically erased after a short period of time (say 2 weeks), might
also yield some bad change set.

Jason

On Sun, Dec 18, 2016 at 6:38 PM, Animesh Sinha
 wrote:
> Hi,
>
> I am a first year masters students at Purdue University and would like to
> propose a project idea for GSoC 2017. I have worked on Vandalism Detection
> in Wikipedia in the past and understand how important it is to predict if an
> information is correct or not as it may be misleading to others.
>
> Hence, I would like to propose this project idea:
>
> Title: Detect if a user edit made in OSM is a vandal edit or regular.
> Summary: It's a very challenging task to monitor the malicious edits or
> spams manually for a large active user base. I plan to identify the cases of
> vandalism on OSM by classifying edits as either regular or vandal. This is
> clearly a Binary Classification task, but if the distribution of regular and
> vandalism cases in the dataset are skewed, it can also be explored as an
> Anomaly Detection problem.
> Requirements: Lots of data about the edits made, information about the users
> making the edit, information about the people annotating the true labels,
> etc.
>
> I would appreciate if someone can provide a feedback on the project idea and
> the requirements needed.
>
> Thanks,
> Animesh Sinha
>
> ___
> dev mailing list
> dev@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/dev
>

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] OSM Project Idea for GSoC 17: Vandalism Detection in Map Edits

2016-12-20 Thread Ethan Nelson
Hi Animesh,


You can check out the vandalism page on the OSM Wiki that provides a pretty 
good overview about OSM vandalism and the challenge of detecting it [0].


I think a binary classification won't be as straightforward for creating a wide 
sweeping 'vandal detection' tool because the problem of vandalism in OSM is 
multifaceted: you can change many objects in one changeset and each object 
itself has multiple dimensions  (there's the spatial dimensions--shape, detail, 
etc.--and then there's the data property dimensions). In addition, sometimes 
the line between poor quality edits and vandalism is very thin, so vandalism 
may not be the result of malice but rather just an uninformed editor.



Thus, for a binary classification, it would be useful to focus on one type of 
vandalism. Perhaps it could be detecting doodles (in which case you could 
search for data that isn't normal shaped: small angles, very high detail, and 
so on). Or it could be finding times when people are deleting a lot of data. I 
started a form that aims to collect "bad" edits in general [1], but I haven't 
really advertised it and thus don't have data that could help inform which 
direction would be most commonly found.



You may also check out some of the projects that have implemented parts of the 
algorithms listed on the wiki page for further inspiration [2,3,4].



Best,

Ethan aka FTA


[0]: http://wiki.openstreetmap.org/wiki/Vandalism

[1]: 
https://docs.google.com/forms/d/e/1FAIpQLSf4bVukO5OUXviSujW1gUtM1NTroTz3lPsXy7EcKxIp8ZzX5g/viewform

[2]: http://www.mdpi.com/2220-9964/1/3/315

[3]: https://github.com/willemarcel/osmcha-django

[4]: https://github.com/ethan-nelson/osm_hall_monitor

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev