[OSM-dev] OSM Level of Detail for GSoC 2017
Hi Peda, yes indeed, yours is a very pertinent question. For the problem of differentiating between low-detail complex buildings and high-detail simple buildings currently I have two ideas. One, check the spatial distribution of the level of detail. This is based on the assumption that buildings in a spatial cluster are mapped in similar level of detail. I'm aware that there are several exceptions to this that might make my assumption invalid. Secondly, use an authorative data set to compare the detail of the footprints. Again, assuming that the authorative data set is consistent in its level of detail. I admit that I was not aware of your second point, so thank you for bringing it to my attention. Which also helps me understand the limitations of such a method to enrich the OSM database. I will still work with OSM data in my thesis, however my project for GSoC 2017 if off the table. In general, my work is focused on the statistical comparison of footprints or 3D buildings by using a machine learning approach. However I'm still in the middle of developing the method, and ultimately I would like to know that until what extent is the automatic inference of LOD possible. Thank you very much for your feedback! Balázs -- Balázs Dukai MSc Geomatics engineer candidate | MEng Landscape architect The Hague, Netherlands LinkedIn: www.linkedin.com/in/balazsdukai ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] OSM Project Idea for GSoC 17: Vandalism Detection in Map Edits
Hi, There is obviously plenty of data that represents "good" changes. The data working group reversions could be used to train a classifier on what a bad edit looks like. After that, looking for change sets that are logically erased after a short period of time (say 2 weeks), might also yield some bad change set. Jason On Sun, Dec 18, 2016 at 6:38 PM, Animesh Sinhawrote: > Hi, > > I am a first year masters students at Purdue University and would like to > propose a project idea for GSoC 2017. I have worked on Vandalism Detection > in Wikipedia in the past and understand how important it is to predict if an > information is correct or not as it may be misleading to others. > > Hence, I would like to propose this project idea: > > Title: Detect if a user edit made in OSM is a vandal edit or regular. > Summary: It's a very challenging task to monitor the malicious edits or > spams manually for a large active user base. I plan to identify the cases of > vandalism on OSM by classifying edits as either regular or vandal. This is > clearly a Binary Classification task, but if the distribution of regular and > vandalism cases in the dataset are skewed, it can also be explored as an > Anomaly Detection problem. > Requirements: Lots of data about the edits made, information about the users > making the edit, information about the people annotating the true labels, > etc. > > I would appreciate if someone can provide a feedback on the project idea and > the requirements needed. > > Thanks, > Animesh Sinha > > ___ > dev mailing list > dev@openstreetmap.org > https://lists.openstreetmap.org/listinfo/dev > ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] OSM Project Idea for GSoC 17: Vandalism Detection in Map Edits
Hi Animesh, You can check out the vandalism page on the OSM Wiki that provides a pretty good overview about OSM vandalism and the challenge of detecting it [0]. I think a binary classification won't be as straightforward for creating a wide sweeping 'vandal detection' tool because the problem of vandalism in OSM is multifaceted: you can change many objects in one changeset and each object itself has multiple dimensions (there's the spatial dimensions--shape, detail, etc.--and then there's the data property dimensions). In addition, sometimes the line between poor quality edits and vandalism is very thin, so vandalism may not be the result of malice but rather just an uninformed editor. Thus, for a binary classification, it would be useful to focus on one type of vandalism. Perhaps it could be detecting doodles (in which case you could search for data that isn't normal shaped: small angles, very high detail, and so on). Or it could be finding times when people are deleting a lot of data. I started a form that aims to collect "bad" edits in general [1], but I haven't really advertised it and thus don't have data that could help inform which direction would be most commonly found. You may also check out some of the projects that have implemented parts of the algorithms listed on the wiki page for further inspiration [2,3,4]. Best, Ethan aka FTA [0]: http://wiki.openstreetmap.org/wiki/Vandalism [1]: https://docs.google.com/forms/d/e/1FAIpQLSf4bVukO5OUXviSujW1gUtM1NTroTz3lPsXy7EcKxIp8ZzX5g/viewform [2]: http://www.mdpi.com/2220-9964/1/3/315 [3]: https://github.com/willemarcel/osmcha-django [4]: https://github.com/ethan-nelson/osm_hall_monitor ___ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev