Dear Nicholas, But from your description, it seems like there might be other questions > that should guide your analysis. > Context should drive your exploration of the data. >
The most important context is the user's current location in my case since I'm researching the usefulness of SDA for Location-Based Services. As Virgilo pointed out you won't get much milage plotting 10000 points. > You need some way of aggregating. Yeah, I think so too. 10,000+ points is the entire dataset. But in the application it would be perhaps 100 points which are nearby restaurants. If the entire Netherlands is shown, I may need to aggregate it otherwise the clutter just won't say much to the user. I am not a big > fan of pie charts, but if you have only a few categories they my show a > pattern. > I have 90 kitchen types / categories. The summary() function from spatstat might enable me to make these pie charts and display the frequency. Another route that might be interesting is if you have street maps, look > at clustering of restaurants on different > streets. It may show interesting patterns, ie fast food clustered near > freeways and walmarts. This would be beneficial to users of the LBS, since they might be positioned on a certain street, and might want to know whether restaurants are close to each other / clustered. With street maps, do you mean the actual shapefiles of the streets? You might want to look at flowingdata.com there are some nice map > visualizations > there. Thanks very much for the link. There are truly nice map visualisations. There might be something I could use. Yes, if you need help rating restaurants, put me in your grant too :-) Hehe, as soon as the LBS prototype is ready, I'll give you a sign :) I originally planned to implement ratings as well, but have restricted myself to just analysing the ratings I have so far. I'm not sure if rating_interior, rating_food, rating_service would qualify the dataset as a Spatial Continuous Dataset, or that they're still covariates.. Visualising the restaurants based on the ratings by dimensions is something I'm investigating as well. I didn't come across any SDA technique to support this though. Thanks very much for the effort of replying and supplying me with valueble tips, Nicholas. Very much appreciated. 2009/2/16 Nicholas Lewin-Koh <ni...@hailmail.net> > Hi, > Yes, if you need help rating restaurants, put me in your grant too :-) > Seriously, there are many ways to skin a cat. I don't think cartograms > will help you much > in this particular case. If you have data besides your point pattern, eg > postal codes, census data, > zoning, ... You could look for the obvious patterns, eg Italian > restaurants clustered in little Italy, > and Chinese in china town, and then look for the more interesting not so > obvious patterns. > > But from your description, it seems like there might be other questions > that should guide your analysis. > Context should drive your exploration of the data. > > As Virgilo pointed out you won't get much milage plotting 10000 points. > You need some way of aggregating. > Glyphs might be one way if you have some polygonal unit that makes > sense, such as census blocks. I am not a big > fan of pie charts, but if you have only a few categories they my show a > pattern. Kernel density estimation is limited, > it will show you the spatial distribution of one particular type. > > Another route that might be interesting is if you have street maps, look > at clustering of restaurants on different > streets. It may show interesting patterns, ie fast food clustered near > freeways and walmarts. > > The sky is the limit. Once you have done a lot of this more basic EDA, > than think about what kind of analytical > methods you want to use to address specific questions. You are more > likely to get what you want. You might > want to look at flowingdata.com there are some nice map visualizations > there. > > Nicholas > > > > > > > > > ------------------------------ > > > > Message: 10 > > Date: Sun, 15 Feb 2009 22:17:29 +0100 > > From: Virgilio Gomez Rubio <virgilio.go...@uclm.es> > > Subject: Re: [R-sig-Geo] Point pattern analysis > > To: Michel Barbosa <cica...@gmail.com> > > Cc: r-sig-geo@stat.math.ethz.ch > > Message-ID: <1234732649.8833.84.ca...@virgilio-gomez> > > Content-Type: text/plain > > > > Dear Michel, > > > > > I'm new to Spatial Data Analysis and have just begun working through > > > "Applied Spatial Data Analysis wit R" by Bivand et al. For my research > I > > > would like to use SDA to be able to tell more about my restaurant data > set > > > than just pinpointing them on a google map. So far, from reading the > > > literature on SDA I've been able to construct the following questions. > > > > Interesting problem. Let me know if you need help collecting data. ;) > > > > > > > > 1. How far / close are restaurants from each other? (answered by using > > > kernel density estimation) > > > 2. Which type of restaurants stand next to each other? > > > 3. How are the restaurants positioned relatlivey from each other? > > > 4. What's the difference between restaurant A and restaurant B? > > > > > > Questions 2 and 3 are much alike, and I believe that question 4 is too > > general and not necessarily about the spatial distribution of the > > restaurants. > > > > Depending on the number of different types of restaurants, you may want > > to estimate a different surface for each type. Basically, you may > > consider a multivariate point pattern, so that you estimate a different > > surface for each type and you compare then to see if they are similar > > or not. This will address the question of whether the spatial > > distribution of different types of restaurants is the same or not. This > > is discussed in Diggle et al. (2005, JRSS Series A). Some of the methods > > described in the paper are implemented in package spatialkernel. > > > > You may also want to compute bivariate K-functions (see 'k12hat' in > > splancs; 'Kmulti' in spatstat) to detect differences between the spatial > > distributions of types of restaurants. This will give you a partial > > answer to Question 2. > > > > If you have a set of covariates for each restaurant and you want to > > estimate their effect and how they explain the spatial distribution of > > the data you can check Diggle et al. (2006, Biometrics). There is also > > an example of this in Bivand et al. (2008). > > > > I am not sure about the best way of tackling Question 3 (and why this is > > important). Have you considered to test for whether a certain type of > > restaurant tends to appear around a particular area of the city? For > > example, are Chinese restaurants clustered around Chinatown? > > > > Finally, another option is to aggregate your data (counts per > > neighbourhood, for example) and do a similar analysis as in disease > > mapping. > > > > > I've exported a subset of my dataset to CSV in order to import it in R. > > > Currently, my CSV file is of the form > > > > > > *restaurant name; latitude; longitude; type* > > > Amigo;52.996058;6.564229;Italian > > > Bella Italia;52.99281;6.560353;Italian > > > Isola Bella;52.993764;6.560245;Italian > > > > I would not use long/lat but UTM to do your analysis. You can do this > > very easily with R. > > > > > > > > I've tried to import the CSV in R by doing: > > > > > > library(spatstat) > > > info <- read.csv(file = "sample.csv", sep = ";", strip.white = TRUE) > > > win <- owin(c(0,100),c(0,100)) > > > pattern <- ppp(info$lat, info$lng, window = win, marks=info$name) > > > > > > However, if I plot the pattern, the points are all cluttered. What > advice > > > could you give me on setting the window size? > > > > If you try to plot more than 10,000 points, then I am not surprised that > > they are all cluttered. :) I would plot the estimated intensity of the > > point patterns. Or you may aggregate your data and produce a map based > > on the neighbourhoods in your area. > > > > Hope this helps. > > > > Virgilio > > > > > > [[alternative HTML version deleted]] _______________________________________________ R-sig-Geo mailing list R-sig-Geo@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo