On Tue, 1 Jul 2008, David Hugh-Jones wrote:

I thought about this some more. One solution would be some wiki-like
documentation which people could easily edit as they learnt. The
obvious place is the R wiki ( http://wiki.r-project.org/rwiki/ ). At
the moment there's no section on spatial data and the spatial
statistics section just points you at the spatial view on CRAN.

Things I would like to know
- a list of external data types and how to get them into and out of R
- a list of internal R data representations (RarcInfo, spatstat etc.)
and how to convert between them
- a list of things to do to data (subset, thin, measure distances,
graphing etc.) and what packages do them

I am sure other people have their own thoughts - e.g. I haven't even
mentioned data analysis or statistics yet - so I am just going to
start hacking at
http://wiki.r-project.org/rwiki/doku.php?id=tips:spatial-data .
Helpers would be welcome - people who know a lot can provide answers,
and if (like me) you know barely anything, then you'll know what
questions need answering.

Excellent. Please do support David's initiative - I'll add a link to the task view very soon.

Roger


David Hugh-Jones
PhD Candidate
Essex University Department of Government
http://davidhughjones.googlepages.com


2008/7/1 Roger Bivand <[EMAIL PROTECTED]>:
On Mon, 30 Jun 2008, tom sgouros wrote:


Roger Bivand <[EMAIL PROTECTED]> wrote:


I don't mean to rant, but believe me, I've spent plenty of time with
the documentation and it's really not helping.

Partly this is a problem of R's doc format which treats package
documentation as an alphabetical list of functions - which gives me no
idea where to start.

I would tend to second this.  I've been lurking on the list for a few
months, hoping to learn a bit, but so far without much success, since
the conversation and the documentation are so far above where I am and
what I need.  I am almost familiar with R, using it for time-series
statistics, and learned early on that the Dalgaard book was a better
intro than any of the real R documents.  It doesn't cover nearly
everything, but it seems to cover what I needed, so I use what is
probably a baby-level set of R functions, but it's adequate.  Without a
professor lurking over my shoulder to explain stuff, I am perpetually
slightly lost, because all the documentation assumes I know stuff that I
don't.

I still use R because I know that with enough poking around it will
eventually provide a solution.  But if the alternatives were not very
expensive, I would have given up a while ago.

I joined this group when I wanted to expand into making maps of
geographical economic data, and after a month of working on the problem,
I essentially had to give up for the time being.  I wish there were an
introduction that showed me how to use R with a GIS program, but to my
knowledge, there is not.  I did run across a GRASS book that claimed it
would help, but as I recall, it cost upward of US$100.

I would be happy to link to such a guide from the Spatial Task View, for
example on the R-Geo site. There are other nice resources, for instance
Dylan Beaudette's site - one page is:

http://casoilresource.lawr.ucdavis.edu/drupal/node/100

Seen from the developer side, it is hard to know what users see as the most
useful advice. The courses that have been provided - say like:

http://www.bias-project.org.uk/ASDARcourse/

are rather "developer-view", as indeed the forthcoming "useR" series book
will be. By "developer-view", I mean attempting to provide information both
for beginning users and trying to advance along the useR-developeR continuum
where experience has shown that this may be advisable, even though neither
desired nor immediately applauded.

A typical immediate response to the courses has been that "all that class
and coordinate reference system stuff is unnecessary". This seems to hold
until the participants actually get to do work with their own data, at which
point having a reference to what is going on is handy. The specific
difficulty, as teachers often find, is that the initial expectation from the
user is often not the most fruitful question for helping the user to become
more self-reliant going forward.

One clear reason for this difficulty is that many different disciplines use
spatial data, and all of them seem to feel that they know enough for their
internal purposes, so get frustrated when they encounter barriers which are
inherent in their perception(s) of spatial data. So listening to other
disciplines and learning from them can be helpful.

As far as sp classes are concerned, is the R-News note of 2005 too outdated
to be helpful? Should it be placed more prominiently on the Task View page?
I would acknowledge that ease of use is not what it could be, we are still
where time series (and time representation) were in R a couple of years ago.
However, the sp classes ought to work for many who do not need to manipulate
the actual coordinates. For working statisticians, simple mapping of model
residuals is no more than:

library(sp)
library(rgdal)
mydata <- readOGR(dsn="directory", layer="shpfile")
# or:
# library(maptools)
# mydata <- readShapeSpatial("shpfile.shp")
# from release 0.7-14 already submitted to CRAN, now you have to know
# whether your shapefile is point, line or polygons
myobj <- lm(response ~ x1 + x2, data=mydata)
mydata$residuals <- residuals(myobj)
spplot(mydata, "residuals")

where the mydata Spatial*DataFrame behaves as a data.frame. The two-faced
nature of the Spatial*DataFrame classes is intentional, looking like GIS
data models for GIS people, and data frames for statisticians. But
manipulating coordinates is just a good deal more complicated - unless you
just need subsetting with the "[" methods.

To summarise, contributions of user tips and examples, and links to those
examples, would be very welcome.

Roger


To make this more useful than just a rant, I would second David's point.
What is missing is only what David misses: an introduction that says
where to start to deal with simple geographic data, maybe providing a
few examples of common techniques and frequent problems, and pushing
data back and forth to some GIS.  I was not able to find that, and
without it, found the R documentation pretty much useless.  I'd be happy
to know of some source I hadn't found before, so if you have one to
recommend, please do.

-tom




This is an inherent (and perhaps ugly) characteristic of the S4
object/class structure as you suggest below. New style classes are not
as well integrated into the documentation as straight functions
are. Here, coerce is as(), but the issue of how to improve
documentation is not resolved.


This then interacts badly with the OO structure. For example, look at
the 20+ pages on "coerce". Hmm, what does "coerce" actually do? In
fact that's in a whole different library. But I didn't know that, so I
click on a page at random, say

coerce,SpatialGrid,data.frame-method

and this takes me to SpatialGrid class - which doesn't mention coerce
at all. (Nor does it tell me what SpatialGrid is, or what it is used
for.)

On the other hand, maybe I might guess that to get a list of
coordinates, I'd use "coordinates". So I click on that method, and it
tells me yes, this "retrieves spatial coordinates". But unfortunately
it retrieves them hidden inside another object ("an object of class
SpatialPointsDataFrame"). OK, but how do I get the _actual_data_?
Maybe the SpatialPointsDataFrame class page will tell me. Nope. Et
cetera.

Rick: yes, I agree that using the internal data structures is how to
do things, but this is broken isn't it? The whole point of having OO
is to be able to use it _without_ understanding the internal data
structures. The ideal, in other words, would be to have a "thin.lines"
method that I could just run on any polygon or set of polygons.
Failing that, then I should be able to get at the internal data
without hours of head scratching.


No, because the underlying understanding of dp and other methods for
thinning is that the objects implement an arc-node topological model,
so that each arc can be thinned without different thinning happening
on otherwise identical boundaries of neighbouring polygons. But we do
not have an arc-node representation, so there cannot be line thinning
for polygon boundaries in a spaghetti world.

Right now, it's like, everything is hidden behind a layer of classes
and slots and methods, but I still need to go behind that layer to get
at the actual raw data, and this is so complicated and confusing that
it would be easier just to work with the raw data.


You need to build topology first, so if need be take the data out to a
GIS that does topology properly, do the arc line thinning there, and
bring it back in. Building topology from a stream of straight line
segments is a serious challenge, especially if you want to retain the
association with attribute data.

Roger

OK, I'll stop venting. If there's anything I could do to improve this
situation, I would gladly try.

David Hugh-Jones
PhD Candidate
Essex University Department of Government
http://davidhughjones.googlepages.com


2008/6/30 Virgilio Gomez-Rubio <[EMAIL PROTECTED]>:

Dear David,

Probably the best way to start is by checking the HTML documentation.
It
should be installed locally but it is also accesible, for example,
here:

http://finzi.psych.upenn.edu/R/library/sp/html/00Index.html

Hope this helps.

Virgilio

On Mon, 2008-06-30 at 18:48 +0200, David Hugh-Jones wrote:

Thanks David for his comment about dp.

Quick question: is there any reasonably comprehensible API
documentation for the "sp" package? I have just spent about an hour
trying to get a list of points from a SpatialPolygons object. I
eventually just printed everything out and found the data by hand, so
now I am doing:

coords <- [EMAIL PROTECTED]@[EMAIL PROTECTED]

but I don't assume that is right. Surely there must be some simple way
to get a list of x and y coords out of any object?

in frustration,
David Hugh-Jones
PhD Candidate
Essex University Department of Government
http://davidhughjones.googlepages.com


2008/6/30 David PINAUD <[EMAIL PROTECTED]>:

maybe you can try the function dp() in the package "shapefiles",
which is an
implementation of the Douglas-Peucker polyLine simplification
algorithm.
Hope it helps
David

David Hugh-Jones a écrit :

Hi all

I have a big dataset of points and am doing stuff on them that takes
a
lot of time. To speed it up, I would like to use "thinlines" from
RArcinfo, which basically makes the maps "rougher" by throwing away
points. Is there an equivalent function for SpatialPolygon type
objects? (I assume that there's no way to convert _to_ Arcinfo,
though
I know it's possible to read from it).

Cheers

David Hugh-Jones
PhD Candidate
Essex University Department of Government
http://davidhughjones.googlepages.com

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo





--
***************************************************
David PINAUD
Ingénieur de Recherche "Analyses spatiales"

Centre d'Etudes Biologiques de Chizé - CNRS UPR1934
79360 Villiers-en-Bois, France poste 485
Tel: +33 (0)5.49.09.35.58
Fax: +33 (0)5.49.09.65.26
http://www.cebc.cnrs.fr/

***************************************************




________ Information from NOD32 ________
This message was checked by NOD32 Antivirus System for Linux Mail
Servers.
http://www.eset.com

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo



_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo




--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo



_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Reply via email to