Re: [R] OpenOffice ods spreadsheets in R?

2011-07-10 Thread Martin Rittner
David, you are right, I seem not to be able to reproduce the problem 
with a synthetic file see attached). However, if I run


tst<-read.ods(file="test.ods",stringsAsFactors=FALSE)

I do not get a data.frame() in tst[[1]], just a vector of the row names:

1> tst[[1]]
[1] NA "row1" "row2" "row3" "row4" "row5"

I tried to reproduce the "messy" structure of the original file, which 
is not my fault but how I got the data... Sorry, I can't post the file 
that actually causes my problems, as it contains all my most important 
(unpublished) research data... And I don't want to spend hours on 
finding the exact table lines in many thousands that reproduce the 
problem, sorry.


Anyway, I'll go with the less satisfying solution of exporting all the 
sheets as .csv (as proposed by Ista and Don, thanks) and see how that 
goes, just thought there was another solution.


Richard, thanks for the tip, I didn't try it, but as far as I can see, 
it's supposed to run R within OpenOffice? That's not quite what I had in 
mind, I am using R on different machines and don't want to have to 
install OpenOffice on each and open it up each time I want to work on 
the data...


Thanks to everyone for the quick and detailed reply!
Martin


On 10/07/11 02:13, David Winsemius wrote:


On Jul 9, 2011, at 8:49 PM, Martin Rittner wrote:


I would like to open OpenOffice (LibreOffice) .ods files in R. I've
tried the ROpenOffice package from omegahat, but unfortunately, the
read.ods() function attempts to use the values of the first column in
a worksheet as row names, and thus does not allow duplicates in there
(which, even more unfortunately, occur in my files). Also, the
function does not allow to forward any other parameters to the called
functions, like e.g. row.names=FALSE.
Another option read.ods() does not offer is choice of the worksheet(s)
to open, but that's a minor issue for me.

Now, I could alter the files to fit the function. I could alter the
function itself. But before I do any of that, I wanted to ask if
someone has a better idea, another approach to ods-files altogether,
maybe?



I'm not getting the same results as you. Duplicates in the first column
do not seem to be causing problems. If you look at the cod for read.ods,
you can step through it and modify its behavior to you liking. Perhaps
you should post a reproducible example and also the error messages you
are getting.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] OpenOffice ods spreadsheets in R?

2011-07-09 Thread Martin Rittner
I would like to open OpenOffice (LibreOffice) .ods files in R. I've 
tried the ROpenOffice package from omegahat, but unfortunately, the 
read.ods() function attempts to use the values of the first column in a 
worksheet as row names, and thus does not allow duplicates in there 
(which, even more unfortunately, occur in my files). Also, the function 
does not allow to forward any other parameters to the called functions, 
like e.g. row.names=FALSE.
Another option read.ods() does not offer is choice of the worksheet(s) 
to open, but that's a minor issue for me.


Now, I could alter the files to fit the function. I could alter the 
function itself. But before I do any of that, I wanted to ask if someone 
has a better idea, another approach to ods-files altogether, maybe?


Thanks, Martin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading an ArcGIS raster file

2008-04-04 Thread Martin Rittner
I don't know too much about R myself, but a little about ArcGIS:

As far as I know, the .aux files ArcGIS produces do not hold the actual 
data, but metadata. The actual file should have some other extension 
(dat, tif, hgz, img, csv... many more), but with rather small datasets 
the metadata may be larger than the actual data (thanks, ESRI!).

As for import of rasterdata in R: I only know of the image() function, 
which gives an image as tabulated data (which then again may be 
displayed colourcoded...). I'd be interested in importing rasterfiles 
(in a defined resolution and scaling) myself (like for plotting 
rasterfiles on a map), but haven't found a good solution yet (in R, that 
is)...

If you'd have access to ArcGIS, I'd go the other way: let R calculate 
some rasterdata, import it in the GIS and do the mapping there..?

Sorry I can't help you more, I'm rather new to R myself...
Martin


Juliane Struve wrote:
> Dear members,
>
> How can I read and plot an ArcGIS raster file into R ? The file has extension 
> .aux and contains floating point bathymetry data. The purpose is to create a 
> spatial model in R that uses ArcGIS map data. I have managed to read and plot 
> various shape files into my R project, but I am stuck with this now. I am new 
> to this list and also to R, so any help would be much appreciated. 
>
> Many thanks and best wishes,
>
> Juliane 
>
>
>  
> Dr. Juliane Struve
> Adjunct Environmental Scientist
> Mote Marine Laboratory
> Center for Fisheries Enhancement
> 1600 Ken Thomson Parkway
> Sarasota, Florida, 34236
> (941)388-4441 Ext. 408
>
>
>   
>
> A Smarter Inbox http://uk.docs.yahoo.com/nowyoucan.html
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Thinking about using two y-scales on your plot?

2008-04-03 Thread Martin Rittner
Richie,
A plot of the actual temperature during a year (or thousands of years, 
as people in palaeoclimate-studies are rather used to) is just so much 
more intuitive, than some correlation-coefficients or such. I know I'm 
largely speaking to statisticians in this forum, but in Earth Sciences, 
most people aren't... I see the use of correlation coefficients and 
-plots in proofing that an apparent correlation is "real", but the first 
question upon presenting any statistic analysis is always "What does the 
DATA look like?".

Of course, these plots could be plotted separately with a common x-axis, 
it's just a matter of saving space and of being used to that kind of 
graph. I can't imagine anyone being falsely lead to a thought like "oh 
gosh, the temperature is much higher/bigger/more than the 
precipitation!" - that makes no sense. I do see the point in graphs 
where values are plotted together, whose possible interaction with each 
other might lead to wrong conclusions. Then, it might not be obvious 
that one is drawing a senseless conclusion.

Best regards,
Martin


Richard Cotton wrote:
>
> thegeologician wrote:
>   
>> ... very often time-series plots of some values are 
>> given rather to show the temporal correlation of these, than to show the 
>> actual numerical values! The same applies for plots of some sample 
>> values over distance (eg. element concentration over a sample or 
>> investigation area). In this case one is more interested in whether some 
>> values change simultaneously, than what the actual values at every point 
>> are.
>>
>> In the mentioned plot (see link below), the temporal  evolution of the 
>> mean temperature and of the precipitation over a year is the important 
>> information. 
>>
>> 
>
> If temporal correlation is what you are interested in, then why not plot
> that?  If you also care about the evolution of temperature and
> precipitation, then these can be plotted on individual graphs, to give three
> graphs in total, each with a common x-axis (time), and each showing one
> variable of interest on its y-axis.  This way the problems of multiple
> y-axes are avoided.
>
>
> -
> Regards,
> Richie.
>
> Mathematical Sciences Unit
> HSL
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Thinking about using two y-scales on your plot?

2008-03-27 Thread Martin Rittner
Hello all,
I know I'm not making friends with this, but: I absolutely see the point 
in dual-(or more!)-y-axis plots! I find them quite informative, and I 
see them often. In Earth-Sciences (and I very generously include 
atmospheric sciences here, as Johannes has given an example of a 
meteorological plot...) very often time-series plots of some values are 
given rather to show the temporal correlation of these, than to show the 
actual numerical values! The same applies for plots of some sample 
values over distance (eg. element concentration over a sample or 
investigation area). In this case one is more interested in whether some 
values change simultaneously, than what the actual values at every point 
are.

In the mentioned plot (see link below), the temporal  evolution of the 
mean temperature and of the precipitation over a year is the important 
information. No-one would get confused or yield wrong conclusions, if 
the curves would intersect somewhere else, only because of a shift of 
one y-axis relative to the other!? (which was proposed to be one of the 
great dangers of dual-scaled axes in the article Hadley posted)

On the other hand, you would never express temperature in terms of a 
percentage of some arbitrary start value, if you could give it just in 
plain °C!? (as was proposed as a workaround in the article mentioned) An 
awkward scale like this makes the actual graph much harder to read, not 
easier, as proposed. Furthermore, since the observed values in Earth 
Sciences often show a cyclic behavior, the graphs would still cross each 
other over and over again, no matter what the scale was.

So my conclusion for now: I'd answer the Question "are dual-scaled axes 
in graphs ever the best solution?" with a definitive YES. Maybe only in 
some specialized applications, but - yes. I strongly expect this 
discussion to go on (as I've read frequently here that these kind of 
graphs are considered very "inappropriate"..) and I am happy to learn to 
do better graphs, if you can show me to be wrong...

Greetings,
Martin


Johannes Hüsing wrote:
> I wonder how long it will take until metereologists will see the light.
>
> http://www.zoolex.org/walter.html
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2: bug in geom_ribbon + log scale!?

2008-02-17 Thread Martin Rittner
'works nicely with loging the values. Thank you very much!

Keep on the great work, the more I use ggplot2, the more I love it!

Greetings, Martin


[EMAIL PROTECTED] wrote:
> Yes, that's a bug in the current version (fixed in the development
> version) - the problem is that the min and max aesthetics aren't being
> correctly scaled.  You can fix it by explicitly logging those values,
> or I can send you the development version off list if you remind me of
> your OS.
>
> Hadley
>
> On 2/17/08, Martin Rittner <[EMAIL PROTECTED]> wrote:
>   
>> Hi everyone, Hadley,
>>
>> it seems there's a bug in geom_ribbon() when using it in a log-scaled plot:
>>
>> d<-data.frame(x=c(1:20),y1=rnorm(20)+3,y2=rnorm(20)+5)
>> p<-ggplot()
>> p<-p+geom_ribbon(data=d,aes(x=d[["x"]],min=d[["y1"]],max=d[["y2"]]))
>> p<-p+geom_line(data=d,aes(x=d[["x"]],y=d[["y1"]]),colour="blue")
>> p<-p+geom_line(data=d,aes(x=d[["x"]],y=d[["y2"]]),colour="red")
>> p2+scale_y_continuous()
>>
>> ...gives the plot I want, only with my actual data, I'd need it in
>> log-scale:
>>
>> p2+scale_y_log10()
>>
>> shifts the ribbon far above the actual data (factor 1000, if I see it
>> right, but I haven't tested it with different data to check on that...).
>>
>> I'm using the latest version of ggplot2 (0.5.7).
>>
>> Has anyone any suggestions for a workaround/patch?
>>
>> Many Thanks, Martin
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> 
>
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2: bug in geom_ribbon + log scale!?

2008-02-17 Thread Martin Rittner
Hi everyone, Hadley,

it seems there's a bug in geom_ribbon() when using it in a log-scaled plot:

d<-data.frame(x=c(1:20),y1=rnorm(20)+3,y2=rnorm(20)+5)
p<-ggplot()
p<-p+geom_ribbon(data=d,aes(x=d[["x"]],min=d[["y1"]],max=d[["y2"]]))
p<-p+geom_line(data=d,aes(x=d[["x"]],y=d[["y1"]]),colour="blue")
p<-p+geom_line(data=d,aes(x=d[["x"]],y=d[["y2"]]),colour="red")
p2+scale_y_continuous()

...gives the plot I want, only with my actual data, I'd need it in 
log-scale:

p2+scale_y_log10()

shifts the ribbon far above the actual data (factor 1000, if I see it 
right, but I haven't tested it with different data to check on that...).

I'm using the latest version of ggplot2 (0.5.7).

Has anyone any suggestions for a workaround/patch?

Many Thanks, Martin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 used in a function - variable scope/environment

2008-02-16 Thread Martin Rittner
Hi Hadley,

that helps perfectly! The actual solution, given my former example, would be

mapping=aes(x=names(da)[1],y=c)

I read about aes_string() sometime, but I didn't realize it was the
solution to this problem... As often, PEBKAC!

Many Thanks!
Martin


hadley wickham wrote:
> Hi Martin,
> 
> Two comments:
> 
>  * ggplot always requires the data to be plotted to be stored in a
> data.frame, not the environment  - this makes it possible to (e.g.)
> save self contained plot objects - but that isn't the problem here,
> the problem is setting up the appropriate mapping
> 
>  * aes_string makes it easier to build up aesthetic mappings
> programmatically - aes_string(x = names(data)[1], y = names(data)[c])
> 
> Does that help?
> 
> Hadley
> 
> On Fri, Feb 15, 2008 at 2:21 PM, Martin Rittner
> <[EMAIL PROTECTED]> wrote:
>> Hi everybody!
>>
>>  I'm trying to use ggplot2 to return a plot from a function (so I can add
>>  something or alter it then). Unfortunately, if I add a mapping to a
>>  layer in the function, the variable *name* is stored in the layer,
>>  rather than the variable's *value* - so that after the function returns
>>  the ggplot2-object, it doesn't plot because the variable don't exist in
>>  the environment calling the function.. e.g:
>>
>>  my function does something like:
>>
>>  getPlot<-function(da=NULL,...){
>> #1st column holds x-values, others hold data series to plot...
>> co<-as.character(names(da))
>> co<-co[2:length(co)]
>>
>> pl<-ggplot(data=da)
>> pl<-pl+scale_y_log10()+scale_x_continuous()
>> for(c in co){
>> 
>> pl<-pl+geom_line(x=da[[1]],y=da[[c]],mapping=aes(x=da[[1]],y=da[[c]]))
>> }
>>
>> return(pl)
>>  }
>>
>>  I need to add every layer separately, because I want to be able to
>>  explicitly define attributes for every data series (colour, size... e.g.
>>  highlight only two specific out of 10 series...).
>>
>>  Anyway, my problem is this:
>>
>>  d<-data.frame(x=seq(0.0,1.0,length=5),y1=rnorm(5),y2=rnorm(5))
>>  p<-getPlot(da=d)
>>  p
>>
>>  returns with
>>
>>  Error in data.frame(..., check.names = FALSE) :
>>arguments imply differing number of rows: 0, 5
>>
>>  and the plot object contains:
>>
>>  Title:
>>  Labels:  x=, y=
>>  ---
>>  Data:x, y1, y2 [5x3]
>>  Mapping:
>>  Scales:  y,x -> y,x
>>  $margins
>>  [1] FALSE
>>
>>  $facets
>>  [1] ". ~ ."
>>
>>  ---
>>  geom_line: (colour=black, size=1, linetype=1, x=NA, y=NA) + (x=c(0,
>>  0.25, 0.5, 0.75, 1), y=c(0.180036717548597, -0.369556903134046,
>>  -0.924474152821948, -2.40773640658189, 0.801471591443009))
>>  stat_sort: (...=) + (x=c(0, 0.25, 0.5, 0.75, 1), y=c(0.180036717548597,
>>  -0.369556903134046, -0.924474152821948, -2.40773640658189,
>>  0.801471591443009))
>>  position_identity: ()
>>  mapping: (x=da[[1]], y=da[[c]])
>>
>>  geom_line: (colour=black, size=1, linetype=1, x=NA, y=NA) + (x=c(0,
>>  0.25, 0.5, 0.75, 1), y=c(-1.59744511956184, -0.9333541477049,
>>  1.88697835844878, 0.921829569181679, -0.741077741846118))
>>  stat_sort: (...=) + (x=c(0, 0.25, 0.5, 0.75, 1), y=c(-1.59744511956184,
>>  -0.9333541477049, 1.88697835844878, 0.921829569181679, -0.741077741846118))
>>  position_identity: ()
>>  mapping: (x=da[[1]], y=da[[c]])
>>
>>  Note the mappings, they refer to "da" and "c" (defined in the function)
>>  which are not available in the calling environment. Any Idea how I can
>>  avoid the problem/paste the actual values in, like it did for the
>>  geometry and the statistics?
>>
>>  Thanks, Martin
>>
>>  __
>>  R-help@r-project.org mailing list
>>  https://stat.ethz.ch/mailman/listinfo/r-help
>>  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>  and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2 used in a function - variable scope/environment

2008-02-15 Thread Martin Rittner
Hi everybody!

I'm trying to use ggplot2 to return a plot from a function (so I can add 
something or alter it then). Unfortunately, if I add a mapping to a 
layer in the function, the variable *name* is stored in the layer, 
rather than the variable's *value* - so that after the function returns 
the ggplot2-object, it doesn't plot because the variable don't exist in 
the environment calling the function.. e.g:

my function does something like:

getPlot<-function(da=NULL,...){ 
#1st column holds x-values, others hold data series to plot...
co<-as.character(names(da))
co<-co[2:length(co)]

pl<-ggplot(data=da)
pl<-pl+scale_y_log10()+scale_x_continuous()
for(c in co){

pl<-pl+geom_line(x=da[[1]],y=da[[c]],mapping=aes(x=da[[1]],y=da[[c]]))
}

return(pl)
}

I need to add every layer separately, because I want to be able to 
explicitly define attributes for every data series (colour, size... e.g. 
highlight only two specific out of 10 series...).

Anyway, my problem is this:

d<-data.frame(x=seq(0.0,1.0,length=5),y1=rnorm(5),y2=rnorm(5))
p<-getPlot(da=d)
p

returns with

Error in data.frame(..., check.names = FALSE) :
   arguments imply differing number of rows: 0, 5

and the plot object contains:

Title:
Labels:  x=, y=
---
Data:x, y1, y2 [5x3]
Mapping:
Scales:  y,x -> y,x
$margins
[1] FALSE

$facets
[1] ". ~ ."

---
geom_line: (colour=black, size=1, linetype=1, x=NA, y=NA) + (x=c(0, 
0.25, 0.5, 0.75, 1), y=c(0.180036717548597, -0.369556903134046, 
-0.924474152821948, -2.40773640658189, 0.801471591443009))
stat_sort: (...=) + (x=c(0, 0.25, 0.5, 0.75, 1), y=c(0.180036717548597, 
-0.369556903134046, -0.924474152821948, -2.40773640658189, 
0.801471591443009))
position_identity: ()
mapping: (x=da[[1]], y=da[[c]])

geom_line: (colour=black, size=1, linetype=1, x=NA, y=NA) + (x=c(0, 
0.25, 0.5, 0.75, 1), y=c(-1.59744511956184, -0.9333541477049, 
1.88697835844878, 0.921829569181679, -0.741077741846118))
stat_sort: (...=) + (x=c(0, 0.25, 0.5, 0.75, 1), y=c(-1.59744511956184, 
-0.9333541477049, 1.88697835844878, 0.921829569181679, -0.741077741846118))
position_identity: ()
mapping: (x=da[[1]], y=da[[c]])

Note the mappings, they refer to "da" and "c" (defined in the function) 
which are not available in the calling environment. Any Idea how I can 
avoid the problem/paste the actual values in, like it did for the 
geometry and the statistics?

Thanks, Martin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.