[R] extend limited dimension in netcdf

2017-09-07 Thread raphael.felber
Dear all

I have to combine 3D netCDF files (lon, lat, time). The files contain data of 
one month and I need a year file containing all the data. Because the 
attributes of all files are the same, I copied the first file and appended the 
data of the other months. This went well until the provider of the data changed 
the time-dimension from UNLIMITED to LIMITED. Is there a way to change the time 
dimension to UNLIMITED?

I tried

ncnew$dim[[3]]$unlim <- TRUE

but this has no effect.

Thanks for any help.

Kind regards

Raphi


Raphael Felber, Dr. sc.
Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene

Eidgen�ssisches Departement f�r
Wirtschaft, Bildung und Forschung WBF
Agroscope
Forschungsbereich Agrar�kologie und Umwelt

Reckenholzstrasse 191, 8046 Z�rich
Tel. 058 468 75 11
Fax 058 468 72 01
raphael.fel...@agroscope.admin.ch
www.agroscope.ch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to benchmark speed of load/readRDS correctly

2017-08-23 Thread raphael.felber
Hi there

Thanks for your answers. I didn't expect that this would be so complex. 
Honestly, I don't understand everything you wrote since I'm not an IT 
specialist. But I read something that reading *.rds files is faster than 
loading *.Rdata and I wanted to proof that for my system and R version. But 
thanks anyway for your time.

Cheers Raphael


> -Ursprüngliche Nachricht-
> Von: Jeff Newmiller [mailto:jdnew...@dcn.davis.ca.us]
> Gesendet: Dienstag, 22. August 2017 18:33
> An: J C Nash ; r-help@r-project.org; Felber Raphael
> Agroscope 
> Betreff: Re: [R] How to benchmark speed of load/readRDS correctly
> 
> Caching happens, both within the operating system and within the C
> standard library. Ostensibly the intent for those caches is to help
> performance, but you are right that different low-level caching algorithms
> can be a poor match for specific application level use cases such as copying
> files or parsing text syntax. However, the OS and even the specific file
> system drivers (e.g. ext4 on flash disk or FAT32 on magnetic media) can
> behave quite differently for the same application level use case, so a generic
> discussion at the R language level (this mailing list) can be almost 
> impossible
> to sort out intelligently.
> --
> Sent from my phone. Please excuse my brevity.
> 
> On August 22, 2017 7:11:39 AM PDT, J C Nash 
> wrote:
> >Not convinced Jeff is completely right about this not concerning R,
> >since I've found that the application language (R, perl, etc.) makes a
> >difference in how files are accessed by/to OS. He is certainly correct
> >that OS (and versions) are where the actual reading and writing
> >happens, but sometimes the call to those can be inefficient. (Sorry,
> >I've not got examples specifically for file reads, but had a case in
> >computation where there was an 800% i.e., 8 fold difference in
> >timing with R, which rather took my breath away. That's probably been
> >sorted now.) The difficulty in making general statements is that a
> >rather full set of comparisons over different commands, datasets, OS
> >and version variants is needed before the general picture can emerge.
> >Using microbenchmark when you need to find the bottlenecks is how I'd
> >proceed, which OP is doing.
> >
> >About 30 years ago, I did write up some preliminary work, never
> >published, on estimating the two halves of a copy, that is, the reading
> >from file and storing to "memory" or a different storage location. This
> >was via regression with a singular design matrix, but one can get a
> >minimal length least squares solution via svd. Possibly relevant today
> >to try to get at slow links on a network.
> >
> >JN
> >
> >On 2017-08-22 09:07 AM, Jeff Newmiller wrote:
> >> You need to study how reading files works in your operating system.
> >This question is not about R.
> >>
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to benchmark speed of load/readRDS correctly

2017-08-23 Thread raphael.felber
Hi Bill

Thanks for your answer and the explanations. I tried to use garbage collection 
but I'm still not satisfied with the result. Maybe the question was not stated 
clear enough. I want to test the speed of reading/loading of data into R when a 
'fresh' R session is started (or even after a new start of the computer).

To understand what really happens, I tried:
r1 <- sapply(1:1, function(x) { gc(); t <- system.time(n <- 
readRDS('file.Rdata'))[3]; rm(n); gc(); return(t)})

and found a similar behavior as you; here and then the time is much larger, but 
the times are not as stable as in your example. Highest values are up to 50 
times larger than most of the other times (8 sec vs 0.15 sec), even with 
garbage collection. I assume with the code above the time spent for garbage 
collection isn't measured.

However, the first iteration always takes the longest. I'm wondering if I 
should take the first value as best guess.

Cheers Raphael
Von: William Dunlap [mailto:wdun...@tibco.com]
Gesendet: Dienstag, 22. August 2017 19:13
An: Felber Raphael Agroscope 
Cc: r-help@r-project.org
Betreff: Re: [R] How to benchmark speed of load/readRDS correctly

Note that if you force a garbage collection each iteration the times are more 
stable.  However, on the average it is faster to let the garbage collector 
decide when to leap into action.

mb_gc <- microbenchmark::microbenchmark(gc(), { x <- as.list(sin(1:5e5)); x <- 
unlist(x) / cos(1:5e5) ; sum(x) }, times=1000, control=list(order="inorder"))
with(mb_gc, plot(time[expr!="gc()"]))
with(mb_gc, quantile(1e-6*time[expr!="gc()"], c(0, .5, .75, .9, .95, .99, 1)))
#   0%   50%   75%   90%   95%   99%  100%
# 59.33450  61.33954  63.43457  66.23331  68.93746  74.45629 158.09799



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Aug 22, 2017 at 9:26 AM, William Dunlap 
mailto:wdun...@tibco.com>> wrote:
The large value for maximum time may be due to garbage collection, which 
happens periodically.   E.g., try the following, where the unlist(as.list()) 
creates a lot of garbage.  I get a very large time every 102 or 51 iterations 
and a moderately large time more often

mb <- microbenchmark::microbenchmark({ x <- as.list(sin(1:5e5)); x <- unlist(x) 
/ cos(1:5e5) ; sum(x) }, times=1000)
plot(mb$time)
quantile(mb$time * 1e-6, c(0, .5, .75, .90, .95, .99, 1))
#   0%   50%   75%   90%   95%   99%  100%
# 59.04446  82.15453 102.17522 180.36986 187.52667 233.42062 249.33970
diff(which(mb$time > quantile(mb$time, .99)))
# [1] 102  51 102 102 102 102 102 102  51
diff(which(mb$time > quantile(mb$time, .95)))
# [1]  6 41  4 47  4 40  7  4 47  4 33 14  4 47  4 47  4 47  4 47  4 47  4  6 41
#[26]  4  6  7  9 25  4 47  4 47  4 47  4 22 25  4 33 14  4  6 41  4 47  4 22



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Aug 22, 2017 at 5:53 AM, 
mailto:raphael.fel...@agroscope.admin.ch>> 
wrote:
Dear all

I was thinking about efficient reading data into R and tried several ways to 
test if load(file.Rdata) or readRDS(file.rds) is faster. The files file.Rdata 
and file.rds contain the same data, the first created with save(d, ' 
file.Rdata', compress=F) and the second with saveRDS(d, ' file.rds', 
compress=F).

First I used the function microbenchmark() and was a astonished about the max 
value of the output.

FIRST TEST:
> library(microbenchmark)
> microbenchmark(
+   n <- readRDS('file.rds'),
+   load('file.Rdata')
+ )
Unit: milliseconds
  expr minlq
   meanmedianuq   
max  neval
n <- readRDS(fl1)106.5956  109.6457 237.3844  
117.8956  141.9921  10934.162   100
 load(fl2)  295.0654  301.8162335.6266  
308.3757  319.6965  1915.706  100

It looks like the max value is an outlier.

So I tried:
SECOND TEST:
> sapply(1:10, function(x) system.time(n <- readRDS('file.rds'))[3])
elapsed   elapsed   elapsed   elapsed   
elapsed   elapsed   elapsed   
elapsed elapsed   elapsed
  10.50   0.11   0.11   
0.11   0.10   0.11  
 0.11   0.11   0.12 
  0.12
> sapply(1:10, function(x) system.time(load'flie.Rdata'))[3])
elapsed   elapsed   elapsed   elapsed   
elapsed   elapsed   elapsed   
elapsed elapsed   elapsed
   1.860.29   0.31  
 0.30  

[R] How to benchmark speed of load/readRDS correctly

2017-08-22 Thread raphael.felber
Dear all

I was thinking about efficient reading data into R and tried several ways to 
test if load(file.Rdata) or readRDS(file.rds) is faster. The files file.Rdata 
and file.rds contain the same data, the first created with save(d, ' 
file.Rdata', compress=F) and the second with saveRDS(d, ' file.rds', 
compress=F).

First I used the function microbenchmark() and was a astonished about the max 
value of the output.

FIRST TEST:
> library(microbenchmark)
> microbenchmark(
+   n <- readRDS('file.rds'),
+   load('file.Rdata')
+ )
Unit: milliseconds
  expr minlq
   meanmedianuq   
max  neval
n <- readRDS(fl1)106.5956  109.6457 237.3844  
117.8956  141.9921  10934.162   100
 load(fl2)  295.0654  301.8162335.6266  
308.3757  319.6965  1915.706  100

It looks like the max value is an outlier.

So I tried:
SECOND TEST:
> sapply(1:10, function(x) system.time(n <- readRDS('file.rds'))[3])
elapsed   elapsed   elapsed   elapsed   
elapsed   elapsed   elapsed   
elapsed elapsed   elapsed
  10.50   0.11   0.11   
0.11   0.10   0.11  
 0.11   0.11   0.12 
  0.12
> sapply(1:10, function(x) system.time(load'flie.Rdata'))[3])
elapsed   elapsed   elapsed   elapsed   
elapsed   elapsed   elapsed   
elapsed elapsed   elapsed
   1.860.29   0.31  
 0.30   0.30   0.31 
  0.30   0.29   0.31
   0.30

Which confirmed my suspicion; the first time loading the data takes much longer 
than the following times. I suspect that this has something to do how the data 
is assigned and that R doesn't has to 'fully' read the data, if it is read the 
second time.

So the question remains, how can I make a realistic benchmark test? From the 
first test I would conclude that reading the *.rds file is faster. But this 
holds only for a large number of neval. If I set times = 1 then reading the 
*.Rdata would be faster (as also indicated by the second test).

Thanks for any help or comments.

Kind regards

Raphael

Raphael Felber, PhD
Scientific Officer, Climate & Air Pollution

Federal Department of Economic Affairs,
Education and Research EAER
Agroscope
Research Division, Agroecology and Environment

Reckenholzstrasse 191, CH-8046 Z�rich
Phone +41 58 468 75 11
Fax +41 58 468 72 01
raphael.fel...@agroscope.admin.ch
www.agroscope.ch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ncdf4: Why are NAs converted to _FillValue when saving?

2017-08-15 Thread raphael.felber
Dear Dave

Thanks a lot for your answer. I agree that it is more an R issue than a package 
issue. But it's the first time I encountered such a problem.

For my R version (v3.4.1) on x86_64-w64-mingw32 the second part of your answer 
only holds for data_temp2; if I do any manipulation to data_temp2 before using 
ncvar_put(…, data_temp) then data_temp2 remains. However this doesn't hold for 
data_temp; after using ncvar_put(…, data_temp), the NAs in data_temp are 
converted to _FillValues (-999.99). For clarification I added two examples 
below.

Regards

Raphael


Examples:

> # *
> # without data manipulation
> # *
>
> # copy data
> data_temp2 <- data_temp
>
> # show what we have
> data_temp[1:5, 1:5, 1]
   [,1]   [,2]  [,3]   [,4]   [,5]
[1,] NA NANA 0.03887696 0.04786269
[2,] NA NANA 0.07736548 0.09524715
[3,] NA NANA 0.11508099 0.14167993
[4,] NA NANA 0.15164665 0.18669710
[5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885
>
> data_temp2[1:5, 1:5, 1]
   [,1]   [,2]  [,3]   [,4]   [,5]
[1,] NA NANA 0.03887696 0.04786269
[2,] NA NANA 0.07736548 0.09524715
[3,] NA NANA 0.11508099 0.14167993
[4,] NA NANA 0.15164665 0.18669710
[5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885
>
> # write to netCDF connection
> ncvar_put( ncid_new, var_temp, data_temp )
>
> # show what we have
> data_temp[1:5, 1:5, 1]
  [,1]  [,2] [,3]   [,4]   [,5]
[1,] -999.9900 -999.9900 -999.990 0.03887696 0.04786269
[2,] -999.9900 -999.9900 -999.990 0.07736548 0.09524715
[3,] -999.9900 -999.9900 -999.990 0.11508099 0.14167993
[4,] -999.9900 -999.9900 -999.990 0.15164665 0.18669710
[5,]0.047862690.095247150.1416799 0.18669710 0.22984885
>
> data_temp2[1:5, 1:5, 1]
  [,1]  [,2] [,3]   [,4]   [,5]
[1,] -999.9900 -999.9900 -999.990 0.03887696 0.04786269
[2,] -999.9900 -999.9900 -999.990 0.07736548 0.09524715
[3,] -999.9900 -999.9900 -999.990 0.11508099 0.14167993
[4,] -999.9900 -999.9900 -999.990 0.15164665 0.18669710
[5,]0.047862690.095247150.1416799 0.18669710 0.22984885




> # *
> # with data manipulation
> # *
>
> # show what we have
> data_temp[1:5, 1:5, 1]
   [,1]   [,2]  [,3]   [,4]   [,5]
[1,] NA NANA 0.03887696 0.04786269
[2,] NA NANA 0.07736548 0.09524715
[3,] NA NANA 0.11508099 0.14167993
[4,] NA NANA 0.15164665 0.18669710
[5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885
>
> data_temp2[1:5, 1:5, 1]
   [,1]   [,2]  [,3]   [,4]   [,5]
[1,] NA NANA 0.03887696 0.04786269
[2,] NA NANA 0.07736548 0.09524715
[3,] NA NANA 0.11508099 0.14167993
[4,] NA NANA 0.15164665 0.18669710
[5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885
>
> # do some manipulations
> data_temp <- data_temp * 1.0
> data_temp2 <- data_temp2 * 1.0
>
> # write to netCDF connection
> ncvar_put( ncid_new, var_temp, data_temp )
>
> # show what we have
> data_temp[1:5, 1:5, 1]
  [,1]  [,2] [,3]   [,4]   [,5]
[1,] -999.9900 -999.9900 -999.990 0.03887696 0.04786269
[2,] -999.9900 -999.9900 -999.990 0.07736548 0.09524715
[3,] -999.9900 -999.9900 -999.990 0.11508099 0.14167993
[4,] -999.9900 -999.9900 -999.990 0.15164665 0.18669710
[5,]0.047862690.095247150.1416799 0.18669710 0.22984885
>
> data_temp2[1:5, 1:5, 1]
   [,1]   [,2]  [,3]   [,4]   [,5]
[1,] NA NANA 0.03887696 0.04786269
[2,] NA NANA 0.07736548 0.09524715
[3,] NA NANA 0.11508099 0.14167993
[4,] NA NANA 0.15164665 0.18669710
[5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885
>
>
> # *
> # RESULT
> # with manipulation of data_temp2 the variable is copied and NAs remain NAs
> # but manipulation of data_temp doesn't help

Von: davidwilliampie...@gmail.com [mailto:davidwilliampie...@gmail.com] Im 
Auftrag von David W. Pierce
Gesendet: Montag, 14. August 2017 17:29
An: Felber Raphael Agroscope 
Cc: r-help@r-project.org
Betreff: Re: [R] ncdf4: Why are NAs converted to _FillValue when saving?

On Mon, Aug 14, 2017 at 5:29 AM, 
mailto:raphael.fel...@agroscope.admin.ch>> 
wrote:

Dear all

I'm a newbie regarding netcdf data. Today I 

[R] ncdf4: Why are NAs converted to _FillValue when saving?

2017-08-14 Thread raphael.felber
Dear all

I'm a newbie regarding netcdf data. Today I realized that I maybe do not 
understand some basics of the netcdf. I want to create a *.nc file containing 
three variables for Switzerland. All data outside of the country are NAs. The 
third variable is calculated from the first two variables. Basically there is 
no problem to do that. I copy the file with the data of the first variable, 
open this file with 'write=TRUE' (nc1 <- nc_open()), read the data to 'var1', 
open the other file (nc2 <- nc_open()), read the data to variable 'var2', put 
this variable to the file (nc1) and calculate the third variable based on var1 
and var2.

So far everything is fine. But I figured out that when I write the data 'var2' 
to nc1, all NAs in this variable are converted to the _FillValue-value. 
Clearly, I expect that all NAs are converted to the _FillValue in the file, but 
I do not expect that also the NAs in 'var2' (i.e. the data which can be called 
in the R-console) is changed. Since I use this data for further calculations, 
the NAs should remain.

Is that a bug or intended? Below you find a minimal example (adapted from the 
code in the netcdf4 manual) of the – in my eye – strange behavior.

Thanks for any explanation.

Kind regards

Raphael





Minimal working example (adapted from netcdf4 manual):

library(ncdf4)
#
# Make dimensions
#
xvals <- 1:360
yvals <- -90:90
nx <- length(xvals)
ny <- length(yvals)
xdim <- ncdim_def('Lon','degreesE', xvals )
ydim <- ncdim_def('Lat','degreesE', yvals )
tdim <- ncdim_def('Time','days since 1900-01-01', 0, unlim=TRUE )
#-
# Make var
#-
mv <- 1.e30 # missing value
var_temp <- ncvar_def('Temperature','K', list(xdim,ydim,tdim), mv )
#-
# Make new output file
#-
output_fname <-'test_real3d.nc'
ncid_new <- nc_create( output_fname, list(var_temp))
#---
# Put some test data in the file
#---
data_temp <- array(0.,dim=c(nx,ny,1))
for( j in 1:ny )
for( i in 1:nx )
data_temp[i,j,1] <- sin(i/10)*sin(j/10)

# add some NAs
data_temp[1:10, 1:5, 1] <- NA

# copy data
data_temp2 <- data_temp

# show what we have
data_temp[1:12, 1:7, 1]
data_temp2[1:12, 1:7, 1]

# write to netCDF connection
ncvar_put( ncid_new, var_temp, data_temp, start=c(1,1,1), count=c(nx,ny,1))

# show what we have now
data_temp[1:12, 1:7, 1]
data_temp2[1:12, 1:7, 1]

# Why are there no more NAs in data_temp?  ncvar_put changed NAs to 
_FillValue-value
# But why are the NAs in data_temp2 also changed to _FillValue?
#--
# Close
#--
nc_close( ncid_new )


Raphael Felber, Dr. sc.
Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene

Eidgenössisches Departement für
Wirtschaft, Bildung und Forschung WBF
Agroscope
Forschungsbereich Agrarökologie und Umwelt

Reckenholzstrasse 191, 8046 Zürich
Tel. 058 468 75 11
Fax 058 468 72 01
raphael.fel...@agroscope.admin.ch
www.agroscope.ch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove attribute from netcdf4 object

2017-08-02 Thread raphael.felber
Hi Marc

That's a workaround I can use. Thanks. I'm a newbie regarding netCDF data. Is 
there any information I'm losing when switching between the packages?

Raphael

Von: Marc Girondot [mailto:marc.giron...@u-psud.fr]
Gesendet: Mittwoch, 2. August 2017 15:13
An: Felber Raphael Agroscope 
Betreff: Re: AW: [R] Remove attribute from netcdf4 object

ok. Sorry, I didn't understood good.
I don't think you can do it in ncdf4 functions. The only solution would be to 
open it in RNetCDF, delete the attribute, save it and then open it in ncdf4.

Marc

Le 02/08/2017 à 15:02, 
raphael.fel...@agroscope.admin.ch a 
écrit :
Dear Marc

Thanks for your remark. I don't want to use both packages. I mentioned the 
package RNetCDF to show that there is a similar function I' d like to use.

Raphael

Von: Marc Girondot [mailto:marc.giron...@u-psud.fr]
Gesendet: Mittwoch, 2. August 2017 14:51
An: Felber Raphael Agroscope 
; 
r-help@r-project.org
Betreff: Re: [R] Remove attribute from netcdf4 object

Le 02/08/2017 à 12:03, 
raphael.fel...@agroscope.admin.ch a 
écrit :

Dear all



For a model I need to combine several netCDF files into one (which works fine). 
For better overview I'd like to delete/remove some of the attributes. Is there 
a simple way doing this?



I'm using the package netcdf4, which creates an object of class(nc) = "ncdf4". 
It seems that for earlier versions of netcdf objects, there was the function 
att.delete.nc{RNetCDF}. But this functions returns the following error, when 
applied to ncdf4-classes:

Error: class(ncfile) == "NetCDF" is not TRUE
You should use package ncdf4 or package RNetCDF, but not mixed both.
Marc







Thanks a lot for any help.



Kind regards



Raphael Felber





Raphael Felber, Dr. sc.

Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene



Eidgen�ssisches Departement f�r

Wirtschaft, Bildung und Forschung WBF

Agroscope

Forschungsbereich Agrar�kologie und Umwelt



Reckenholzstrasse 191, 8046 Z�rich

Tel. 058 468 75 11

Fax 058 468 72 01

raphael.fel...@agroscope.admin.ch

www.agroscope.ch





 [[alternative HTML version deleted]]







__

R-help@r-project.org mailing list -- To 
UNSUBSCRIBE and more, see

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--

__

Marc Girondot, Pr



Laboratoire Ecologie, Systématique et Evolution

Equipe de Conservation des Populations et des Communautés

CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079

Bâtiment 362

91405 Orsay Cedex, France



Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53

e-mail: marc.giron...@u-psud.fr

Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html

Skype: girondot



--

__

Marc Girondot, Pr



Laboratoire Ecologie, Systématique et Evolution

Equipe de Conservation des Populations et des Communautés

CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079

Bâtiment 362

91405 Orsay Cedex, France



Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53

e-mail: marc.giron...@u-psud.fr

Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html

Skype: girondot

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove attribute from netcdf4 object

2017-08-02 Thread raphael.felber
Dear Marc

Thanks for your remark. I don't want to use both packages. I mentioned the 
package RNetCDF to show that there is a similar function I' d like to use.

Raphael

Von: Marc Girondot [mailto:marc.giron...@u-psud.fr]
Gesendet: Mittwoch, 2. August 2017 14:51
An: Felber Raphael Agroscope ; 
r-help@r-project.org
Betreff: Re: [R] Remove attribute from netcdf4 object

Le 02/08/2017 à 12:03, 
raphael.fel...@agroscope.admin.ch a 
écrit :

Dear all



For a model I need to combine several netCDF files into one (which works fine). 
For better overview I'd like to delete/remove some of the attributes. Is there 
a simple way doing this?



I'm using the package netcdf4, which creates an object of class(nc) = "ncdf4". 
It seems that for earlier versions of netcdf objects, there was the function 
att.delete.nc{RNetCDF}. But this functions returns the following error, when 
applied to ncdf4-classes:

Error: class(ncfile) == "NetCDF" is not TRUE
You should use package ncdf4 or package RNetCDF, but not mixed both.
Marc






Thanks a lot for any help.



Kind regards



Raphael Felber





Raphael Felber, Dr. sc.

Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene



Eidgen�ssisches Departement f�r

Wirtschaft, Bildung und Forschung WBF

Agroscope

Forschungsbereich Agrar�kologie und Umwelt



Reckenholzstrasse 191, 8046 Z�rich

Tel. 058 468 75 11

Fax 058 468 72 01

raphael.fel...@agroscope.admin.ch

www.agroscope.ch





 [[alternative HTML version deleted]]






__

R-help@r-project.org mailing list -- To 
UNSUBSCRIBE and more, see

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--

__

Marc Girondot, Pr



Laboratoire Ecologie, Systématique et Evolution

Equipe de Conservation des Populations et des Communautés

CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079

Bâtiment 362

91405 Orsay Cedex, France



Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53

e-mail: marc.giron...@u-psud.fr

Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html

Skype: girondot

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Remove attribute from netcdf4 object

2017-08-02 Thread raphael.felber
Dear all

For a model I need to combine several netCDF files into one (which works fine). 
For better overview I'd like to delete/remove some of the attributes. Is there 
a simple way doing this?

I'm using the package netcdf4, which creates an object of class(nc) = "ncdf4". 
It seems that for earlier versions of netcdf objects, there was the function 
att.delete.nc{RNetCDF}. But this functions returns the following error, when 
applied to ncdf4-classes:
Error: class(ncfile) == "NetCDF" is not TRUE

Thanks a lot for any help.

Kind regards

Raphael Felber


Raphael Felber, Dr. sc.
Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene

Eidgen�ssisches Departement f�r
Wirtschaft, Bildung und Forschung WBF
Agroscope
Forschungsbereich Agrar�kologie und Umwelt

Reckenholzstrasse 191, 8046 Z�rich
Tel. 058 468 75 11
Fax 058 468 72 01
raphael.fel...@agroscope.admin.ch
www.agroscope.ch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with POSIXt time zone

2014-02-14 Thread raphael.felber
Thanks a lot for the fast answer. Sys.setenv(TZ="GMT") is a good solution for 
me.

Best regards

Raphael

-Ursprüngliche Nachricht-
Von: Frede Aakmann Tøgersen [mailto:fr...@vestas.com] 
Gesendet: Freitag, 14. Februar 2014 10:02
An: Felber Raphael Agroscope; r-help@r-project.org
Betreff: RE: Problem with POSIXt time zone

Hi Raphael

Bug or not that for others to say. Here is an explanation and a work around.

This is on my Windows 8 laptop and R-3.0.2:


## strptime gives you POSIXlt objects (see ?DateTimeClasses or ?striptime)
> t1 <- strptime(paste("01.01.2013", "20:00:00"),format="%d.%m.%Y 
> %H:%M:%S", tz="GMT")
> t2 <- strptime(paste("02.01.2013", "01:00:00"),format="%d.%m.%Y 
> %H:%M:%S", tz="GMT")

## c is probably doing some conversions. It does that for other classes of 
objects so be aware!
> c(t1, t2)
[1] "2013-01-01 21:00:00 CET" "2013-01-02 02:00:00 CET"

## see which methods there exists for c() 
> methods(c)
[1] c.bibentry*   c.Datec.noquote c.numeric_version
[5] c.person* c.POSIXct c.POSIXlt

   Non-visible functions are asterisked

## since strptime gives POSIXlt this is called
> c.POSIXlt
function (..., recursive = FALSE)
as.POSIXlt(do.call("c", lapply(list(...), as.POSIXct)))



## c.POSIXcl converts to POSIXct
## and so c.POISXct is called which does an unclass whereby information on time 
zone is lost ## the unclass gives us the number of second since 1st Janaury 1970
> c.POSIXct
function (..., recursive = FALSE)
.POSIXct(c(unlist(lapply(list(...), unclass



## last in c.POSIXct we now call .POSIXct with tz set NULL by default. Is this 
a bug Since tz=NULL by default the time zone information is now read from 
your locale on your computer (that depends what is set by the OS. On our linux 
HPC it is set to UTC)
> .POSIXct
function (xx, tz = NULL)
structure(xx, class = c("POSIXct", "POSIXt"), tzone = tz)



## However when I am dealing with data with timestamps those are usually save 
in the "UTC" = "GMT" zone format because we get data from all over the world 
and we do not want to deal with day light saving times. So do this to begin 
with in the R session: 
> Sys.setenv(TZ="GMT")

## and then we have:
> c(t1, t2)
[1] "2013-01-01 20:00:00 GMT" "2013-01-02 01:00:00 GMT"
>

Yours sincerely / Med venlig hilsen


Frede Aakmann Tøgersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling

Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
fr...@vestas.com
http://www.vestas.com

Company reg. name: Vestas Wind Systems A/S This e-mail is subject to our e-mail 
disclaimer statement.
Please refer to www.vestas.com/legal/notice If you have received this e-mail in 
error please contact the sender. 


> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org]
> On Behalf Of raphael.fel...@agroscope.admin.ch
> Sent: 14. februar 2014 09:24
> To: r-help@r-project.org
> Subject: [R] Problem with POSIXt time zone
> 
> Hello
> 
> I have to convert character strings into POSIXt format. And would like 
> to combine two of them. The following code does not what I expect. The 
> single conversions of the character strings, gives a the date and time 
> with time zone "GMT" as I expect. However if I combine two date time 
> with c() the time zone is changed to CET.
> 
> > strptime(paste("01.01.2013", "20:00:00"),format="%d.%m.%Y %H:%M:%S",
> tz="GMT")
> [1] "2013-01-01 20:00:00 GMT"
> > strptime(paste("02.01.2013", "01:00:00"),format="%d.%m.%Y %H:%M:%S",
> tz="GMT")
> [1] "2013-01-02 01:00:00 GMT"
> > c(strptime(paste("01.01.2013", "20:00:00"),format="%d.%m.%Y
> %H:%M:%S", tz="GMT"),
> +   strptime(paste("02.01.2013", "01:00:00"),format="%d.%m.%Y
> %H:%M:%S", tz="GMT"))
> [1] "2013-01-01 21:00:00 CET" "2013-01-02 02:00:00 CET"
> 
> Is that a bug? How can I solve this problem? I really need the time in 
> the time zone "GMT" else I run into troubles when the time changes to summer 
> time.
> 
> Thanks for any help.
> 
> Kind regards
> 
> Raphael Felber
> PhD Student
> 
> Eidgenössisches Departement für
> Wirtschaft, Bildung und Forschung WBF
> Agroscope
> Institut für Nachhaltigkeitswissenschaften INH Klima und Lufthygiene
> 
> Reckenholzstrasse 191, CH-8046 Zürich
> Tel. +41 44 377 75 11
> Fax  +41 44 377 72 01
> raphael.fel...@agroscope.admin.ch n.ch>
> www.agroscope.ch
> 
> 
>   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with POSIXt time zone

2014-02-14 Thread raphael.felber
Hello

I have to convert character strings into POSIXt format. And would like to 
combine two of them. The following code does not what I expect. The single 
conversions of the character strings, gives a the date and time with time zone 
"GMT" as I expect. However if I combine two date time with c() the time zone is 
changed to CET.

> strptime(paste("01.01.2013", "20:00:00"),format="%d.%m.%Y %H:%M:%S", tz="GMT")
[1] "2013-01-01 20:00:00 GMT"
> strptime(paste("02.01.2013", "01:00:00"),format="%d.%m.%Y %H:%M:%S", tz="GMT")
[1] "2013-01-02 01:00:00 GMT"
> c(strptime(paste("01.01.2013", "20:00:00"),format="%d.%m.%Y %H:%M:%S", 
> tz="GMT"),
+   strptime(paste("02.01.2013", "01:00:00"),format="%d.%m.%Y %H:%M:%S", 
tz="GMT"))
[1] "2013-01-01 21:00:00 CET" "2013-01-02 02:00:00 CET"

Is that a bug? How can I solve this problem? I really need the time in the time 
zone "GMT" else I run into troubles when the time changes to summer time.

Thanks for any help.

Kind regards

Raphael Felber
PhD Student

Eidgenössisches Departement für
Wirtschaft, Bildung und Forschung WBF
Agroscope
Institut für Nachhaltigkeitswissenschaften INH
Klima und Lufthygiene

Reckenholzstrasse 191, CH-8046 Zürich
Tel. +41 44 377 75 11
Fax  +41 44 377 72 01
raphael.fel...@agroscope.admin.ch
www.agroscope.ch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] remove NA in df results in NA, NA.1 ... rows

2012-12-13 Thread raphael.felber
Good morning!

I have the following data frame (df):

X.outer  Y.outer   X.PAD1   Y.PAD1   X.PAD2 Y.PAD2   X.PAD3 Y.PAD3   X.PAD4 
Y.PAD4
73 574690.0 179740.0 574690.2 179740.0 574618.3 179650 574729.2 179674 574747.1 
179598
74 574680.6 179737.0 574693.4 179740.0 574719.0 179688 574831.8 179699 574724.9 
179673
75 574671.0 179734.0 574696.2 179740.0 574719.0 179688 574807.8 179787 574729.2 
179674
76 574663.6 179736.0 574699.1 179734.0 574723.5 179678 574703.4 179760 574831.8 
179699
77 574649.9 179734.0 574704.7 179724.0 574724.9 179673 574702.4 179755 574852.3 
179626
78 574647.3 179742.0 574706.9 179719.0 574747.1 179598 574702.0 179754 574747.1 
179598
79 574633.6 179739.0 574711.4 179710.0 574641.8 179570 574698.0 179747   NA 
NA
80 574634.9 179732.0 574716.6 179698.0 574639.6 179573 574700.2 179738   NA 
NA
81 574616.5 179728.6 574716.7 179695.0 574618.3 179650 574704.4 179729   NA 
NA
82 574615.4 179731.0 574718.2 179690.0   NA NA 574708.1 179724   NA 
NA
83 574614.4 179733.6 574719.1 179688.0   NA NA 574709.3 179720   NA 
NA
...

44 574702.0 179754.0   NA   NA   NA NA   NA NA   NA 
NA

45 574695.1 179751.0   NA   NA   NA NA   NA NA   NA 
NA

46 574694.4 179752.0   NA   NA   NA NA   NA NA   NA 
NA

Which I subset to

df2 <- df[,c("X.PAD2","Y.PAD2")]

df2

 X.PAD2 Y.PAD2

73 574618.3 179650

74 574719.0 179688

75 574719.0 179688

76 574723.5 179678

77 574724.9 179673

78 574747.1 179598

79 574641.8 179570

80 574639.6 179573

81 574618.3 179650

82   NA NA

83   NA NA

...

44   NA NA

45   NA NA

46   NA NA





followed by removing the NA's using



df2 <- df2[!is.na(df2),]



If I now call df2, I get:



   X.PAD2 Y.PAD2

73   574618.3 179650

74   574719.0 179688

75   574719.0 179688

76   574723.5 179678

77   574724.9 179673

78   574747.1 179598

79   574641.8 179570

80   574639.6 179573

81   574618.3 179650

NA NA NA

NA.1   NA NA

NA.2   NA NA

NA.3   NA NA

NA.4   NA NA

NA.5   NA NA

NA.6   NA NA

NA.7   NA NA

NA.8   NA NA



It seems there are still NA's in my data frame. How can I get rid of them? What 
is the meaning of the rows numbered NA, NA.1 and so on?



Thanks for any hints.



Best regards



Raphael Felber


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What is wrong with this plotting?

2012-01-06 Thread raphael.felber
Hello

I have a data frame, called input, like this:

   DateTime CO2_A1cont
1 2011-04-08 11:47:01 NA
2 2011-04-08 12:42:018.9
3 2011-04-08 13:07:01 NA
4 2011-04-08 13:32:01 NA
5 2011-04-08 13:57:01   7.556482
6 2011-04-08 14:22:01 NA

57 2011-04-09 16:52:01   4.961558

And like to plot this series with plot() and connected lines, no interruption 
by the NAs.
I found that this code works:

y<-input[,2]
times <- DateTime

plot(y~as.POSIXct(times, format="%d.%m. %H:%M"), type="l", 
data=na.omit(data.frame(y,times)))

whereas this plot command...

 plot(input[,2]~as.POSIXct(DateTime, format="%d.%m. %H:%M"), type="l", 
data=na.omit(data.frame(input[,2],DateTime)))

... produces the error:

Error in model.frame.default(formula = input[, 2] ~ as.POSIXct(DateTime,  :
  variable lengths differ (found for 'as.POSIXct(DateTime, format = "%d.%m. 
%H:%M")')

I already checked the length of y, times, input[,2], DateTime, 
as.POSIXct(DateTime, format="%d.%m. %H:%M") which give all 57! 
nrow(data.frame(input[,2],DateTime)) and nrow(data.frame(y,times)) give both  
57! Too.

Thanks for any help.

Best regards

Raphael


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.