Re: [R-sig-Geo] Did I correctly convert my data from UTM zone 12 to 11?

2023-08-08 Thread Alexander Ilich
If the vast majority of your data was within one UTM zone and only a small 
amount bled over into the adjacent UTM zone then it'd probably just be ok to 
choose the one that contains the majority of your data and accept that there 
will be some distortion. Since it's roughly 50/50 though, you should either 
split the data into two data sets and perform calculations separately for each 
one, or transform them to a common projection (e.g. EPSG:4326 aka WGS84 aka 
lat/lon) and then perform the analysis. Most of this has been stated already, 
but I figured I'd point you to some tools as well that may help you. Rather 
than doing some work arounds that seem to get it to look right it'd be better 
to use tools specifically to transform between different projections. Based on 
what you wrote it seems like you're working with vector data. For vector data, 
the sf package in R is very comprehensive (though terra also has vector 
capabilities). The st_crs will let you check or manually set a projection for a 
data set, and the st_transform function will let you transform from one 
projection to another. Note the difference here is that st_crs will not change 
the coordinates, it will just assign the name of the coordinate system whereas 
st_transform will convert from one to the other. UTM is convenient in certain 
cases because it can make calculations simpler since you can assume the area is 
essentially flat over the space considered. That being said, the sf package 
many calculations can now be done using spherical 
geometry which will be more 
accurate (I'm not sure about google earth engine though as I haven't used it). 
Below is some R code to take a dataframe of coordinates (like from your CSV), 
define the projection as UTM 11 or UTM 12 in separate objects, and how to 
transform them to lat/lon and combine them into a single object.

``` r
library(sf)
#> Linking to GEOS 3.11.1, GDAL 3.6.2, PROJ 9.1.1; sf_use_s2() is TRUE

# Create Data frames of points (Here you would use read.csv)
UTM11<- data.frame(X = c(358334, 571173, 446818, 576082, 331892),
   Y = c(5128282, 5105900, 4730385, 4667181, 4811548))

UTM12<- data.frame(X = c(405776, 614821, 428211, 602751, 366593),
   Y = c(5132654, 5117139, 4841550, 4704516, 4625745))

#Have 2 separate objects with different crs
UTM11<- st_as_sf(UTM11, coords = c("X", "Y"))
st_crs(UTM11)<- "EPSG:32611" #Set crs (https://epsg.io/32611)

UTM12<- st_as_sf(UTM12, coords = c("X", "Y"))
st_crs(UTM12)<- "EPSG:32612" #Set crs (https://epsg.io/32612)

#Transform to common crs and merge
WGS84<- rbind(st_transform(UTM11, crs = "EPSG:4326"),
  st_transform(UTM12, crs = "EPSG:4326"))
```

Created on 2023-08-08 with [reprex 
v2.0.2](https://reprex.tidyverse.org)



From: R-sig-Geo  on behalf of Jason Edelkind 
via R-sig-Geo 
Sent: Sunday, August 6, 2023 4:23 PM
To: r-sig-geo@r-project.org 
Subject: [R-sig-Geo] Did I correctly convert my data from UTM zone 12 to 11?

hello, first time user here and aspiring grad student. I have a set of location 
data that I�ve been trying to import into google earth engine from R as a CSV 
file. The problem is that about half of my data is from utm zone 12, and the 
other half is from utm zone 11. When I import my original data into google 
earth engine, the zone 11 data is shifted over to the right because I use utm 
zone 12 as the crs in R. After some reading into the definition of a utm zone, 
I tried to just subtract 6 from the zone 11 latitude values after first 
converting them to a lon/lat format. This appears to have worked as on first 
glance all of the zone 11 points are where they should be, however it feels 
like too easy a fix for me after struggling with this for several days. So my 
question is, is this an acceptable way to convert my data, or am I doing 
something wrong that could result in inaccurate location data? Thanks!
___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7Cailich%40usf.edu%7Cf73b5985c9d543c03e7508db96bb0824%7C741bf7dee2e546df8d6782607df9deaa%7C0%7C0%7C638269502289723854%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=T2BxWF1rCQPJmqKsVkbFR4%2F8xXrEh5iXM0XjhO634gc%3D&reserved=0
[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the 
sender and know the content is safe.

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] Merge dataframe with NetCDF file

2023-05-19 Thread Alexander Ilich
The WriteRaster should be able to do it 
(https://stackoverflow.com/questions/50026442/writing-r-raster-stack-to-netcdf).
 Also, if practical, I'd recommend switching from the raster package to the 
terra package which has replaced it and is written by the same author. The 
syntax is almost identical so it's an easy transition. terra has a writeCDF 
function which may work (https://rdrr.io/cran/terra/man/writeCDF.html).

Best Regards,
Alex


From: R-sig-Geo  on behalf of Miluji Sb 

Sent: Friday, May 19, 2023 7:19 PM
To: R-sig-geo mailing list 
Subject: [R-sig-Geo] Merge dataframe with NetCDF file

[You don't often get email from miluj...@gmail.com. Learn why this is important 
at https://aka.ms/LearnAboutSenderIdentification ]

Dear all,

I am struggling to convert a dataframe with 49 years of data for 259,200
coordinates. How can I convert this dataset into a NetCDF file with the
following attributes;

class  : RasterBrick
dimensions : 360, 720, 259200, 10  (nrow, ncol, ncell, nlayers)
resolution : 0.5, 0.5  (x, y)
extent : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
crs: +proj=longlat +datum=WGS84 +no_defs
year (): 1971, 1972, ..., 2019
varname: fs

Any help will be highly appreciated.

Best,

Milu

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7Cailich%40usf.edu%7C0f7d47bd46514adffc8408db58bf9b44%7C741bf7dee2e546df8d6782607df9deaa%7C0%7C0%7C638201387342113041%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=4%2FySAN%2BWpf4VPnWpJ464WBHBud2bZ0vVz4aV3k1pLHI%3D&reserved=0
[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the 
sender and know the content is safe.

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] Finding the highest and lowest rates of increase at specific x value across several time series in R

2023-05-18 Thread Alexander Ilich
"df" is not an object, but is an input to a function (known as a function 
argument). If you run the code for the "ExtractFirstMin" function definition 
with a clear environment you'll notice there's no error event though there's no 
object df. What will happen after you run the code is a new variable called 
"ExtactFirstMin" will be defined. This is new variable in your environent is 
actually a function. It works just like any built in R function such as "mean", 
"range", "min", etc, but it only exists because you defined it. When you supply 
an input to the function it is substituted for "df" in that function code. When 
you use "sapply" you input a list of all your data frames as well as the 
function to apply to them. So when you do sapply(df_list, ExtactFirstMin), you 
are applying that ExtractFirstMin function across all of your dataframes.  You 
should only need to edit the right side of the following line of code to put 
your dataframes in the list by substituting the names of your data
 frames:

df_list<- list(dataframe1, dataframe2, dataframe3, dataframe4, dataframe5, 
dataframe6, dataframe7, dataframe8, dataframe9, dataframe10)

You do not need the code block to create 10 data frames. Since I don't have 
your data, I needed to generate data with a similar structure to run the code 
on, but you can run the code on your real data.

Here are some resources on functions and iteration that may help clarify a few 
things.
https://r4ds.had.co.nz/functions.html
https://r-coder.com/sapply-function-r/

____
From: rain1...@aim.com 
Sent: Wednesday, May 17, 2023 3:56 PM
To: Alexander Ilich ; r-sig-geo@r-project.org 

Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

Hi Alexander,

Yes, you're right - that approach would be much faster and much less subject to 
error. The method that I was using worked as intended, but I am more than happy 
to try to learn this arguably more effective way. My only question in that 
regard is the defining of the object "df" in:

ExtractFirstMin<- function(df){
  df$abs_diff<- abs(df$em-1)
  min_rate<- df$pct[which.min(df$abs_diff)]
  return(min_rate)
}

Is object "df" in your example above coming from this?

#Generate data
set.seed(5)
for (i in 1:10) {
  assign(x = paste0("df", i),
 value = data.frame(Time = sort(rnorm(n = 10, mean = 1, sd = 0.1)),
Rate= rnorm(n = 10, mean = 30, sd = 1)))
} # Create 10 Data Frames

If so, how would I approach placing all 10 of my dataframes (i.e. df1, df2, 
df3, df4...df10) in that command?

Thanks, again, and sorry if I missed this previously in your explanation! In 
any case, at least I am able to obtain the results that I was looking for!

-Original Message-
From: Alexander Ilich 
To: r-sig-geo@r-project.org ; rain1...@aim.com 

Sent: Wed, May 17, 2023 10:16 am
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

Awesome, glad you were able to get the result you needed. Just to be clear 
though, you shouldn't need to manually copy the code 
"df$pct[which.min(df$abs_diff)]" repeatedly for each dataframe. That I sent 
just to explain what internally was happening when using sapply and the 
function. If you replace "$x" with "$em" and $y" with "$pct" you can 
automatically iterate through as many dataframes as you want as long as they 
are in df_list.

# Define Functions (two versions based on how you want to deal with ties)
ExtractFirstMin<- function(df){
  df$abs_diff<- abs(df$em-1)
  min_rate<- df$pct[which.min(df$abs_diff)]
  return(min_rate)
}

# Put all dataframes into a list
df_list<- list(df1,df2,df3,df4,df5,df6,df7,df8,df9,df10)

# Apply function across list
w<- sapply(df_list, ExtractFirstMin)
w
#>  [1] 29.40269 32.21546 30.75330 30.12109 30.38361 28.64928 30.45568 29.66190
#>  [9] 31.57229 31.33907

From: rain1...@aim.com 
Sent: Tuesday, May 16, 2023 11:13 PM
To: Alexander Ilich ; r-sig-geo@r-project.org 

Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

Hi Alexander,

Oh, wow...you are absolutely right - I cannot believe that I did not notice 
that previously! Thank you so much, yet again, including for the insight on 
what "numeric(0)" signifies! Indeed, it all works just fine now!

I am now able to flexibly achieve the goal of deriving the range of these 
values across the 10 dataframes using the "range" function!

I cannot thank you enough, including for your tireless efforts to explain 
everything step-by-step throughout all of this, though I do apologize fo

Re: [R-sig-Geo] Finding the highest and lowest rates of increase at specific x value across several time series in R

2023-05-17 Thread Alexander Ilich
Awesome, glad you were able to get the result you needed. Just to be clear 
though, you shouldn't need to manually copy the code 
"df$pct[which.min(df$abs_diff)]" repeatedly for each dataframe. That I sent 
just to explain what internally was happening when using sapply and the 
function. If you replace "$x" with "$em" and $y" with "$pct" you can 
automatically iterate through as many dataframes as you want as long as they 
are in df_list.

# Define Functions (two versions based on how you want to deal with ties)
ExtractFirstMin<- function(df){
  df$abs_diff<- abs(df$em-1)
  min_rate<- df$pct[which.min(df$abs_diff)]
  return(min_rate)
}

# Put all dataframes into a list
df_list<- list(df1,df2,df3,df4,df5,df6,df7,df8,df9,df10)

# Apply function across list
w<- sapply(df_list, ExtractFirstMin)
w
#>  [1] 29.40269 32.21546 30.75330 30.12109 30.38361 28.64928 30.45568 29.66190
#>  [9] 31.57229 31.33907
____
From: rain1...@aim.com 
Sent: Tuesday, May 16, 2023 11:13 PM
To: Alexander Ilich ; r-sig-geo@r-project.org 

Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

Hi Alexander,

Oh, wow...you are absolutely right - I cannot believe that I did not notice 
that previously! Thank you so much, yet again, including for the insight on 
what "numeric(0)" signifies! Indeed, it all works just fine now!

I am now able to flexibly achieve the goal of deriving the range of these 
values across the 10 dataframes using the "range" function!

I cannot thank you enough, including for your tireless efforts to explain 
everything step-by-step throughout all of this, though I do apologize for the 
time spent on this!



-Original Message-
From: Alexander Ilich 
To: r-sig-geo@r-project.org ; rain1...@aim.com 

Sent: Tue, May 16, 2023 10:24 pm
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

I believe you didn't clear your environment and that's why df1 works. All 
should evaluate to "numeric(0) with the current code. You call df2$abs_diff, 
but you never defined that variable. You assigned that result to an object 
called diff2 which is not used anywhere else in your code. If you type in 
df2$abs_diff, you'll see it evaluates to NULL and that caries through the rest 
of your code. numeric(0) means that it's a variable of type numeric but it's 
empty (zero in length).

set.seed(5)
df2<- data.frame(em= rnorm(10), pct=rnorm(10))

diff2 <- abs(df2$em-1) #You defined diff2
df2$abs_diff #This was never defined so it evalues to NULL
#> NULL

which.min(df2$abs_diff) #can't find position of min since df2$abs_diff was 
never defined
#> integer(0)

df2$pct[which.min(df2$abs_diff)] #cannot subset df2$pct since 
which.min(df2$abs_diff) evaluates to integer(0)
#> numeric(0)


From: rain1...@aim.com 
Sent: Tuesday, May 16, 2023 8:29 PM
To: Alexander Ilich ; r-sig-geo@r-project.org 

Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

Hi Alexander,

I am receiving that for my real data, which is, indeed, odd. It works just fine 
with my very first dataframe, but for all other dataframes, it returns 
"numeric(0)". I did the following to organize each dataframe accordingly (note 
that I renamed my dataframes to "df1" through to "df10" for simplicity):

diff1 <- abs(df1$em-1)
w1 <- df1$pct[which.min(df1$abs_diff)]

diff2 <- abs(df2$em-1)
w2 <- df2$pct[which.min(df2$abs_diff)]

diff3 <- abs(df3$em-1)
w3 <- df3$pct[which.min(df3$abs_diff)]

diff4 <- abs(df4$em-1)
w4 <- df4$pct[which.min(df4$abs_diff)]

diff5 <- abs(df5$em-1)
w5 <- df5$pct[which.min(df5$abs_diff)]

diff6 <- abs(df6$em-1)
w6 <- df6$pct[which.min(df6$abs_diff)]

diff7 <- abs(df7$em-1)
w7 <- df7$pct[which.min(df7$abs_diff)]

diff8 <- abs(df8$em-1)
w8 <- df8$pct[which.min(df8$abs_diff)]

diff9 <- abs(df9$em-1)
w9 <- df9$pct[which.min(df9$abs_diff)]

diff10 <- abs(df10$em-1)
w10 <- df10$pct[which.min(df10$abs_diff)]

This is what object "df2" looks like (the first 21 rows are displayed - there 
are 140 rows in total). All dataframes are structured the same way, including 
"df1" (which, as mentioned previously, worked just fine). All begin with 
"0.0" in the first row. "em" is my x-column name, and "pct" is my 
y-column name, as shown in the image below:

[df2.jpg]

What could make the other dataframes so different from "df1" to cause 
"numeric(0)"? Essentially, why would "df1" be fine, and not the other 9 
dataframes? Unless my code for the other dataframes is flawed somehow?

Th

Re: [R-sig-Geo] Finding the highest and lowest rates of increase at specific x value across several time series in R

2023-05-16 Thread Alexander Ilich
It's not clear to me why that would be happening. Are you getting that with 
your real data or the example data generated in the code I sent? The only 
reasons I can think of for that happening is if you're trying to access the 
zeroeth element of a vector which would require which.min(df2$abs_diff) to 
somehow evaluating to zero (which I don't see how it could) or if your 
dataframe is zero rows.

From: rain1...@aim.com 
Sent: Tuesday, May 16, 2023 5:58:22 PM
To: Alexander Ilich ; r-sig-geo@r-project.org 

Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

Hi Alexander,
Thank you so much, once again! Very, very helpful explanations.

I experimented with this method:

df1$abs_diff<- abs(df1$x-1)
min_rate[1]<- df1$y[which.min(df1$abs_diff)]

df2$abs_diff<- abs(df2$x-1)
min_rate[2]<- df2$y[which.min(df2$abs_diff)]


For the first dataframe, it correctly returned the first y-value where x = ~1. 
However, for dataframe2 to dataframe9, I strangely received: "numeric(0)".  
Everything is correctly placed. It does not appear to be an error per se, but 
is there a way around that to avoid that message and see the correct value?

Thanks, again,

-Original Message-
From: Alexander Ilich 
To: r-sig-geo@r-project.org ; rain1...@aim.com 

Sent: Tue, May 16, 2023 2:03 pm
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

sapply goes element by element in your list, where each element is one of your 
dataframes. So mydata starts out as dataframe1, then dataframe2, then 
dataframe3, etc. It is never all of them at once. It goes through the list 
sequentially. So, at the end of the sapply call, you have a vector of length 10 
where the first element corresponds to the rate closest to x=1 in dataframe 1, 
and the tenth element corresponds to the rate closest to x=1 in dataframe 10. 
If your columns are not named x and y, then the function should be edited 
accordingly based on the names. It does assume the "x" and "y" have the same 
name across dataframes. For example, if x was actually "Time" and y was "Rate", 
you could use

#Generate data
set.seed(5)
for (i in 1:10) {
  assign(x = paste0("df", i),
 value = data.frame(Time = sort(rnorm(n = 10, mean = 1, sd = 0.1)),
Rate= rnorm(n = 10, mean = 30, sd = 1)))
} # Create 10 Data Frames

# Define Functions (two versions based on how you want to deal with ties)
ExtractFirstMin<- function(df){
  df$abs_diff<- abs(df$Time-1)
  min_rate<- df$Rate[which.min(df$abs_diff)]
  return(min_rate)
}

# Put all dataframes into a list
df_list<- list(df1,df2,df3,df4,df5,df6,df7,df8,df9,df10)

# Apply function across list
sapply(df_list, ExtractFirstMin)
____
From: rain1...@aim.com 
Sent: Tuesday, May 16, 2023 12:46 PM
To: Alexander Ilich ; r-sig-geo@r-project.org 

Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

Hi Alexander,

Wow, thank you so very much for taking the time to articulate this answer! It 
really gives a good understanding of what is going on at each stage in the 
coding!

And sorry if I missed this previously, but the object "mydata" is defined based 
on the incorporation of all dataframes? Since it is designed to swiftly obtain 
the first minimum at y = ~1 across each dataframe, "mydata" must take into 
account "dataframe1" to dataframe10", correct?

Also, the "x" is simply replaced with the name of the x-column and the "y" with 
the y-column name, if I understand correctly?

Again, sorry if I overlooked this, but that would be all, and thank you so very 
much, once again for your help and time with this! Much appreciated!

~Trav.~


-Original Message-
From: Alexander Ilich 
To: r-sig-geo@r-project.org ; rain1...@aim.com 

Sent: Tue, May 16, 2023 11:42 am
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

The only spot you'll need to change the names for is when putting all of your 
dataframes in a list as that is based on the names you gave them in your script 
when reading in the data. In the function, you don't need to change the input 
to "dataframe1", and naming it that way could be confusing since you are 
applying the function to more than just dataframe1 (you're applying it to all 
10 of your dataframes). I named the argument df to indicate that you should 
supply your dataframe as the input to the function, but you could name it 
anything you want. For example, you could call it "mydata" and define the 
function this way if you wanted to.

ExtractFirstMin<- functi

Re: [R-sig-Geo] Finding the highest and lowest rates of increase at specific x value across several time series in R

2023-05-16 Thread Alexander Ilich
sapply goes element by element in your list, where each element is one of your 
dataframes. So mydata starts out as dataframe1, then dataframe2, then 
dataframe3, etc. It is never all of them at once. It goes through the list 
sequentially. So, at the end of the sapply call, you have a vector of length 10 
where the first element corresponds to the rate closest to x=1 in dataframe 1, 
and the tenth element corresponds to the rate closest to x=1 in dataframe 10. 
If your columns are not named x and y, then the function should be edited 
accordingly based on the names. It does assume the "x" and "y" have the same 
name across dataframes. For example, if x was actually "Time" and y was "Rate", 
you could use

#Generate data
set.seed(5)
for (i in 1:10) {
  assign(x = paste0("df", i),
 value = data.frame(Time = sort(rnorm(n = 10, mean = 1, sd = 0.1)),
Rate= rnorm(n = 10, mean = 30, sd = 1)))
} # Create 10 Data Frames

# Define Functions (two versions based on how you want to deal with ties)
ExtractFirstMin<- function(df){
  df$abs_diff<- abs(df$Time-1)
  min_rate<- df$Rate[which.min(df$abs_diff)]
  return(min_rate)
}

# Put all dataframes into a list
df_list<- list(df1,df2,df3,df4,df5,df6,df7,df8,df9,df10)

# Apply function across list
sapply(df_list, ExtractFirstMin)

From: rain1...@aim.com 
Sent: Tuesday, May 16, 2023 12:46 PM
To: Alexander Ilich ; r-sig-geo@r-project.org 

Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

Hi Alexander,

Wow, thank you so very much for taking the time to articulate this answer! It 
really gives a good understanding of what is going on at each stage in the 
coding!

And sorry if I missed this previously, but the object "mydata" is defined based 
on the incorporation of all dataframes? Since it is designed to swiftly obtain 
the first minimum at y = ~1 across each dataframe, "mydata" must take into 
account "dataframe1" to dataframe10", correct?

Also, the "x" is simply replaced with the name of the x-column and the "y" with 
the y-column name, if I understand correctly?

Again, sorry if I overlooked this, but that would be all, and thank you so very 
much, once again for your help and time with this! Much appreciated!

~Trav.~


-Original Message-
From: Alexander Ilich 
To: r-sig-geo@r-project.org ; rain1...@aim.com 

Sent: Tue, May 16, 2023 11:42 am
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

The only spot you'll need to change the names for is when putting all of your 
dataframes in a list as that is based on the names you gave them in your script 
when reading in the data. In the function, you don't need to change the input 
to "dataframe1", and naming it that way could be confusing since you are 
applying the function to more than just dataframe1 (you're applying it to all 
10 of your dataframes). I named the argument df to indicate that you should 
supply your dataframe as the input to the function, but you could name it 
anything you want. For example, you could call it "mydata" and define the 
function this way if you wanted to.

ExtractFirstMin<- function(mydata){
  mydata$abs_diff<- abs(mydata$x-1)
  min_rate<- mydata$y[which.min(mydata$abs_diff)]
  return(min_rate)
}

#The function has its own environment of variables that is separate from the 
global environment of variables you've defined in your script.
#When we supply one of your dataframes to the function, we are assigning that 
information to a variable in the function's environment called "mydata". 
Functions allow you to generalize your code so that you're not required to name 
your variables a certain way. Note here, we do assume that "mydata" has a "$x" 
and "$y" slot though.

#Without generalizing the code using a function, we'd need to copy and paste 
the code over and over again and make sure to change the name of the dataframe 
each time. This is very time consuming and error prone. Here's an example for 
the first 3 dataframes.

min_rate<- rep(NA_real_, 10) #initialize empty vector
df1$abs_diff<- abs(df1$x-1)
min_rate[1]<- df1$y[which.min(df1$abs_diff)]

df2$abs_diff<- abs(df2$x-1)
min_rate[2]<- df2$y[which.min(df2$abs_diff)]

df3$abs_diff<- abs(df3$x-1)
min_rate[3]<- df3$y[which.min(df3$abs_diff)]

print(min_rate)
#>  [1] 29.40269 32.21546 30.75330   NA   NA   NA   NA   NA
#>  [9]   NA   NA

#With the function defined we can run that it for each individual dataframe, 
which is less error prone than copying and pasting but still fairly repetitive
ExtractFirstMin(mydata = df1) # You

Re: [R-sig-Geo] Finding the highest and lowest rates of increase at specific x value across several time series in R

2023-05-16 Thread Alexander Ilich
The only spot you'll need to change the names for is when putting all of your 
dataframes in a list as that is based on the names you gave them in your script 
when reading in the data. In the function, you don't need to change the input 
to "dataframe1", and naming it that way could be confusing since you are 
applying the function to more than just dataframe1 (you're applying it to all 
10 of your dataframes). I named the argument df to indicate that you should 
supply your dataframe as the input to the function, but you could name it 
anything you want. For example, you could call it "mydata" and define the 
function this way if you wanted to.

ExtractFirstMin<- function(mydata){
  mydata$abs_diff<- abs(mydata$x-1)
  min_rate<- mydata$y[which.min(mydata$abs_diff)]
  return(min_rate)
}

#The function has its own environment of variables that is separate from the 
global environment of variables you've defined in your script.
#When we supply one of your dataframes to the function, we are assigning that 
information to a variable in the function's environment called "mydata". 
Functions allow you to generalize your code so that you're not required to name 
your variables a certain way. Note here, we do assume that "mydata" has a "$x" 
and "$y" slot though.

#Without generalizing the code using a function, we'd need to copy and paste 
the code over and over again and make sure to change the name of the dataframe 
each time. This is very time consuming and error prone. Here's an example for 
the first 3 dataframes.

min_rate<- rep(NA_real_, 10) #initialize empty vector
df1$abs_diff<- abs(df1$x-1)
min_rate[1]<- df1$y[which.min(df1$abs_diff)]

df2$abs_diff<- abs(df2$x-1)
min_rate[2]<- df2$y[which.min(df2$abs_diff)]

df3$abs_diff<- abs(df3$x-1)
min_rate[3]<- df3$y[which.min(df3$abs_diff)]

print(min_rate)
#>  [1] 29.40269 32.21546 30.75330   NA   NA   NA   NA   NA
#>  [9]   NA   NA

#With the function defined we can run that it for each individual dataframe, 
which is less error prone than copying and pasting but still fairly repetitive
ExtractFirstMin(mydata = df1) # You can explicitly say "mydata ="
#> [1] 29.40269
ExtractFirstMin(df2) # Or equivalently it will be based on the order arguments 
when you defined the function. Since there is just one argument, then what you 
supply is assigned to "mydata"
#> [1] 32.21546
ExtractFirstMin(df3)
#> [1] 30.7533

# Rather than manually typing out to tun the function on eeach dataframe and 
bringing it together, we can instead use sapply.
# Sapply takes a list of inputs and a function as arguments. It then applies 
the function to every element in the list and returns a vector (i.e. goes 
through each dataframe in your list, applies the function to each one 
individually, and then records the result for each one in a single variable).
sapply(df_list, ExtractFirstMin)
#>  [1] 29.40269 32.21546 30.75330 30.12109 30.38361 28.64928 30.45568 29.66190
#>  [9] 31.57229 31.33907



From: rain1...@aim.com 
Sent: Monday, May 15, 2023 4:44 PM
To: Alexander Ilich ; r-sig-geo@r-project.org 

Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

Hi Alexander and everyone,

I hope that all is well! Just to follow up with this, I recently was able to 
try the following code that you had kindly previously shared:

ExtractFirstMin<- function(df){
  df$abs_diff<- abs(df$x-1)
  min_rate<- df$y[which.min(df$abs_diff)]
  return(min_rate)
} #Get first y value of closest to x=1

Just to be clear, do I simply replace the "df" in that code with the name of my 
individual dataframes? For example, here is the name of my 10 dataframes, which 
are successfully placed in a list (i.e. df_list), as you showed previously:

dataframe1
dataframe2
dataframe3
dataframe4
dataframe5
dataframe6
dataframe7
dataframe8
dataframe9
dataframe10

Thus, using your example above, using the first dataframe listed there, would 
this become:

ExtractFirstMin<- function(dataframe1){
  dataframe1$abs_diff<- abs(dataframe1$x-1)
  min_rate<- dataframe1$y[which.min(dataframe1$abs_diff)]
  return(min_rate)
} #Get first y value of closest to x=1

df_list<- list(dataframe1, dataframe2, dataframe3, dataframe4, dataframe5, 
dataframe6, dataframe7, dataframe8, dataframe9, dataframe10)

# Apply function across list
sapply(df_list, ExtractFirstMin)


Am I doing this correctly?

Thanks, again!


-Original Message-
From: Alexander Ilich 
To: rain1...@aim.com 
Sent: Thu, May 11, 2023 1:48 am
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

Sure thing. Glad I could help!

From: rain1...

Re: [R-sig-Geo] Finding the highest and lowest rates of increase at specific x value across several time series in R

2023-05-10 Thread Alexander Ilich
So using your data but removing x=1, 0.8 and 1.2 would be equally close. Two 
potential options are to choose the y value corresponding to the first minimum 
difference (in this case x=0.8, y=39), or average the y values for all that are 
equally close (in this case average the y values for x=0.8 and x=1.2). I think 
the easiest wayodo that would to first calculate a column of the absolute value 
of differences between x and 1 and then subset the dataframe to the minimum of 
that column to extract the y values. Here's a base R and tidyverse 
implementation to do that.

#Base R
df<- data.frame(x=c(0,0.2,0.4,0.6,0.8,1.2,1.4),
y= c(0,27,31,32,39,34,25))
df$abs_diff<- abs(df$x-1)

df$y[which.min(df$abs_diff)] #Get first y value of closest to x=1
#> [1] 39
mean(df$y[df$abs_diff==min(df$abs_diff)]) #Average all y values that are 
closest to x=1
#> [1] 36.5

#tidyverse
rm(list=ls())
library(dplyr)

df<- data.frame(x=c(0,0.2,0.4,0.6,0.8,1.2,1.4),
y= c(0,27,31,32,39,34,25))
df<- df %>% mutate(abs_diff = abs(x-1))

df %>% filter(abs_diff==min(abs_diff)) %>% pull(y) %>% head(1) #Get first y 
value of closest to x=1
#> [1] 39

df %>% filter(abs_diff==min(abs_diff)) %>% pull(y) %>% mean() #Average all y 
values that are closest to x=1
#> [1] 36.5

From: rain1...@aim.com 
Sent: Wednesday, May 10, 2023 8:13 AM
To: Alexander Ilich ; r-sig-geo@r-project.org 

Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

Hi Alex and everyone,

My apologies for the confusion and this double message (I just noticed that the 
example dataset appeared distorted)! Let me try to simplify here again.

My dataframes are structured in the following way: an x column and y column, 
like this:

[X]


Now, let's say that I want to determine the rate of increase at about x = 1.0, 
relative to the beginning of the period (i.e. 0 at the beginning). We can see 
clearly here that the answer would be y = 43. My question is would it be 
possible to quickly determine the value at around x = 1.0 across the 10 
dataframes that I have like this without having to manually check them? The 
idea is to determine the range of values for y at around x = 1.0 across all 
dataframes. Note that it's not perfectly x = 1.0 in all dataframes - some could 
be 0.99 or 1.01.

I hope that this is clearer!

Thanks,


-Original Message-
From: Alexander Ilich 
To: r-sig-geo@r-project.org ; rain1...@aim.com 

Sent: Tue, May 9, 2023 2:23 pm
Subject: Re: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

I'm currently having a bit of difficultly following. Rather than using your 
actual data, perhaps you could include code to generate a smaller dataset with 
the same structure with clear definitions of what is contained within each (r 
faq - How to make a great R reproducible example - Stack 
Overflow<https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example>).
 You can design that dataset to be small with a known answer and the describe 
how you got to that answer and then others could help determine some code to 
accomplish that task.

Best Regards,
Alex

From: R-sig-Geo  on behalf of rain1290--- via 
R-sig-Geo 
Sent: Tuesday, May 9, 2023 1:01 PM
To: r-sig-geo@r-project.org 
Subject: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

I would like to attempt to determine the difference between the highest and 
lowest rates of increase across a series of dataframes at a specified x value. 
As shown below, the dataframes have basic x and y columns, with emissions 
values in the x column, and precipitation values in the y column. Among the 
dataframes, the idea would be to determine the highest and lowest rates of 
precipitation increase at "approximately" 1 Terratons of emissions (TtC) 
relative to the first value of each time series. For example, I want to figure 
out which dataframe has the highest increase at 1 TtC, and which dataframe has 
the lowest increase at 1 TtC. at However, I am not sure if there is a way to 
quickly achieve this? Here are the dataframes that I created, followed by an 
example of how each dataframe is structured:
#Dataframe objects created:
CanESMRCP8.5PL<-data.frame(get3.teratons, pland20) 
IPSLLRRCP8.5PL<-data.frame(get6.teratons, pland21)
IPSLMRRCP8.5PL<-data.frame(get9.teratons, pland22)
IPSLLRBRCP8.5PL<-data.frame(get12.teratons, pland23)
MIROCRCP8.5PL<-data.frame(get15.teratons, pland24)
HadGEMRCP8.5PL<-data.frame(get18.teratons, pland25)
MPILRRCP8.5PL<-data.frame(get21.teratons, pland26)
GFDLGRCP8.5PL<-data.frame(get27.teratons, pland27)
GFDLMRCP8.5PL<-data.frame(get30.tera

Re: [R-sig-Geo] Finding the highest and lowest rates of increase at specific x value across several time series in R

2023-05-09 Thread Alexander Ilich
I'm currently having a bit of difficultly following. Rather than using your 
actual data, perhaps you could include code to generate a smaller dataset with 
the same structure with clear definitions of what is contained within each (r 
faq - How to make a great R reproducible example - Stack 
Overflow).
 You can design that dataset to be small with a known answer and the describe 
how you got to that answer and then others could help determine some code to 
accomplish that task.

Best Regards,
Alex

From: R-sig-Geo  on behalf of rain1290--- via 
R-sig-Geo 
Sent: Tuesday, May 9, 2023 1:01 PM
To: r-sig-geo@r-project.org 
Subject: [R-sig-Geo] Finding the highest and lowest rates of increase at 
specific x value across several time series in R

I would like to attempt to determine the difference between the highest and 
lowest rates of increase across a series of dataframes at a specified x value. 
As shown below, the dataframes have basic x and y columns, with emissions 
values in the x column, and precipitation values in the y column. Among the 
dataframes, the idea would be to determine the highest and lowest rates of 
precipitation increase at "approximately" 1 Terratons of emissions (TtC) 
relative to the first value of each time series. For example, I want to figure 
out which dataframe has the highest increase at 1 TtC, and which dataframe has 
the lowest increase at 1 TtC. at However, I am not sure if there is a way to 
quickly achieve this? Here are the dataframes that I created, followed by an 
example of how each dataframe is structured:
#Dataframe objects created:
CanESMRCP8.5PL<-data.frame(get3.teratons, pland20) 
IPSLLRRCP8.5PL<-data.frame(get6.teratons, pland21)
IPSLMRRCP8.5PL<-data.frame(get9.teratons, pland22)
IPSLLRBRCP8.5PL<-data.frame(get12.teratons, pland23)
MIROCRCP8.5PL<-data.frame(get15.teratons, pland24)
HadGEMRCP8.5PL<-data.frame(get18.teratons, pland25)
MPILRRCP8.5PL<-data.frame(get21.teratons, pland26)
GFDLGRCP8.5PL<-data.frame(get27.teratons, pland27)
GFDLMRCP8.5PL<-data.frame(get30.teratons, pland28)
#Example of what each of these look like:
>CanESMRCP8.5PL
get3.teratons   pland20X1  0.4542249 13.252426X2  
0.4626662  3.766658X3  0.4715780  2.220986X4  0.4809204  
8.495072X5  0.4901427 10.206458X6  0.4993126 10.942797X7
  0.5088599  6.592956X8  0.5187588  2.435796X9  0.5286758  
2.275836X10 0.5389284  5.051706X11 0.5496212  8.313389X12   
  0.5600628  9.007722X13 0.5708608 11.905644X14 0.5819234  
6.126022X15 0.5926283  9.883264X16 0.6042306  7.699696X17   
  0.6159752  5.614193X18 0.6274483  6.681527X19 0.6394011 
10.112812X20 0.6519496  8.721810X21 0.6646344 10.315931X22  
   0.6773436 11.372490X23 0.6903203  8.662169X24 0.7036479 
10.106109X25 0.7180955 10.990867X26 0.7322746 13.491778X27  
   0.7459771 17.256650X28 0.7604589 12.040960X29 0.7753096 
10.638796X30 0.7898374  7.889500X31 0.8047258 11.757174X3
 2 0.8204160 15.060151X33 0.8359387  9.822078X34 0.8510721 
11.388695X35 0.8661237 10.271567X36 0.8815913 13.224285X37  
   0.8984146 15.584782X38 0.9154501  9.320024X39 0.9324529  
9.187128X40 0.9497379 12.919805X41 0.9672824 15.190318X42   
  0.9854439 12.098606X43 1.0041460 16.758629X44 1.0241779 
17.435182X45 1.0451656 15.323428X46 1.0663605 18.292109X47  
   1.0868977 12.625429X48 1.1079376 17.318583X49 1.1295719 
14.056624X50 1.1516720 18.239445X51 1.1736696 16.312087X52  
   1.1963065 18.683315X53 1.2195753 20.364835X54 1.2425277 
14.337167X55 1.2653873 16.072449X56 1.2888002 14.870248X57  
   1.3126799 18.431717X58 1.3362459 19.873449X59 1.3593610 
17.278361X60 1.3833589 18.532887X61 1.4083234 16.178170X62  
   1.4328881 17.689810X63 1.4572568 21.395131X64
  1.4821021 20.154886X65 1.5072721 15.655971X66 1.5325393 
21.692028X67 1.5581797 23.258303X68 1.5842384 23.802459X69  
   1.6108635 15.824673X70 1.6365393 19.016228X71 1.6618322 
20.957593X72 1.6876948 19.105363X73 1.7134712 19.759288X74  
   1.7392598 27.315595X75 1.7652725 24.882263X76 1.7913807 
25.813408X77 1.8173818 23.658997X78 1.8434211 24.223432X79  
   1.8695911 23.560818X80 1.8960611 28.057708X81 1.9228969 
26.996265X82 1.9493552 26.659719X83 1.9759324 22.723687X84  
   2.002 30.977267X85 2.0290137 29.384326X86 2.054

Re: [R-sig-Geo] getting data from an nc file

2023-01-06 Thread Alexander Ilich
HDF Viewer can be a good way to 
look at the structure of nc files to see where different information is stored 
in the file.

From: R-sig-Geo  on behalf of Nick Wray 

Sent: Wednesday, November 9, 2022 5:53 AM
To: r-sig-geo@r-project.org 
Subject: [R-sig-Geo] getting data from an nc file

Hello

I am trying to get rainfall data from the UK chess-met site

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcatalogue.ceh.ac.uk%2Fdatastore%2Feidchub%2F2ab15bf0-ad08-415c-ba64-831168be7293%2Fprecip%2F&data=05%7C01%7Cailich%40usf.edu%7C39cfc2a140934dc323d408dac240b9c0%7C741bf7dee2e546df8d6782607df9deaa%7C0%7C0%7C638060296454914830%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=AJ5Q%2BWYYGxbXJZjHxisFdGAAomjtIWn95k5r9vtXniw%3D&reserved=0

and here there are a large number of nc files eg

"chess-met_precip_gb_1km_daily_20150101-20150131.nc"

I�ve found various sources on the net for opening nc files, and getting the
data but when I try instructions like

nc_data<-nc_open("chess-met_precip_gb_1km_daily_20150101-20150131.nc")

lon <- ncvar_get(nc_data, "lon")

lon



the code works but all I get is a series of element numbers and no data:

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[,10][,11][,12]

  [,13][,14][,15][,16][,17][,18][,19]
[,20][,21][,22][,23][,24]



And similarly for latitude

But if I open the nc file as a raster I get a raster precipitation plot of
Great Britain, which rather suggests that the lat and long values are in
there somewhere

Can anyone help with getting the actual data sets out of an nc file?

Thanks Nick Wray

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7Cailich%40usf.edu%7C39cfc2a140934dc323d408dac240b9c0%7C741bf7dee2e546df8d6782607df9deaa%7C0%7C0%7C638060296454914830%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dOzJYz06Zy8XUTh05KFvBI%2BMF3n0clxcgCsILupkkps%3D&reserved=0
[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the 
sender and know the content is safe.

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] Customizing levelplot coloring scheme in r

2022-10-19 Thread Alexander Ilich
You could also try using tmap. Within the tm_raster function you can specify a 
color palette, what value the midpoint of the color palette should be, and even 
set the breaks so that you have different ranges on each side of zero (e.g go 
from -5 to +10 instead of -10 to +10).

From: R-sig-Geo  on behalf of rain1290--- via 
R-sig-Geo 
Sent: Wednesday, October 12, 2022 10:44 AM
To: r-sig-geo@r-project.org 
Subject: [R-sig-Geo] Customizing levelplot coloring scheme in r

I have climate data that I have plotted on a global map using a levelplot map. 
However, I am trying to adjust the default coloring scheme to allow for only 
blue and red shades to appear. Ideally, I am trying to showcase blue for the 
positive values, and red for the negative values, along with a whitish coloring 
near and at 0. The default coloring scheme isn't bad, but I think a 
blue-red-white coloring would allow the plot to appear more visually pleasing. 
Here is the code that I have now to create my current levelplot:

#packages installed


library(raster)
library(ncdf4)
library(maps)
library(maptools)
library(rasterVis)
library(ggplot2)
library(rgdal)
library(sp)
library(gridExtra)
library(grid)
library(RColorBrewer)

#Using levelplot

FPlot10 <- levelplot(FDifference5,margin=F,at=c(seq(-50,150,10)),pretty=TRUE,
par.settings=mapTheme, main="Higher end")
FM10 <- FPlot10 + latticeExtra::layer(sp.lines(world.outlines.sp))

This yields a range of nice default colors, but I'd like to adopt something 
with only shades of red and blue (with white at and around 0). I can do it 
easily in ggplot, but levelplot appears completely different in using color 
commands (if they exist). Is that even possible in levelplot?
Thank you!
[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7Cailich%40usf.edu%7C4cbfb9fce68b4842fec308daac604d98%7C741bf7dee2e546df8d6782607df9deaa%7C0%7C0%7C638012598934423717%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KMXznsbBZW4onMZbbXkfGscLqIGtITpsO8eoi7c9exo%3D&reserved=0
[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the 
sender and know the content is safe.

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] Selecting a range of values in a specific column for R ggplot

2022-10-13 Thread Alexander Ilich
The best way would probably be that you could create a new column indicating 
which subset each row is in and then set that to the color aesthetic in the aes 
call. Alternatively, you could call geom_density multiple times and overwrite 
the data argument and change the color.

For example
GEV$set<-  NA #initialize new column as NA's
# Now create code to fill in the set column appropriately (e.g. as "set1", 
"set2", "set3", etc)
ggplot(data=GEV[c(1:4, 17:21, 30:34),], aes(x=RL45, color=set)) +
geom_density() +
xlab("One-day max (mm/day)") +
ggtitle("Global one-day max Return Level (RCP 4.5)") +
xlim(300, 450)

OR

ggplot() +
geom_density(data=GEV[1:4,], mapping = aes(x=RL45), color="midnightblue") +
geom_density(data=GEV[17:21,], mapping = aes(x=RL45), color="green")+
geom_density(data=GEV[30:34,], mapping = aes(x=RL45), color="red")+
xlab("One-day max (mm/day)") +
ggtitle("Global one-day max Return Level (RCP 4.5)") +
xlim(300, 450)
________
From: rain1...@aim.com 
Sent: Thursday, October 13, 2022 2:21 PM
To: Alexander Ilich ; bfalevl...@gmail.com 
; r-sig-geo@r-project.org 
Subject: Re: [R-sig-Geo] Selecting a range of values in a specific column for R 
ggplot

Finally, let's say that I wanted to add several lines/curves on the same plot 
using different subsets, how would we do this? I tried the following:

newplot1 <- ggplot(data=GEV[17:21,], aes(x=RL45)) + 
geom_density(color="midnightblue") + xlab("One-day max (mm/day)") + 
ggtitle("Global one-day max Return Level (RCP 4.5)") + xlim(300, 450) + 
(data=GEV[1:5,], aes(x=RL45)) + geom_density(color="blue")

I receive this error in the process:

Error: unexpected ',' in "newplot1 <- ggplot(data=GEV[17:21,], aes(x=RL45)) + 
geom_density(color="midnightblue") + xlab("One-day max (mm/day)") + 
ggtitle("Global one-day max Return Level (RCP 4.5)") + xlim(300, 450) + (d"

Here, I am identifying two different subsets to create two different lines on 
the same plot, but for some reason, this strange error occurs. Unless, there is 
a way to specify all of the subsets desired earlier in the command?

Thanks,


-Original Message-
From: Alexander Ilich 
To: bfalevl...@gmail.com ; r-sig-geo@r-project.org 
; rain1...@aim.com 
Sent: Thu, Oct 13, 2022 2:07 pm
Subject: Re: [R-sig-Geo] Selecting a range of values in a specific column for R 
ggplot

Make sure to include the comma after the numbers so that you're selecting rows 
x through y and all columns. For example, GEV[17:21,] not GEV[17:21]

From: R-sig-Geo  on behalf of rain1290--- via 
R-sig-Geo 
Sent: Thursday, October 13, 2022 1:08 PM
To: bfalevl...@gmail.com ; r-sig-geo@r-project.org 

Subject: Re: [R-sig-Geo] Selecting a range of values in a specific column for R 
ggplot

That you so much! Yes, this stopped the error from appearing.
>From what I see, though, when I attempt to change the subset from [1:4] to, 
>say, [17:21] or [30:34], I end up with the exact same plots. Why could that 
>be? The idea would be to make different curves on the same plot by changing 
>the subset.


-Original Message-
From: Bede-Fazekas �kos 
To: r-sig-geo@r-project.org
Sent: Thu, Oct 13, 2022 12:37 pm
Subject: Re: [R-sig-Geo] Selecting a range of values in a specific column for R 
ggplot

Hello,
try
newplot <- ggplot(data = GEV[1:4, ], aes(x = RL45)) + geom_density(color
= "midnightblue") + xlab("Location (mm/day") + ggtitle("Global Location
under RCP4.5") + xlim(300, 350)
HTH,
�kos
---
�kos Bede-Fazekas
Centre for Ecological Research, Hungary

2022.10.13. 18:27 keltez�ssel, rain1290--- via R-sig-Geo �rta:
> I am trying to select a range of values (i.e. the first 4 values) in a 
> specific column from a table. The column is called "RL45". I tried the 
> following code to create a plot in ggplot:
> newplot <- ggplot(data = GEV, aes(x = RL45[1:4])) + geom_density(color = 
> "midnightblue") + xlab("Location (mm/day") + ggtitle("Global Location under 
> RCP4.5") + xlim(300, 350)
>
> This results in this strange error:
>
> Error: Aesthetics must be either length 1 or the same as the data (34): x
>
> This is odd, as I selected the first 4 values in that "RL45" column.
> Any thoughts would be greatly appreciated!
> Thank you,
> [[alternative HTML version deleted]]
>
> ___
> R-sig-Geo mailing list
> R-sig-Geo@r-project.org
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7Cailich%40usf.edu%7Cf3a66d57d850475703de08daad3da230%7C741bf7dee2e546df8d6782607df9deaa%7C0%

Re: [R-sig-Geo] Selecting a range of values in a specific column for R ggplot

2022-10-13 Thread Alexander Ilich
Make sure to include the comma after the numbers so that you're selecting rows 
x through y and all columns. For example, GEV[17:21,] not GEV[17:21]

From: R-sig-Geo  on behalf of rain1290--- via 
R-sig-Geo 
Sent: Thursday, October 13, 2022 1:08 PM
To: bfalevl...@gmail.com ; r-sig-geo@r-project.org 

Subject: Re: [R-sig-Geo] Selecting a range of values in a specific column for R 
ggplot

That you so much! Yes, this stopped the error from appearing.
>From what I see, though, when I attempt to change the subset from [1:4] to, 
>say, [17:21] or [30:34], I end up with the exact same plots. Why could that 
>be? The idea would be to make different curves on the same plot by changing 
>the subset.


-Original Message-
From: Bede-Fazekas �kos 
To: r-sig-geo@r-project.org
Sent: Thu, Oct 13, 2022 12:37 pm
Subject: Re: [R-sig-Geo] Selecting a range of values in a specific column for R 
ggplot

Hello,
try
newplot <- ggplot(data = GEV[1:4, ], aes(x = RL45)) + geom_density(color
= "midnightblue") + xlab("Location (mm/day") + ggtitle("Global Location
under RCP4.5") + xlim(300, 350)
HTH,
�kos
---
�kos Bede-Fazekas
Centre for Ecological Research, Hungary

2022.10.13. 18:27 keltez�ssel, rain1290--- via R-sig-Geo �rta:
> I am trying to select a range of values (i.e. the first 4 values) in a 
> specific column from a table. The column is called "RL45". I tried the 
> following code to create a plot in ggplot:
> newplot <- ggplot(data = GEV, aes(x = RL45[1:4])) + geom_density(color = 
> "midnightblue") + xlab("Location (mm/day") + ggtitle("Global Location under 
> RCP4.5") + xlim(300, 350)
>
> This results in this strange error:
>
> Error: Aesthetics must be either length 1 or the same as the data (34): x
>
> This is odd, as I selected the first 4 values in that "RL45" column.
> Any thoughts would be greatly appreciated!
> Thank you,
> [[alternative HTML version deleted]]
>
> ___
> R-sig-Geo mailing list
> R-sig-Geo@r-project.org
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7Cailich%40usf.edu%7Cf3a66d57d850475703de08daad3da230%7C741bf7dee2e546df8d6782607df9deaa%7C0%7C0%7C638012777488310075%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=QWh37WcKos58sGkwE6aiccdKPcjlWdcEPBNJ%2FxJDneo%3D&reserved=0

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7Cailich%40usf.edu%7Cf3a66d57d850475703de08daad3da230%7C741bf7dee2e546df8d6782607df9deaa%7C0%7C0%7C638012777488466291%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1jk%2BDyj%2BgAML1JoeJIcsL%2BdwIeBE4eG%2BeR1AU1u5LMM%3D&reserved=0

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7Cailich%40usf.edu%7Cf3a66d57d850475703de08daad3da230%7C741bf7dee2e546df8d6782607df9deaa%7C0%7C0%7C638012777488466291%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1jk%2BDyj%2BgAML1JoeJIcsL%2BdwIeBE4eG%2BeR1AU1u5LMM%3D&reserved=0
[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the 
sender and know the content is safe.

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] Selecting a range of values in a specific column for R ggplot

2022-10-13 Thread Alexander Ilich
The aesthetic is looking for the name of the variable, so try subsetting the 
dataframe in the data portion instead.

newplot <- ggplot(data = GEV[1:4,], aes(x = RL45)) + geom_density(color = 
"midnightblue") + xlab("Location (mm/day") + ggtitle("Global Location under 
RCP4.5") + xlim(300, 350)

From: R-sig-Geo  on behalf of rain1290--- via 
R-sig-Geo 
Sent: Thursday, October 13, 2022 12:27 PM
To: r-sig-geo@r-project.org 
Subject: [R-sig-Geo] Selecting a range of values in a specific column for R 
ggplot

I am trying to select a range of values (i.e. the first 4 values) in a specific 
column from a table. The column is called "RL45". I tried the following code to 
create a plot in ggplot:
newplot <- ggplot(data = GEV, aes(x = RL45[1:4])) + geom_density(color = 
"midnightblue") + xlab("Location (mm/day") + ggtitle("Global Location under 
RCP4.5") + xlim(300, 350)

This results in this strange error:

Error: Aesthetics must be either length 1 or the same as the data (34): x

This is odd, as I selected the first 4 values in that "RL45" column.
Any thoughts would be greatly appreciated!
Thank you,
[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7Cailich%40usf.edu%7Cb1a922b685464ea1578208daad37e286%7C741bf7dee2e546df8d6782607df9deaa%7C0%7C0%7C638012752802904543%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6dbwE4NOy9SeJBqLd52kIv07d1FlRt4Ehj7r4g4R598%3D&reserved=0
[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the 
sender and know the content is safe.

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] Raster Data Management Advice

2022-10-13 Thread Alexander Ilich
Thank you everyone for the advice. I have some things to look into.

Thanks,
Alex

From: Michael Sumner 
Sent: Sunday, October 9, 2022 5:01 AM
To: Alexander Ilich 
Cc: r-sig-geo@r-project.org 
Subject: Re: [R-sig-Geo] Raster Data Management Advice

I would set up a polygon of the bounding box (in the native projection) of each 
raster source, and use fields on those polygons to store the details of 
interest: xmin,xmax,ymin,ymax, dimension, resolution, crs, and your other 
details. Then hone in a areas of interest for different tasks to see what set 
of overlapping data you have for it.

There's a lot of fanfare about STAC, but it's really just a JSON-format with 
some of the information you could store on a simple polygon dataset ...with 
STAC as with so many formats you'd have to shoehorn your data into that more 
restrictive form (you can always spit out STAC as a side product of your own 
rich summary for less sophisticated uses).

The crux is keeping the details of the source's native projection independent 
from the representation you use to query it spatially IMO, just record what's 
there. Further, the GDAL warper app-lib (one level below the gdalwarp.exe) is 
the right tool for doing general reads, of any number of sources into one 
specific window of your choosing in any projection (you could use your dataset 
described above to limit which sources get included). You can easily see what 
you'd get by merging any number of sources together, and of course more nuanced 
situations like a sensible background with more detailed layers merged over 
that is very valuable.

Cheers, Mike



On Sat, Oct 8, 2022 at 3:35 AM Alexander Ilich 
mailto:ail...@mail.usf.edu>> wrote:
Hi, I was wondering if anyone has some advice on how to organize raster
data so that it is easily queryable by various attributes (e.g. find me all
the rasters of data type bathymetry, collected by this organization with
10m resolution or finer ). Currently we have data on a server organized
often by when/where it was collected but that can make it difficult to find
specific rasters that meet a certain criteria. I've created a table as a
csv file on github 
<https://github.com/ailich/WFS_Multibeam_Metadata<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Failich%2FWFS_Multibeam_Metadata&data=05%7C01%7Cailich%40usf.edu%7Cc45fb8db339c4f5e12eb08daa9d4f41f%7C741bf7dee2e546df8d6782607df9deaa%7C0%7C0%7C638009029357802723%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YFHL%2B7tp5i2u6vFxTo5L3zgCu6auS2ttvmttPrNjR9M%3D&reserved=0>>
 where
each row is a raster and it has various column attributes describing it
(e.g. who collected it, what sonar was used, resolution, coordinate system,
etc) and a path to the filename as a temporary solution, but I think some
type of spatial database that would allow for querying and then reading
into R as terra objects, as well as into QGIS and ArcGIS as layers for
visualization would be optimal as multiple project members use these data.
Tools I've come across that seem potentially useful include PostGIS and
Geopackage, but I'm not entirely sure how to properly set them up or if
they'd suit my needs. Any advice would be greatly appreciated.

Thanks,
Alex

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org<mailto:R-sig-Geo@r-project.org>
https://stat.ethz.ch/mailman/listinfo/r-sig-geo<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7Cailich%40usf.edu%7Cc45fb8db339c4f5e12eb08daa9d4f41f%7C741bf7dee2e546df8d6782607df9deaa%7C0%7C0%7C638009029357802723%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=z4tucvFujxVkXKkVbv90Wx372fqYdP4VQ5R4qc3ERJ8%3D&reserved=0>


--
Michael Sumner
Software and Database Engineer
Australian Antarctic Division
Hobart, Australia
e-mail: mdsum...@gmail.com<mailto:mdsum...@gmail.com>

[EXTERNAL EMAIL] DO NOT CLICK links or attachments unless you recognize the 
sender and know the content is safe.

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


[R-sig-Geo] Raster Data Management Advice

2022-10-07 Thread Alexander Ilich
Hi, I was wondering if anyone has some advice on how to organize raster
data so that it is easily queryable by various attributes (e.g. find me all
the rasters of data type bathymetry, collected by this organization with
10m resolution or finer ). Currently we have data on a server organized
often by when/where it was collected but that can make it difficult to find
specific rasters that meet a certain criteria. I've created a table as a
csv file on github  where
each row is a raster and it has various column attributes describing it
(e.g. who collected it, what sonar was used, resolution, coordinate system,
etc) and a path to the filename as a temporary solution, but I think some
type of spatial database that would allow for querying and then reading
into R as terra objects, as well as into QGIS and ArcGIS as layers for
visualization would be optimal as multiple project members use these data.
Tools I've come across that seem potentially useful include PostGIS and
Geopackage, but I'm not entirely sure how to properly set them up or if
they'd suit my needs. Any advice would be greatly appreciated.

Thanks,
Alex

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo