Re: [R] wanting to count instances of values in each cell of a series of simulated symmetric matrices of the same size

2021-06-01 Thread R. Mark Sharp via R-help
Bert, 

You are obviously correct about the diagonals. I was not thinking carefully. 
Typically they are expected to be at or near 0.5 in an outbred population but 
can theoretically go to 1.0 in completely inbred populations. This was subset 
from a madeup pedigree and I reused parents, hence the inbreeding.

Space is a concern since I will need to simulate more matrices for the same 
precision as the matrices (breeding populations) increase in size. However, I 
will certainly look into your suggestion. I am doing this on the side so it may 
take a few days.

Thank you for your kind attention. I will provide more definitive feedback 
later.

Mark
R. Mark Sharp, Ph.D.
Data Scientist and Biomedical Statistical Consultant
7526 Meadow Green St.
San Antonio, TX 78251
mobile: 210-218-2868
rmsh...@me.com











> On Jun 1, 2021, at 10:44 PM, Bert Gunter  wrote:
> 
> Come again?! The diagonal values in your example are not all .5.
> 
> If space is not an issue, a straightforward approach is to collect all the 
> matrices into a 3d array and use indexing.
> Here is a simple reprex (as you did not provide one in a convenient form, e.g 
> via dput())
> 
> x <- matrix(1:9, nr = 3); y <- x+10
> diag(x) <- diag(y) <- 0
> print(x) ; print(y)
> ## Now you need to populate a 3 x 3 x 2 array with these matrices
> ## How you do this depends on your naming conventions
> ## You might use a loop, or ls() and assign(),
> ##  or collect your matrices into a list and use do.call() or ...
> ## You will *not*want to do this if you have lots of matrices:
> list_of_mats <- list(x,y) 
> arr <- array(do.call(c,list_of_mats), dim = c(3,3,length(list_of_mats)))
> arr
> arr[2,3,] ## all the values in the [2,3] cell of the matrices; do whatever 
> you want with them.
> 
> Cheers,
> Bert
> 
> 
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along and 
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> 
> On Tue, Jun 1, 2021 at 7:00 PM R. Mark Sharp via R-help  > wrote:
> I want to capture the entire distribution of values for each cell in a 
> sequence of symmetric matrices of the same size. The diagonal values are all 
> 0.5 so I need only the values above or below the diagonal. 
> 
> A small example with three of the structures I am wanting to count follows:
>F  G  H  I J
> F 0.6250 0.3750 0.2500 0.1875 0.125
> G 0.3750 0.6250 0.2500 0.1875 0.125
> H 0.2500 0.2500 0.5000 0.1875 0.125
> I 0.1875 0.1875 0.1875 0.5000 0.250
> J 0.1250 0.1250 0.1250 0.2500 0.500
> 
>F  G  H  I J
> F 0.5625 0.3125 0.1875 0.1250 0.125
> G 0.3125 0.5625 0.1875 0.1250 0.125
> H 0.1875 0.1875 0.5000 0.1875 0.125
> I 0.1250 0.1250 0.1875 0.5000 0.250
> J 0.1250 0.1250 0.1250 0.2500 0.500
> 
> F   G  H   I  J
> F 0.5 0.25000 0.1250 0.09375 0.0625
> G 0.25000 0.5 0.1250 0.09375 0.0625
> H 0.12500 0.12500 0.5000 0.18750 0.1250
> I 0.09375 0.09375 0.1875 0.5 0.2500
> J 0.06250 0.06250 0.1250 0.25000 0.5000
> 
> 
> To be more specific, I have coded up a solution for a single cell with the 
> sequence of values (one from each matrix) in a vector. 
> 
> I used match() below and it works with a matrix but I do not know how to do 
> what is in the if statements with matrices. Since the number of values and 
> the values will be different among the various cells a simple array structure 
> does not seem appropriate and I am assuming I will need to use a list but I 
> would like to do as much as I can with matrices for speed and clarity.
> 
> #' Counts the number of occurrences of each kinship value seen for a pair of
> #' individuals.
> #'
> #' @examples
> #' \donttest{
> #' set.seed(20210529)
> #' kSamples <- sample(c(0, 0.0675, 0.125, 0.25, 0.5, 0.75), 1, replace = 
> TRUE,
> #'prob = c(0.005, 0.3, 0.15, 0.075, 0.0375, 0.01875))
> #' kVC <- list(kinshipValues = numeric(0),
> #' kinshipCounts = numeric(0))
> #' for (kSample in kSamples) {
> #'   kVC <- countKinshipValues(kSample, kVC$kinshipValues, kVC$kinshipCounts)
> #' }
> #' kVC
> #' ## $kinshipValues
> #' ## [1] 0.2500 0.1250 0.0675 0.7500 0.5000 0.
> #' ##
> #' ## $kinshipCounts
> #' ## [1]  301 2592 5096 1322  592   97
> #' }
> #'
> #' @param kValue numeric value being counted (kinship value in
> #' \emph{nprcgenekeepr})
> #' @param kinshipValues vector of unique values of \code{kValue} seen
> #' thus far.
> #' @param kinshipCounts vector of the counts of the unique values of
> #' \code{kValue} seen thus far.
> #' @export
> countKinshipValues <- function(kValue, kinshipValues = numeric(0),
>   kinshipCounts = numeric(0)) {
>   kinshipValue <- match(kValue, kinshipValues, nomatch = -1L)
>   if (kinshipValue == -1L) {
> kinshipValues <- c(kinshipValues, kValue)
> kinshipCounts[length(kinshipCounts) + 1] <- 1
>   } else {
> 

Re: [R] wanting to count instances of values in each cell of a series of simulated symmetric matrices of the same size

2021-06-01 Thread Bert Gunter
Come again?! The diagonal values in your example are not all .5.

If space is not an issue, a straightforward approach is to collect all the
matrices into a 3d array and use indexing.
Here is a simple reprex (as you did not provide one in a convenient form,
e.g via dput())

x <- matrix(1:9, nr = 3); y <- x+10
diag(x) <- diag(y) <- 0
print(x) ; print(y)
## Now you need to populate a 3 x 3 x 2 array with these matrices
## How you do this depends on your naming conventions
## You might use a loop, or ls() and assign(),
##  or collect your matrices into a list and use do.call() or ...
## You will *not*want to do this if you have lots of matrices:
list_of_mats <- list(x,y)
arr <- array(do.call(c,list_of_mats), dim = c(3,3,length(list_of_mats)))
arr
arr[2,3,] ## all the values in the [2,3] cell of the matrices; do whatever
you want with them.

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Jun 1, 2021 at 7:00 PM R. Mark Sharp via R-help <
r-help@r-project.org> wrote:

> I want to capture the entire distribution of values for each cell in a
> sequence of symmetric matrices of the same size. The diagonal values are
> all 0.5 so I need only the values above or below the diagonal.
>
> A small example with three of the structures I am wanting to count follows:
>F  G  H  I J
> F 0.6250 0.3750 0.2500 0.1875 0.125
> G 0.3750 0.6250 0.2500 0.1875 0.125
> H 0.2500 0.2500 0.5000 0.1875 0.125
> I 0.1875 0.1875 0.1875 0.5000 0.250
> J 0.1250 0.1250 0.1250 0.2500 0.500
>
>F  G  H  I J
> F 0.5625 0.3125 0.1875 0.1250 0.125
> G 0.3125 0.5625 0.1875 0.1250 0.125
> H 0.1875 0.1875 0.5000 0.1875 0.125
> I 0.1250 0.1250 0.1875 0.5000 0.250
> J 0.1250 0.1250 0.1250 0.2500 0.500
>
> F   G  H   I  J
> F 0.5 0.25000 0.1250 0.09375 0.0625
> G 0.25000 0.5 0.1250 0.09375 0.0625
> H 0.12500 0.12500 0.5000 0.18750 0.1250
> I 0.09375 0.09375 0.1875 0.5 0.2500
> J 0.06250 0.06250 0.1250 0.25000 0.5000
>
>
> To be more specific, I have coded up a solution for a single cell with the
> sequence of values (one from each matrix) in a vector.
>
> I used match() below and it works with a matrix but I do not know how to
> do what is in the if statements with matrices. Since the number of values
> and the values will be different among the various cells a simple array
> structure does not seem appropriate and I am assuming I will need to use a
> list but I would like to do as much as I can with matrices for speed and
> clarity.
>
> #' Counts the number of occurrences of each kinship value seen for a pair
> of
> #' individuals.
> #'
> #' @examples
> #' \donttest{
> #' set.seed(20210529)
> #' kSamples <- sample(c(0, 0.0675, 0.125, 0.25, 0.5, 0.75), 1, replace
> = TRUE,
> #'prob = c(0.005, 0.3, 0.15, 0.075, 0.0375, 0.01875))
> #' kVC <- list(kinshipValues = numeric(0),
> #' kinshipCounts = numeric(0))
> #' for (kSample in kSamples) {
> #'   kVC <- countKinshipValues(kSample, kVC$kinshipValues,
> kVC$kinshipCounts)
> #' }
> #' kVC
> #' ## $kinshipValues
> #' ## [1] 0.2500 0.1250 0.0675 0.7500 0.5000 0.
> #' ##
> #' ## $kinshipCounts
> #' ## [1]  301 2592 5096 1322  592   97
> #' }
> #'
> #' @param kValue numeric value being counted (kinship value in
> #' \emph{nprcgenekeepr})
> #' @param kinshipValues vector of unique values of \code{kValue} seen
> #' thus far.
> #' @param kinshipCounts vector of the counts of the unique values of
> #' \code{kValue} seen thus far.
> #' @export
> countKinshipValues <- function(kValue, kinshipValues = numeric(0),
>   kinshipCounts = numeric(0)) {
>   kinshipValue <- match(kValue, kinshipValues, nomatch = -1L)
>   if (kinshipValue == -1L) {
> kinshipValues <- c(kinshipValues, kValue)
> kinshipCounts[length(kinshipCounts) + 1] <- 1
>   } else {
> kinshipCounts[kinshipValue] <- kinshipCounts[kinshipValue] + 1
>   }
>   list(kinshipValues = kinshipValues,
>kinshipCounts = kinshipCounts)
> }
>
> Mark
>
>
> R. Mark Sharp, Ph.D.
> Data Scientist and Biomedical Statistical Consultant
> 7526 Meadow Green St.
> San Antonio, TX 78251
> mobile: 210-218-2868
> rmsh...@me.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, 

[R] wanting to count instances of values in each cell of a series of simulated symmetric matrices of the same size

2021-06-01 Thread R. Mark Sharp via R-help
I want to capture the entire distribution of values for each cell in a sequence 
of symmetric matrices of the same size. The diagonal values are all 0.5 so I 
need only the values above or below the diagonal. 

A small example with three of the structures I am wanting to count follows:
   F  G  H  I J
F 0.6250 0.3750 0.2500 0.1875 0.125
G 0.3750 0.6250 0.2500 0.1875 0.125
H 0.2500 0.2500 0.5000 0.1875 0.125
I 0.1875 0.1875 0.1875 0.5000 0.250
J 0.1250 0.1250 0.1250 0.2500 0.500

   F  G  H  I J
F 0.5625 0.3125 0.1875 0.1250 0.125
G 0.3125 0.5625 0.1875 0.1250 0.125
H 0.1875 0.1875 0.5000 0.1875 0.125
I 0.1250 0.1250 0.1875 0.5000 0.250
J 0.1250 0.1250 0.1250 0.2500 0.500

F   G  H   I  J
F 0.5 0.25000 0.1250 0.09375 0.0625
G 0.25000 0.5 0.1250 0.09375 0.0625
H 0.12500 0.12500 0.5000 0.18750 0.1250
I 0.09375 0.09375 0.1875 0.5 0.2500
J 0.06250 0.06250 0.1250 0.25000 0.5000


To be more specific, I have coded up a solution for a single cell with the 
sequence of values (one from each matrix) in a vector. 

I used match() below and it works with a matrix but I do not know how to do 
what is in the if statements with matrices. Since the number of values and the 
values will be different among the various cells a simple array structure does 
not seem appropriate and I am assuming I will need to use a list but I would 
like to do as much as I can with matrices for speed and clarity.

#' Counts the number of occurrences of each kinship value seen for a pair of
#' individuals.
#'
#' @examples
#' \donttest{
#' set.seed(20210529)
#' kSamples <- sample(c(0, 0.0675, 0.125, 0.25, 0.5, 0.75), 1, replace = 
TRUE,
#'prob = c(0.005, 0.3, 0.15, 0.075, 0.0375, 0.01875))
#' kVC <- list(kinshipValues = numeric(0),
#' kinshipCounts = numeric(0))
#' for (kSample in kSamples) {
#'   kVC <- countKinshipValues(kSample, kVC$kinshipValues, kVC$kinshipCounts)
#' }
#' kVC
#' ## $kinshipValues
#' ## [1] 0.2500 0.1250 0.0675 0.7500 0.5000 0.
#' ##
#' ## $kinshipCounts
#' ## [1]  301 2592 5096 1322  592   97
#' }
#'
#' @param kValue numeric value being counted (kinship value in
#' \emph{nprcgenekeepr})
#' @param kinshipValues vector of unique values of \code{kValue} seen
#' thus far.
#' @param kinshipCounts vector of the counts of the unique values of
#' \code{kValue} seen thus far.
#' @export
countKinshipValues <- function(kValue, kinshipValues = numeric(0),
  kinshipCounts = numeric(0)) {
  kinshipValue <- match(kValue, kinshipValues, nomatch = -1L)
  if (kinshipValue == -1L) {
kinshipValues <- c(kinshipValues, kValue)
kinshipCounts[length(kinshipCounts) + 1] <- 1
  } else {
kinshipCounts[kinshipValue] <- kinshipCounts[kinshipValue] + 1
  }
  list(kinshipValues = kinshipValues,
   kinshipCounts = kinshipCounts)
}

Mark


R. Mark Sharp, Ph.D.
Data Scientist and Biomedical Statistical Consultant
7526 Meadow Green St.
San Antonio, TX 78251
mobile: 210-218-2868
rmsh...@me.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.