from:"Bert Gunter"

Re: [R] Using apply

2018-10-30 Thread Bert Gunter

Indeed.

But perhaps it's also worth noting that if such statistics are calculated
as implementations of (e.g. anova) formulae still found (sadly) in many
statistics texts, then they shouldn't be calculated at all. Rather, the
appropriate matrix methods (e.g. QR decompositions ) built into R -- many
of which are already incorporated into R's statistical corpus -- should be
used. To say more would of course be far O/T.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Oct 30, 2018 at 8:44 PM Peter Langfelder 
wrote:

> It should be said that for many basic statistics, there are faster
> functions than apply, for example here you want
>
> sum = colSums(x)
>
> As already said, for sum of squares you would do colSums(x^2).
>
> Many useful functions of this kind are implemented in package
> matrixStats. Once you install it, either look at the package manual or
> type ls("package:matrixStats") to see a list of functions. Most if not
> all have self-explanatory names.
>
> HTH,
>
> Peter
> On Tue, Oct 30, 2018 at 7:28 PM Steven Yen  wrote:
> >
> > I need help with "apply". Below, I have no problem getting the column
> sums.
> > 1. How do I get the sum of squares?
> > 2. In general, where do I look up these functions?
> > Thanks.
> >
> > x<-matrix(1:10,nrow=5); x
> > sum <- apply(x,2,sum); sum
> >
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove specific rows from nested list of matrices

2018-11-02 Thread Bert Gunter

If you learn to use dput() to provide useful examples in your posts, you
are more likely to receive useful help. It is rather difficult to make much
sense of your messy text, though some brave soul(s) may try to help.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Nov 2, 2018 at 8:00 AM Ek Esawi  wrote:

> Hi All,
>
> I have a list that is made up of nested lists, as shown below. I want
> to remove all rows in each sub-list that start with an empty space,
> that’s the first entry of a row is blank; for example, on
> [[1]][[1]][[1]] Remove row 4,on [[1]][[1]][[3]] remove row 5, on
> [[1]][[2]][[1]] remove row 6, etc.. All rows start with 2 digits/ 2
> digits. My formula works on individual sublist but not the whole
> list.. I know my indexing is wrong, but don’t know how to fix it.
>
>
> > FF
>
> [[1]]
> [[1]][[1]]
> [[1]][[1]][[1]]
> [,1][,2]   [,3][,4] [,5]
> [1,] "30/20"   "" ““   "-89"
> [2,] "02/20"   "” ““   "-98"
> [3,] "02/20"   “AAA” ““   "-84"
> [4,] “  “ “  “   “
> [[1]][[1]][[2]]
> [,1][,2]
> [1,] "02/23" “” : 29" “
> [2,] "02/23" “” ." “
> [3,] "02/23" “” " “
> [4,] "02/23" “” "
> [[1]][[1]][[3]]
> [,1][,2][,3] [,4] [,5] [,6] [,7]
> [1,] "01/09" “"“   “   “   "53"
> [2,] "01/09" “” "   “   “   “   "403"
> [3,] "01/09" “” "   “   “   “   "83"
> [4,] "01/09" “” "   “   “   “   "783"
> [5,] “  “  “”  3042742181"   “   “   “   “
> [[1]][[2]]
> [[1]][[2]][[1]]
> [,1]  [,2] [,3] [,4] [,5]
> [1,] ““   “   “   “” "
> [2,] "Standard Purchases"  “   “   “   "
> [3,] "24/90 "” “   "243"  "
> [4,] "24/90 "” "   "143"  "
> [5,] "24/91 "” " “   "143" “
> [6,] ““   “   “   "792"
> [[1]][[2]][[2]]
> [,1][,2]
> [1,] "02/23" “”: 31" “
> [2,] "02/23" “”." “
> [3,] "02/23" “” " “
> [4,] "02/23" “”
> [5,] "02/23" “”
> [6,] "02/23" “” 20"
> [7,] "02/23" “”  “
> [8,] "02/23" “” "33"
> [[1]][[3]]
> [[1]][[3]][[1]]
> [,1][,2]
> [1,] "02/23" “”: 28" “
> [2,] "02/23" “”." “
> [3,] "02/23" “” " “
> [4,] "02/23" “” "
> [[1]][[3]][[2]]
> [,1][,2][,3][,4] [,5] [,6] [,7][,8][,9]
> [1,] "02/23" “” " “   “   "53" "
> [2,] "02/24" “” " “   “   "
> [3,] “  “  “  “   “   “   “  “  "1,241"
> [4,] "02/24" "”  “   "33”
>
> My Formula,:
>
> G <- lapply(FF, function(x) lapply(x, function (y) lapply(y,
> function(z)  z[grepl("^[0-9][0-9]/",z[,1]),])))
>
> The error: Error in z[, 1] : incorrect number of dimensions
>
>
>
> Thanks in advance--EK
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Exact Poly-K Test

2018-11-02 Thread Bert Gunter

A search on rseek.org on "exact poly-k test" brought up several hits. Did
you not try this? Are none of the hits suitable?

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Fri, Nov 2, 2018 at 1:07 PM Evarite Galois  wrote:

> Hello All,
>
> I am looking for a package with an R implementation of the Exact Poly-K
> Test; any feedback will be greatly appreciated.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Exact Poly-K Test

2018-11-02 Thread Bert Gunter

Oh, and if you wish to fake a famous name for your signature, you should
get it right: it's Évariste Galois.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Nov 2, 2018 at 1:13 PM Bert Gunter  wrote:

> A search on rseek.org on "exact poly-k test" brought up several hits. Did
> you not try this? Are none of the hits suitable?
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Fri, Nov 2, 2018 at 1:07 PM Evarite Galois  wrote:
>
>> Hello All,
>>
>> I am looking for a package with an R implementation of the Exact Poly-K
>> Test; any feedback will be greatly appreciated.
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] date and time data on x axis

2018-11-03 Thread Bert Gunter

See ?identify and ?locator

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Nov 3, 2018 at 6:47 PM snowball0916  wrote:

> Hi, Don
> After I've tried 1 month data. It show me like attachment.
>
> The problem is it's hard to identify the the high point from the graph.
> Is there possible, when my cursor move on some point , it will show me
> both x axis and y axis data?
> OR
> Is there other way to get the same goal?
>
> Thanks very much.
>
>
>
>
>
> From: MacQueen, Don
> Date: 2018-10-30 00:01
> To: snowball0916; r-help
> Subject: Re: [R] date and time data on x axis
> Here's an example of 24 hours of data at one second intervals.
>
> npts <- 24*60*60
>
> df <- data.frame(
>  tm = seq( Sys.time(), by='1 sec', length=npts),
>  yd = round(runif(npts),2)
>  )
>
> head(df)
>
> with(df, plot(tm,yd))
>
> The x axis appears to me to be displayed in a neat and clean way. I don't
> understand what the problem is.
> (The data itself is illegible, but that's a different problem.)
>
> The default axis may not be what you want, but it is neat and clean. To
> choose the axis tick marks and labels yourself, use axis() or axis.POSIXct,
> as Rui mentioned.  help(axis.POSIXct) provides examples of actual use.
>
> I prefer to do as much as possible with base R, so look at this example:
>
> > as.POSIXct( '20181028_10:00:00' , format='%Y%m%d_%H:%M:%S')
> [1] "2018-10-28 10:00:00 PDT"
>
> Therefore
>   xdata <- as.POSIXct(mydata$V1, format='%Y%m%d_%H:%M:%S')
> is perfectly adequate (the lubridate package is not essential here)
>
>
> par() is the function that sets graphical parameters. There are many
> graphical parameters.
> "mar" is the parameter that specifies the sizes of the plot margins  ( see
> ?par )
>
> This expression
>op <- par(mar = c(4, 0, 0, 0) + par("mar"))
> is a way to modify the values of the "mar" parameter.
>
> Type the following commands
>par('mar')
>par()$mar## an alternative
>c(4,0,0,0) + par('mar')
>par(mar = c(4, 0, 0, 0) + par("mar"))
>par('mar')## to see that the margins have been changed
>
> --
> Don MacQueen
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
> Lab cell 925-724-7509
>
> On 10/28/18, 8:16 AM, "R-help on behalf of snowball0916" <
> r-help-boun...@r-project.org on behalf of snowball0...@163.com> wrote:
>
> Hi, guys
> How do you guys deal with the date and time data on x axis?
> I have some trouble with it. Could you help with this?
>
> =
> Sample Data
> =
> The sample data look like this:
>
> 20181028_10:00:00 600
> 20181028_10:00:01 500
> 20181028_10:00:02 450
> 20181028_10:00:03 660
> ..
>
> =
> My Code
> =
>
> library(lubridate)
> mydata <- read.table("e:/R_study/graph_test2.txt")
> xdata <- ymd_hms(mydata$V1)
> ydata <- mydata$V2
> plot(xdata, ydata, type="o")
>
>
> =
> Questions:
> =
>
> 1. Why my x axis does not show me the correct date time like
> ""2018-10-28 10:00:00 UTC" ?
> 2. If my data is very huge(like data in every second and the data has
> the whole day , even the whole month), how can I display the x axis in a
> neat and clean way?
>
> Thanks very much.
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] HMM-Classification

2018-11-04 Thread Bert Gunter

Indeed! There is even a HMM package!

A web search on "hidden markov models" on rseek.org brought up many
relevant looking hits.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Nov 4, 2018 at 7:31 AM Ingmar Visser  wrote:

> There are several packages that allow fitting hidden Markov models
> (assuming that that is what you mean by HMM ...), which you can find here:
> https://cran.r-project.org/web/packages/
> Best, Ingmar
>
> Ingmar Visser
> Universitair Hoofddocent ontwikkelingspsychologie | Directeur College
> Psychologie
> Afdeling Psychologie | Faculteit Maatschappij- en Gedragswetenschappen |
> Universiteit van Amsterdam
> Bezoek | Nieuwe Achtergracht 129B | Kamer G 1.18
> Post | Postbus 15933 | 1001 NK Amsterdam
> Pakketpost | Valckenierstraat 59 | 1018 XE Amsterdam
> T: +31205256723 | M: +31647260824 | e: i.vis...@uva.nl
>
>
> On Sun, Nov 4, 2018 at 3:06 PM Kabouch Nourdine via R-help <
> r-help@r-project.org> wrote:
>
> > Hi,
> > I would like to use HMM for a time serie (solar radiation)
> > classification.I would like to know what are the steps I should
> follow?For
> > the states i have to chose between 3 or 5, I do not have other
> informations.
> > Regards.
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to add a dummy code for (G)LMER

2018-11-04 Thread Bert Gunter

I am almost certain that no dummy variables are necessary -- but mixed
models questions are always better posed on the r-sig-mixed-models list.

Bert

On Sun, Nov 4, 2018 at 1:38 PM Yune S. Lee  wrote:

> Dear R experts --
>
> I never needed to add a dummy column and always query statistical results
> by querying summary(model) for GLMER. However, I was recently asked to add
> a dummy column for interaction variables when performing GLMER. Could
> anyone tell me if it's necessary to add a dummy column for GLMER or R
> automatically handles it when outputting results?
>
> Best,
> Yune
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with the matrix() function.

2018-11-07 Thread Bert Gunter

I have no opinion on your queries, but whatever is done, it should be
properly documented, which appears not to be the case presently afaics.

-- Bert



On Wed, Nov 7, 2018 at 12:33 PM Rolf Turner  wrote:

>
> In the course of writing a bit of somewhat convoluted code I recently
> made a silly error that revealed the following phenomenon:
>
> m <- matrix(1:10,nrow=2,ncol=c(5,4))
>
> produces
>
> >  [,1] [,2] [,3] [,4] [,5]
> > [1,]13579
> > [2,]2468   10
>
> That is, the nonsense value of c(5,4) for the "ncol" argument is
> accepted, without comment --- the first entry of the given ncol argument
> is used.
>
> It might be argued that this is a reasonable accommodation of the user's
> ineptitude.  I am of the opinion that an error should be thrown if the
> value of ncol is not an integer scalar.
>
> I have also discerned that if ncol is not an integer, it is replaced by
> its floor value.
>
> Is this a Good Thing?
>
> What do others think?
>
> cheers,
>
> Rolf Turner
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Identify row indices corresponding to each distinct row of a matrix

2018-11-07 Thread Bert Gunter

A mess -- due to your continued use of html formatting.

But something like this may do what you want (hard to tell with the mess):

> m <- matrix(1:16,nrow=8)[rep(1:8,2),]
> m
  [,1] [,2]
 [1,]19
 [2,]2   10
 [3,]3   11
 [4,]4   12
 [5,]5   13
 [6,]6   14
 [7,]7   15
 [8,]8   16
 [9,]19
[10,]2   10
[11,]3   11
[12,]4   12
[13,]5   13
[14,]6   14
[15,]7   15
[16,]8   16
> vec <- apply(m,1,paste,collapse="-") ## converts rows into character
vector
> vec
 [1] "1-9"  "2-10" "3-11" "4-12" "5-13" "6-14" "7-15" "8-16" "1-9"  "2-10"
"3-11" "4-12" "5-13" "6-14"
[15] "7-15" "8-16"
> ## Then maybe:
> tapply(seq_along(vec),vec, I)
$`1-9`
[1] 1 9

$`2-10`
[1]  2 10

$`3-11`
[1]  3 11

$`4-12`
[1]  4 12

$`5-13`
[1]  5 13

$`6-14`
[1]  6 14

$`7-15`
[1]  7 15

$`8-16`
[1]  8 16

> ## gives the row numbers for each unique row

There may well be slicker ways to do this -- if this is actually what you
want to do.

-- Bert



On Wed, Nov 7, 2018 at 7:56 PM li li  wrote:

> Hi all,
>I use the following example to illustrate my question. As you can see,
> in matrix C some rows are repeated and I would like to find the indices of
> the rows corresponding to each of the distinct rows.
>   For example, for the row c(1,9), I have used the "which" function to
> identify the row indices corresponding to c(1,9). Using this approach, in
> order to cover all distinct rows, I need to use a for loop.
>I am wondering whether there is an easier way where a for loop can be
> avoided?
>Thanks very much!
>   Hanna
>
>
>
> > A <- matrix(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16),8,2)> B <-
> rbind(A,A,A)> C <- as.data.frame(B[sample(nrow(B)),])> C   V1 V2
> 1   1  9
> 2   2 10
> 3   3 11
> 4   5 13
> 5   7 15
> 6   6 14
> 7   4 12
> 8   3 11
> 9   8 16
> 10  5 13
> 11  7 15
> 12  2 10
> 13  1  9
> 14  8 16
> 15  1  9
> 16  3 11
> 17  7 15
> 18  4 12
> 19  2 10
> 20  6 14
> 21  4 12
> 22  8 16
> 23  5 13
> 24  6 14> T <- unique(C)> T  V1 V2
> 1  1  9
> 2  2 10
> 3  3 11
> 4  5 13
> 5  7 15
> 6  6 14
> 7  4 12
> 9  8 16> > i <- 1> which(C[,1]==T[i,1]&
> C[,2]==T[i,2])[1]  1 13 15
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Identify row indices corresponding to each distinct row of a matrix

2018-11-08 Thread Bert Gunter

Yes -- much better than mine. I didn't know about the MARGIN argument of
duplicated().

-- Bert


On Wed, Nov 7, 2018 at 10:32 PM Jeff Newmiller 
wrote:

> Perhaps
>
> which( ! duplicated( m, MARGIN=1 ) )
>
> ? (untested)
>
> On November 7, 2018 9:20:57 PM PST, Bert Gunter 
> wrote:
> >A mess -- due to your continued use of html formatting.
> >
> >But something like this may do what you want (hard to tell with the
> >mess):
> >
> >> m <- matrix(1:16,nrow=8)[rep(1:8,2),]
> >> m
> >  [,1] [,2]
> > [1,]19
> > [2,]2   10
> > [3,]3   11
> > [4,]4   12
> > [5,]5   13
> > [6,]6   14
> > [7,]7   15
> > [8,]8   16
> > [9,]19
> >[10,]2   10
> >[11,]3   11
> >[12,]4   12
> >[13,]5   13
> >[14,]6   14
> >[15,]7   15
> >[16,]8   16
> >> vec <- apply(m,1,paste,collapse="-") ## converts rows into character
> >vector
> >> vec
> >[1] "1-9"  "2-10" "3-11" "4-12" "5-13" "6-14" "7-15" "8-16" "1-9"
> >"2-10"
> >"3-11" "4-12" "5-13" "6-14"
> >[15] "7-15" "8-16"
> >> ## Then maybe:
> >> tapply(seq_along(vec),vec, I)
> >$`1-9`
> >[1] 1 9
> >
> >$`2-10`
> >[1]  2 10
> >
> >$`3-11`
> >[1]  3 11
> >
> >$`4-12`
> >[1]  4 12
> >
> >$`5-13`
> >[1]  5 13
> >
> >$`6-14`
> >[1]  6 14
> >
> >$`7-15`
> >[1]  7 15
> >
> >$`8-16`
> >[1]  8 16
> >
> >> ## gives the row numbers for each unique row
> >
> >There may well be slicker ways to do this -- if this is actually what
> >you
> >want to do.
> >
> >-- Bert
> >
> >
> >
> >On Wed, Nov 7, 2018 at 7:56 PM li li  wrote:
> >
> >> Hi all,
> >>I use the following example to illustrate my question. As you can
> >see,
> >> in matrix C some rows are repeated and I would like to find the
> >indices of
> >> the rows corresponding to each of the distinct rows.
> >>   For example, for the row c(1,9), I have used the "which" function
> >to
> >> identify the row indices corresponding to c(1,9). Using this
> >approach, in
> >> order to cover all distinct rows, I need to use a for loop.
> >>I am wondering whether there is an easier way where a for loop can
> >be
> >> avoided?
> >>Thanks very much!
> >>   Hanna
> >>
> >>
> >>
> >> > A <- matrix(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16),8,2)> B <-
> >> rbind(A,A,A)> C <- as.data.frame(B[sample(nrow(B)),])> C   V1 V2
> >> 1   1  9
> >> 2   2 10
> >> 3   3 11
> >> 4   5 13
> >> 5   7 15
> >> 6   6 14
> >> 7   4 12
> >> 8   3 11
> >> 9   8 16
> >> 10  5 13
> >> 11  7 15
> >> 12  2 10
> >> 13  1  9
> >> 14  8 16
> >> 15  1  9
> >> 16  3 11
> >> 17  7 15
> >> 18  4 12
> >> 19  2 10
> >> 20  6 14
> >> 21  4 12
> >> 22  8 16
> >> 23  5 13
> >> 24  6 14> T <- unique(C)> T  V1 V2
> >> 1  1  9
> >> 2  2 10
> >> 3  3 11
> >> 4  5 13
> >> 5  7 15
> >> 6  6 14
> >> 7  4 12
> >> 9  8 16> > i <- 1> which(C[,1]==T[i,1]&
> >> C[,2]==T[i,2])[1]  1 13 15
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] list contingency tables

2018-11-09 Thread Bert Gunter

Yes, exactly (heh, heh).

?fisher.test

is probably what is wanted.

For arbitrary rxc tables with fixed marginals, this is a difficult problem.
Mehta's efficient network algorithm to solve it can be found by a web
search on "algorithm for Fisher exact test."

-- Bert



On Fri, Nov 9, 2018 at 12:36 PM David Winsemius 
wrote:

> Seems like you are trying to recreate the calculations needed to perform
> an exact test. Why not look at the code for that or even easier, just
> use the function.
>
> --
>
> David.
>
> On 11/8/18 8:05 PM, li li wrote:
> > Hi all,
> >I am trying to list all the 4 by 2 tables with some fixed margins.
> >For example, consider 4 by 2 tables with row margins 1,2,2,1 and
> > column margins 3,3. I was able to do it using the code below. However,
> > as seen below, I had to first count the total number of tables with
> > the specific row margins and column margins in order to create space
> > to store the tables.
> > Is there a way to skip the step of counting the number of tables?
> >Also, wanted to avoid for loops as much as possible since it can be
> > extremely slow and inefficient.
> > Thanks so much in advance for you insight and help.
> > Hanna
> >
> >
> >
> >> library(gtools)
> >> A <- permutations(n=4,r=2,v=0:3, repeats.allowed=TRUE)
> >> B <- apply(A, 1, sum)
> >> rmg <- c(1,2,2,1)
> >> cmg <- c(3,3)
> >> m1 <- t(A[which(B==1),])
> >> m2 <- t(A[which(B==2),])
> >> m3 <- t(A[which(B==2),])
> >>
> >> ##count number of tables with row margins 1,2,2,1 and column margins
> 3,3.
> >> num <- 0
> >> for (i in 1:ncol(m1)){
> > + for (j in 1:ncol(m2)){
> > + for (k in 1:ncol(m3)){
> > + M <- t(cbind(m1[,i], m2[,j], m3[,k]))
> > + M1 <- rbind(M, cmg-apply(M,2,sum))
> > + num <- num+(sum(M1[4,] < 0) == 0)
> > + }}}
> >>
> >> #create space to store the tables
> >> C <- array(NA, dim=c(4,2,num))
> >>
> >> # list all the tables with fixed margins
> >> num <- 0
> >> for (i in 1:ncol(m1)){
> > + for (j in 1:ncol(m2)){
> > + for (k in 1:ncol(m3)){
> > + M <- t(cbind(m1[,i], m2[,j], m3[,k]))
> > + M1 <- rbind(M,cmg-apply(M,2,sum))
> > + if (sum(M1[4,] < 0) == 0) {
> > + num <- num+1
> > +C[,,num] <- M1
> > + }
> > + }}}
> >> C
> > , , 1
> >
> >   [,1] [,2]
> > [1,]01
> > [2,]02
> > [3,]20
> > [4,]10
> >
> > , , 2
> >
> >   [,1] [,2]
> > [1,]01
> > [2,]11
> > [3,]11
> > [4,]10
> >
> > , , 3
> >
> >   [,1] [,2]
> > [1,]01
> > [2,]11
> > [3,]20
> > [4,]01
> >
> > , , 4
> >
> >   [,1] [,2]
> > [1,]01
> > [2,]20
> > [3,]02
> > [4,]10
> >
> > , , 5
> >
> >   [,1] [,2]
> > [1,]01
> > [2,]20
> > [3,]11
> > [4,]01
> >
> > , , 6
> >
> >   [,1] [,2]
> > [1,]10
> > [2,]02
> > [3,]11
> > [4,]10
> >
> > , , 7
> >
> >   [,1] [,2]
> > [1,]10
> > [2,]02
> > [3,]20
> > [4,]01
> >
> > , , 8
> >
> >   [,1] [,2]
> > [1,]10
> > [2,]11
> > [3,]02
> > [4,]10
> >
> > , , 9
> >
> >   [,1] [,2]
> > [1,]10
> > [2,]11
> > [3,]11
> > [4,]01
> >
> > , , 10
> >
> >   [,1] [,2]
> > [1,]10
> > [2,]20
> > [3,]02
> > [4,]01
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reporting binomial logistic regression from R results

2018-11-12 Thread Bert Gunter

Generally speaking, this list is about questions on R programming, not
statistical issues. However, I grant you that your queries are in something
of a gray area intersecting both.

Nevertheless, based on your admitted confusion, I would recommend that you
find a local statistical expert with whom you can consult 1-1 if at all
possible. As others have already noted, you statistical understanding is
muddy, and it can be quite difficult to resolve such confusion in online
forums like this that cannot provide the close back and forth that may be
required (as well as further appropriate study).

Best,
Bert

On Mon, Nov 12, 2018 at 11:09 AM Frodo Jedi 
wrote:

> Dear Peter and Eik,
> I am very grateful to you for your replies.
> My current understanding is that from the GLM analysis I can indeed
> conclude that the response predicted by System A is significantly different
> from that of System B, while the pairwise comparison A vs C leads to non
> significance. Now the Wald test seems to be correct only for Systems B vs
> C, indicating that the pairwise System B vs System C is significant. Am I
> correct?
>
> However, my current understanding is also that I should use contrasts
> instead of the wald test. So the default contrasts is with the System A,
> now I should re-perform the GLM with another base. I tried to use the
> option "contrasts" of the glm:
>
> > fit1 <- glm(Response ~ System, data = scrd, family = "binomial",
> contrasts = contr.treatment(3, base=1,contrasts=TRUE))
> > summary(fit1)
>
> > fit2 <- glm(Response ~ System, data = scrd, family = "binomial",
> contrasts = contr.treatment(3, base=2,contrasts=TRUE))
> > summary(fit2)
>
> > fit3 <- glm(Response ~ System, data = scrd, family = "binomial",
> contrasts = contr.treatment(3, base=3,contrasts=TRUE))
> > summary(fit3)
>
> However, the output of these three summary functions are identical. Why?
> That option should have changed the base, but apparently this is not the
> case.
>
>
> Another analysis I found online (at this link
>
> https://stats.stackexchange.com/questions/60352/comparing-levels-of-factors-after-a-glm-in-r
> )
> to understand the differences between the 3 levels is to use glth with
> Tuckey. I performed the following:
>
> > library(multcomp)
> > summary(glht(fit, mcp(System="Tukey")))
>
> Simultaneous Tests for General Linear Hypotheses
>
> Multiple Comparisons of Means: Tukey Contrasts
>
>
> Fit: glm(formula = Response ~ System, family = "binomial", data = scrd)
>
> Linear Hypotheses:
>   Estimate Std. Error z value Pr(>|z|)
> B - A == 0  -1.2715 0.3379  -3.763 0.000445 ***
> C - A == 00.8588 0.4990   1.721 0.192472
> C - B == 0 2.1303 0.4512   4.722  < 1e-04 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> (Adjusted p values reported -- single-step method)
>
>
> Is this Tukey analysis correct?
>
>
> I am a bit confused on what analysis I should do. I am doing my very best
> to study all resources I can find, but I would really need some help from
> experts, especially in using R.
>
>
> Best wishes
>
> FJ
>
>
>
>
>
>
> On Mon, Nov 12, 2018 at 1:46 PM peter dalgaard  wrote:
>
> > Yes, only one of the pairwise comparisons (B vs. C) is right. Also, the
> > overall test has 3 degrees of freedom whereas a comparison of 3 groups
> > should have 2. You (meaning Frodo) are testing that _all 3_ regression
> > coefficients are zero, intercept included. That would imply that all
> three
> > systems have response probablilities og 0.5, which is not likely what you
> > want.
> >
> > This all suggests that you are struggling with the interpretation of the
> > regression coefficients and their role in the linear predictor. This
> should
> > be covered by any good book on logistic regression.
> >
> > -pd
> >
> > > On 12 Nov 2018, at 14:15 , Eik Vettorazzi  wrote:
> > >
> > > Dear Jedi,
> > > please use the source carefully. A and C are not statistically
> different
> > at the 5% level, which can be inferred from glm output. Your last two
> > wald.tests don't test what you want to, since your model contains an
> > intercept term. You specified contrasts which tests A vs B-A, ie A-
> > (B-A)==0 <-> 2*A-B==0 which is not intended I think. Have a look at
> > ?contr.treatment and re-read your source doc to get an idea what dummy
> > coding and indicatr variables are about.
> > >
> > > Cheers
> > >
> > >
> > > Am 12.11.2018 um 02:07 schrieb Frodo Jedi:
> > >> Dear list members,
> > >> I need some help in understanding whether I am doing correctly a
> > binomial
> > >> logistic regression and whether I am interpreting the results in the
> > >> correct way. Also I would need an advice regarding the reporting of
> the
> > >> results from the R functions.
> > >> I want to report the results of a binomial logistic regression where I
> > want
> > >> to assess difference between the 3 levels of a factor (called System)
> on
> > >> the dependent variable (called Response) taking two values, 0

Re: [R] missRanger package

2018-11-12 Thread Bert Gunter

You have asked what I believe is an incoherent question, and thus are
unlikely to receive any useful replies (of course, I may be wrong about
this...).

Please read and follow the posting guide linked below to to ask a question
that can be answered.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Mon, Nov 12, 2018 at 12:03 PM Rebecca Bingert 
wrote:

> Hi,
> does anybody know where I need to insert the censoring in the missRanger
> package?
> Regards,
> Rebecca
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] which element is duplicated?

2018-11-12 Thread Bert Gunter

> match(v, unique(v))
[1] 1 2 2 1

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Nov 12, 2018 at 5:08 PM Duncan Murdoch 
wrote:

> The duplicated() function gives TRUE if an item in a vector (or row in a
> matrix, etc.) is a duplicate of an earlier item.  But what I would like
> to know is which item does it duplicate?
>
> For example,
>
> v <- c("a", "b", "b", "a")
> duplicated(v)
>
> returns
>
> [1] FALSE FALSE  TRUE  TRUE
>
> What I want is a fast way to calculate
>
>   [1] NA NA 2 1
>
> or (equally useful to me)
>
>   [1] 1 2 2 1
>
> The result should have the property that if result[i] == j, then v[i] ==
> v[j], at least for i != j.
>
> Does this already exist somewhere, or is it easy to write?
>
> Duncan Murdoch
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] which element is duplicated?

2018-11-12 Thread Bert Gunter

It is not clear to what you want for the general case. Perhaps:

> v <- letters[c(2,2,1,2,1,1)]
> wh <- tapply(seq_along(v),factor(v), '[',1)
> w <- wh[match(v,v[wh])]
> w
b b a b a a
1 1 3 1 3 3
> ## and if you want NA's for the first occurences of unique values
> ## of course:
> w[wh] <- NA
> w
 b  b  a  b  a  a
NA  1 NA  1  3  3

I'd like to see a cleverer solution that vectorizes and avoids the
tapply(), though.

Cheers,
Bert




On Mon, Nov 12, 2018 at 8:33 PM Bert Gunter  wrote:

> > match(v, unique(v))
> [1] 1 2 2 1
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Nov 12, 2018 at 5:08 PM Duncan Murdoch 
> wrote:
>
>> The duplicated() function gives TRUE if an item in a vector (or row in a
>> matrix, etc.) is a duplicate of an earlier item.  But what I would like
>> to know is which item does it duplicate?
>>
>> For example,
>>
>> v <- c("a", "b", "b", "a")
>> duplicated(v)
>>
>> returns
>>
>> [1] FALSE FALSE  TRUE  TRUE
>>
>> What I want is a fast way to calculate
>>
>>   [1] NA NA 2 1
>>
>> or (equally useful to me)
>>
>>   [1] 1 2 2 1
>>
>> The result should have the property that if result[i] == j, then v[i] ==
>> v[j], at least for i != j.
>>
>> Does this already exist somewhere, or is it easy to write?
>>
>> Duncan Murdoch
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] which element is duplicated?

2018-11-12 Thread Bert Gunter

"I'd like to see a cleverer solution that vectorizes..."

and Herve provided it.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Nov 12, 2018 at 9:43 PM Bert Gunter  wrote:

> It is not clear to what you want for the general case. Perhaps:
>
> > v <- letters[c(2,2,1,2,1,1)]
> > wh <- tapply(seq_along(v),factor(v), '[',1)
> > w <- wh[match(v,v[wh])]
> > w
> b b a b a a
> 1 1 3 1 3 3
> > ## and if you want NA's for the first occurences of unique values
> > ## of course:
> > w[wh] <- NA
> > w
>  b  b  a  b  a  a
> NA  1 NA  1  3  3
>
> I'd like to see a cleverer solution that vectorizes and avoids the
> tapply(), though.
>
> Cheers,
> Bert
>
>
>
>
> On Mon, Nov 12, 2018 at 8:33 PM Bert Gunter 
> wrote:
>
>> > match(v, unique(v))
>> [1] 1 2 2 1
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Mon, Nov 12, 2018 at 5:08 PM Duncan Murdoch 
>> wrote:
>>
>>> The duplicated() function gives TRUE if an item in a vector (or row in a
>>> matrix, etc.) is a duplicate of an earlier item.  But what I would like
>>> to know is which item does it duplicate?
>>>
>>> For example,
>>>
>>> v <- c("a", "b", "b", "a")
>>> duplicated(v)
>>>
>>> returns
>>>
>>> [1] FALSE FALSE  TRUE  TRUE
>>>
>>> What I want is a fast way to calculate
>>>
>>>   [1] NA NA 2 1
>>>
>>> or (equally useful to me)
>>>
>>>   [1] 1 2 2 1
>>>
>>> The result should have the property that if result[i] == j, then v[i] ==
>>> v[j], at least for i != j.
>>>
>>> Does this already exist somewhere, or is it easy to write?
>>>
>>> Duncan Murdoch
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unexpected failure of Cholesky docomposition

2018-11-13 Thread Bert Gunter

Your understanding is wrong. The eigenvalues, not singular values, must be
positive, and they are not.

Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Nov 13, 2018 at 7:39 AM Hoffman, Gabriel 
wrote:

> My understanding is that a Cholesky decomposition should work on any
> square, positive definite matrix.  I am encountering an issue where chol()
> fails and give the error: "the leading minor of order 3 is not positive
> definite"
>
> This occurs on multiple machines and version of R.
>
> Here is a minimal reproducible example:
>
> # initialize matrix
> values = c(1,0.725,0,0,0.725,1,0.692,0,0,0.692,1,0.644,0,0,0.664,1)
> B = matrix(values, 4,4)
>
> # show that singular values are positive
> svd(B)$d
>
> # show that matrix is symmetric
> isSymmetric(B)
>
> # B is symmetric positive definite, but Cholesky still fails
> chol(B)
>
> Is this a numerical stability issue?  How can I predict which matrices
> will fail?
>
> - Gabriel
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glmutli package assistance please

2018-11-15 Thread Bert Gunter

Please do not cross post (see te posting guide). This should go only
to the mixed models list.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Nov 15, 2018 at 3:47 AM Bill Poling  wrote:
>
> Hi, I have removed the pdf which was causing my e-mail to be blocked by 
> moderators, my apologies.
>
> https://www.jstatsoft.org/article/view/v034i12/v34i12.pdf
>
> Original post:
>
> Hello. I am still trying to get some of the examples in this glmulti pdf to 
> work with my data.
>
> I have sent e-mails to author addresses provided but no response or bounced 
> back as in valid.
>
> I am not sure if this is more likely to receive support on r-help or 
> r-sig-mixed-models, hence the double posting, my apologies in advance.
>
> I am windows 10 -- R3.5.1 -- RStudio Version 1.1.456
>
> glmulti: An R Package for Easy Automated Model Selection with (Generalized) 
> Linear Models
>
> pdf Attached:
>
> On page 13 section 3.1 of the pdf they describe a routine to estimate the 
> candidate models possible.
>
> Their data description:
> The number of levels factors have does not affect the number of candidate 
> models, only their complexity. We use a data frame dod, containing as a first 
> column a dummy response variable, the next 6 columns are dummy factors with 
> three levels, and the last six are dummy covariates.
> To compute the number of candidate models when there are between 1 and 6 
> factors and 1 and 6 covariates, we call glmulti with method = "d" and data = 
> dod. We use names(dod) to specify the names of the response variable and of 
> the predictors. We vary the number of factors and covariates, this way:
>
>
> Their routine:
> dd <- matrix(nc = 6, nr = 6) for(i in 1:6) for(j in 1:6) dd[i, j] <- 
> glmulti(names(dod)[1],
> + names(dod)[c(2:(1 + i), 8:(7 + j))], data = dod, method = "d")
>
> My data, I organized it similar to the example, Response, Factor, Factor, 5 
> covariates
>
> Classes 'data.table' and 'data.frame':23141 obs. of  8 variables:
>  $ Editnumber2: num  0 0 1 1 1 1 1 1 1 1 ...
>  $ PatientGender  : Factor w/ 3 levels "F","M","U": 1 1 2 2 2 2 1 1 1 1 ...
>  $ B1 : Factor w/ 14 levels "Z","A","C","D",..: 2 2 3 3 2 2 2 2 2 
> 2 ...
>  $ SavingsReversed: num  -0.139 -0.139 -0.139 -0.139 -0.139 ...
>  $ productID  : int  3 3 3 3 3 3 3 3 1 1 ...
>  $ ProviderID : int  113676 113676 113964 113964 114278 114278 114278 
> 114278 114278 114278 ...
>  $ ModCnt : int  0 0 0 0 1 1 1 1 1 1 ...
>  $ B2 : num  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
>  - attr(*, ".internal.selfref")=
>
> Trying to follow what they did, my routine, Editnumber2 is the response 
> variable:
>
> dd <- matrix(nc = 2, nr = 5)
> for(i in 1:2) for(j in 1:5) dd[i, j] <- glmulti(names(r1)[1], 
> names(r1)[c(2:(1 + i), 7:(6 + j))], data = r1, method = "d")
>
> The error: Error in terms.formula(formula, data = data) :
>   invalid model formula in ExtractVars
>
> I have tried changing the numbers around but get results like this:
>
> Initialization...
> TASK: Diagnostic of candidate set.
> Sample size: 23141
> 2 factor(s).
> 2 covariate(s). <--appears to be missing 3 of the covariates for some reason?
> 0 f exclusion(s).
> 0 c exclusion(s).
> 0 f:f exclusion(s).
> 0 c:c exclusion(s).
> 0 f:c exclusion(s).
> Size constraints: min =  0 max = -1
> Complexity constraints: min =  0 max = -1 Your candidate set contains 250 
> models.
> Error in `[<-`(`*tmp*`, i, j, value = glmulti(names(r1)[1], names(r1)[c(2:(1 
> +  :
>   subscript out of bounds
>
>
> I hope someone can help straighten out my code, thank you.
>
>
> WHP
>
>
>
> Confidentiality Notice This message is sent from Zelis. ...{{dropped:13}}
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glmutli package assistance please

2018-11-15 Thread Bert Gunter

OK. Then post here but *not* on mixed models list. One or the other,
exclusive or.

-- Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Nov 15, 2018 at 7:52 AM Michael Dewey  wrote:
>
> Dear Bert
>
> Since glmulti operates on glm/lm models I think, although I agree about
> not cross-posting, that it was OK here. Perhaps I do not understand the
> full significance of mixed models though.
>
> Michael
>
> On 15/11/2018 15:43, Bert Gunter wrote:
> > Please do not cross post (see te posting guide). This should go only
> > to the mixed models list.
> >
> > -- Bert
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> > On Thu, Nov 15, 2018 at 3:47 AM Bill Poling  wrote:
> >>
> >> Hi, I have removed the pdf which was causing my e-mail to be blocked by 
> >> moderators, my apologies.
> >>
> >> https://www.jstatsoft.org/article/view/v034i12/v34i12.pdf
> >>
> >> Original post:
> >>
> >> Hello. I am still trying to get some of the examples in this glmulti pdf 
> >> to work with my data.
> >>
> >> I have sent e-mails to author addresses provided but no response or 
> >> bounced back as in valid.
> >>
> >> I am not sure if this is more likely to receive support on r-help or 
> >> r-sig-mixed-models, hence the double posting, my apologies in advance.
> >>
> >> I am windows 10 -- R3.5.1 -- RStudio Version 1.1.456
> >>
> >> glmulti: An R Package for Easy Automated Model Selection with 
> >> (Generalized) Linear Models
> >>
> >> pdf Attached:
> >>
> >> On page 13 section 3.1 of the pdf they describe a routine to estimate the 
> >> candidate models possible.
> >>
> >> Their data description:
> >> The number of levels factors have does not affect the number of candidate 
> >> models, only their complexity. We use a data frame dod, containing as a 
> >> first column a dummy response variable, the next 6 columns are dummy 
> >> factors with three levels, and the last six are dummy covariates.
> >> To compute the number of candidate models when there are between 1 and 6 
> >> factors and 1 and 6 covariates, we call glmulti with method = "d" and data 
> >> = dod. We use names(dod) to specify the names of the response variable and 
> >> of the predictors. We vary the number of factors and covariates, this way:
> >>
> >>
> >> Their routine:
> >> dd <- matrix(nc = 6, nr = 6) for(i in 1:6) for(j in 1:6) dd[i, j] <- 
> >> glmulti(names(dod)[1],
> >> + names(dod)[c(2:(1 + i), 8:(7 + j))], data = dod, method = "d")
> >>
> >> My data, I organized it similar to the example, Response, Factor, Factor, 
> >> 5 covariates
> >>
> >> Classes 'data.table' and 'data.frame':23141 obs. of  8 variables:
> >>   $ Editnumber2: num  0 0 1 1 1 1 1 1 1 1 ...
> >>   $ PatientGender  : Factor w/ 3 levels "F","M","U": 1 1 2 2 2 2 1 1 1 1 
> >> ...
> >>   $ B1 : Factor w/ 14 levels "Z","A","C","D",..: 2 2 3 3 2 2 2 
> >> 2 2 2 ...
> >>   $ SavingsReversed: num  -0.139 -0.139 -0.139 -0.139 -0.139 ...
> >>   $ productID  : int  3 3 3 3 3 3 3 3 1 1 ...
> >>   $ ProviderID : int  113676 113676 113964 113964 114278 114278 114278 
> >> 114278 114278 114278 ...
> >>   $ ModCnt : int  0 0 0 0 1 1 1 1 1 1 ...
> >>   $ B2 : num  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
> >>   - attr(*, ".internal.selfref")=
> >>
> >> Trying to follow what they did, my routine, Editnumber2 is the response 
> >> variable:
> >>
> >> dd <- matrix(nc = 2, nr = 5)
> >> for(i in 1:2) for(j in 1:5) dd[i, j] <- glmulti(names(r1)[1], 
> >> names(r1)[c(2:(1 + i), 7:(6 + j))], data = r1, method = "d")
> >>
> >> The error: Error in terms.formula(formula, data = data) :
> >>invalid model formula in ExtractVars
> >>
> >> I have tried changing the numbers around but get results like this:
> >>
> >> Initialization...
> >> TASK: Dia

Re: [R] help with grouping data and calculating the means

2018-11-15 Thread Bert Gunter

On Thu, Nov 15, 2018 at 10:40 AM Boris Steipe  wrote:
>
> Use round() with the appropriate  "digits" argument. Then use unique() to 
> define your groups.

No.
> round(c(.124,.126),2)
[1] 0.12 0.13

As I understand it, the OP said he wanted the last decimal to be ignored.

The OP also did not specify what he wanted to calculate means of. I
assume TK-QUADRANT. It is also not clear whether the calculations are
to be done separately by latitude and longitude, or both together.
I'll assume separately. In which case, the calculation of TK-QUADRANT
means by e.g. grouped according to 4 decimal digit values of latitude
could be done using(using the provided example data):
(Note: ignore all that follows if my interpretation is incorrect)

> with(df, tapply(TK.QUADRANT, floor(1e4*LAT),mean))
 549249  549749  550249  550749
10158.5 10156.5  9163.5  9161.5

## Note that this assumes positive values of latitude, because:
> floor(c(-1.2,1.2))
[1] -2  1

This could be easily modifed if both positive and negative values were
used: e.g.
> x <-c(-1.2,1.2)
> sign(x)*floor(abs(x))
[1] -1  1

Confession: I suspect that this exponentiate and floor() procedure
might fail with lots of decimal places due to the usual issues of
binary representations of decimals. But maybe it fails even here. If
so, I would appreciate someone pointing this out and, if possible,
providing a better strategy.

Cheers,
Bert

>
> HTH,
> B.
>
>
> > On 2018-11-15, at 11:48, sasa kosanic  wrote:
> >
> > Dear All,
> >
> > I would very much appreciate the help with following:
> > I need to calculate the mean of  different lat/long points that should be
> > grouped.
> > However I would like that r excludes taking  values that are different in
> > only last decimal.
> > So instead 4 values in the group it would calculate the mean for only 3(
> > excluding the ones that differs in only one decimal).
> > # construct the dataframe
> > `TK-QUADRANT` <- c(9161,9162,9163,9164,10152,10154,10161,10163)
> > LAT <- c(55.07496,55.07496,55.02495,55.02496
> > ,54.97496,54.92495,54.97496,54.92496)
> > LON <-
> > c(8.37477,8.458109,8.37477,8.45811,8.291435,8.291437,8.374774,8.374774)
> > df <- data.frame(`TK-QUADRANT`=`TK-QUADRANT`,LAT=LAT,LON=LON)
> >
> >
> > I would like to group the data and calculate means by group but in a way to
> > exclude every number that differs in only last decimal.
> >
> >
> > Also please see pdf. example-attached .
> >
> > Many thanks!
> > Best wishes,
> > Sasha
> >
> > --
> >
> > Dr Sasha Kosanic
> > Ecology Lab (Biology Department)
> > Room M644
> > University of Konstanz
> > Universitätsstraße 10
> > D-78464 Konstanz
> > Phone: +49 7531 883321 & +49 (0)175 9172503
> >
> > http://cms.uni-konstanz.de/vkleunen/
> > https://tinyurl.com/y8u5wyoj
> > https://tinyurl.com/cgec6tu
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with grouping data and calculating the means

2018-11-15 Thread Bert Gunter

On further thought -- and subject to my prior interpretation -- I
think a foolproof way of truncating to 4 decimal digits is to treat
them as character strings rather than numerics and use regex
operations:

> with(df,tapply(TK.QUADRANT, 
> sub("(\\.[[:digit:]]{4}).*","\\1",as.character(LAT)),mean))
54.9249 54.9749 55.0249 55.0749
10158.5 10156.5  9163.5  9161.5

I should have realized this before!!!!

Cheers,
Bert





Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Thu, Nov 15, 2018 at 12:19 PM Bert Gunter  wrote:
>
> On Thu, Nov 15, 2018 at 10:40 AM Boris Steipe  
> wrote:
> >
> > Use round() with the appropriate  "digits" argument. Then use unique() to 
> > define your groups.
>
> No.
> > round(c(.124,.126),2)
> [1] 0.12 0.13
>
> As I understand it, the OP said he wanted the last decimal to be ignored.
>
> The OP also did not specify what he wanted to calculate means of. I
> assume TK-QUADRANT. It is also not clear whether the calculations are
> to be done separately by latitude and longitude, or both together.
> I'll assume separately. In which case, the calculation of TK-QUADRANT
> means by e.g. grouped according to 4 decimal digit values of latitude
> could be done using(using the provided example data):
> (Note: ignore all that follows if my interpretation is incorrect)
>
> > with(df, tapply(TK.QUADRANT, floor(1e4*LAT),mean))
>  549249  549749  550249  550749
> 10158.5 10156.5  9163.5  9161.5
>
> ## Note that this assumes positive values of latitude, because:
> > floor(c(-1.2,1.2))
> [1] -2  1
>
> This could be easily modifed if both positive and negative values were
> used: e.g.
> > x <-c(-1.2,1.2)
> > sign(x)*floor(abs(x))
> [1] -1  1
>
> Confession: I suspect that this exponentiate and floor() procedure
> might fail with lots of decimal places due to the usual issues of
> binary representations of decimals. But maybe it fails even here. If
> so, I would appreciate someone pointing this out and, if possible,
> providing a better strategy.
>
> Cheers,
> Bert
>
>
>
> >
> > HTH,
> > B.
> >
> >
> > > On 2018-11-15, at 11:48, sasa kosanic  wrote:
> > >
> > > Dear All,
> > >
> > > I would very much appreciate the help with following:
> > > I need to calculate the mean of  different lat/long points that should be
> > > grouped.
> > > However I would like that r excludes taking  values that are different in
> > > only last decimal.
> > > So instead 4 values in the group it would calculate the mean for only 3(
> > > excluding the ones that differs in only one decimal).
> > > # construct the dataframe
> > > `TK-QUADRANT` <- c(9161,9162,9163,9164,10152,10154,10161,10163)
> > > LAT <- c(55.07496,55.07496,55.02495,55.02496
> > > ,54.97496,54.92495,54.97496,54.92496)
> > > LON <-
> > > c(8.37477,8.458109,8.37477,8.45811,8.291435,8.291437,8.374774,8.374774)
> > > df <- data.frame(`TK-QUADRANT`=`TK-QUADRANT`,LAT=LAT,LON=LON)
> > >
> > >
> > > I would like to group the data and calculate means by group but in a way 
> > > to
> > > exclude every number that differs in only last decimal.
> > >
> > >
> > > Also please see pdf. example-attached .
> > >
> > > Many thanks!
> > > Best wishes,
> > > Sasha
> > >
> > > --
> > >
> > > Dr Sasha Kosanic
> > > Ecology Lab (Biology Department)
> > > Room M644
> > > University of Konstanz
> > > Universitätsstraße 10
> > > D-78464 Konstanz
> > > Phone: +49 7531 883321 & +49 (0)175 9172503
> > >
> > > http://cms.uni-konstanz.de/vkleunen/
> > > https://tinyurl.com/y8u5wyoj
> > > https://tinyurl.com/cgec6tu
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with factor column replacement value issue

2018-11-16 Thread Bert Gunter

As usual, careful reading of the relevant Help page would resolve the confusion.

from ?factor:

"factor(x, exclude = NULL) applied to a factor without NAs is a
no-operation unless there are unused levels: in that case, a factor
with the reduced level set is returned. If exclude is used, since R
version 3.4.0, excluding non-existing character levels is equivalent
to excluding nothing, and when excludeis a character vector, that is
applied to the levels of x. Alternatively, excludecan be factor with
the same level set as x and will exclude the levels present in
exclude."

In, subsetting a factor does not change the levels attribute, even if
some levels are not present. One must explicitly remove them, e.g.:

> f <- factor(letters[1:3])
## 3 levels, all present

> f[1:2]
[1] a b
Levels: a b c
## 3 levels, but one empty

> factor(f[1:2], exclude = NULL)
[1] a b
Levels: a b
## Now only two levels


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Fri, Nov 16, 2018 at 7:38 AM Bill Poling  wrote:
>
> Hello:
>
> I am running windows 10 -- R3.5.1 -- RStudio Version 1.1.456
>
> I would like to know why when I replace a column value it still appears in 
> subsequent routines:
>
> My example:
>
> r1$B1 is a Factor: It is created from the first character of a list of CPT 
> codes, r1$CPT.
>
> head(r1$CPT, N= 25)
> [1] A4649 A4649 C9359 C1713 A0394 A0398
> 903 Levels: 0 1 00140 00160 00670 00810 00940 01400 01470 01961 01968 
> 10160 11000 11012 11042 11043 11044 11045 11100 11101 11200 11201 11401 11402 
> ... l8699
>
> str(r1$CPT)
>  Factor w/ 903 levels "0","1",..: 773 773 816 783 739 741 743 739 739 
> 741 ...
>
>
> And I want only those CPT's with leading alpha char in this column so I set 
> the numeric leading char to Z
>
> r1$B1 <- str_sub(r1$CPT,1,1)
>
> r1$B1 <- as.factor(r1$B1) #Redundant
> levels(r1$B1)[levels(r1$B1) %in%  c('1','2','3','4','5','6','7','8','9','0')] 
> <- 'Z'
>
> When I check what I have done I find l & L
>
> unique(r1$B1)
> #[1] A C Z L G Q U J V E S l D P
> #Levels: Z A C D E G J l L P Q S U V
>
> So I change l to L
> r1$B1[r1$B1 == 'l'] <- 'L'
>
> When I check again I have l & L but l = 0
> table(r1$B1)
> #   Z  A  C  D E G  J   l L   
>   P Q S U V
> #19639  1673   546 2 8   147   281 0664 16436   
> 11414
>
> When I go to find those rows as if they existed, they are not accounted for?
>
> tmp <- subset(r1, B1 == "l")
> print(tmp)
> Empty data.table (0 rows) of 9 cols: 
> SavingsReversed,productID,ProviderID,PatientGender,ModCnt,Editnumber2...
>
> And I have actually visually inspected the whole darn column, sheesh!
>
> So I ignore it temporarily.
>
> Now later on it resurfaces in a tutorial I am following for caret pkg.
>
> preProcess(r1b, method = c("center", "scale"),
>thresh = 0.95, pcaComp = NULL, na.remove = TRUE, k = 5,
>knnSummary = mean, outcome = NULL, fudge = 0.2, numUnique = 3,
>verbose = FALSE, freqCut = 95/5, uniqueCut = 10, cutoff = 0.9,
>rangeBounds = c(0, 1))
> # Warning in preProcess.default(r1b, method = c("center", "scale"), thresh = 
> 0.95,  :
> # These variables have zero variances: B1l  
> <-yes this is a remnant of the r1$B1 clean-up
> #   Created from 23141 samples and 22 variables
> #
> #   Pre-processing:
> # - centered (22)
> # - ignored (0)
> # - scaled (22)
>
>
> So my questions are, in consideration of regression modelling accuracy:
>
> Why is this happening?
> How do I remove it?
> Or is it irrelevant and leave it be?
>
> As always, thank you for you support.
>
> WHP
>
>
>
>
>
>
>
>
>
>
>
>
> Confidentiality Notice This message is sent from Zelis. ...{{dropped:13}}
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiplication of regression coefficient by factor type variable

2018-11-17 Thread Bert Gunter

You shouldn't have to do any of what you are doing.

See ?predict.lm  and note the "newdata" argument.

Also, you should spend some time studying a linear model text, as your
question appears to indicate some basic confusion (e.g. about
"contrasts" ) about how they work.


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Sat, Nov 17, 2018 at 1:24 PM Julian Righ Sampedro
 wrote:
>
> Dear all,
>
> In a context of regression, I have three regressors, two of which are
> categorical variables (sex and education) and have class 'factor'.
>
> y = data$income
> x1 = as.factor(data$sex)  # 2 levels
> x2 = data$age  # continuous
> x3 = as.factor(data$ed)  # 8 levels
>
> for example, the first entries of x3 are
>
> head(x3)[1] 5 3 5 5 4 2
> Levels: 1 2 3 4 5 6 7 8
>
> When we call the model, the output looks like this
>
> model1=lm(y ~ x1 + x2 + x3, data = data)
> summary(model1)
>
> Residuals:
> Min 1Q Median 3QMax -31220  -6300   -594   4429 190731
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)(Intercept)  1440.66
>  3809.99   0.378 0.705417
> x1  -4960.88 772.96  -6.418 2.13e-10 ***
> x2181.45  25.03   7.249 8.41e-13 ***
> x32  2174.953453.22   0.630 0.528948
> x33  7497.683428.94   2.187 0.029004 *
> x34  8278.973576.30   2.315 0.020817 *
> x35 13686.883454.93   3.962 7.97e-05 ***
> x36 15902.924408.49   3.607 0.000325 ***
> x37 28773.133696.77   7.783 1.76e-14 ***
> x38 31455.555448.11   5.774 1.03e-08 ***---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 12060 on 1001 degrees of freedom
> Multiple R-squared:  0.2486,Adjusted R-squared:  0.2418
> F-statistic: 36.79 on 9 and 1001 DF,  p-value: < 2.2e-16
>
> Now suppose I want to compute the residuals. To do so I first need to
> compute the prediction by the model. (I use it in a cross validation
> context so it is a partial display of the code)
>
> yhat1 = model1$coef[1] + model1$coef[2]*x1[i] + model1$coef[3]*x2[i] +
> model1$coef[4]*x3[i]
>
> But I get the following warnings
>
> Warning messages:1: In Ops.factor(model1$coef[2], x1[i]) : ‘*’ not
> meaningful for factors2: In Ops.factor(model1$coef[4], x3[i]) : ‘*’
> not meaningful for factors
>
> 1st question: Is there a way to multiply the coefficient by the 'factor'
> without having to transform my 'factor' into a 'numeric' type variable ?
>
> 2nd question: Since x3 is associated with 7 parameters (one for x32, one
> for x33, ... , one for x38), how do I multiply the 'correct' parameter
> coefficient with my 'factor' x3 ?
>
> I have been considering a 'if then' solution, but to no avail. I also have
> considered splitting my x3 variable into 8 binary variables without
> succeeding. What may be the best approach ? Thank you for your help.
>
> Since I understand this my not be specific enough, I add here the complete
> code
>
> # for n-fold cross validation# fit models on leave-one-out samples
> x1= as.factor(data$sex)
> x2= data$age
> x3= as.factor(data$ed)
> yn=data$income
> n = length(yn)
> e1 = e2 = numeric(n)
>
> for (i in 1:n) {
>   # the ith observation is excluded
>   y = yn[-i]
>   x_1 = x1[-i]
>   x_2 = x2[-i]
>   x_3 = x3[-i]
>   x_4 = as.factor(cf4)[-i]
>   # fit the first model without the ith observation
>   J1 = lm(y ~ x_1 + x_2 + x_3)
>   yhat1 = J1$coef[1] + J1$coef[2]*x1[i] + J1$coef[3]*x2[i] + J1$coef[4]*x3[i]
>   # construct the ith part of the loss function for model 1
>   e1[i] = yn[i] - yhat1
>   # fit the second model without the ith observation
>   J2 = lm(y ~ x_1 + x_2 + x_3  + x_4)
>   
> yhat2=J2$coef[1]+J2$coef[2]*x1[i]+J2$coef[3]*x2[i]+J2$coef[4]*x3[i]+J2$coef[5]*cf4[i]
>   e2[i] = yn[i] - yhat2
>  }
>  sqrt(c(mean(e1^2),mean(e2^2))) # RMSE
>
> cf4 is a variable corresponding to groups (using mixtures) . What we want
> to demonstrate is that the prediction error after cross-validation is lower
> when we include this latent grouping variable. It works wonders when the
> categorical variables are treated as 'numeric' variables. Though the ols
> estimates are obviously very different.
>
> Thank you in advance for your views on the problem.
>
> Best Regards,
>
> julian
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org maili

Re: [R] [R studio] Plotting of line chart for each columns at 1 page

2018-11-20 Thread Bert Gunter

You need to do some studying! ggplot is built on the grid graphics system,
which is separate from the base graphics system. The par() function is part
of the *base* graphics system and so ignored by ggplot.

Others may offer you solutions using the "faceting" functionality of
ggplot. But you really should reading up on this on your own. There are
many good tutorials on ggplot2 that are available on the web.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Nov 20, 2018 at 10:19 AM Subhamitra Patra <
subhamitra.pa...@gmail.com> wrote:

> Dear R users,
>
> I have one excel file with 5 sheets. The no. of columns vary for each
> sheet. The 1st sheet consists of 38 columns. So, I want to plot 38 separate
> line charts and arrange them in par(mfrow = c(4, 10)) order. Please suggest
> me how to do this. I have tried with the following code by running a loop
> inside of a sheet, but it is not working. Further, I want to run loops for
> each sheet.
>
> par(mfrow = c(4, 10))
> loop.vector <- 1:38
> for (i in loop.vector)
> x <- JJ[,i]
> library(ggplot2)
>   library(cowplot)
>   plot.mpg <- ggplot(mpg, aes(x,
>   main = paste ("country", i),
>   xlab = "Scores",
>   xlim = c(1,500)
>   y = colnames[i,], colour = factor(cyl))) +
>   geom_line(size=2.5)
> save_plot("mpg.png", plot.mpg,
>   base_aspect_ratio = 1.3)
>
> I want to give my X axis name as scores of (1,500) and Y axis as the
> particular column names for all graphs.
>
> Please suggest.
>
> Thanks in advance.
>
> --
> *Best Regards,*
> *Subhamitra Patra*
> *Phd. Research Scholar*
> *Department of Humanities and Social Sciences*
> *Indian Institute of Technology, Kharagpur*
> *INDIA*
>
>
>
>
>
>
>
>
> [image: Mailtrack]
> <
> https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&;
> >
> Sender
> notified by
> Mailtrack
> <
> https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&;
> >
> 11/20/18,
> 11:49:42 PM
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] detecting measurement of specific id in column in R

2018-11-22 Thread Bert Gunter

Jeff's advice is sound, but as I have a bit of time on my hand, I'll take a
guess at it. If it's wrong, then follow Jeff's advice so that we don't have
to continue to guess -- and do as he describes in any future posts, of
course. Note also that the mail server strips off attachments (except for a
few special types -- see the mailing list instructions for which), so you
need to follow the posting guide to post example data (see ?dput) .

It sounds to me as if your data are in a data frame, d, that looks like
this:

Sample_IDIN   d13cppm_CO2ppm_13CO2  ...
1v1x1   y1 z1
1v2x2   y2 z2
.  ...  ..
.  ...
1
2   vmxm   ym  zm
.
.
2 etc.
3
.
.

If so, then something like

nm <- c("d13c", "ppm_CO2", "ppm_13CO2")
by(d, d$Sample_ID, function(x){ x[, nm] - x[1, "IN"] }, simplify = FALSE )

would do what you seem to request. See ?by and associated links for details.

Again, if this is not what you want, do not ask me for further help. I'm
done guessing.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Nov 22, 2018 at 8:14 AM Romy Rehschuh via R-help <
r-help@r-project.org> wrote:

> Dear all,
>
> I hope this is the right way to ask questions.
> I have a problem with R regarding the detection of the measurement of a
> specific sample_id (see example file attached). I have to substract the
> "IN" values (means the air which goes into the chambers) from the values of
> "d13C", "ppm_CO2" and "ppm_13CO2" for every single chamber (=sample ID).
> The "IN" values have to be the ones which were measured* before *the
> measurements of the single chambers in time. I measured "IN" once and then
> up to 10 chambers in a row to safe time, then "IN" again, but it can
> change.
> Therefore, searching for the closest "IN" does not work.
>
> Do you have any suggestions? Would it be possible to write a loop for this?
> I would really much appreciate your help!
>
> Best, Vicci
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] warnings when using binomial models and offset

2018-11-23 Thread Bert Gunter

You should post this on the r-sig-mixed-models list, which is (obviously)
specifically concerned with mixed models, and where you are more likely to
find the expertise and help you seek.

Cheers,

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Nov 23, 2018 at 7:57 AM Joana Martelo 
wrote:

> Hello everyone
>
>
>
> I'm trying to model fish capture success using length, velocity and group
> composition as explanatory variables, density as an offset variable, and
> fish.id. as random effect. I'm getting the follow warnings:
>
>
>
> Model1<-glmer(capture~length+offset(density)+(1|fish.id
> ),family=binomial,dat
> a=cap)
>
>
>
> Warning messages:
>
> 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>
>   Model failed to converge with max|grad| = 0.260123 (tol = 0.001,
> component
> 1)
>
> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>
>   Model is nearly unidentifiable: very large eigenvalue
>
> - Rescale variables?
>
>
>
>
>
> -  I only get the warnings when I use length and group composition,
> not with velocity.
>
> -  I don't get any warning if I don't use the offset.
>
>
>
> I've tried:
>
>
> Model1<-glmer(capture~length+offset(log(density))+(1|fish.id.c),family=binom
> ial(link="cloglog"),data=cap)
>
>
>
> But still get the warning.
>
>
>
> Any ideas of what might be the problem?
>
>
>
> Many thanks!
>
>
>
>
>
> Joana Martelo
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lme ->Error in getGroups.data.frame(dataMix, groups)

2018-11-24 Thread Bert Gunter

In brief, your random effects formula syntax is wrong. You need to (re)
study ?lme or suitable tutorials for details of how to do what you want --
if you can with lme (e.g. crossed random effects are very difficult in lme,
much easier in lmer).

However, you should probably re-post on the r-sig-mixed-models list to
receive better and *more expert* help.

Cheers,
Bert


On Sat, Nov 24, 2018 at 2:31 PM Boy Possen  wrote:

> The basic idea is to create a linear model in R such that FinH is explained
> by SoilNkh, dDDSP, dDDSP2, Provenance, Site, Genotype and Block, where
> SoilNkh, dDDSP and dDDSP2 are continuous covariates, Provenance, Site,
> Genotype and Block are factors, Site and Provenance are fixed and Genotype
> and Block are random. Also, Genotype is nested within Provenance and Block
> within Site.
>
> Since the order the variables go in is of importance, it should be a Anova
> type-I with the parameters in following order:
>
> FinH~SoilNkh,Site,dDDSP,dDDSP2,Provenance,Site:Provenance,Provenance/Genotype,Site/Block
>
>
> For the fixed part I am oké with either:
>
> test31 <-lm(FinH~SoilNkh + Site + dDDSP + dDDSP2 + Provenance +
> Site:Provenance ,data=d1)
>
> test32 <-aov(FinH~SoilNkh + Site + dDDSP + dDDSP2 + Provenance +
> Site:Provenance ,data=d1
>
> When trying to specify the random-part, taking the above text as starting
> point, trouble starts :)
>
> I feel it should be of the form:
>
> test64 <- lme(FinH~SoilNkh + Site + dDDSP + dDDSP2 + Provenance +
> Site:Provenance,
> random = ~1|Provenance/Genotype + ~1|Site/Block,data=d1)
>
> but I can't avoid the error
>
> "Error in getGroups.data.frame(dataMix, groups) : invalid formula for
> groups"
>
> I am lost for clues, really, so any advice would be great! If any data
> should be supplied, I'd be happy to provide of course, but can't (yet)
> figure out how...
>
> Thanks in advance for your time.
>
>
> --
> B.J.H.M. Possen
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] EXAMPLE OF HOW TO USE R FOR EXPONENTIAL DISTRIBUTION & EXPONENTIAL REGRESSION

2018-11-27 Thread Bert Gunter

... but do note that a nonlinear fit to the raw data will give a(somewhat)
different result than a linear fit to the transformed data. In the former,
the errors are additive and in the latter they are multiplicative. Etc.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Nov 27, 2018 at 9:11 AM Sarah Goslee  wrote:

> Hi,
>
> Please also include R-help in your replies - I can't provide
> one-on-one tutorials.
>
> Without knowing where you got your sample code, it's hard to help. But
> what are you trying to do?
>
> It doesn't have to be that complicated:
>
> x <- 1:10
> y <- c(0.00, 0.00,0.0033,0.0009,0.0025,0.0653,0.1142,0.2872,0,1 )
> plot(x, y, pch=20)
>
> # basic straight line of fit
> fit <- glm(y~x)
>
> abline(fit, col="blue", lwd=2)
> exp.lm <- lm(y ~ exp(x))
> lines(1:10, predict(exp.lm, newdata=data.frame(x=1:10)))
>
>
> On Tue, Nov 27, 2018 at 9:34 AM Tolulope Adeagbo
>  wrote:
> >
> > Hello,
> >
> > So I found this example online but there seems to be an issue with the
> "Start" points. the result is giving somewhat a straight line
> >
> > # get underlying plot
> > x <- 1:10
> > y <- c(0.00, 0.00,0.0033,0.0009,0.0025,0.0653,0.1142,0.2872,0,1 )
> > plot(x, y, pch=20)
> >
> > # basic straight line of fit
> > fit <- glm(y~x)
> > co <- coef(fit)
> > abline(fit, col="blue", lwd=2)
> >
> > # exponential
> > f <- function(x,a,b) {a * exp(b * x)}
> > fit <- nls(y ~ f(x,a,b), start = c(a=1 , b=c(0,1)))
> > co <- coef(fit)
> > curve(f(x, a=co[1], b=co[2]), add = TRUE, col="green", lwd=2)
> >
> >
> > # exponential
> > f <- function(x,a,b) {a * exp(b * x)}
> > fit <- nls(y ~ f(x,a,b), start = c(a=1, b=1))
> > co <- coef(fit)
> > curve(f(x, a=co[1], b=co[2]), add = TRUE, col="green", lwd=2)
> > # logarithmic
> > f <- function(x,a,b) {a * log(x) + b}
> > fit <- nls(y ~ f(x,a,b), start = c(a=1, b=1))
> > co <- coef(fit)
> > curve(f(x, a=co[1], b=co[2]), add = TRUE, col="orange", lwd=2)
> >
> > # polynomial
> > f <- function(x,a,b,d) {(a*x^2) + (b*x) + d}
> > fit <- nls(y ~ f(x,a,b,d), start = c(a=1, b=1, d=1))
> > co <- coef(fit)
> > curve(f(x, a=co[1], b=co[2], d=co[3]), add = TRUE, col="pink", lwd=2)
> >
> > On Tue, Nov 27, 2018 at 12:28 PM Sarah Goslee 
> wrote:
> >>
> >> Hi,
> >>
> >> Using rseek.org to search for exponential regression turns up lots of
> information, as does using Google.
> >>
> >> Which tutorials have you worked thru already, and what else are you
> looking for?
> >>
> >> Sarah
> >>
> >> On Tue, Nov 27, 2018 at 5:44 AM Tolulope Adeagbo <
> tolulopeadea...@gmail.com> wrote:
> >>>
> >>> Good day,
> >>> Please i nee useful materials to understand how to use R for
> exponential
> >>> regression.
> >>> Many thanks.
> >>
>
>
> --
> Sarah Goslee (she/her)
> http://www.numberwright.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Basic optimization question (I'm a rookie)

2018-11-27 Thread Bert Gunter

Of course, this particular example is trivially solvable by hand: x ==y
==p/4 , a square.
Note also that optimization with equality constraints are generally
solvable by the method of Lagrange multipliers for smooth functions and
constraints, so that numerical methods may not be needed for relatively
simple cases.

Cheers,
Bert

On Tue, Nov 27, 2018 at 3:19 PM FAIL PEDIA <
soloparapaginas123456...@gmail.com> wrote:

> Hello, and thanks to anyone who takes the time to read this
>
> I'm trying to learn to properly optimize a function with a constraint using
> R. For example, maximize the area of a terrain with a maximum perimeter.
> For this example the function would be:
>
>  Area <- function(x,y){x*y}
>
> The restriction would be the following function:
>
>  Perimeter <- function(x,y){2*(x+y)}
>
> The idea is to give a desired value to "Perimeter" and get the values of x
> & y that maximize the area and respect the constraint.
>
> I've searched online for some time, and only found a video of a dude that
> plotted the functions toggling the values to find the tangent optimum point
> (something useless, because the idea is to make the optimization more
> efficiently than using a paper and a pencil)
>
> Thanks again, and sorry if this question is silly.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating a P for trend

2018-11-29 Thread Bert Gunter

I would suggest that if at all possible, you find a local statistician
(your instructor??) with whom to consult. Much of what you are doing
appears likely to result in irreproducible nonsense.

This list is concerned with R programming, not statistics, although they
sometimes do intersect. So I think your post is off topic here (but others
may disagree).

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Nov 29, 2018 at 8:50 AM Lisa van der Burgh <40760...@student.eur.nl>
wrote:

> Hi all,
>
>
>
> I have a question about calculating a P for trend on my data. Let’s give
> an example that is similar to my own situation first: I have a continuous
> outcome, namely BMI. I want to investigate the effect of a specific
> medicine, let’s call it MedA on BMI. MedA is a variable that is
> categorical, coded as yes/no use of the medication. A also have the
> duration of use of the MedA, divided in three categories: use of MedA for
> 1-30 days, use of MedA for 31-60 days and use of MedA for 61-120 days
> (categories based on literature). I have performed a linear regression
> analyses and it seems like there is some kind of trend: the longer the use
> of MedA, the higher the BMI will be (the betas increase with time of use).
> So an exemplary table:
>
>
>
>
>
>
> Outcome: BMI
>
>
> Beta
>
>
> MedA use duration
>
>
>
>
>
>   Use for 1-30 days
>
>
> 0.060
>
>
>   Use for 31-60 days
>
>
> 0.074
>
>
>   Use for 61-120 da
>
>
> 0.081
>
>
>
>
> So, I have created three variables and I modelled them in Rstudio (on a
> multiple imputed dataset using MICE):
>
>
>
> mod1  <- with(imp, lm(BMI ~ MedA_1to30))
>
> pool_ mod1  <- pool(mod1)
>
> summary(pool_ mod1, conf.int = TRUE)
>
>
>
> mod2  <- with(imp, lm(BMI ~ MedA_31to60))
>
> pool_ mod2  <- pool(mod2)
>
> summary(pool_ mod2, conf.int = TRUE)
>
>
>
> mod3  <- with(imp, lm(BMI ~ MedA_61to120))
>
> pool_ mod3  <- pool(mod3)
>
> summary(pool_ mod3, conf.int = TRUE)
>
>
>
> Now that I have done this, I want to calculate a p for trend. I do know
> what a P for trend measures, but I do not know how to calculate this
> myself. I read something about the partial.cor.trend.test() function from
> the trend package, but I do not know what I should fill in. Because I can
> only fill in an x and y, but I have three time variables. So I do not know
> how to solve this. Can somebody help me?
>
>
>
> If more information is necessary, I am happy to give it to you!
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rmarkdown

2018-11-29 Thread Bert Gunter

1. What error?

2. This is the r-help list. RStudio is a separate product. Requests for
help in RStudio should be directed to their lists.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Nov 29, 2018 at 12:16 PM Dennis Weygand  wrote:

> When I try to create an rmarkdown file in Rstudio, I get the error below.
> What am I doing wrong?
>
> Sent from Mail for Windows 10
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bootstrapping One- and Two-Sample Hypothesis Tests of Proportion

2018-11-29 Thread Bert Gunter

... but as Duncan pointed out already, I believe, a proportion **is** a
mean -- of 0/1 responses.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Nov 29, 2018 at 3:30 PM Janh Anni  wrote:

> Hi Rui,
>
> Thanks a lot for responding and I apologize for my late response.  I tried
> using the *boot.two.per* function in the wBoot package which stated that it
> could bootstrap 2-sample tests for both means and proportions but it turned
> out that it only works for the mean.
>
> Thanks again,
> Janh
>
> On Wed, Nov 28, 2018 at 12:38 PM Rui Barradas 
> wrote:
>
> > Hello,
> >
> > What have you tried?
> > Reproducible example please.
> >
> > http://adv-r.had.co.nz/Reproducibility.html
> >
> >
> https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> > https://www.r-bloggers.com/minimal-reproducible-examples/
> >
> >
> > Rui Barradas
> >
> > Às 22:33 de 27/11/2018, Janh Anni escreveu:
> > > Hello R Experts!
> > >
> > > Does anyone know of a relatively straightforward way to bootstrap
> > > hypothesis tests for proportion in R?
> > >
> > > Thanks in advance!
> > >
> > > Janh
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] D-optimum design - optimization model in R

2018-12-04 Thread Bert Gunter

Please do not re-post. To increase your chance of getting a useful answer,
read and follow the posting guide below, which you have not yet done. For
example, what is "F.cube", what packages are you using? Also, this list is
about R programming, not statistics, which is more what your query seems to
be about. Finally, search yourself! -- e,g, "d-optimal designs" on rseek.org
brought up what appeared to be many relevant hits, including a CRAN task
view on experimental design.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Dec 4, 2018 at 3:29 AM Thanh Tran  wrote:

> Dear all,
>
>
>
> I'm trying to use the D-optimum design. In my data, the response is KIC,
> and 4 factors are AC, AV, T, and Temp. A typical second-degree response
> modeling is as follows:
>
>
>
> > data<-read.csv("2.csv", header =T)
>
> > mod <-
>
> lm(KIC~AC+I(AC^2)+AV+I(AV^2)+T+I(T^2)+Temp+I(Temp^2)+AC:AV+AC:T+AC:Temp+AV:T+AV:Temp+T:Temp,
>
> + data = data)
>
>
>
> The result of the model:
>
>
>
> KIC = 4.85 – 2.9AC +0.151 AV + 0.1094T
>
>   + 0.0091Temp + 0.324 AC^2-0.0156V^2
>
>   - 10.00106T^2 - 0.0009Temp^2 + 0.0071AC´AV
>
>   - 0.00087AC´T -0.00083AC´Temp – 0.0018AV´T
>
>  +0.0015AV´Temp – 0.000374 AV ´ T
>
>
>
> Based on the above response modelling, I want to determine levels of the
> AC, AV, T, and Temp to have the Maximum value of KIC. The result running in
> Minitab as is shown in Figure 1. In R, I try to compute an D-optimum design
> with the following codes:
>
>
>
> > attach(data)
>
> > F.trig <- F.cube
>
> > F.trip <-
>
> F.cube(KIC~AC+I(AC^2)+AV+I(AV^2)+T+I(T^2)+Temp+I(Temp^2)+AC:AV+AC:T+AC:Temp+AV:T+AV:Temp+T:Temp,
>
> + c(4,4,30,5), # Smalesst values of AC,AV,T, and Temp
>
> + c(5,7,50,25), # Highest values of AC,AV,T, and Temp
>
> + c(3,3,3,3)) # Numbers of levels ofAC,AV,T, and Temp
>
> > res.trip.D <- od.AA(F.trip,1,alg = "doom", crit = "D",
>
> + graph =1:7, t.max = 4)
>
>
>
> I have the result as shown in Figure 2 but I cannot find out the optimum
> design as shown in Figure 1 using Minitab.
>
>
>
> If anyone has any experience about what would be the reason for error or
> how I can solve it? I really appreciate your support and help.
>
>
>
> Best regards,
>
> Nhat Tran
>
>
>
> Ps: I also added a CSV file for practicing R.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with K-Means output

2018-12-08 Thread Bert Gunter

Please see ?kmeans and note the "cluster" component of the returned value
that would appear to provide the info you seek.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Dec 8, 2018 at 7:03 AM Bill Poling  wrote:

> Good afternoon. I hope I have provided enough info to get my question
> answered.
>
> I am running windows 10 -- R3.5.1 -- RStudio Version 1.1.456
>
> When running a K-Means clustering routine is it possible to get the actual
> data from each cluster into a DF?
>
> I have reviewed a number of tutorials and unless I missed it somewhere I
> would like to know if it is possible.
>
> https://www.datacamp.com/community/tutorials/k-means-clustering-r
> https://www.guru99.com/r-k-means-clustering.html
> https://datascienceplus.com/k-means-clustering-in-r/
> https://datascienceplus.com/finding-optimal-number-of-clusters/
> http://enhancedatascience.com/2017/10/24/machine-learning-explained-kmeans/
> http://enhancedatascience.com/2017/04/30/r-basics-k-means-r/
>
> For example:
>
> I ran the below and get K-means clustering with 10 clusters of sizes 1511,
> 1610, 702, 926, 996, 1076, 580, 2429, 728, 3797
> Can the 1511 values of SavingsReversed and ProviderID , 1610 values of
> SavingsReversed and ProviderID, etc.. be run out into DF's?
>
> Thank you for your help.
>
> WHP
>
> str(rr0)
> Classes 'data.table' and 'data.frame':14355 obs. of  2 variables:
>  $ SavingsReversed: num  0 0 61 128 160 ...
>  $ ProviderID : num  113676 113676 116494 116641 116641 ...
>  - attr(*, ".internal.selfref")=
>
> head(rr0, n=35)
> SavingsReversed ProviderID
>  1:0.00 113676
>  2:0.00 113676
>  3:   61.00 116494
>  4:  128.25 116641
>  5:  159.60 116641
>  6:  372.66 119316
>  7:   18.79 121319
>  8:   15.64 121319
>  9:0.00 121319
> 10:   18.79 121319
> 11:   23.00 121319
> 12:   18.79 121319
> 13:0.00 121319
> 14:   25.86 121319
> 15:   14.00 121319
> 16:  113.00 121545
> 17:   50.00 121545
> 18: 1155.32 121545
> 19:  113.00 121545
> 20:  197.20 121545
> 21:0.00 121780
> 22:   36.00 122536
> 23: 1171.32 125198
> 24: 1171.32 125198
> 25:   43.00 125303
> 26:0.00 125881
> 27:   69.64 128435
> 28:  420.18 128435
> 29:  175.18 128435
> 30:   71.54 128435
> 31:   99.85 128435
> 32:0.00 128435
> 33:   42.75 128435
> 34:  175.18 128435
> 35:  846.45 128435
>
> set.seed(213)
> rr0a <- kmeans(rr0, 10)
> View(rr0a)
> summary(rr0a)
> # Length Class  Mode
> # cluster  14355  -none- numeric
> # centers 20  -none- numeric
> # totss1  -none- numeric
> # withinss10  -none- numeric
> # tot.withinss 1  -none- numeric
> # betweenss1  -none- numeric
> # size10  -none- numeric
> # iter 1  -none- numeric
> # ifault   1  -none- numeric
>
> x1 <- as.data.frame(rr0a$centers)
> sort(x1)
> #SavingsReversed ProviderID
> # 2 75.19665  2773789.2
> # 3 99.31959  4147091.6
> # 5101.21070  3558532.7
> # 4103.41147  3893274.4
> # 1105.38310  2241031.2
> # 8114.61562  3240701.5
> # 10   121.14184  4718727.6
> # 9153.70536  4470878.9
> # 6156.84426  5560636.6
> # 7185.09745   173732.9
> print(rr0a)
> # K-means clustering with 10 clusters of sizes 1511, 1610, 702, 926, 996,
> 1076, 580, 2429, 728, 3797
> #
> # Cluster means:
> #   SavingsReversed ProviderID
> # 1105.38310  2241031.2
> # 2 75.19665  2773789.2
> # 3 99.31959  4147091.6
> # 4103.41147  3893274.4
> # 5101.21070  3558532.7
> # 6156.84426  5560636.6
> # 7185.09745   173732.9
> # 8114.61562  3240701.5
> # 9153.70536  4470878.9
> # 10   121.14184  4718727.6
> #Within cluster sum of squares by cluster:
> # [1] 74529288379846 25846368411171  4692898666512  6277704963344
> 8428785199973 90824041558798  1468798013919 12143462193009  5483877005233
> # [10] 51547955737867
> # (between_SS / total_SS =  98.7 %)
> #
> # Available components:
> #
&

Re: [R] Help with K-Means output

2018-12-08 Thread Bert Gunter

See David Carlson's reply -- and his advice for learning about how to use
lists.

"And I can just join this DF with my original DF used for the KMean,
correct?"

Define "join" . See, e.g.
http://desktop.arcgis.com/en/arcmap/10.3/manage-data/tables/essentials-of-joining-tables.htm
See also ?merge

I consider it to be your job to learn how to work with R's data structures.
There are numerous web tutorials to help you do so. Others may disagree and
reply to such queries.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Dec 8, 2018 at 8:43 AM Bill Poling  wrote:

> Thank you Bert, I see, so I think this is the process?
>
> set.seed(213)
> rr0a1 <- kmeans(rr0, 10)
>
> summary(rr0a1) #Just the cluster
> #Length Class  Mode
> #cluster  14355  -none- numeric
>
> head(rr0a1$cluster, n=35)
> # [1] 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7
>
> Xcluster <- as.data.frame(rr0a1$cluster)
>
> head(Xcluster, n=5)
> #rr0a1$cluster
> # 1 7
> # 2 7
> # 3 7
> # 4 7
> # 5 7
>
> tail(Xcluster, n=5)
> #rr0a1$cluster
> # 14351 6
> # 14352 6
> # 14353 6
> # 14354 6
> # 14355 6
>
> And I can just join this DF with my original DF used for the KMean,
> correct?
> The vertical order is the same?
>
> WHP
>
>
> From: Bert Gunter 
> Sent: Saturday, December 8, 2018 10:46 AM
> To: Bill Poling 
> Cc: R-help 
> Subject: Re: [R] Help with K-Means output
>
> Please see ?kmeans and note the "cluster" component of the returned value
> that would appear to provide the info you seek.
>
> -- Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Dec 8, 2018 at 7:03 AM Bill Poling <mailto:bill.pol...@zelis.com>
> wrote:
> Good afternoon. I hope I have provided enough info to get my question
> answered.
>
> I am running windows 10 -- R3.5.1 -- RStudio Version 1.1.456
>
> When running a K-Means clustering routine is it possible to get the actual
> data from each cluster into a DF?
>
> I have reviewed a number of tutorials and unless I missed it somewhere I
> would like to know if it is possible.
>
> https://www.datacamp.com/community/tutorials/k-means-clustering-r
> https://www.guru99.com/r-k-means-clustering.html
> https://datascienceplus.com/k-means-clustering-in-r/
> https://datascienceplus.com/finding-optimal-number-of-clusters/
> http://enhancedatascience.com/2017/10/24/machine-learning-explained-kmeans/
> http://enhancedatascience.com/2017/04/30/r-basics-k-means-r/
>
> For example:
>
> I ran the below and get K-means clustering with 10 clusters of sizes 1511,
> 1610, 702, 926, 996, 1076, 580, 2429, 728, 3797
> Can the 1511 values of SavingsReversed and ProviderID , 1610 values of
> SavingsReversed and ProviderID, etc.. be run out into DF's?
>
> Thank you for your help.
>
> WHP
>
> str(rr0)
> Classes 'data.table' and 'data.frame':14355 obs. of  2 variables:
>  $ SavingsReversed: num  0 0 61 128 160 ...
>  $ ProviderID : num  113676 113676 116494 116641 116641 ...
>  - attr(*, ".internal.selfref")=
>
> head(rr0, n=35)
> SavingsReversed ProviderID
>  1:0.00 113676
>  2:0.00 113676
>  3:   61.00 116494
>  4:  128.25 116641
>  5:  159.60 116641
>  6:  372.66 119316
>  7:   18.79 121319
>  8:   15.64 121319
>  9:0.00 121319
> 10:   18.79 121319
> 11:   23.00 121319
> 12:   18.79 121319
> 13:0.00 121319
> 14:   25.86 121319
> 15:   14.00 121319
> 16:  113.00 121545
> 17:   50.00 121545
> 18: 1155.32 121545
> 19:  113.00 121545
> 20:  197.20 121545
> 21:0.00 121780
> 22:   36.00 122536
> 23: 1171.32 125198
> 24: 1171.32 125198
> 25:   43.00 125303
> 26:0.00 125881
> 27:   69.64 128435
> 28:  420.18 128435
> 29:  175.18 128435
> 30:   71.54 128435
> 31:   99.85 128435
> 32:0.00 12843

Re: [R] Generate Range of Correlations Matrix Bernoulli

2018-12-08 Thread Bert Gunter

I have no idea why your post was "rejected," nor even quite what you mean
by that. But I believe your post may not receive any replies because you
have failed to follow the posting guide linked below. I find it
incomprehensible, but maybe others will be able to understand what you
want. It may also be off topic -- this list is about R programming, not
statistics (though they do sometimes intersect) -- but that may just be
because I didn't understand your query.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

‪On Sat, Dec 8, 2018 at 8:55 AM ‫إيمان إسماعيل محمد‬‎ <
emanismail...@gmail.com> wrote:‬

> Hi all,
> I was wondering how can I construct range of correlations matrix that cover
> all space
> from dependent Multivariate Bernoulli (known Marginal Probabilities but
> unknown correlations)
> I have P's for every variable but unknown correlation between each pair
> I want to try range of applicable correlations
>
> Why my post rejected?
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generate Range of Correlations Matrix Bernoulli

2018-12-08 Thread Bert Gunter

Unless you are sending a private message, always include the list (which I
have cc'ed) in your reply, because, as here, individuals do not do private
consulting and may be unwilling or unable to provide the help you seek.

Bert Gunter



‪On Sat, Dec 8, 2018 at 10:33 AM ‫إيمان إسماعيل محمد‬‎ <
emanismail...@gmail.com> wrote:‬

> I want to simulate data from Multivariate Bernoulli
> I have probability for each variable but unknown correlation for each pair
> I want to try all possible correlation value that made me cover all space
> if you could have a look on attached paper
> I am looking on implementation in R to that paper
> I appreciate if you could help me with any references or links
>
> Thanks in advance
>
> ‫في السبت، 8 ديسمبر 2018 في 7:36 م تمت كتابة ما يلي بواسطة ‪Bert Gunter‬‏
> <‪bgunter.4...@gmail.com‬‏>:‬
>
>> I have no idea why your post was "rejected," nor even quite what you mean
>> by that. But I believe your post may not receive any replies because you
>> have failed to follow the posting guide linked below. I find it
>> incomprehensible, but maybe others will be able to understand what you
>> want. It may also be off topic -- this list is about R programming, not
>> statistics (though they do sometimes intersect) -- but that may just be
>> because I didn't understand your query.
>>
>> Cheers,
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> ‪On Sat, Dec 8, 2018 at 8:55 AM ‫إيمان إسماعيل محمد‬‎ <
>> emanismail...@gmail.com> wrote:‬
>>
>>> Hi all,
>>> I was wondering how can I construct range of correlations matrix that
>>> cover
>>> all space
>>> from dependent Multivariate Bernoulli (known Marginal Probabilities but
>>> unknown correlations)
>>> I have P's for every variable but unknown correlation between each pair
>>> I want to try range of applicable correlations
>>>
>>> Why my post rejected?
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to keep colnames of matrix when put it into a data frame

2018-12-09 Thread Bert Gunter

Your names are not syntactically valid.

Consider:

> mat <- matrix(1:9, nrow = 3)
> colnames(mat) <- letters[1:3]
> mat
 a b c
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
> data.frame(x=1:3,mat)
  x a b c
1 1 1 4 7
2 2 2 5 8
3 3 3 6 9

See ?make.names, and the "Value" section of ?data.frame for how names are
constructed.

Michael's suggestion produces a matrix, not  a data frame. dimnames of
matrices apparently have different rules for validity.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Sun, Dec 9, 2018 at 7:05 AM Jinsong Zhao  wrote:

> Hi there,
>
> In the following mini-example, I hope to keep the column names of mat, but
> failed.
>
> # mini-example
> > mat <- matrix(1:9, nrow = 3)
> > colnames(mat) <- paste("(", 1:3, ")", sep = "")
> > mat
>  (1) (2) (3)
> [1,]   1   4   7
> [2,]   2   5   8
> [3,]   3   6   9
> > data.frame(x = 1:3, mat)
>   x X.1. X.2. X.3.
> 1 1147
> 2 2258
> 3 3369
>
> Any hints will be really appreciated.
>
> Best,
> Jinsong
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting Very Large lat-lon data in x-y axes graph

2018-12-09 Thread Bert Gunter

Yes, there are many ways to do this. Search rseek.org for "2d density
plots". Also check the CRAN "Spatial" task view. Also see the kde2d
function in the MASS package and especially the examples there that use the
image() function to plot densities.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Dec 9, 2018 at 7:50 AM Ogbos Okike  wrote:

> Dear Contributors,
>
> I have a data of the form:
> Lat  Lon
> 30.1426 104.7854
> 30.5622 105.0837
> 30.0966 104.6213
> 29.9795 104.8430
> 39.2802 147.7295
> 30.2469 104.6543
> 26.4428 157.7293
> 29.4782 104.5590
> 32.3839 105.3293
> 26.4746 157.8411
> 25.1014 159.6959
> 25.1242 159.6558
> 30.1607 104.9100
> 31.4900 -71.8919
> 43.3655 -74.9994
> 30.0811 104.8462
> 29.0912 -85.5138
> 26.6204 -80.9342
> 31.5462 -71.9638
> 26.8619 97.3844
> 30.2534 104.6134
> 29.9311 -85.3434
> 26.1524 159.6806
> 26.5112 158.0233
> 26.5441 158.0565
> 27.8901 -105.8554
> 30.3175 104.7135
> 26.4822 157.6127
> 30.1887 104.5986
> 29.5058 104.5661
> 26.4010 157.5749
> 30.2281 104.7585
> 31.4556 110.5619
> 30.1700 104.5861
> 26.3911 157.4776
> 30.6493 104.9949
> 30.2209 104.6629
> 26.0488 97.3608
> 30.2142 104.8023
> 30.1806 104.8158
> 25.2107 160.1690
> 30.6708 104.9385
> 30.4152 104.7002
> 30.2446 104.7804
> 29.5760 -85.1535
> 26.4484 92.4312
> 26.3914 157.4189
> 26.3986 157.4421
> 30.4903 -88.2271
> 30.6727 104.8768
> 30.2518 104.6466
> 41.6979 -78.4136
> 33.7575 72.1089
> 26.8333 -80.9485
> 25.3103 124.0978
> 30.1742 104.7554
> 30.6345 104.9739
> 30.2075 104.7960
> 30.2226 104.7517
> 30.5948 105.0532.
> The record is for lightning flashes in the continental U.S. and
> surrounding waters within the latitudinal band between
> 258 and 458N.
>
> I want to display the result in x-y co-ordinate plot. However, the
> data is very large such that when plotted, everything just appeared
> blurred.
>
>
> Is there a way of using color codes to  indicate the regions of higher
> or lower flash densities?
>
> I can attach the plot I generated but I am not sure if the moderator
> will allow it to go with this.
>
> I will send it in a separate email if required.
>
> Thank you so much for sparing your time.
>
> Best
> Ogbos
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Spark DataFrame: replace NULL cell by NA

2018-12-09 Thread Bert Gunter

"...   if("factor" %in% class(x)) x <- as.character(x) ## since ifelse wont
work with factors  "
Nonsense!

> x <- factor(c("a","", "b"))
> x
[1] a   b
Levels:  a b

> levels(x)
[1] ""  "a" "b"

> x <- factor(ifelse(x==(""),NA,x))
> x
[1] 2 3
Levels: 2 3


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Dec 9, 2018 at 1:07 PM Karim Mezhoud  wrote:

> Dear All,
> ## function to relpace empty cell by NA
> empty_as_na <- function(x){
>   if("factor" %in% class(x)) x <- as.character(x) ## since ifelse wont work
> with factors
>   ifelse(as.character(x)!="", x, NA)
> }
>
> ## connect to spark local
> sc <- spark_connect(master = "local")
> # load an example of dataframe taht has empty cells (needs cgdsr package)
> clinicalData <- cgdsr::getClinicalData(cgds, "gbm_tcga_pub_all")
> ## copy to spark
> clinicalData_tbl <- dplyr::copy_to(sc, clinicalData, overwrite = TRUE)
>
>  # This works
> clinicalData %>% mutate_all(funs(empty_as_na))
> # This Does not works
> clinicalData_tbl %>% mutate_all(funs(empty_as_na))
> Thanks,
> Karim
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ks.test ; impossible to calculate exact exact value with ex-aequos

2018-12-10 Thread Bert Gunter

"Other than correlation, how to check ressemblence between these two curve"

(As Ted Indicated) Graph them... and look!

There is nothing magical about statistics, which seems to be what you seek.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Dec 10, 2018 at 3:36 PM Fatma Ell  wrote:

> Thanks a lot for this reply
>
> 'a' is a simulated data while 'b' is empirical data.
> Other than correlation, how to check ressemblence between these two curve
> in terms of :
> Amplitude in each row 1...12
> Evolution and variability from 1 to 12
>
> Thanks !
>
>
> Le lundi 10 décembre 2018, Ted Harding  a écrit
> :
>
> > On Mon, 2018-12-10 at 22:17 +0100, Fatma Ell wrote:
> > > Dear all,
> > > I'm trying to use ks.test in order to compare two curve. I've 0 values
> i
> > > think this is why I have the follonwing warnings :impossible to
> calculate
> > > exact exact value with  ex-aequos
> > >
> > > a=c(3.02040816326531, 7.95918367346939, 10.6162790697674,
> > 4.64150943396226,
> > > 1.86538461538462, 1.125, 1.01020408163265, 1.2093023255814,
> > > 0.292452830188679,
> > > 0, 0, 0)
> > > b=c(2.30769230769231, 4.19252873563218, 5.81924882629108,
> > 6.2248243559719,
> > > 5.02682926829268, 4.50728862973761, 3.61741424802111, 5.05479452054795,
> > > 3.68095238095238, 1.875, 5.25, 0)
> > >
> > > ks.test(a,b)
> > >
> > > data:  a and b
> > > D = 0.58333, p-value = 0.0337
> > > alternative hypothesis: two-sided
> > >
> > > Warning message:
> > > In ks.test(a, b) :
> > > impossible to calculate exact exact value with ex-aequos
> > >
> > > Does the p-value is correct ? Otherwise, how to solve this issue ?
> > > Thanks a lot.
> >
> > The warning arises, not because you have "0" values as such,
> > but because there are repeated values (which happen to be 0).
> >
> > The K-S test is designed for continuous random variables, for
> > which the probability of repeated values is (theoretically) zero:
> > theoretically, they can't happen.
> >
> > >From the help page ?ks.test :
> >
> > "The presence of ties always generates a warning, since continuous
> > distributions do not generate them. If the ties arose from
> > rounding the tests may be approximately valid, but even modest
> > amounts of rounding can have a significant effect on the
> > calculated statistic."
> >
> >
> >
> > in view of the fact that your sample 'a' has three zeros along with
> > nine other vauwes which are all different from 0 (and all *very*
> > different from 0 except for 0.292452830188679), along with the fact
> > that your sample 'b' has 11 values all *very* different from 0.
> > and pne finall value equal to 0; together also with the fact that
> > in each sample the '0' values occur at the end, stringly suggests
> > that the data source is not such that a K-D test is auitasble.
> >
> > The K-S test is a non-parametric test for whether
> >   a) a given sample comes from na given kind of distribiution;
> > or
> >   v) two samples are drwn from the same distribition.
> > In either case, it is assumed that the sample values are drawn
> > independently of each other; if there is some reason why they
> > may not be independent of each other, the test os not valid.
> >
> > You say "I'm trying to use ks.test in order to compare two curve".
> > When I ezecute
> >   plot(a)
> >   plot(b)
> > on your data, I see (approximately) in each case a rise from a
> > medium vale (~2 or ~3) to a higher vale {~6 or ~10) followed
> > by a decline down to an exact 0.
> >
> > This is not the sort of situation that the K-S test is for!
> > Hoping this helps,
> > Ted.
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plot split screen in uneven panels

2018-12-12 Thread Bert Gunter

?layout
Please read the Help file **carefully** and work through the **examples**.
I cannot explain better than they.
Here is code using layout() that I think does what you want:

m <- matrix(1:2, nrow =1)
layout(m, widths = c(1,2))
plot(1:10, type = "p",main = "The First Plot")
plot(10:1, type = "l", main ="The Second Plot")

Note that both the lattice package and ggplot2 can also do this sort of
thing much more flexibly(and therefore requiring more effort to learn).

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Dec 12, 2018 at 7:19 AM Luigi Marongiu 
wrote:

> Dear all,
> I would like to draw two plots in the same device so that there is a
> single row and two columns, with the first column being 1/3 of the
> device's width.
> I am creating a PNG object with width = 30 and height = 20 cm.
> I know that I should use split.screen or layout but I am lost with the
> matrix to pass to the functions.
> For istance, I tried:
> # distance in arbitrary units (so let's say cm) from of corners
> # left, right, bottom, and top counting from bottom left corner
> # that is first panel has the bottom right corner 20 cm from the bottom
> left?
> > m = matrix(c(0,20,40,0, 20,60,40,0), byrow=T, ncol=4)
> > m
>  [,1] [,2] [,3] [,4]
> [1,]0   20   400
> [2,]   20   60   400
> > split.screen(m)
> Error in par(split.screens[[cur.screen]]) :
>   invalid value specified for graphical parameter "fig"
> > m[1,]
> [1]  0 20 40  0
> > split.screen(m[1,])
> Error in split.screen(m[1, ]) : 'figs' must specify at least one screen
>
> What should be the syntax for this task?
>
> --
> Best regards,
> Luigi
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plot split screen in uneven panels

2018-12-12 Thread Bert Gunter

Incidentally, here is another way to do what (I think) you asked using
layout():

m <- matrix(c(1,2,2), nrow =1)
layout(m)
plot(1:10, type = "p",main = "The First Plot")
plot(10:1, type = "l", main ="The Second Plot")

On my device, the plots use different size fonts, point sizes, etc. and so
aesthetically differ. I do not know why and am too lazy to delve into the
code.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Dec 12, 2018 at 8:39 AM Bert Gunter  wrote:

> ?layout
> Please read the Help file **carefully** and work through the **examples**.
> I cannot explain better than they.
> Here is code using layout() that I think does what you want:
>
> m <- matrix(1:2, nrow =1)
> layout(m, widths = c(1,2))
> plot(1:10, type = "p",main = "The First Plot")
> plot(10:1, type = "l", main ="The Second Plot")
>
> Note that both the lattice package and ggplot2 can also do this sort of
> thing much more flexibly(and therefore requiring more effort to learn).
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Dec 12, 2018 at 7:19 AM Luigi Marongiu 
> wrote:
>
>> Dear all,
>> I would like to draw two plots in the same device so that there is a
>> single row and two columns, with the first column being 1/3 of the
>> device's width.
>> I am creating a PNG object with width = 30 and height = 20 cm.
>> I know that I should use split.screen or layout but I am lost with the
>> matrix to pass to the functions.
>> For istance, I tried:
>> # distance in arbitrary units (so let's say cm) from of corners
>> # left, right, bottom, and top counting from bottom left corner
>> # that is first panel has the bottom right corner 20 cm from the bottom
>> left?
>> > m = matrix(c(0,20,40,0, 20,60,40,0), byrow=T, ncol=4)
>> > m
>>  [,1] [,2] [,3] [,4]
>> [1,]0   20   400
>> [2,]   20   60   400
>> > split.screen(m)
>> Error in par(split.screens[[cur.screen]]) :
>>   invalid value specified for graphical parameter "fig"
>> > m[1,]
>> [1]  0 20 40  0
>> > split.screen(m[1,])
>> Error in split.screen(m[1, ]) : 'figs' must specify at least one screen
>>
>> What should be the syntax for this task?
>>
>> --
>> Best regards,
>> Luigi
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need help with R studio

2018-12-14 Thread Bert Gunter

... and R GUI is an interactive and command driven language. This means you
have to learn how how to use it like a programming language. Please spend
some time with R tutorials to do this -- there are many on the web.

... or see here for a GUI interface to some of R's basic, but most widely
used, functionality:

https://www.rdocumentation.org/packages/Rcmdr/versions/2.5-1

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Fri, Dec 14, 2018 at 8:58 AM Jeff Newmiller 
wrote:

> Looks fine to me, though you seem to be confused between what R is (on
> topic in this mailing list) and what RStudio is (a fine software package
> that relies on R but is not really on topic here).
>
> Do find the link in the footer below about posting in R-help and read it
> before posting again.
>
> On December 14, 2018 8:27:07 AM PST, Madhavi Bhat 
> wrote:
> >I am using HO Spectre windows and I have downloaded latest version of R
> >studio 3.5.1 for windows but its not working. I need help to fix this
> >issue. I am attaching screen shot of my R studio. Please help me in
> >this
> >regard.
> >Thank you
> >Madhavi Bhat
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dealing with special characters at end of line in file

2018-12-15 Thread Bert Gunter

... or used the fixed = TRUE argument.

> z <-"In  Alvarez Cabral street by no. 105.\\000"

> sub("\\000","", z, fixed = TRUE)
[1] "In  Alvarez Cabral street by no. 105."


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Dec 15, 2018 at 7:32 AM J C Nash  wrote:

> I am trying to fix up some image files (jpg) that have comments in them.
> Unfortunately, many have had extra special characters encoded.
>
> rdjpgcom, called from an R script, returns a comment e.g.,
>
> "In  Alvarez Cabral street by no. 105.\\000"
>
> I want to get rid of "\\000", but sub seems
> to be giving trouble.
>
> > sub("\\000", "", ctxt)
> [1] "In  Alvarez Cabral street by no. 105.\\0"
>
> Anyone know how to resolve this?
>
> JN
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Combine lists into a data frame or append them to a text file

2018-12-15 Thread Bert Gunter

FWIW, I had no trouble writing a test case to a file with either version of
your code. As we have no idea what your data look like, I don't know how
anyone can diagnose the problem. But maybe I'm wrong and someone else will
recognize the issue.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Dec 15, 2018 at 7:28 PM Ek Esawi  wrote:

> Hi All,
>
> I have an R object that is made up of N number of lists which are all
> of different number of columns and rows.  I want to combine the N
> lists into a single data frame or write (append) them into text file.
> I hope the question is clear and doesn’t require an example. I am
> hoping to accomplish this using base R functions.
> Below is what I tried but both gave me the same error which I do
> understand, I think, but I don’t know how to fix it. My R object is
> MyTables
>
> lapply(MyTables, function(x) write.table(x, file = "Temp.txt",append =
> TRUE ))
> OR
> for (i in 1:length(MyTables)) {
> write.table(MyTables[i], file = "Temp.txt",append = TRUE,quote = TRUE)
>
> the error
> Error in (function (..., row.names = NULL, check.rows = FALSE,
> check.names = TRUE,  :
>   arguments imply differing number of rows: 51, 8, 30
>
> Thanks--EK
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Functional data anlysis for unequal length and unequal width time series

2018-12-17 Thread Bert Gunter

Specialized: Probably need to email the maintainer. See ?maintainer

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Dec 17, 2018 at 9:27 AM  wrote:

> Dear All,
> I apologize if you have already seen in Stack Overflow. I
> have not got any response from there so I am posting for help here.
>
> I have data on 1318 time series. Many of these series are of unequal
> length. Apart from this also quite a few time points for each of the
> series are observed at different time points. For example consider the
> following four series
>
> t1 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.67)
> V1 <- c(-0.1710, -0.0824, -0.0419, -0.0416, -0.0216, -0.0792, -0.0656,-
> 0.0273, -0.0589)
> ser1 <- cbind(t1, V1)
>
> t2 <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38)
> V2 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231, 0.2264)
> ser2 <- cbind(t2, V2)
>
> t3 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.65,
> 25.88, 25.97, 25.99)
> V3 <- c(0.0897, -0.0533, -0.3497, -0.5684, -0.4294, -0.1109, 0.0352,
> 0.0550, -0.0536, 0.0185, -0.0295, -0.0324)
> ser3 <- cbind(t3, V3)
>
> t4 <- c(24.5, 24.67, 24.71, 24.98, 25.17)
> V4 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231)
> ser4 <- cbind(t4, V4)
>
> Here t1, t2, t3, t4 are the time points and V1, V2, V3, V4 are the
> observations made at over those time points. The time points in the
> actual data are Julian dates so they look like these, just that they
> are much larger decimal figures like 2452450.6225.
>
> I am trying to cluster these time series using functional data approach
> for which I am using the "funFEM" package in R. Th examples present are
> for equispaced and equal length time series so I am not sure how to use
> the package for my data. Initially I tried by making all the time
> series equal in length to the time series having the highest number of
> observations (here equal to ser3) by adding NA's to the time series. So
> following this example I made ser2 as
>
> t2_n <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38, 25.50, 25.55, 25.65,
> 25.88, 25.97, 25.99)
> V2_na <- c(V2, rep(NA, 6))
> ser2_na <- cbind(t2_n, V2_na)
>
> Note that to make t2 equal to length of t3 I grabbed the last 6 time
> points from t3. To make V2 equal in length to V3 I added NA's.
>
> Then I created my data matrix as
>
> dat <- rbind(V1_na, V2_na, V3, V4_na).
>
> The code I used was
>
> require(funFEM)
> basis<- create.fourier.basis(c(min(t3), max(t3)), nbasis = 25)
> fdobj <- smooth.basis(c(min(t3), max(t3)) ,dat, basis)$fd
>
> Note that the range is constructed using the maximum and minumum time
> point of ser_3 series.
>
> res <- funFEM(fdobj, K = 2:9, model = "all", crit = "bic", init =
> "random")
>
> But this gives me an error
>
> Error in svd(X) : infinite or missing values in 'x'.
>
> Can anyone tell please help me on how to deal with this dataset for
> this package or any alternative package?
>
> Sincerly,
> Souradeep
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with LM

2018-12-18 Thread Bert Gunter

... Perhaps worth adding is the use of poly() rather than separately
created  terms for (non/orthogonal)  polynomials:

lm(y ~ poly(x, degree =2)  #orthogonal polyomial of degree 2

see ?poly for details.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Dec 18, 2018 at 6:14 PM rsherry8  wrote:

> Richard,
>
> It is now working.
>
> Thank you very much.
>
> Bob
> On 12/18/2018 7:10 PM, Richard M. Heiberger wrote:
> > ## This example, with your variable names, works correctly.
> >
> > z2 <- data.frame(y=1:5, x=c(1,5,2,3,5), x2=c(1,5,2,3,5)^2)
> > z2
> > class(z2)
> > length(z2)
> > dim(z2)
> >
> > lm(y ~ x + x2, data=z2)
> >
> > ## note that that variable names y, x, x2 are column names of the
> > ## data.frame z2
> >
> > ## please review the definitions and examples of data.frame in
> ?data.frame
> > ## also the argument requirements for lm in ?lm
> >
> > On Tue, Dec 18, 2018 at 6:32 PM rsherry8  wrote:
> >> The values read into z2 came from a CSV file. Please consider this R
> >> session:
> >>
> >>   > length(x2)
> >> [1] 1632
> >>   > length(x)
> >> [1] 1632
> >>   > length(z2)
> >> [1] 1632
> >>   > head(z2)
> >> [1] 28914.0 28960.5 28994.5 29083.0 29083.0 29083.0
> >>   > tail(z2)
> >> [1] 32729.65 32751.85 32386.05 32379.75 32379.15 31977.15
> >>   > lm ( y ~ x2 + x, z2 )
> >> Error in eval(predvars, data, env) :
> >> numeric 'envir' arg not of length one
> >>   > lm ( y ~ x2 + x, as.data.frme(z2) )
> >> Error in as.data.frme(z2) : could not find function "as.data.frme"
> >>   > lm ( y ~ x2 + x, as.data.frame(z2) )
> >> Error in eval(predvars, data, env) :
> >> numeric 'envir' arg not of length one
> >> lm(formula = y ~ x2 + x, data = as.data.frame(z2))
> >>
> >> Coefficients:
> >> (Intercept)   x2x
> >>-1.475e-091.000e+006.044e-13
> >>
> >>   > min(z2)
> >> [1] 24420
> >>   > max(z2)
> >> [1] 35524.85
> >>   > class(z2)
> >> [1] "numeric"
> >>   >
> >>
> >> where x is set to x = seq(1:1632)
> >> and x2 is set to x^2
> >>
> >> I am looking for an interpolating polynomial of the form:
> >>   Ax^2 + Bx + C
> >> I do not think the results I got make sense. I believe that I have a
> >> data type error.  I do not understand why
> >> I need to convert z2 to a data frame if it is already numeric.
> >>
> >> Thanks,
> >> Bob
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting rgb proportions in R

2018-12-18 Thread Bert Gunter

3-d Proportions must sum to 1and are thus actually 2-d and should preferaby
be plotted as a ternary plot. Several r packages will do this for you, e.g.
package Ternary. Search "ternary plots" on rseek.org for others.

-- Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Dec 18, 2018 at 3:10 PM Jim Lemon  wrote:

> Hi Tasha,
> I may be right off the track, but you could plot RGB proportions on a
> 3D plot. The easiest way I can think if would be to convert your 0-255
> values to proportions:
>
> rgb_prop<-read.table(text="Red Green Blue pct
> 249 158 37 56.311
> 249 158 68 4.319
> 249 158 98 0.058
> 249 128 7 13.965
> 249 128 37 12.87
> 188 128 37 0.029
> 249 128 68 0.161
> 188 128 68 0.015
> 188 98 7 0.029
> 219 128 7 2.773
> 219 128 37 2.583
> 188 98 68 0.058
> 219 128 68 0.525
> 249 188 37 0.876
> 249 188 68 1.08
> 219 98 7 0.482
> 249 188 98 0.015
> 249 158 7 3.852",header=TRUE)
> rgb_prop$Red<-rgb_prop$Red/255
> rgb_prop$Green<-rgb_prop$Green/255
> rgb_prop$Blue<-rgb_prop$Blue/255
> library(scatterplot3d)
> scatterplot3d(rgb_prop[,1:3],cex.symbols=sqrt(rgb_prop[,4]),
>  color=rgb(rgb_prop[,1],rgb_prop[,2],rgb_prop[,3]),pch=19)
>
> then plot the RGB values on a 3D scatterplot. I have included
> arguments to make the symbols the actual RGB colors that you specify
> and their size proportional to the square root of the percentages.
>
> Jim
>
> On Wed, Dec 19, 2018 at 5:17 AM Tasha O'Hara 
> wrote:
> >
> > Hello,
> >
> > I am trying to plot specific rgb color proportions of a marine specimen
> in
> > a stacked plot using R and I was looking for some help. I have several
> > rgb proportions per specimen (an example of one is below).  I've run into
> > different examples of people using vegan or grDevices. Can anyone help
> with
> > this?
> >
> > RedGreen  Blue   %
> > 249 158 37 56.311
> > 249 158 68 4.319
> > 249 158 98 0.058
> > 249 128 7 13.965
> > 249 128 37 12.87
> > 188 128 37 0.029
> > 249 128 68 0.161
> > 188 128 68 0.015
> > 188 98 7 0.029
> > 219 128 7 2.773
> > 219 128 37 2.583
> > 188 98 68 0.058
> > 219 128 68 0.525
> > 249 188 37 0.876
> > 249 188 68 1.08
> > 219 98 7 0.482
> > 249 188 98 0.015
> > 249 158 7 3.852
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Combine recursive lists in a single list or data frame and write it to file

2018-12-19 Thread Bert Gunter

Does ?unlist not help? Why not?

Bert


On Wed, Dec 19, 2018, 5:13 PM Ek Esawi  Hi All—
>
>  I am using the R tabulizer package to extract tables from pdf files.
> The output is a set of lists of matrices. The package extracts tables
> and a lot of extra stuff which is nearly impossible to clean with
> RegEx. So, I want to clean it manually.
> To do so I need to (1) combine all lists in a single list or data
> frame and (2) then write the single entity to a text file to edit it.
> I could not figure out how.
>
> I tried something like this but did not work.
> lapply(MyTables, function(x)
> lapply(x,write.table(file="temp.txt",append = TRUE)))
>
>  Any help is greatly appreciated.
>
>  Here is my code:
>
> install.packages("rJava");library(rJava)
> install.packages("tabulizer");library(tabulizer)
> MyPath <- "C:/Users/name/Documents/tEMP"
> ExtTable <- function (Path,CalOrd){
>   FileNames <- dir(Path, pattern =".(pdf|PDF)",full.names = TRUE)
>   MyFiles <- lapply(FileNames, function(i) extract_tables(i,method =
> "stream"))
>   if(CalOrd == "Yes"){
> MyOFiles <- gsub("(\\s.*)|(.pdf|.PDF)","",basename(FileNames))
> MyOFiles <- match(MyOFiles,month.name)
> MyNFiles <- MyFiles[order(MyOFiles)]}
>   else
> MyFiles
> }
> MyTables <- ExtTable(Path=MyPath,CalOrd = "No")
>
> Here is cleaned portion of the output: The whole output consists of 3
> lists, each contains 12, 15, and 12 sub-lists.
>
>  [[2]][[2]]
>  [,1][,2][,3][,4]  [,5][,6][,7][,8][,9]
> [,10]
>  [1,] ""  "Avg."  "+_ lo" "n"   "Med."  ""  "Avg."  "+_
> lo" "n"   "Med."
>  [2,] "SiOz"  "44.0"  "1.26"  "375" "44.1"  "Nb""4.8"   "6.3"
>  "58"  "2.7"
>  [3,] "T i O  2"  "0.09"  "0.09"  "561" "0.09"  "Mo(b)" "50""30"
>  "3"   "35"
>  [4,] "A1203" "2.27"  "1.10"  "375" "2.20"  "Ru(b)" "12.4"  "4.1"
>  "3"   "12"
>  [5,] "FeO total" "8.43"  "1.14"  "375" "8.19"  "Pd(b)" "3.9"   "2.1"
>  "19"  "4.1"
>  [6,] "MnO"   "0.14"  "0.03"  "366" "0.14"  "Ag(b)" "6.8"   "8.3"
>  "17"  "4.8"
>  [7,] "MgO"   "41.4"  "3.00"  "375" "41.2"  "Cd(b)" "41""14"
>  "16"  "37"
>  [8,] "CaO"   "2.15"  "1.11"  "374" "2.20"  "In(b)" "12""4"
>  "19"  "12"
>  [9,] "Na20"  "0.24"  "0.16"  "341" "0.21"  "Sn(b)" "54""31"
>  "6"   "36"
> [10,] "K20"   "0.054" "0.11"  "330" "0.028" "Sb(b)" "3.9"   "3.9"
>  "11"  "3.2"
> [11,] "P205"  "0.056" "0.11"  "233" "0.030" "Te(b)" "11""4"
>  "18"  "10"
> [12,] "Total" "98.88" ""  """98.43" "Cs(b)" "10""16"
>  "17"  "1.5"
> [13,] ""  ""  ""  """"  "Ba""33""52"
>  "75"  "17"
> [14,] "Mg-value"  "89.8"  "1.1"   "375" "90.0"  "La""2.60"  "5.70"
>  "208" "0.77"
> [15,] "Ca/AI" "1.28"  "1.6"   "374" "1.35"  "Ce""6.29"  "11.7"
>  "197" "2.08"
> [16,] "AI/Ti" "22""29""361" "22""Pr""0.56"  "0.87"
>  "40"  "0.21"
> [17,] "F e / M n" "60""10""366" "59""Nd""2.67"  "4.31"
>  "162" "1.52"
> [18,] ""  ""  ""  """"  "Sm""0.47"  "0.69"
>  "214" "0.25"
> [19,] "Li""1.5"   "0.3"   "6"   "1.5"   "Eu""0.16"  "0.21"
>  "201" "0.097"
> [20,] "B" "0.53"  "0.07"  "6"   "0.55"  "Gd""0.60"  "0.83"
>  "67"  "0.31"
> [21,] "C" "110"   "50""13"  "93""Tb""0.070"
> "0.064" "146" "0.056"
> [22,] "F" "88""71""15"  "100"   "Dy""0.51"  "0.35"
>  "58"  "0.47"
> [23,] "S" "157"   "77""22"  "152"   "Ho""0.12"  "0.14"
>  "54"  "0.090"
> [24,] "C1""53""45""15"  "75""Er""0.30"  "0.22"
>  "52"  "0.28"
> [25,] "Sc""12.2"  "6.4"   "220" "12.0"  "Tm""0.038"
> "0.026" "40"  "0.035"
> [26,] "V" "56""21""132" "53""Yb""0.26"  "0.14"
>  "201" "0.27"
> [27,] "Cr""2690"  "705"   "325" "2690"  "Lu""0.043"
> "0.023" "172" "0.045"
> [28,] "Co""112"   "10""166" "111"   "Hf""0.27"  "0.30"
>  "71"  "0.17"
> [29,] "Ni""2160"  "304"   "308" "2140"  "Ta""0.40"  "0.51"
>  "38"  "0.23"
> [30,] "Cu""11""9" "94"  "9" "W(b)"  "7.2"   "5.2"
>  "6"   "4.0"
> [31,] "Zn""65""20""129" "60""Re(b)" "0.13"  "0.11"
>  "18"  "0.09"
> [32,] "Ga""2.4"   "1.3"   "49"  "2.4"   "Os(b)" "4.0"   "1.8"
>  "18"  "3.7"
> [33,] "Ge""0.96"  "0.19"  "19"  "0.92"  "Ir(b)" "3.7"   "0.9"
>  "34"  "3.0"
> [34,] "As""0.11"  "0.07"  "7"   "0.10"  "Pt(b)" "7" "-"
>  "1"   "-"
> [35,] "Se""0.041" "0.056" "18"  "0.025" "Au(b)" "0.65"  "0.53"
>  "30"  "0.5"
> [36,] "Br""0.01"  "0.01"  "6"   "0.01"  "Tl(b)" "1.2"   "1.0"
>  "13"  "0.9"
> [37,] "Rb""1,9"   "4.8"   "97"  "0.38"  "Pb""0.16"  "0.11"
>  "17"  "0.16"
> [38,] "Sr""49""60""110" "20""Bi(b)" "1.7"   "0.7"
>  "13"  "1.6"
> [39,] "Y" "4.4"   "5.5"   "86"  "3.1"   "Th*"   "0.71"  "1.2"
>  "71"  "0.22"
> [40,] "Zr""21""42""82"  "8.0"   "U"

Re: [R] Reformatting output of Forecasts generated by mlp model

2018-12-20 Thread Bert Gunter

The printed output you have shown is meaningless (and maybe mangled, since
you posted in HTML): it is the result of a call to a print method for the
forecast object. You need to examine that object (e.g. via str()) and/or
the print method to see how to extract the data you want in whatever form
you want them. If these remarks don't make sense to you, I suggest you
spend some time with an R tutorial or two to learn about (S3, I assume)
objects and methods and how R uses these for printing output. The important
point is: what you see in printed output may not at "look like" the
structure of the object from which the output is obtained.

You have also failed to tell us what package(s) you are using, which is
part of the requested minimal info. There appear to be at least two that
seem relevant.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Dec 20, 2018 at 7:12 AM Paul Bernal  wrote:

> Dear friends,
>
> Hope you are doing great. I am using the multiple layer perceptron model
> (provided in R´s mlp() function for time series forecasting, but I don´t
> know how to reformat the output forecasts generated.
>
> mydata <- dput(datframe$Transits)
>
> > dput(datframe$Transits)
> c(77L, 75L, 85L, 74L, 73L, 96L, 82L, 90L, 91L, 81L, 81L, 77L,
> 84L, 81L, 82L, 86L, 81L, 81L, 83L, 88L, 88L, 92L, 97L, 89L, 96L,
> 94L, 94L, 95L, 92L, 94L, 95L, 95L, 87L, 102L, 94L, 91L, 93L,
> 86L, 96L, 85L, 81L, 84L, 88L, 91L, 89L, 89L, 93L, 83L, 92L, 92L,
> 76L, 98L, 80L, 95L, 89L, 92L, 96L, 86L, 98L, 84L, 90L, 95L, 90L,
> 99L, 85L, 91L, 90L, 88L, 97L, 93L, 97L, 87L, 92L, 87L, 86L, 85L,
> 82L, 90L, 89L, 101L, 94L, 92L, 109L, 101L, 103L, 96L, 89L, 102L,
> 87L, 101L, 100L, 99L, 101L, 98L, 101L, 90L, 106L, 90L, 99L, 105L,
> 91L, 96L, 91L, 96L, 93L, 101L, 105L, 98L, 110L, 100L, 101L, 106L,
> 99L, 111L, 114L, 112L, 113L, 120L, 105L, 111L, 114L, 111L, 118L,
> 115L, 108L, 120L, 119L, 120L, 118L, 117L, 121L, 111L, 114L, 107L,
> 121L, 109L, 106L, 116L, 105L, 119L, 120L, 123L, 126L, 117L, 127L,
> 128L, 132L, 138L, 120L, 132L, 134L, 136L, 144L, 152L, 155L, 146L,
> 155L, 138L, 141L, 146L, 123L, 133L, 123L, 137L, 133L, 143L, 132L,
> 126L, 134L, 129L, 138L, 134L, 132L, 139L, 130L, 152L, 150L, 153L,
> 161L, 152L, 154L, 154L, 138L, 149L, 137L, 144L, 146L, 152L, 140L,
> 151L, 168L, 148L, 157L, 152L, 153L, 166L, 157L, 156L, 166L, 168L,
> 179L, 188L, 190L, 185L, 184L, 185L, 202L, 191L, 175L, 197L, 187L,
> 195L, 204L, 218L, 220L, 212L, 220L, 211L, 221L, 204L, 196L, 209L,
> 205L, 217L, 211L, 212L, 224L, 206L, 225L, 206L, 219L, 232L, 220L,
> 242L, 241L, 261L, 252L, 261L, 269L, 251L, 264L, 261L, 266L, 274L,
> 236L, 270L, 263L, 276L, 276L, 300L, 303L, 301L, 318L, 294L, 308L,
> 308L, 269L, 303L, 302L, 318L, 282L, 311L, 305L, 304L, 309L, 298L,
> 295L, 295L, 281L, 280L, 287L, 313L, 276L, 296L, 307L, 307L, 309L,
> 287L, 286L, 290L, 261L, 285L, 279L, 286L, 284L, 267L, 271L, 259L,
> 268L, 243L, 242L, 237L, 208L, 250L, 237L, 267L, 257L, 276L, 277L,
> 269L, 282L, 264L, 270L, 270L, 251L, 272L, 271L, 288L, 266L, 283L,
> 266L, 270L, 282L, 272L, 264L, 269L, 253L, 269L, 283L, 288L, 275L,
> 301L, 292L, 283L, 287L, 261L, 265L, 269L, 234L, 251L, 261L, 262L,
> 249L, 256L, 255L, 253L, 253L, 233L, 234L, 235L, 217L, 244L, 232L,
> 261L, 236L, 252L, 242L, 252L, 251L, 230L, 240L, 254L, 226L, 267L,
> 245L, 263L, 261L, 286L, 281L, 265L, 274L, 250L, 260L, 265L, 242L,
> 251L, 249L, 251L, 247L, 248L, 234L, 206L, 219L, 194L, 218L, 209L,
> 192L, 207L, 200L, 208L, 208L, 209L, 213L, 216L, 219L, 195L, 217L,
> 217L, 197L, 210L, 211L, 229L, 232L, 227L, 233L, 217L)
>
> TransitsDat <- ts(mydata, start=(1985,10), end=c(2018,9), frequency=12)
>
> Model <- mlp(TransitsDat)
>
> ModelForecasts <- forecast(Model, h=10)
>
> And the output I get is this:
>
>Jan Feb Mar Apr May
> Jun Jul Aug Sep Oct Nov Dec
> 2018
> 224.9970134 221.7932717 220.2698789
> 2019 223.8440115 219.3309631 221.5382052 221.5720276 222.0963057
> 223.8392450 224.0982199
>
> But I would like to have the results in tabular form, where the first
> column is the date with format mmm- (for example jan-2019) and the
> second row are the actual forecasts.
>
> Like this
> Date   TransitForecast
> Jan-2019 230
> Feb-2019217
> etc.
>
> Any guidance will be greatly appreciated,
>
> Best regards,
>
> Paul
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/li

Re: [R] Glitch in Kruskal-Wallis test?

2018-12-22 Thread Bert Gunter

... Moreover, you should not analyze proportions in this way, which treats
.5 = 2/4 or .5 = 2000/4000 identically. As David said, you need to work
with a statistician.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Dec 22, 2018 at 7:32 AM David L Carlson  wrote:

> You may need to spend some more time with the statistician who needs to
> see your data. It is not clear if you have a two sample test or a paired
> sample test. Kruskall-Wallis expects data for each observation, not grouped
> data. Without the observations, the test cannot compute the sample size and
> the degrees of freedom. You have run kruskal.test separately on each
> sample. The kruskal.test is designed for comparing two or more samples.
>
> 
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
> -Original Message-
> From: R-help  On Behalf Of Jenny Liu
> Sent: Saturday, December 22, 2018 7:10 AM
> To: Michael Dewey 
> Cc: r-help@r-project.org
> Subject: Re: [R] Glitch in Kruskal-Wallis test?
>
> Hi Michael,
>
> Thank you for your reply! I'm testing the difference in proportions. Temp
> is temperature, and Prop is the proportion of insect pupae that survived at
> that temperature. I was told by a statistician that the K-W was appropriate
> for testing proportions, but perhaps you know of an alternative? I have
> already tested for heteroscedasticity using the Breusch-Pagan test.
>
> Thanks again,
> Jenny
>
>
>
> On Dec 22, 2018 7:38 AM, "Michael Dewey"  wrote:
>
> Dear Jenny
>
> What exactly do you think you are testing here? You are telling K-W you
> have seven groups each with a single value which is not the usual
> situation for K-W.
>
> Michael
>
>
> On 22/12/2018 04:58, Jenny Liu wrote:
> > Hi everyone,
> > I have been running a K-W test with the attached data, PupMort1. My code:
> > kruskal.test(Prop~Temp,data=PupMort1)
> > However, I found that I get the exact same result when I change the
> x-values, as
> > in the attached data PupMort2.
> > Test run with PupMort1Kruskal-Wallis rank sum testdata:  Prop by Temp
>
> > Kruskal-Wallis chi-squared = 6, df = 6, p-value = 0.4232
> > Test run with PupMort2Kruskal-Wallis rank sum testdata:  Prop by Temp
>
> > Kruskal-Wallis chi-squared = 6, df = 6, p-value = 0.4232
> > Does anybody know why this is happening?
> > Thank you!
> > Jenny
> >
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with Kruskal–Wallis test

2018-12-25 Thread Bert Gunter

"So, I'm not an expert in R and statistics" 

So you need to seek local help from someone who is. Statistics is usually
off-topic for this list -- it is about R programming primarily. And online
is probably not a good venue for the sort of discussion you need anyway.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Dec 25, 2018 at 5:37 AM Giuseppe Cillis  wrote:

> Dear Michael,
> Thanks for your answer.
> So, I'm not an expert in R and statistics, how can I create this interval
> of confidence of groups?
> Thanks
> Gc
>
> Il giorno sab 22 dic 2018, 13:34 Michael Dewey 
> ha
> scritto:
>
> > Dear Giuseppe
> >
> > If I understand you correctly you have a very large sample size so it is
> > not surprising that you get very small p-values. Eevn a scientifically
> > uninteresting difference can become statistically significant with large
> > samples. You probably need to define a metric for meaningful differences
> > between groups and calculate a confidence interval for it.
> >
> > Michael
> >
> > On 21/12/2018 15:37, Giuseppe Cillis wrote:
> > > Dear all,
> > > I am a beginner with R (and also with the statistics) for which I hope
> to
> > > be clear.
> > > I should do this non-parametric test on data I extracted from maps.
> > > In practice I have a column that represents the landscape Dynamics of a
> > > certain time period (there are 3 dynamics, each of them marked by the
> > > number 1, 2 or 3) and the other column with the values of a topographic
> > > variable (for example the slope) . In all, there are more than 90,000
> > pairs
> > > of values.
> > > Going to do the test in R, for all the dynamics and for all the
> > variables,
> > > I get out of the values of chi-square elevated (even in the order of
> > > thousands) and a p-value always <2.2e-16  why? Where can the error
> > be? in
> > > the script or in the test approach?
> > > Thanks in advance
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> > --
> > Michael
> > http://www.dewey.myzen.co.uk/home.html
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help converting .txt to .csv file

2018-12-26 Thread Bert Gunter

Inline.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Wed, Dec 26, 2018 at 3:04 PM Spencer Brackett <
spbracket...@saintjosephhs.com> wrote:

> Good evening,
>
> I am attempting to anaylze the protein expression data contained within
> these two ICGC, TCGA datasets (one for GBM and the other for LGG)
>
> ...
>   When I tried to transfer the files from .txt (via Notepad) to .csv (via
> Excel), the data appeared in the columns as unorganized and random
> script... not like how a typical csv should be arranged at all. I need the
> dataset to be converted into .csv in order to analyze it in R,

Huh?? Why do you think this? A csv is just a comma delimited text file.

R can input pretty much any kind of file, ONCE YOU KNOW THE FORMAT OF WHAT
YOU ARE INPUTTING. This should be provided by the links that you gave. Then
see ?read.table or, more generally, ?scan for how to read the (text) file
into R into whatever data structure you need. See also the R data
import/export manual. Or possibly post to the Bioconductor list where they
specialize in this sort of thing and may already have packages that can
access the repositories and bring in the data in the form you need them.
They also have lots of software there for analysis, too.

Cheers,
Bert

> which is why
> I am hoping someone here might help me in doing that. If not, is there
> perhaps some other way that I could analyze the datatsets on R, which again
> is downloaded from the dataportal ICGC?
>
> Best,
>
> Spencer Brackett
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Proposed changes to the R Language

2018-12-28 Thread Bert Gunter

I would assume r-devel is where this sort of query should be posted as we
mere users have nothing to say about this.

However, I've seen discussions and talks about better languages for
scientific (but data science?) programming -- Matlab, Julia, Scipy, etc. --
for at least a decade. But with a library of now over 10,000 packages on
CRAN and yet more on Bioconductor and github -- that's a lot of inertia to
overcome.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Fri, Dec 28, 2018 at 7:13 AM Jeff Newmiller 
wrote:

> I have no idea. But I hope not... that sounds like a different tool than
> R, just as C++ is a different tool than C.
>
> On December 27, 2018 4:36:42 PM PST, "Angus C Hewitt (DHHS)" <
> angus.hew...@dhhs.vic.gov.au> wrote:
> >Hi Team,
> >
> >Please advise if there are any plans in the pipeline for change the R
> >language to "B"  as proposed in the below mentioned statistical series
> >help at Auckland Uni.
> >
> >https://www.youtube.com/watch?v=88TftllIjaY
> >
> >
> >Kind Regards,
> >
> >Angus Hewitt
> >Senior Analyst | Decision Support
> >System Design, Planning & Decision Making |  Health & Well Being
> >Department of Health and Human Services | 19th floor, 50 Lonsdale
> >Street, Melbourne Victoria 3000
> >t. 9096 5859  | m. 0468 364 744 | e. angus.hew...@dhhs.vic.gov.au
> >
> >
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Recursive Feature Elimination with SVM

2019-01-02 Thread Bert Gunter

Note: **NOT** reproducible (only you have "data.csv").

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Jan 1, 2019 at 11:14 PM Priyanka Purkayastha <
ppurkayastha2...@gmail.com> wrote:

> This is the code I tried,
>
> library(e1071)
> library(caret)
> library(ROCR)
>
> data <- read.csv("data.csv", header = TRUE)
> set.seed(998)
>
> inTraining <- createDataPartition(data$Class, p = .70, list = FALSE)
> training <- data[ inTraining,]
> testing  <- data[-inTraining,]
>
> while(length(data)>0){
>
> ## Building the model 
> svm.model <- svm(Class ~ ., data = training,
>
> cross=10,metric="ROC",type="eps-regression",kernel="linear",na.action=na.omit,probability
> = TRUE)
> print(svm.model)
>
>
> ## auc  measure ###
>
> #prediction and ROC
> svm.model$index
> svm.pred <- predict(svm.model, testing, probability = TRUE)
>
> #calculating auc
> c <- as.numeric(svm.pred)
> c = c - 1
> pred <- prediction(c, testing$Class)
> perf <- performance(pred,"tpr","fpr")
> plot(perf,fpr.stop=0.1)
> auc <- performance(pred, measure = "auc")
> auc <- auc@y.values[[1]]
> print(length(data))
> print(auc)
>
> #compute the weight vector
> w = t(svm.model$coefs)%*%svm.model$SV
>
> #compute ranking criteria
> weight_matrix = w * w
>
> #rank the features
> w_transpose <- t(weight_matrix)
> w2 <- as.matrix(w_transpose[order(w_transpose[,1], decreasing = FALSE),])
> a <- as.matrix(w2[which(w2 == max(w2)),]) #to get the rows with minimum
> values
> row.names(a) -> remove
> training<- data[,setdiff(colnames(data),remove)]
> }
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Wed, Jan 2, 2019 at 11:18 AM David Winsemius 
> wrote:
>
> >
> > On 1/1/19 5:31 PM, Priyanka Purkayastha wrote:
> > > Thankyou David.. I tried the same, I gave x as the data matrix and y
> > > as the class label. But it returned an empty "featureRankedList". I
> > > get no output when I try the code.
> >
> >
> > If you want people to spend time on this you should post a reproducible
> > example. See the Posting Guide ... and learn to post in plain text.
> >
> >
> > --
> >
> > David
> >
> > >
> > > On Tue, 1 Jan 2019 at 11:42 PM, David Winsemius
> > > mailto:dwinsem...@comcast.net>> wrote:
> > >
> > >
> > > On 1/1/19 4:40 AM, Priyanka Purkayastha wrote:
> > > > I have a dataset (data) with 700 rows and 7000 columns. I am
> > > trying to do
> > > > recursive feature selection with the SVM model. A quick google
> > > search
> > > > helped me get a code for a recursive search with SVM. However, I
> > > am unable
> > > > to understand the first part of the code, How do I introduce my
> > > dataset in
> > > > the code?
> > >
> > >
> > > Generally the "labels" is given to such a machine learning device
> > > as the
> > > y argument, while the "features" are passed as a matrix to the x
> > > argument.
> > >
> > >
> > > --
> > >
> > > David.
> > >
> > > >
> > > > If the dataset is a matrix, named data. Please give me an
> > > example for
> > > > recursive feature selection with SVM. Bellow is the code I got
> for
> > > > recursive feature search.
> > > >
> > > >  svmrfeFeatureRanking = function(x,y){
> > > >
> > > >  #Checking for the variables
> > > >  stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE)
> > > >
> > > >  n = ncol(x)
> > > >  survivingFeaturesIndexes = seq_len(n)
> > > >  featureRankedList = vector(length=n)
> > > >  rankedFeatureIndex = n
> > > >
> > > >  while(length(survivingFeaturesIndexes)>0){
> > > >  #train the support vector machine
> > > >  svmModel = svm(x[, survivingFeaturesIndexes], y, cost = 10,
> > > > cachesize=500,
> > > >  scale=FALSE, type="C-classification",
>

Re: [R] Accessing Data Frame

2019-01-03 Thread Bert Gunter

I do not know how you define "quick way," but as there is an "==" method
for data frames (see ?"==" and links therein for details), that allows the
straightforward use of basic R functionality:

## using your 'deck' and 'topCard' examples:

> deck [ apply(deck == topCard[rep(1,nrow(deck)), ],1, all),]
  face   suit value
1 king spades13

> deck [ !apply(deck == topCard[rep(1,nrow(deck)),],1, all), ]
   face   suit value
2 queen spades12
3  jack spades11
4   ten spades10

> topCard <- deck[2, ]
> deck [ !apply(deck == topCard[rep(1, nrow(deck)), ],1, all), ]
  face   suit value
1 king spades13
3 jack spades11
4  ten spades10

This approach can be trivially changed to using only a subset of columns to
define the "filter."

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jan 3, 2019 at 9:16 AM Benoit Galarneau 
wrote:

> You are correct, the anti_join is working fine.
> However, I still find it strange there is no "quick" way to find the
> index of an item extracted from the data frame.
>
> This works as it returns the deck without the card no 10.
> aCard = deck[10,]
> cardNo = which(deck$value == aCard$value & deck$suit == aCard$suit)
> deck[-cardNo,]
>
> But I'm still puzzled by the complexity of finding back the index of
> the card with the long statement.
>
> Another approach that "works" is the following, but I still find it
> strange to depend on data frame row names to find the index:
> cardNo <- as.numeric(row.names(aCard))
>
> Apologies if the above question are strange. I'm coming C++ world with
> some bias with objects. Again, since "aCard" is extracted from the
> data frame, I assume (bias?) there would be a simple way to find back
> the item in the data frame it came frame. Some kind of indexOf() or
> similar on the container and item.
>
> Benoit
>
> Ista Zahn  a écrit :
>
> > Hi Benoit,
> >
> > You can select rows from deck matched in aCard using
> >
> > merge(deck, aCard)
> >
> > Selecting rows that don't match is bit more difficult. You could do
> > something like
> >
> > isin <- apply(mapply(function(x, y) x %in% y, deck, topCard),
> >1,
> >all)
> > deck[!isin, ]
> >
> > perhaps.
> >
> > Alternatively, you can use anti_join from the dplyr package:
> >
> > library(dplyr)
> > anti_join(deck, topCard)
> >
> > Best,
> > Ista
> >
> > On Thu, Jan 3, 2019 at 10:38 AM Benoit Galarneau
> >  wrote:
> >>
> >> Hi everyone,
> >> I'm new to the R world.
> >> Probably a newbie question but I am stuck with some concept with data
> frame.
> >> I am following some examples in the "Hands-On Programming with R".
> >>
> >> In short, how can I access/filter items in a data frame using a
> variable.
> >>
> >> One example consists of manipulating elements from a deck of card:
> >>
> >> > deck
> >>  face suit value
> >> 1   king   spades13
> >> 2  queen   spades12
> >> 3   jack   spades11
> >> 4ten   spades10
> >> etc.
> >>
> >> Let's say I want to remove or filter out the first card. I know I
> >> could do deck[-1].
> >>
> >> But let's say I have: topCard <- deck[1,]
> >>
> >> topCard is then a list of 3 elements
> >> > topCard
> >>face   suit value
> >> 1 king spades13
> >>
> >> My question is the following, how can I remove or filter out the deck
> >> using the topCard variable.
> >>
> >> In my programmer's head, something similar to this should "work":
> >> > deck[10,]
> >> face   suit value
> >> 10 four spades 4
> >> > aCard <- deck[10,]
> >> > aCard
> >> face   suit value
> >> 10 four spades 4
> >> > deck[aCard]
> >> Error in `[.default`(deck, aCard) : invalid subscript type 'list'
> >>
> >> Wihout having to specify all elements in the logical tests.
> >>
> >> deck[deck$face == aCard$face & deck$suit == aCard$suit & deck$value ==
> >> aCard$value,]
> >> face   suit value
> >> 10 four spades 4
> >>
> >> ___

Re: [R] BLUPS from lme models

2019-01-03 Thread Bert Gunter

No.

But as this is a statistical issue and not an R programming issue, it is
off topic here. Post on stats.stackexchange.com or other statistical list
and/or spend time with web tutorials.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jan 3, 2019 at 5:55 PM Patrick Connolly 
wrote:

> The bottom of page 276 of the "Gold Book" Modern Applied Statistics by
> Venables and Ripley, 4th edition, the last sentence states:
>
> "Random effects are set either to zero or to their BLUP values."
>
> Am I correct in inferring from that, it amounts respectively to
> removing the random term from the model, or setting it as a fixed
> effect?  To get something meaningful, one needs to choose which random
> effects are relevant to the topic under study?
>
> Thank you.
>
> --
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
>___Patrick Connolly
>  {~._.~}   Great minds discuss ideas
>  _( Y )_ Average minds discuss events
> (:_~*~_:)  Small minds discuss people
>  (_)-(_)  . Eleanor Roosevelt
>
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to perform Mixed Design ANOVA on MICE imputed dataset in R?

2019-01-04 Thread Bert Gunter

You might wish to post on the r-sig-mixed-models list, which is
specifically devoted to mixed effects models, instead of here. You are more
likely to find both interest and expertise there.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Jan 4, 2019 at 7:49 AM Lisa Snel  wrote:

> Dear Ista,
>
> Thank you for your response and the link you have sent me. However, I know
> the basic things of the MICE package (how impute, to pool and to do basic
> analyses), but the problem is that I cannot find anything about this
> specific analysis.
>
> Best,
> Lisa
> 
> Van: Ista Zahn 
> Verzonden: vrijdag 4 januari 2019 15:11
> Aan: Lisa Snel
> CC: r-help@r-project.org
> Onderwerp: Re: [R] How to perform Mixed Design ANOVA on MICE imputed
> dataset in R?
>
> Hi Lisa,
>
> The package web page at http://stefvanbuuren.github.io/mice/ has all
> the info you need to get started.
>
> Best,
> Ista
>
> On Fri, Jan 4, 2019 at 3:29 AM Lisa Snel  wrote:
> >
> > Hi all,
> >
> > I have a question about performing a Mixed Design ANOVA in R after
> multiple imputation using MICE. My data is as follows:
> >
> > id <- c(1,2,3,4,5,6,7,8,9,10)
> > group <- c(0,1,1,0,0,1,0,0,0,1)
> > measure_1 <- c(60,80,90,54,60,61,77,67,88,90)
> > measure_2 <- c(55,88,88,55,70,62,78,66,65,92)
> > measure_3 <- c(58,88,85,56,68,62,89,62,70,99)
> > measure_4 <- c(64,80,78,92,65,64,87,65,67,96)
> > measure_5 <- c(64,85,80,65,74,69,90,65,70,99)
> > measure_6 <- c(70,83,80,55,73,64,91,65,91,89)
> > dat <- data.frame(id, group, measure_1, measure_2, measure_3, measure_4,
> measure_5, measure_6)
> > dat$group <- as.factor(dat$group)
> >
> > So: we have 6 repeated measurements of diastolic blood pressure (measure
> 1 till 6). The grouping factor is gender, which is called group. This
> variable is coded 1 if male and 0 if female. Before multiple imputation, we
> have used the following code in R:
> >
> > library(reshape)
> > library(reshape2)
> > datLong <- melt(dat, id = c("id", "group"), measured = c("measure_1",
> "measure_2", "measure_3", "measure_4", "measure_5", "measure_6"))
> > datLong
> >
> > colnames(datLong) <- c("ID", "Gender", "Time", "Score")
> > datLong
> > table(datLong$Time)
> > datLong$ID <- as.factor(datLong$ID)
> >
> > library(ez)
> > model_mixed <- ezANOVA(data = datLong,
> >dv = Value,
> >wid = ID,
> >within = Time,
> >between = Gender,
> >detailed = TRUE,
> >type = 3,
> >return_aov = TRUE)
> > model_mixed
> >
> > This worked perfectly. However, our data is not complete. We have
> missing values, that we impute using MICE:
> >
> > id <- c(1,2,3,4,5,6,7,8,9,10)
> > group <- c(0,1,1,0,0,1,0,0,0,1)
> > measure_1 <- c(60,80,90,54,60,61,77,67,88,90)
> > measure_2 <- c(55,NA,88,55,70,62,78,66,65,92)
> > measure_3 <- c(58,88,85,56,68,62,89,62,70,99)
> > measure_4 <- c(64,80,78,92,NA,NA,87,65,67,96)
> > measure_5 <- c(64,85,80,65,74,69,90,65,70,99)
> > measure_6 <- c(70,NA,80,55,73,64,91,65,91,89)
> > dat <- data.frame(id, group, measure_1, measure_2, measure_3, measure_4,
> measure_5, measure_6)
> > dat$group <- as.factor(dat$group)
> >
> > imp_anova <- mice(dat, maxit = 0)
> > meth <- imp_anova$method
> > pred <- imp_anova$predictorMatrix
> > imp_anova <- mice(dat, method = meth, predictorMatrix = pred, seed =
> 2018, maxit = 10, m = 5)
> >
> > (The imputation gives logged events, because of the made-up data and the
> simple imputation code e.g id used as a predictor. For my real data, the
> imputation was correct and valid)
> >
> > Now I have the imputed dataset of class ‘mids’. I have searched the
> internet, but I cannot find how I can perform the mixed design ANOVA on
> this imputed set, as I did before with the complete set using ezANOVA. Is
> there anyone who can and wants to help me?
> >
> >
> > Best,
> >
> > Lisa
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>

Re: [R] Fit CMARS with R possible (Packages) ?

2019-01-04 Thread Bert Gunter

rseek.org might be a better place to search if you haven't tried t
herealready. However, my minimal effort there did not turn up any R
software.  Maybe you can do better.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Fri, Jan 4, 2019 at 11:31 AM varin sacha via R-help 
wrote:

> Dear R-experts,
>
> We can fit MARS regression using the packages "earth" and/or "mda" or
> others packages.
> However, I am wondering if it is possible to fit a CMARS (Conic
> multivariate adaptive regression splines) using R ?
> I have googled "conic MARS with R software", I did not get anything, so
> Google is not my friend anymore !
>
> If you have any solution, would be highly appreciated.
>
> Best,
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Diff'ing 2 strings

2019-01-05 Thread Bert Gunter

I do not know what you mean in your string context, as diff in Linux finds
lines in files that differ. A reproducible example -- posting guide! --
would be most useful here.

However, maybe something of the following strategy might be useful:

1. Break up your strings into lists of string "chunks" relevant for your
context via strspit() . Using "" (empty character) as the "sep" string
would break your strings into individual characters; "\n" would break it
into "lines" separated by the return
character; etc.

2. Compare your lists using e.g. lapply() and probably ?match and friends
like ?setdiff

You should also probably check out the stringr package to see if it
contains what you need. Also, if this is gene sequence related, posting on
the Bioconductor list rather than here is likely to be more fruitful.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Sat, Jan 5, 2019 at 5:58 AM Sebastien Bihorel <
sebastien.biho...@cognigencorp.com> wrote:

> Hi,
>
> Does R include an equivalent of the linux diff command?
>
> Ideally I would like to diff 2 fairly complex strings and extract the
> differences without having to save them on disk and using a system('diff
> file1 file2') command.
>
> Thanks
>
> Sebastien
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Diff'ing 2 strings

2019-01-05 Thread Bert Gunter

It's the "split" string not the "sep" string, as you and probably everyone
else already realizes.
And, of course, it could be a regular expression, not literally a character
string.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Jan 5, 2019 at 7:19 AM Bert Gunter  wrote:

> I do not know what you mean in your string context, as diff in Linux finds
> lines in files that differ. A reproducible example -- posting guide! --
> would be most useful here.
>
> However, maybe something of the following strategy might be useful:
>
> 1. Break up your strings into lists of string "chunks" relevant for your
> context via strspit() . Using "" (empty character) as the "sep" string
> would break your strings into individual characters; "\n" would break it
> into "lines" separated by the return
> character; etc.
>
> 2. Compare your lists using e.g. lapply() and probably ?match and friends
> like ?setdiff
>
> You should also probably check out the stringr package to see if it
> contains what you need. Also, if this is gene sequence related, posting on
> the Bioconductor list rather than here is likely to be more fruitful.
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Jan 5, 2019 at 5:58 AM Sebastien Bihorel <
> sebastien.biho...@cognigencorp.com> wrote:
>
>> Hi,
>>
>> Does R include an equivalent of the linux diff command?
>>
>> Ideally I would like to diff 2 fairly complex strings and extract the
>> differences without having to save them on disk and using a system('diff
>> file1 file2') command.
>>
>> Thanks
>>
>> Sebastien
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame transformation

2019-01-06 Thread Bert Gunter

Like this (using base R only)?

dat<-data.frame(id=id,letter=letter,weight=weight) # using your data

ud <- unique(dat$id)
ul = unique(dat$letter)
d <- with(dat,
  data.frame(
  letter = rep(ul, e = length(ud)),
  id = rep(ud, length(ul))
  ) )

 merge(dat[,c(2,1,3)],d, all.y = TRUE)
## resulting in:

   letter id weight
1   A  1 25
2   A  2 28
3   A  3 14
4   A  4 27
5   A  5 NA
6   B  1 13
7   B  2 14
8   B  3 NA
9   B  4 15
10  B  5  2
11  C  1 NA
12  C  2 NA
13  C  3 NA
14  C  4 NA
15  C  5 25
16  D  1 24
17  D  2 18
18  D  3 NA
19  D  4 29
20  D  5 27
21  E  1 NA
22  E  2  2
23  E  3 20
24  E  4 25
25  E  5 28


Cheers,

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help <
r-help@r-project.org> wrote:

> Hello Everyone,
>
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with different
> dimensions and without all the line items here?
>
> we have:
>
> id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ
> of course in real data set, usually in magnitude of 1
>
> letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
>
> sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number
> of unique "letters" is less than 4000 in real data set and they are no
> duplicates within same ID
> weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
>   sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is
> below 50 in real data set and they are no duplicates within same ID
>
>
> data<-data.frame(id=id,letter=letter,weight=weight)
>
> #goal is to get the following transformation where a column is added for
> each unique letter and the weight is pulled into the column if the letter
> exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
>
> datatransfer<-data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
>
> colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
>
> thanks
>
> Andras
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame transformation

2019-01-06 Thread Bert Gunter

... and my reordering of column indices was unnecessary:
merge(dat, d, all.y = TRUE)
will do.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help <
r-help@r-project.org> wrote:

> Hello Everyone,
>
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with different
> dimensions and without all the line items here?
>
> we have:
>
> id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ
> of course in real data set, usually in magnitude of 1
>
> letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
>
> sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number
> of unique "letters" is less than 4000 in real data set and they are no
> duplicates within same ID
> weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
>   sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is
> below 50 in real data set and they are no duplicates within same ID
>
>
> data<-data.frame(id=id,letter=letter,weight=weight)
>
> #goal is to get the following transformation where a column is added for
> each unique letter and the weight is pulled into the column if the letter
> exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
>
> datatransfer<-data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
>
> colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
>
> thanks
>
> Andras
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question

2019-01-08 Thread Bert Gunter

I think it's ?install.packages

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Jan 8, 2019 at 9:50 AM Rich Shepard 
wrote:

> On Tue, 8 Jan 2019, S. Mahmoud Nasrollahi wrote:
>
> > I have got a problem during working with some package in R and in spite
> of
> > trying with R help, internet and any other resources I could not succeed.
> > Indeed when I what to install some function like bwplot, boxplot, xyplot
> I
> > receive this sort of messages: Warning in install.packages : package
> > ‘xyplot’ is not available (for R version 3.5.2) Do you know how I can
> > solve that?
>
>Yep. Those plots are part of the lattice package. You can install
> lattice
> (and latticeExtra if you want) with
>
> > installpkg("lattice")
>
> Happy plotting,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] External validation for a hurdle model (pscl)

2019-01-08 Thread Bert Gunter

This list is (mostly) about R programming. Your query is (mostly) about
statistics. So you should post on a statistics site like
stats.stackexchange.com
not here; I am pretty sure you'll receive lots of answers there.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Jan 8, 2019 at 10:18 AM Maria Eugenia Utgés 
wrote:

> Hi R-list,
> We have constructed a hurdle model some time ago.
> Now we were able to gather new data in the same city (38 new sites), and
> want to do an external validation to see if the model still performs ok.
> All the books and lectures I have read say its the best validation option
> but...
> I have made a (simple) search, but it seems that as having new data for a
> model is rare, have not found anything with the depth enough so as to
> reproduce it/adapt it to hurdle models.
>
> I have predicted the probability for non-zero counts
> nonzero <- 1 - predict(final, newdata = datosnuevos, type = "prob")[, 1]
>
> and the predicted mean from the count component
> countmean <- predict(final, newdata = datosnuevos, type = "count")
>
> I understand that "newdata" is taking into account the new values for the
> independent variables (environmental variables), is it?
>
> So, I have to compare the predicted values of y (calculated with the new
> values of the environmental variables) with the new observed values.
>
> That would be using the model (constructed with the old values), having as
> input the new variables, and having as output a "new" prediction, to be
> contrasted with the "new" observed y.
>
> These comparison would be by means of AUC, correct classification, and/or
> what other options? Results of the external validation would just be a % of
> correct predicted values? plots?
>
> Need some guidance, sorry if the explanation was "basic" but needed to
> write it in my own words so as not to miss any detail.
>
> Thank you very much in advance,
>
> María Eugenia Utgés
>
> CeNDIE-ANLIS
> Buenos Aires
> Argentina
> a
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Resampling 1 time series at another set of (known) irregularly spaced times

2019-01-09 Thread Bert Gunter

John:

Clarification: Do you mean you just want an "irregular" subset of your
*given* data values/times, or do you want times randomly over the series
duration for which you will construct values, which is what Jeff described.

The former is trivial: see ?sample with the "replace" argument = FALSE :
you're actually just sampling from the integer vector of time indices here,
so sample.int would even do. For the latter, I would presume you could use
?runif to sample arbitrary times over the time series duration and then
follow Jeff's suggestions to fill in values for these times using methods
to which he referred you.

Or have I misunderstood completely?

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Jan 9, 2019 at 3:17 PM Jeff Newmiller 
wrote:

> The key to accomplishing this is to clarify how you want to address
> selecting values between the existing points, but there are many base R
> functions and packages that address this problem. In general the methods
> fall into two categories: interpolation and smoothing. Interpolation
> includes piecewise linear interpolation, splines,
> last-observation-carried-forward, and  first-order-extrapolation, all of
> which yield the same values of applied only at the original independent
> values. Smoothing methods such as regression, loess, kriging, and kernel
> interpolation may not have this identity property but you don't need unique
> input values at each independent variable value either.
>
> Read some Task Views, e.g.
>
> https://cran.r-project.org/web/views/NumericalMathematics.html
>
> https://cran.r-project.org/web/views/TimeSeries.html
>
> https://cran.r-project.org/web/views/MissingData.html
>
>
>
> On January 9, 2019 2:55:04 PM PST, John Hillier 
> wrote:
> >Dear All,
> >
> >
> >I would appreciate a quick pointer in the right direction (e.g. www
> >page I could look at, or indicator of which function within a package).
> >
> >
> >The problem: I have a regular time series of values x at times t (i.e.
> >t, x). I would like to sample them at irregular, known times - this is
> >a second time series (T).
> >
> >
> >I can move these data between formats as required (i.e. file, vector,
> >matrix, ts etc )
> >
> >
> >I have been searching around for a while and found many packages to
> >regularise time-series (e.g. xts, lubricate, . ), but not the
> >reverse as I want to.
> >
> >
> >Before you ask, I know it might seem a bit odd, but it is necessary for
> >the particular question I'm asking.
> >
> >
> >Thank you for your time,
> >
> >
> >John
> >
> >
> >-
> >Work days: Mon-Thurs
> >Web page: <http://homepages.lboro.ac.uk/~gyjh5/>
> ><http://www.lboro.ac.uk/departments/geography/staff/john-hillier/>
> >http://www.lboro.ac.uk/departments/geography/staff/john-hillier/
> >Latest research:
> >http://publications.lboro.ac.uk/publications/all/collated/gyjh5.html<
> https://lb-public.lboro.ac.uk/cgi-bin/personcite?username=gyjh5&dobranding=1&hits=10
> >
> >
> >Dr John Hillier
> >Senior Lecturer & NERC Knowledge Exchange Fellow (Insurance Sector)
> >Geography and Environment
> >Loughborough University
> >01509 223727
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Seeking help for using optim for MLE in R

2019-01-10 Thread Bert Gunter

Probably: don't do this.

Use the nnet package (and there may well be others) to fit multinomial
regression. See here for a tutorial:

https://rpubs.com/rslbliss/r_logistic_ws

Cheers,
Bert


On Thu, Jan 10, 2019 at 6:18 AM Naznin Sultana  wrote:

> Hi, I am writing a program for MLE of parameters of multinomial
> distribution
> using optim.But when I run the program outside of a function it gives me
> the
> likelihood value, but when using it for optim function it gives the error
> message "Error in X %*% beta : non-conformable arguments".
> If X, and beta are non-conformable how it gives values.
> My data has first three columns of three dependent variables and rest of
> the
> colums indicating X (indep vars).
> Please help me out. Here goes my program for k1 categories of multinomial
> distribution:
>
> #data is the data which consists of three dependent varaible in first three
> columns and rest of the columns represent covariates.
>
>
> k1<- length(unique(data[,1]))
> p<- ncol(data)-3
> beta0 <-matrix(-.1,nrow=k1-1,ncol=(p+1)) # starting value
> beta <-as.matrix(beta0)
> beta <-as.matrix(t(beta))
>
>
>
>
> ## likelihood for y1
>
> multin.lik<- function(beta,data) ##beta is a matrix of beta's of order
> ((p+1)*(k-1))
> {
> nr<- nrow(data)
> nc<- ncol(data)
>
> y1<- data[,1]
> y1<- as.matrix(y1,ncol=1)
>
> X<-as.matrix(cbind(1,data[,4:nc])) #matrix of order
> ((n*(p+1)))
> covariates; 1 is added for intercept
>
> LL<- exp(X%*%beta) #LL is of order (n*(k-1))
> L<- as.matrix(cbind(1,LL))  #L is of order (n*k); 1
> is added for ref
> category, L0, L1, L2
> pi<- t(apply(L,1, function(i) i/sum(i)))
>
>
> lgl<- 0
> for (i in 1:nr)
> {
> if (y1[i]==0) {lgl[i]<-
> log(pi[i,1])}
> else if (y1[i]==1) {lgl[i]<-
> log(pi[i,2])}
> else lgl[i]<- log(pi[i,3])
> lgl
> }
> lgL<- sum(lgl)
> return(-lgL)
> }
>
>
> ## parameter estimates
> abc <-optim(beta, multin.lik,data=data,method="SANN",hessian=T)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Diff'ing 2 strings

2019-01-10 Thread Bert Gunter

It's the same thing. From ?Rdiff:

"Given two *R* output files, compute differences ignoring headers, footers
and some other differences."

Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jan 10, 2019 at 8:39 AM Sebastien Bihorel <
sebastien.biho...@cognigencorp.com> wrote:

> Yep, I did. Got nothing. It does not come with R 3.4.3, which is the
> version I can use.
>
> R CMD Rdiff comes with this version, but it is a shell command not a R
> function. It is meant for diff'ing R output.
>
>
> - Original Message -
> From: "Jeff Newmiller" 
> To: r-help@r-project.org, "Sebastien Bihorel" <
> sebastien.biho...@cognigencorp.com>, "Martin Møller Skarbiniks Pedersen" <
> traxpla...@gmail.com>
> Cc: "R mailing list" 
> Sent: Thursday, January 10, 2019 10:49:15 AM
> Subject: Re: [R] Diff'ing 2 strings
>
> Just type
>
> ?Rdiff
>
> it is in the preinstalled packages that come with R.
>
> On January 10, 2019 7:35:42 AM PST, Sebastien Bihorel <
> sebastien.biho...@cognigencorp.com> wrote:
> >From which the diffobj package?
> >
> >
> >From: "Martin Møller Skarbiniks Pedersen" 
> >To: "Sebastien Bihorel" 
> >Cc: "R mailing list" 
> >Sent: Thursday, January 10, 2019 2:35:15 AM
> >Subject: Re: [R] Diff'ing 2 strings
> >
> >
> >
> >On Sat, Jan 5, 2019, 14:58 Sebastien Bihorel < [
> >mailto:sebastien.biho...@cognigencorp.com |
> >sebastien.biho...@cognigencorp.com ] wrote:
> >
> >
> >Hi,
> >
> >Does R include an equivalent of the linux diff command?
> >
> >
> >
> >
> >yes.
> >?rdiff
> >
> >/martin
> >
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading an excel file

2019-01-10 Thread Bert Gunter

Don't!

Well, I know that being a wiseguy is not helpful, but this "advice" is
actually not entirely unhelpful. Search on "input Excel file" or similar on
rseek.org to bring up many links, including the readxl package, tutorials,
the R data import/export manual, etc. However, excel files are notoriously
"unstructured," and you would probably be better off converting your data
in tabular form to a .csv or .txt file and reading in from there (using
read.table, read.csv, etc.) . The linked references (and advice from others
with more experience) should be consulted for details.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Jan 10, 2019 at 1:40 PM Bernard Comcast <
mcgarvey.bern...@comcast.net> wrote:

> What is the best way to read in data of any type from an Excel 2016 .xlsx
> file?
>
> Thanks
>
> Bernard
> Sent from my iPhone so please excuse the spelling!"
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2019-01-10 Thread Bert Gunter

Please post on R-package-devel, not here. That list is specifically devoted
to such issues. This list is about R programming help.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jan 10, 2019 at 3:06 PM Sam Albers 
wrote:

> Hello all,
>
> I am experience some issues with building a package that we are
> hosting on GitHub. The package itself is quite large.  It is a data
> package with a bunch of spatial files stored as .rds files.
>
> The repo is located here: https://github.com/bcgov/bcmaps.rdata
>
> If we clone that package to local machine via:
> git clone https://github.com/bcgov/bcmaps.rdata
>
> The first oddity is that the package installs successfully using this:
>
> $ R CMD INSTALL "./bcmaps.rdata"
>
> But fails when I try to build the package:
>
> $ R CMD build "./bcmaps.rdata"
> * checking for file './bcmaps.rdata/DESCRIPTION' ... OK
> * preparing 'bcmaps.rdata':
> * checking DESCRIPTION meta-information ... OK
> * checking for LF line-endings in source and make files and shell scripts
> * checking for empty or unneeded directories
> * looking to see if a 'data/datalist' file should be added
> Warning in gzfile(file, "rb") :
>   cannot open compressed file 'bcmaps.rdata', probable reason
> 'Permission denied'
> Error in gzfile(file, "rb") : cannot open the connection
> Execution halted
>
>
> The second oddity is that if I remove the . from the Package name in
> the DESCRIPTION file, the build proceeds smoothly:
>
> $ R CMD build "./bcmaps.rdata"
> * checking for file './bcmaps.rdata/DESCRIPTION' ... OK
> * preparing 'bcmapsrdata':
> * checking DESCRIPTION meta-information ... OK
> * checking for LF line-endings in source and make files and shell scripts
> * checking for empty or unneeded directories
> * looking to see if a 'data/datalist' file should be added
> * building 'bcmapsrdata_0.2.0.tar.gz'
>
> I am assuming that R CMD install builds the package internally so I
> find it confusing that I am not able to build it myself. Similarly
> confusing is the lack of a . in the package name indicative of
> anything?
>
> Does anyone have any idea what's going on here? Am I missing something
> obvious?
>
> Thanks in advance,
>
> Sam
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fwd: Overlapping legend in a circular dendrogram

2019-01-11 Thread Bert Gunter

This is the 3rd time you've posted this. Please stop re-posting!

Your question is specialized and involved, and you have failed to provide a
reproducible example/data. We are not obliged to respond.

You may do better contacting the maintainer, found by ?maintainer, as
recommended by the posting guide for specialized queries such as this.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Jan 11, 2019 at 12:47 PM N Meriam  wrote:

> Hi, I'm facing some issues when generationg a circular dendrogram.
> The labels on the left which are my countries are overlapping with the
> circular dendrogram (middle). Same happens with the labels (regions)
> located on the right.
> I run the following code and I'd like to know what should be changed
> in my code in order to avoid that.
>
> load("hc1.rda")
> library(cluster)
> library(ape)
> library(dendextend)
> library(circlize)
> library(RColorBrewer)
>
> labels = hc1$labels
> n = length(labels)
> dend = as.dendrogram(hc1)
> markcountry=as.data.frame(markcountry1)
> #Country colors
> groupCodes=as.character(as.factor(markcountry[,2]))
> colorCodes=rainbow(length(unique(groupCodes))) #c("blue","red")
> names(colorCodes)=unique(groupCodes)
> labels_colors(dend) <- colorCodes[groupCodes][order.dendrogram(dend)]
>
> #Region colors
> groupCodesR=as.character(as.factor(markcountry[,3]))
> colorCodesR=rainbow(length(unique(groupCodesR))) #c("blue","red")
> names(colorCodesR)=unique(groupCodesR)
>
> circos.par(cell.padding = c(0, 0, 0, 0))
> circos.initialize(factors = "foo", xlim = c(1, n)) # only one sector
> max_height = attr(dend, "height")  # maximum height of the trees
>
> #Region graphics
> circos.trackPlotRegion(ylim = c(0, 1.5), panel.fun = function(x, y) {
>   circos.rect(1:361-0.5, rep(0.5, 361), 1:361-0.1, rep(0.8,361), col =
> colorCodesR[groupCodesR][order.dendrogram(dend)], border = NA)
> }, bg.border = NA)
>
> #labels graphics
> circos.trackPlotRegion(ylim = c(0, 0.5), bg.border = NA,
>panel.fun = function(x, y) {
>
>circos.text(1:361-0.5,
> rep(0.5,361),labels(dend), adj = c(0, 0.5),
>facing = "clockwise", niceFacing =
> TRUE,
>col = labels_colors(dend), cex =
> 0.45)
>
>})
> dend = color_branches(dend, k = 6, col = 1:6)
>
> #Dendrogram graphics
> circos.trackPlotRegion(ylim = c(0, max_height), bg.border = NA,
>track.height = 0.4, panel.fun = function(x, y) {
>  circos.dendrogram(dend, max_height = 0.55)
>})
>
> legend("left",names(colorCodes),col=colorCodes,text.col=colorCodes,bty="n",pch=15,cex=0.8)
>
> legend("right",names(colorCodesR),col=colorCodesR,text.col=colorCodesR,bty="n",pch=15,cex=0.35)
>
> Cheers,
> Myriam
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] randomForest out of bag prediction

2019-01-12 Thread Bert Gunter

Off topic.
But see here:
https://stats.stackexchange.com/questions/61405/random-forest-and-prediction

-- Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Jan 12, 2019 at 9:56 AM Witold E Wolski  wrote:

> Hello,
>
> I am just not sure what the predict.RandomForest function is doing...
> I confused.
>
> I would expect the predictions for these 2 function calls to predict the
> same:
> ```{r}
> diachp.rf <- randomForest(quality~.,data=data,ntree=50, importance=TRUE)
>
> ypred_oob <- predict(diachp.rf)
> dataX <- data %>% select(-quality) # remove response.
> ypred <- predict( diachp.rf, dataX )
>
> ypred_oob == ypred
> ```
> These are both out of bag predictions but ypred and ypred_oob are
> actually they are very different.
>
> > table(ypred_oob , data$quality)
>
> ypred_oob01
> 0 1324  346
> 1  493 2837
> > table(ypred , data$quality)
>
> ypred01
> 0 18170
> 10 3183
>
> What I find even more disturbing is that 100% accuracy for ypred.
> Would you agree that this is rather unexpected?
>
> regards
> Witek
> --
> Witold Eryk Wolski
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [R-sig-ME] Calculating F values for lme function

2019-01-15 Thread Bert Gunter

Ricardo:
You may do better posting on the r-sig-mixed-models list, which is
specifically devoted to such topics.

FWIW, re calculating F-values for mixed effects models, I think many say:
don't.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Jan 15, 2019 at 8:36 AM Jeff Newmiller 
wrote:

> You should use Reply-All to make sure the discussion continues to include
> the mailing list.
>
> Have you looked at the help for lme?
>
> lme is non-trivial, so it may take some reading. I only have a few of the
> references listed in the help file, and none with me at the moment.
>
> On January 15, 2019 7:36:22 AM PST, RICARDO ALVARADO BARRANTES <
> ricardo.alvar...@ucr.ac.cr> wrote:
> >Thanks for your response, however my understandig of all this
> >programming is very limited. Is there any source where I can read about
> >F calculation for those models?
> >
> >Thanks for your time
> >
> >Ricardo
> >
> >El 14-01-2019 16:47, Jeff Newmiller escribió:
> >
> >> Fortunately, nlme is open source [1 [1]][2 [2]], so you can follow
> >along in as much detail as you like.
> >>
> >> Note that capitalization matters in R... NLME is not correct.
> >>
> >> [1]  https://github.com/cran/nlme/blob/master/R/lme.R
> >> [2] https://cran.r-project.org/package=nlme
> >>
> >> On January 14, 2019 1:59:29 PM PST, RICARDO ALVARADO BARRANTES
> > wrote:
> >>
> >>> I have a question related to the funcion LME in the library NLME.  I
> >>> would like to understand how the F values are calculated, since the
> >>> output only shows the degrees of freedom but doen't show the sums of
> >>> squares involved in those calculations.
> >>>
> >>> Thanks for your attention.
> >>>
> >>> Ricardo
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ___
> >>> r-sig-mixed-mod...@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> >
> >
> >Links:
> >--
> >[1] https://github.com/cran/nlme/blob/master/R/lme.R
> >[2] https://cran.r-project.org/package=nlme
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Companion to Linear Statistical Models by KNNL

2019-01-16 Thread Bert Gunter

See here for relevant comments:

https://stats.stackexchange.com/questions/64406/r-code-for-kutner-et-als-applied-linear-statistical-models


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Jan 16, 2019 at 3:51 PM AbouEl-Makarim Aboueissa <
abouelmakarim1...@gmail.com> wrote:

> Dear All:
>
>
> I am wondering if there is An R Companion to Linear Statistical Models
> *by  *Kutner, Nachtsheim, Neter, and Li. Any help would be appreciated.
>
>
> with many thanks
>
> abou
> __
>
>
> *AbouEl-Makarim Aboueissa, PhD*
>
> *Professor, Statistics and Data Science*
> *Graduate Coordinator*
>
> *Department of Mathematics and Statistics*
> *University of Southern Maine*
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R: estimating genotyping error rate

2019-01-17 Thread Bert Gunter

"How can I proceed?"

-- By doing your own homework about appropriate methodology and software
instead of asking others to do it for you.

-- and by posting as necessary on the appropriate website, which is most
likely Bioconductor Help, not here.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jan 17, 2019 at 9:03 AM N Meriam  wrote:

> Hello,
> I have SNP data from genotyping.
> I would like to estimate the error rate between replicated samples using R.
> How can I proceed?
>
> Thanks
> Meriam
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Kaplan-Meier plot

2019-01-17 Thread Bert Gunter

Have you consulted ?plot.survfit ? There are examples for KM plots there.

Also, obvious question: Have you specfied the censoring properly in your
data and fit?

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Jan 17, 2019 at 11:39 AM Medic  wrote:

> According to the guidelines (if I'm not mistaken), the code below is
> sufficient (without any specification) to give Kaplan-Meier curves with
> censored data markings on Kaplan-Meier curves. But in my case censored data
> don't appears on the curves?!
>
> library(survival)
> mydata<-read.csv (file="C:/mydata/mydata.csv", header=TRUE, sep=";" )
> # Sic! The separator in my csv file is ";"
>
> dput (mydata, "dputmydata.r")
> #attached
>
> Y <- Surv (mydata$time, mydata$status == 2)
> # 2 -- encodes event
>
> km <- survfit (Y~mydata$stage)
>
> plot (km)
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] adding a hex sticker to a package

2019-01-21 Thread Bert Gunter

Better posted on r-package-devel list, no?

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Mon, Jan 21, 2019 at 9:52 AM Therneau, Terry M., Ph.D. via R-help <
r-help@r-project.org> wrote:

> I've created a hex sticker for survival.  How should that be added to the
> package
> directory?   It's temporarily in man/figures on the github page.
>
> Terry T.
>
> (Actually, the idea was from Ryan Lennon. I liked it, and we found someone
> with actual
> graphical skills to execute it. )
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mismatch distribution

2019-01-21 Thread Bert Gunter

"Do not work" does not work (in providing sufficient info). See the Posting
guide  linked below for how to post an intelligible question.

HOWEVER, I suspect you would do better posting on te Bioconductor list
where they are much more likely to know what "fasta" files look like and
might even have software already developed to do what you want. You could
well be trying to reinvent wheels.

Cheers,
Bert

On Mon, Jan 21, 2019 at 5:35 PM Myriam Croze 
wrote:

> Hello!
>
> I need your help. I am trying to calculate the pairwise differences between
> sequences from several fasta files.
> I would like for each of my DNA alignments (fasta files), calculate the
> pairwise differences and then:
> - 1. Combine all the data of each file to have one file and one histogram
> (mismatch distribution)
> - 2. calculate the mean for each difference for all the file and again make
> a mismatch distribution plot
>
> Here the script that I wrote:
>
> library("pegas")
> > library("seqinr")
> > library("ggplot2")
> >
> >
>
> > Files <- list.files(pattern="fas")
> > nb_files <- length(Files)
> >
> >
> > for (i in 1:nb_files) {
> > Dist <-  as.numeric(dist.gene(read.dna(Files[i], "fasta"), method
> > = "pairwise",
> >pairwise.deletion = FALSE, variance = FALSE))
> >
> > Data <- merge(Data, Dist, by=c("x"), all=T)
> > }
> >
>
>
> > hist(Data, prob=TRUE)
> > lines(density(Data), col="blue", lwd=2)
> >
>
> However, the script does not work and I do not know what to change to make
> it working.
> Thanks in advance for your help.
>
> Myriam
>
> --
> Myriam Croze, PhD
> Post-doctorante
> Division of EcoScience,
> Ewha Womans University
> Seoul, South Korea
>
> Email: myriam.croz...@gmail.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tukey Test

2019-01-24 Thread Bert Gunter

In the age of google, Search!

e.g. on "tukey test" at rseek.org

-- Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jan 24, 2019 at 7:51 PM  wrote:

> R-Help
>
>
>
> There is an R library that will perform a Tukey test which prints out the
> Tukey groups (A, B, C, etc) and I don't recall the library. It was
> agriculture or something like that.
>
>
>
> And is there a library that will product the Tukey, Bonferonni, Scheffe,
> and
> Dunnett comparison tables?
>
>
>
> Jeff Reichmqn
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] duplicates including first occurrence

2019-01-28 Thread Bert Gunter

... Alternatively(but probably less efficient):

## the indexing logical vector
with(mtcars, wt %in% wt[duplicated(wt)] )

cheers,
Bert




Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Jan 28, 2019 at 3:53 AM Rui Barradas  wrote:

> Hello,
>
> Simply OR (|) both conditions.
>
> mtcars[duplicated(mtcars$wt) | duplicated(mtcars$wt,fromLast=TRUE),]
> #   mpg cyl  disp  hp drat   wt  qsec vs am gear carb
> #Hornet Sportabout 18.7   8 360.0 175 3.15 3.44 17.02  0  032
> #Duster 36014.3   8 360.0 245 3.21 3.57 15.84  0  034
> #Merc 280  19.2   6 167.6 123 3.92 3.44 18.30  1  044
> #Merc 280C 17.8   6 167.6 123 3.92 3.44 18.90  1  044
> #Maserati Bora 15.0   8 301.0 335 3.54 3.57 14.60  0  158
>
>
> Hope this helps,
>
> Rui Barradas
>
> Às 11:43 de 28/01/2019, Knut Krueger via R-help escreveu:
> > Ho to all
> >
> > i get the  results
> >
> > mtcars[duplicated(mtcars$wt,fromLast=TRUE),]
> > Hornet Sportabout 18.7   8 360.0 175 3.15 3.44 17.02  0  032
> > Duster 36014.3   8 360.0 245 3.21 3.57 15.84  0  034
> > Merc 280  19.2   6 167.6 123 3.92 3.44 18.30  1  044
> >
> >
> > mtcars[duplicated(mtcars$wt),]
> >
> > Merc 280  19.2   6 167.6 123 3.92 3.44 18.3  1  044
> > Merc 280C 17.8   6 167.6 123 3.92 3.44 18.9  1  044
> > Maserati Bora 15.0   8 301.0 335 3.54 3.57 14.6  0  158
> >
> >
> > The first occurrence is missing - is there any possibility to get
> >
> > Hornet Sportabout 18.7   8 360.0 175 3.15 3.44 17.02  0  032
> > Merc 280  19.2   6 167.6 123 3.92 3.44 18.30  1  044
> > Merc 280C 17.8   6 167.6 123 3.92 3.44 18.90  1  044
> > Duster 36014.3   8  360 245 3.21 3.57 15.84  0  034
> > Maserati Bora 15.0   8  301 335 3.54 3.57 14.60  0  158
> >
> >
> > Kind regards Knut
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-28 Thread Bert Gunter

I would say your question is foolish -- you disagree no doubt! -- because
the point of using R (or Octave or C++) is to take advantage of the
packages (= "libraries" in some languages; a library is something different
in R) it (or they) offers to simplify your task. Many of R's libraries are
written in C (or Fortran) an thus **are** fast as well as having
task-appropriate functionality and UI's .

So I think instead of pursuing this discussion you would do well to search.
I find rseek.org to be especially good for this sort of thing. Searching
there on "demography" brought up what appeared to be many appropriate hits
-- including the "demography" package! -- which you could then examine to
see whether and to what extent they provide the functionality you seek.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Jan 28, 2019 at 4:00 PM Alan Feuerbacher 
wrote:

> On 1/28/2019 4:20 PM, Rolf Turner wrote:
> >
> > On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
> >
> >> Hi,
> >>
> >> I recently learned of the existence of R through a physicist friend
> >> who uses it in his research. I've used Octave for a decade, and C for
> >> 35 years, but would like to learn R. These all have advantages and
> >> disadvantages for certain tasks, but as I'm new to R I hardly know how
> >> to evaluate them. Any suggestions?
> >
> > * C is fast, but with a syntax that is (to my mind) virtually
> >incomprehensible.  (You probably think differently about this.)
>
> I've been doing it long enough that I have little problem with it,
> except for pointers. :-)
>
> > * In C, you essentially have to roll your own for all tasks; in R,
> >practically anything (well ...) that you want to do has already
> >been programmed up.  CRAN is a wonderful resource, and there's more
> >on github.
>  >
> > * The syntax of R meshes beautifully with *my* thought patterns; YMMV.
> >
> > * Why not just bog in and try R out?  It's free, it's readily available,
> >and there are a number of good online tutorials.
>
> I just installed R on my Linux Fedora system, so I'll do that.
>
> I wonder if you'd care to comment on my little project that prompted
> this? As part of another project, I wanted to model population growth
> starting from a handful of starting individuals. This is exponential in
> the long run, of course, but I wanted to see how a few basic parameters
> affected the outcome. Using Octave, I modeled a single person as a
> "cell", which in Octave has a good deal of overhead. The program
> basically looped over the entire population, and updated each person
> according to the parameters, which included random statistical
> variations. So when the total population reached, say 10,000, and an
> update time of 1 day, the program had to execute 10,000 x 365 update
> operations for each year of growth. For large populations, say 100,000,
> the program did not return even after 24 hours of run time.
>
> So I switched to C, and used its "struct" declaration and an array of
> structs to model the population. This allowed the program to complete in
> under a minute as opposed to 24 hours+. So in line with your comments, C
> is far more efficient than Octave.
>
> How do you think R would fare in this simulation?
>
> Alan
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to ref data directly from bls.gov using their api

2019-01-29 Thread Bert Gunter

Please search on "Bureau of Labor Statistics" at rseek.org.  You will find
several packages and other resources there for doing what you want.

-- Ber

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Jan 29, 2019 at 9:21 AM Evans, Richard K. (GRC-H000) via R-help <
r-help@r-project.org> wrote:

> Hello,
>
> I'd like to generate my own plots of various labor statistics using live
> data available at https://www.bls.gov/bls/api_features.htm
>
> This is 10% an R question and 90 % a bls.gov api query question.  Please
> forgive me for making this request here but I would be truly grateful for
> anyone here on the R mailinglist who can show me how to write a line of R
> code that "fetches" the raw data anew from the bls.gov website every time
> it runs.
>
> Truest Thanks,
> /Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] periodicity

2019-01-30 Thread Bert Gunter

Ummm... ???

A google search on "R function periodicity" immediately brought up the xts
package and others.

and RStudio is **NOT** R. It's an IDE for R (and there are others).

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Wed, Jan 30, 2019 at 11:17 AM Nick Wray via R-help 
wrote:

> I've found references on websites to an R function "periodicity", but
> there's no such built-in function as far as I can see in R studio.  I can't
> find reference to it being part of any package either.  Can anyone help
> with this?
>
> Thanks, Nick Wray
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] periodicity

2019-01-30 Thread Bert Gunter

All:

https://rdrr.io/  and Rdocumentation.org

Seems to be good places for finding info on specific R functions.


-- Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Jan 30, 2019 at 11:27 AM Sarah Goslee 
wrote:

> Hi Nick,
>
> A quick look on rseek.org didn't turn anything up. It would help to
> know what websites you're referring to - they might be loading custom
> code.
>
> Sarah
>
> On Wed, Jan 30, 2019 at 2:17 PM Nick Wray via R-help
>  wrote:
> >
> > I've found references on websites to an R function "periodicity", but
> there's no such built-in function as far as I can see in R studio.  I can't
> find reference to it being part of any package either.  Can anyone help
> with this?
> >
> > Thanks, Nick Wray
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Sarah Goslee (she/her)
> http://www.numberwright.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help converting file to XTS

2019-02-05 Thread Bert Gunter

Hard to say without knowing what dat looks like.

Can you show us a small sample, perhaps via dput( head( dat))  ?
See ?dput, ?head for details.

A guess would be that dat is a data frame and not a character string, but



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Feb 5, 2019 at 9:24 AM Johannes Møllerhagen 
wrote:

> Hello there! I am a master student working on my master thesis, and I am
> trying to convert some data to xts so I can apply a highfrequency package
> to it.
>
> At the moment I am trying to use a POSIXct function. I am quite new at
> this program and I am having some issue. The file is  attached.
>
>
> The current coding is:
>
>
> dat<-read_csv("TEL5minint.csv")
> xts(dat,order.by=as.POSIXct(dat),"%d/%m/%Y %H:%M")
>
>
> And the error is:
>
> Error in as.POSIXct.default(dat) :
>   do not know how to convert 'dat' to class “POSIXct”
>
>
> Any help is appreciated!
>
>
> Kind Regards
>
> Johannes
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems when merging two data sets

2019-02-05 Thread Bert Gunter

Show us your code! (as the posting guide below requests. Please read the
posting guide).


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Feb 5, 2019 at 10:04 AM sasa kosanic  wrote:

> Dear All,
>
> I would like to merge two data sets however I am doing something wrong...
> 1 data set contains 2 columns of  'species occurrence'(1 column) in Germany
> and  'species names' (2 column).
> and the second one names of 'Red list species'(1 column) and 'species
> status' (2 column).
> so I would like to merge Red list species with species names from the first
> table and to sign the  species status
> I have tried with merge function but got this an error:" 'by' must specify
> a uniquely valid column"
> I also tried with the function left_join, however no success.
>
> Also columns in two data sets are different in size. 1 table has 7189 rows
> and 2 table just 426 rows as we do not have much Red list Species.
>
> I would appreciate your help.
>
> Kind regards,
> Sasha
>
>
> Dr Sasha Kosanic
> Ecology Lab (Biology Department)
> Room M842
> University of Konstanz
> Universitätsstraße 10
> D-78464 Konstanz
> Phone: +49 7531 883321 & +49 (0)175 9172503
>
> http://cms.uni-konstanz.de/vkleunen/
> https://tinyurl.com/y8u5wyoj
> https://tinyurl.com/cgec6tu
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] faster execution of for loop in Fishers test

2019-02-11 Thread Bert Gunter

1. I believe Fisher's exact test is computationally intensive and takes a
lot of time for large structures, so I would say what you see is what you
should expect! (As I'm not an expert on this, confirmation or contradiction
by those who are would be appreciated).

2. Your second question on how to select results based on values in another
vector/column is very basic R. So it appears that you need to spend some
time with an R tutorial or two to learn the basics (unless I have
misinterpreted).

3. Please do not repost further. No one is obligated to respond to your
posts. Following the posting guide, which you appear to have done,
increases the likelihood, but is of course no guarantee.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Feb 11, 2019 at 5:28 PM Adrian Johnson 
wrote:

> Dear group,
>
> I have two large matrices.
>
> Matrix one: is 24776 x 76 (example toy1 dput object given below)
>
> Matrix two: is 12913 x 76 (example toy2 dput object given below)
>
> Column names of both matrices are identical.
>
> My aim is:
>
> a. Take each row of toy2 and transform vector into UP (>0)  and DN (
> <0 ) categories. (kc)
> b  Test association between kc and every row of toy1.
>
> My code, given below, although this works but is very slow.
>
> I gave dput objects for toy1, toy2 and result matrix.
>
> Could you suggest/help me how I can make this faster.  Also, how can I
> select values in result column that are less than 0.001 (p < 0.001).
>
> Appreciate your help. Thank you.
> -Adrian
>
> Code:
>
> ===
>
>
>
> result <- matrix(NA,nrow=nrow(toy1),ncol=nrow(toy2))
>
> rownames(result) <- rownames(toy1)
> colnames(result) <- rownames(toy2)
>
> for(i in 1:nrow(toy2)){
> for(j in 1:nrow(toy1)){
> kx = toy2[i,]
> kc <- rep('NC',length(kx))
> kc[ kx >0] <- 'UP'
> kc[ kx <=0 ] <- 'DN'
> xpv <- fisher.test(table(kc,toy1[j,]),simulate.p.value = TRUE)$p.value
> result[j,i] <- xpv
> }
> }
>
>
> ===
>
>
>
> ===
>
>
> > dput(toy1)
> structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, -1, -1, -1, -1,
> -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
> -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
> -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, -1, -1, -1, -1,
> -1, -1, -1, -1, -1), .Dim = c(10L, 7L), .Dimnames = list(c("ACAP3",
> "ACTRT2", "AGRN", "ANKRD65", "ATAD3A", "ATAD3B", "ATAD3C", "AURKAIP1",
> "B3GALT6", "C1orf159"), c("a", "b", "c", "d", "e", "f", "g")))
>
>
>
> > dput(toy2)
> structure(c(-0.242891119688613, -0.0514058216682132, 0.138447212993773,
> -0.312576648033122, 0.271489918720452, -0.281196468299486,
> -0.0407160143344565,
> -0.328353812845287, 0.151667836674511, 0.408596843743938,
> -0.049351944902924,
> 0.238586287349249, 0.200571558784821, -0.0737604184858411,
> 0.245971526254877,
> 0.24740263959845, -0.161528943131908, 0.197521973013793,
> 0.0402668125708444,
> 0.376323735212088, 0.0731550871764204, 0.385270176969893, 0.28953042756208,
> 0.062587289401188, -0.281187168932979, -0.0202298984561554,
> -0.0848696970309447,
> 0.0349676726358973, -0.520484215644868, -0.481991414222996,
> -0.00698099201388211,
> 0.135503878341873, 0.156983081312087, 0.320223832092661, 0.34582193394074,
> 0.0844455960468667, -0.157825604090972, 0.204758250510969,
> 0.261796072978612,
> -0.19510450641405, 0.43196474472874, -0.211155577453175,
> -0.0921641871215187,
> 0.420950361292263, 0.390261862151936, -0.422273930504427,
> 0.344653684951627,
> 0.0378273248838503, 0.197782027324611, 0.0963124876309569,
> 0.332093167080656,
> 0.128036554821915, -0.41338065859335, -0.409470440033177,
> 0.371490567256253,
> -0.0912549189140141, -0.247451812684234, 0.127741739114639,
> 0.0856254238844557,
> 0.515282940316031, -0.25675759521248, 0.333943163209869, 0.604141413840881,
> 0.0824942299510931, -0.179605710473021, -0.275604207054643,
> -0.113251154591898,
> 0.172897837449258, -0.329808795076691, -0.239255324324506), .Dim = c(10L,
> 7L), .Dimnames = list(c("chr5q23", "chr16q24", "chr8q24", "chr13q11",
> "ch

Re: [R] Package updates fail: how to fix the causes

2019-02-15 Thread Bert Gunter

You *might* do better posting this on r-sig-debian and/or r-sig-fedora,
especially as this is not a question about R programming per se, which
makes it off topic for this list, but more on topic for those lists.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Feb 15, 2019 at 7:58 AM Rich Shepard 
wrote:

> Running R-3.5.2 on Slackware-14.2, using my script that updates installed
> packages found four that failed. My web searches did not find relevant
> hits,
> and only the last build failure is explained by the build failure of a
> specific dependency. The results displayed are:
>
> ERROR: dependency ‘sf’ is not available for package ‘spdep’
> * removing ‘/usr/lib/R/library/spdep’
>
> The downloaded source packages are in
> ‘/tmp/RtmpzEBBCY/downloaded_packages’
> Updating HTML index of packages in '.Library'
> Making 'packages.html' ... done
> Warning messages:
> 1: In install.packages(update[instlib == l, "Package"], l, contriburl =
> contriburl,  :
>installation of package ‘units’ had non-zero exit status
> 2: In install.packages(update[instlib == l, "Package"], l, contriburl =
> contriburl,  :
>installation of package ‘later’ had non-zero exit status
> 3: In install.packages(update[instlib == l, "Package"], l, contriburl =
> contriburl,  :
>installation of package ‘sf’ had non-zero exit status
> 4: In install.packages(update[instlib == l, "Package"], l, contriburl =
> contriburl,  :
>installation of package ‘spdep’ had non-zero exit status
>
> Starting at the top, trying to install 'units' resulted in identifying a
> missing library:
>
> Configuration failed because libudunits2.so was not found. Try installing:
>  * deb: libudunits2-dev (Debian, Ubuntu, ...)
>  * rpm: udunits2-devel (Fedora, EPEL, ...)
>  * brew: udunits (OSX)
>If udunits2 is already installed in a non-standard location, use:
>  --configure-args='--with-udunits2-lib=/usr/local/lib'
>if the library was not found, and/or:
>  --configure-args='--with-udunits2-include=/usr/include/udunits2'
>if the header was not found, replacing paths with appropriate values.
>You can alternatively set UDUNITS2_INCLUDE and UDUNITS2_LIBS manually.
>
> 
> See `config.log' for more details
> ERROR: configuration failed for package ‘units’
> * removing ‘/usr/lib/R/library/units’
>
> The downloaded source packages are in
> ‘/tmp/RtmprbnasH/downloaded_packages’
> Updating HTML index of packages in '.Library'
> Making 'packages.html' ... done
> Warning message:
> In install.packages("units") :
>installation of package ‘units’ had non-zero exit status
>
> Please advise me on how to proceed so these four packages are eventually
> updated or re-installed.
>
> TIA,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Taking the Average of a subset of data

2019-02-15 Thread Bert Gunter

Read the posting guide, please, paying particular attention to how to
provide reproducible data, e.g. via ?dput. You are much more likely to get
useful help if you do what it recommends and provide data for people to
work with.

You also should provide code showing us what you tried. You appear not to
have done much homework of your own -- have you gone through some R
tutorials, for example? Which ones?

Cheers,
Bert





Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Feb 15, 2019 at 12:26 PM Isaac Barnhart  wrote:

> Hello all, I have another question. I'm working with the following dataset:
>
>
>
>
>
>
> plotplant   leaf_number sen_score   plot_laiplant_lai
>  lai_score   leaf_num
> 104 5   1   90  104 1   82  1
> 104 5   2   90  104 1   167 2
> 104 5   3   95  104 1   248 3
> 104 5   4   100 104 1   343 4
> 104 6   1   95  104 1   377 5
> 104 6   2   85  104 1   372 6
> 104 6   3   90  104 1   335 7
> 104 6   4   90  104 1   221 8
> 105 5   1   90  104 1   162 9
> 105 5   2   95  104 2   145 1
> 105 5   3   100 104 2   235 2
> 105 5   4   100 104 2   310 3
> 105 6   1   70  104 2   393 4
> 105 6   2   80  104 2   455 5
> 105 6   3   90  104 2   472 6
> 105 6   4   80  104 2   445 7
> 106 5   1   100 104 2   330 8
> 106 5   2   90  104 2   292 9
> 106 5   3   100 105 1   64  1
> 106 5   4   100 105 1   139 2
> 106 5   10  0   105 1   211 3
> 106 6   1   100 105 1   296 4
> 106 6   2   30  105 1   348 5
> 106 6   3   100 105 1   392 6
> 106 6   4   40  105 1   405 7
> 108 5   1   100 105 1   379 8
> 108 5   2   100 105 1   278 9
> 108 5   3   100 105 2   64  1
> 108 5   4   100 105 2   209 2
>
> (Note: 'plant' and 'leaf' column should be separated. '51' means plant 5,
> leaf 1).
>
>
> This dataset shows two datasets: The left 4 columns are of one
> measurement (leaf senescence), and the right 4 columns are of another (leaf
> area index). I have a large amount of plots, and several plants, more than
> what is listed.
>
>
> I need to sort both datasets (senescence and leaf area index) so that each
> plot has the same number of leaves.
>
>
> This is hard because sometimes plots in the 'senescence' dataset have more
> leaves, and sometimes plots in the 'leaf area index'. Is there a way to
> sort both datasets so that this requirement is met? Like I said, there is
> no way to tell which dataset has the plot with the minimum amount of
> leaves; it can be either one in any case.
>
>
> Any help would be appreciated!
>
>
> Isaac
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Saving and reloading function in a package

2019-02-15 Thread Bert Gunter

Do what we all do and learn how to create packages by reading the "Writing
R Extensions" manual; or spend time with a tutorial that shows how various
packages or IDE's such as the RStudio IDE simplify the process. See also
?package.skeleton and other R functions that can be used to assist you.

-- Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Feb 15, 2019 at 6:10 PM Dimitrios Stasinopoulos <
stasi...@staff.londonmet.ac.uk> wrote:

>  I would like to put a graphic background to a model diagnostic plot.
> The background is created with plot()/lines() but it takes time.
> My solution was to save the plots as functions using splinefun().
> Those saved function can be put in a .RData file using load()  or  .rds
> using saveRDS().
>
> My question is how I can put those files  in a package and load them
> within a function of the package.
> Any suggestion please?
>
>
> Prof Dimitrios Stasinopoulos
> stasi...@staff.londonmet.ac.uk
>
>
>
>
> --
> London Metropolitan University is a limited company registered in England
> and Wales with registered number 974438 and VAT registered number GB 447
> 2190 51. Our registered office is at 166-220 Holloway Road, London N7 8DB.
> London Metropolitan University is an exempt charity under the Charities
> Act
> 2011. Its registration number with HMRC is X6880.
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove cases with -Inf from a data frame

2019-02-16 Thread Bert Gunter

Sorry, that's

function(x)all(is.finite(x) | is.na(x) )

of course.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Feb 16, 2019 at 8:25 AM Bert Gunter  wrote:

> Many ways. I assume you know that Inf and -Inf are (special) numeric
> values that can be treated like other numerics. i.e.
>
> > 1 == - Inf
> [1] FALSE
>
> So straightforward indexing (selection) would do it.
> But there is also ?is.infinite and ?is.finite, so
>
> apply(yourdat, 1, function(x)all(is.finite(x)))
>
> would produce the index vector to keep rows with only finite values
> assuming yourdat contains only numeric data. If this is not the case, just
> select the numeric columns to index on, i.e.
>
> apply(yourdat[sapply(yourdat,is.numeric)], 1, function(x)
> all(is.finite(x)))
>
> One possible problem here is handling of NA's:
>
> is.finite(c(-Inf,NA))
> [1] FALSE FALSE
> ... so rows containing NA's but no -Inf's would also get removed. If you
> wish to keep rows with NA's but no -Inf's, then
>
> function(x)(is.finite(x) | is.na(x) )
>
> could be used.
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Feb 16, 2019 at 7:07 AM AbouEl-Makarim Aboueissa <
> abouelmakarim1...@gmail.com> wrote:
>
>> Dear All: good morning
>>
>>
>> I have a log-transformed data frame with some *-Inf* data values.
>>
>> *my question: *how to remove all rows with *-Inf* data value from that
>> data
>> frame?
>>
>>
>> with many thanks
>> abou
>> __
>>
>>
>> *AbouEl-Makarim Aboueissa, PhD*
>>
>> *Professor, Statistics and Data Science*
>> *Graduate Coordinator*
>>
>> *Department of Mathematics and Statistics*
>> *University of Southern Maine*
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove cases with -Inf from a data frame

2019-02-16 Thread Bert Gunter

Many ways. I assume you know that Inf and -Inf are (special) numeric values
that can be treated like other numerics. i.e.

> 1 == - Inf
[1] FALSE

So straightforward indexing (selection) would do it.
But there is also ?is.infinite and ?is.finite, so

apply(yourdat, 1, function(x)all(is.finite(x)))

would produce the index vector to keep rows with only finite values
assuming yourdat contains only numeric data. If this is not the case, just
select the numeric columns to index on, i.e.

apply(yourdat[sapply(yourdat,is.numeric)], 1, function(x) all(is.finite(x)))

One possible problem here is handling of NA's:

is.finite(c(-Inf,NA))
[1] FALSE FALSE
... so rows containing NA's but no -Inf's would also get removed. If you
wish to keep rows with NA's but no -Inf's, then

function(x)(is.finite(x) | is.na(x) )

could be used.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Sat, Feb 16, 2019 at 7:07 AM AbouEl-Makarim Aboueissa <
abouelmakarim1...@gmail.com> wrote:

> Dear All: good morning
>
>
> I have a log-transformed data frame with some *-Inf* data values.
>
> *my question: *how to remove all rows with *-Inf* data value from that data
> frame?
>
>
> with many thanks
> abou
> __
>
>
> *AbouEl-Makarim Aboueissa, PhD*
>
> *Professor, Statistics and Data Science*
> *Graduate Coordinator*
>
> *Department of Mathematics and Statistics*
> *University of Southern Maine*
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems trying to place a global map with Ncdf data plot

2019-02-17 Thread Bert Gunter

The r-sig-geo list would probably be a better place to post this, as they
specialize in this sort of thing.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Feb 17, 2019 at 3:48 AM rain1290--- via R-help 
wrote:

> Hello there,
>
> I am trying to overlay a global map with ncdf data of precipitation for a
> particular location (using specific coordinates). The file is in ncdf
> format
> (commonly used to store away climate data), and I am currently attempting
> to
> place a global map on plotted precipitation values. However, I am having
> difficulty placing a global map on this plot and am encountering errors. I
> will show you what I have done:
>
> #To create a plot of precipitation data using the following ncdf file - the
> following works fine and provides the distributions global precipitation
> values (Land+Water values):
>
> library(ncdf4)
> Can<-"MaxPrecCCCMACanESM2rcp45.nc"
>
>
> >Model<-nc_open(Can)
> >print(Model)
> >attributes(Model$var)
> >$names
> >dat<-ncvar_get(Model, "onedaymax")
> >dat[128,50,1] #View onedaymax for selected latitude, longitude and Year
> >nc_lat<-ncvar_get(Model,attributes(Model$dim)$names[2]) #Retrieve latitude
> >nc_lon<-ncvar_get(Model,attributes(Model$dim)$names[3]) #Retrieve
> longitude
> >print(paste(dim(nc_lat), "latitudes and", dim(nc_lon), "longitudes"))
> >library(maptools)
> >map<-dat[,,5] #Precipitation for all longitudes, latitudes, and Year 5
> >grid<-expand.grid(nc_lon=nc_lon, nc_lat=nc_lat)
> >image(nc_lon,nc_lat,map, ylab="Latitude", xlab="Longitude", main="One-day
> maximum precipitation")
> >levelplot(map~nc_lon*nc_lat, data=grid, at=cutpoints, cuts=11,
> ylab="Latitude", xlab="Longitude", >main="Year 5 one-day maximum
> precipitation (mm/day) for CanESM2 under RCP4.5", pretty=T,
> col.regions=(rev(brewer.pal(10, "Spectral"
>
> #To place a global map on the map that map that returns using the above
> code. *This is where errors begin:
>
> >ggplot()+geom_point(aes(x=nc_lon,y=nc_lat,color="onedaymax"),
> size=0.8)+borders("world",
>
> colour="black")+scale_color_viridis(name="onedaymax")+theme_void()+coord_quickmap()
> *Error: Aesthetics must be either length 1 or the same as the data (128):
> x,
> y, colour*
>
>
> Why doesn't this work? Could it be that I am not including the "time"
> dimension in the ggplot function? The problem, though, is when I try to
> obtain the "time" dimension, like I did for latitude and longitude, I
> receive the following error:
>
> t<-ncvar_get(Model,"time")
> *Error in nc$dim[[idobj$list_index]] :
>   attempt to select more than one element*
>
> If it helps, this is what the variables and dimensions look like in the
> ncdf
> file:
>
> /File MaxPrecCCCMACanESM2rcp45.nc (NC_FORMAT_NETCDF4):
>
> 3 variables (excluding dimension variables):
> double onedaymax[lon,lat,time]  (Contiguous storage)
> units: mm/day
> double fivedaymax[lon,lat,time]  (Contiguous storage)
> units: mm/day
> short Year[time]  (Contiguous storage)
>
> 3 dimensions:
> time  Size:95
> lat  Size:64
> units: degree North
> lon  Size:128
> units: degree East/
>
> Any help would be greatly appreciated
>
> Thanks,
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Weather station data

2019-02-17 Thread Bert Gunter

Search!

"r-package weather station data"

immediately brought up what looked like several relevant hits.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Feb 17, 2019 at 7:02 AM Bernard Comcast <
mcgarvey.bern...@comcast.net> wrote:

> Is anyone aware of any R capability to access data at weather stations
> around the globe? An R package perhaps?
>
> Thanks
>
> Bernard
> Sent from my iPhone so please excuse the spelling!"
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Learning to Write R Packages (Libraries) with Documentation

2019-02-17 Thread Bert Gunter

This is off topic for this list. Post to r-package-devel for questions
about writing r packages, package docs, etc. Note especially the use of
namespaces to avoid name clashes.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Feb 17, 2019 at 11:27 AM Ivo Welch  wrote:

> I would like to put together a set of my collected utility functions and
> share them internally.  (I don't think they are of any broader interest.)
> To do this, I still want to follow good practice.  I am particularly
> confused about writing docs.
>
> * for documentation, how do I refer to '@'-type documentation rather than
> the latex-like format?  I have read descriptions where both are referred to
> as roxygen-type.  I believe that devtools::document() translates the more
> convenient @-type into the latex-like format.
>
> * where do I find current good examples of R functions documented properly
> with the '@' format.   What should be taken from the function itself (name?
> usage?) so as to not repeat myself?
>
> * when I run `document()`, does devtools create a set of documentation
> files that I can also easily import by itself into another R session?  I am
> asking because I want to put a few functions into my .Rprofile, generate
> the documentation, and import it by hand.
>
> * my utility functions currently live in their own environment to avoid
> name conflicts ( such as mywork$read.csv <- cmpfun(function()
> message("specialized")) ).
>
>   - is keeping function collections in environments a good or bad idea in a
> library?
>   - will generating a package automatically compile all the functions, so
> that I should lose the `cmpfun`s ?
>   - to export the functions for others' uses, presumably I should place an
> "#` @export" just before the function.
>
> * is there integration between Rmd and R documentation?  Can/should I use
> Rmd for writing documentation for my functions and have this become
> available through the built-in help system?  Or are the two really
> separate.
>
> /iaw
>
> PS: Yes, I tried to do my homework.  apparently, the R ecosystem has been
> moving fast.  I start reading something, it seems great, but then I find
> out that it does not work.  For example, I tried the "Object Documentation"
> example from Hadley's book from 2015, but I think it is outdated.  (My
> `document()` run seems to want an explicit @name.  Hilary Parker's nice
> tutorial is outdated, too, as are many others.  The popular load.Rd example
> is already in the latex format. etc.)  where should I look for definitive
> documentation for the *current* package writing ecosystem?
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Learning to Write R Packages (Libraries) with Documentation

2019-02-17 Thread Bert Gunter

Oh, I also would assume that the authoritative, current doc for R package
development is "Writing R Extensions," which of course is part of all R
distros.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Feb 17, 2019 at 11:35 AM Bert Gunter  wrote:

> This is off topic for this list. Post to r-package-devel for questions
> about writing r packages, package docs, etc. Note especially the use of
> namespaces to avoid name clashes.
>
> -- Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sun, Feb 17, 2019 at 11:27 AM Ivo Welch  wrote:
>
>> I would like to put together a set of my collected utility functions and
>> share them internally.  (I don't think they are of any broader interest.)
>> To do this, I still want to follow good practice.  I am particularly
>> confused about writing docs.
>>
>> * for documentation, how do I refer to '@'-type documentation rather than
>> the latex-like format?  I have read descriptions where both are referred
>> to
>> as roxygen-type.  I believe that devtools::document() translates the more
>> convenient @-type into the latex-like format.
>>
>> * where do I find current good examples of R functions documented properly
>> with the '@' format.   What should be taken from the function itself
>> (name?
>> usage?) so as to not repeat myself?
>>
>> * when I run `document()`, does devtools create a set of documentation
>> files that I can also easily import by itself into another R session?  I
>> am
>> asking because I want to put a few functions into my .Rprofile, generate
>> the documentation, and import it by hand.
>>
>> * my utility functions currently live in their own environment to avoid
>> name conflicts ( such as mywork$read.csv <- cmpfun(function()
>> message("specialized")) ).
>>
>>   - is keeping function collections in environments a good or bad idea in
>> a
>> library?
>>   - will generating a package automatically compile all the functions, so
>> that I should lose the `cmpfun`s ?
>>   - to export the functions for others' uses, presumably I should place an
>> "#` @export" just before the function.
>>
>> * is there integration between Rmd and R documentation?  Can/should I use
>> Rmd for writing documentation for my functions and have this become
>> available through the built-in help system?  Or are the two really
>> separate.
>>
>> /iaw
>>
>> PS: Yes, I tried to do my homework.  apparently, the R ecosystem has been
>> moving fast.  I start reading something, it seems great, but then I find
>> out that it does not work.  For example, I tried the "Object
>> Documentation"
>> example from Hadley's book from 2015, but I think it is outdated.  (My
>> `document()` run seems to want an explicit @name.  Hilary Parker's nice
>> tutorial is outdated, too, as are many others.  The popular load.Rd
>> example
>> is already in the latex format. etc.)  where should I look for definitive
>> documentation for the *current* package writing ecosystem?
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Software

2019-02-18 Thread Bert Gunter

To add to what Rui said, go here:
https://www.r-project.org/

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Feb 18, 2019 at 2:11 PM Rui Barradas  wrote:

> Hello,
>
> I do not speak for the R Foundation but I believe you are not aware that
> R is a computer language for statistics biostatistics and (scientific)
> graphics.
>
> - R itself does not collect data.
> - Security policies are left to the users.
> - You can program whatever you want since R is Turing equivalent, GDPR
> or ADA compliant or not. It's up to the users/developers to comply to
> laws. (I hope they do.)
>
> Regarding this, R is pretty much the same as, for instance, C, C++,
> Fortran, etc. And just like those languages R is used by companies and
> other institutions, government or private, that enforce strong security
> policies.
>
>
> Hope this helps,
>
> Rui Barradas
>
> Às 17:32 de 18/02/2019, Evan Lindenberger escreveu:
> > Hello,
> >
> > My name is Evan Lindenberger and I work at the Johnson & Wales
> information security office. We received a request for R Software, but I
> have a few questions before we start using R, such as:
> >
> > - What information does R collect?
> > - Does the R Foundation have a written information security policy
> (WISP)?
> > - Is R compliant with GDPR and ADA?
> >
> > If someone could get back to me, that would be greatly appreciated.
> >
> > Thank you.
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interaction effects with GAMM

2019-02-19 Thread Bert Gunter

Wrong list. This list is about R programming, not statistical questions on
mixed models. Post on the r-sig-mixed-models list for that.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Feb 19, 2019 at 10:07 AM Louise Baruël Johansen 
wrote:

> I have a question on how to model interaction terms including smooths in a
> GAMM model (using the mgcv and nlme packages in R).
>
> We have collected longitudinal behavioral and brain imaging data from ~100
> subjects across ~6 time points, and I would like to model main effects of
> age, sex, brain as well as to-way interaction terms (and maybe three-way
> interaction terms), while correcting for education level and taking random
> effects into account. Is using the ti() setup the way to do this:
>
> M = gamm(behav ~ ti(age) + sex + education + ti(age, by = sex) + brain +
> ti(brain, by = age), random = list(subjectID = ~1+age), data = data)
>
> All help will be appreciated.
>
> Thanks, Louise
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] character set problem

2019-02-19 Thread Bert Gunter

You might try posting this on r-sig-mac if you don't get resolution here.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Feb 19, 2019 at 10:55 AM bretschr  wrote:

> Dear R-users,
>
>
> Last week I installed R 3.5.2 on a new MacBook Air.
>
> I got error messages for the wrong locale (character set).
> And simple math proved not to work:
> Upon typing this, I got:
> > 2ˆ2
> Error: unexpected input in "2À"
> >
>
> The character visible as a caret is apparently coded as something very
> different.
>
> Then I changed things according to the FAQ, chapter 7, (switching all
> settings to English),
> and executed the recommended line:
>
> defaults write org.R-project.R force.LANG en_US.UTF-8
>
> The error messages disappeared, but the problem remained.
>
> A fresh install of R 3.5.2 also didn't help:
>
> Here its startup messages, then a line testing the caret:
>
> > R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"
> > Copyright (C) 2018 The R Foundation for Statistical Computing
> > Platform: x86_64-apple-darwin15.6.0 (64-bit)
> >
> > R is free software and comes with ABSOLUTELY NO WARRANTY.
> > You are welcome to redistribute it under certain conditions.
> > Type 'license()' or 'licence()' for distribution details.
> >
> > Natural language support but running in an English locale
> >
> > R is a collaborative project with many contributors.
> > Type 'contributors()' for more information and
> > 'citation()' on how to cite R or R packages in publications.
> >
> > Type 'demo()' for some demos, 'help()' for on-line help, or
> > 'help.start()' for an HTML browser interface to help.
> > Type 'q()' to quit R.
> >
> > [R.app GUI 1.70 (7612) x86_64-apple-darwin15.6.0]
> >
> > [History restored from /Users/fb/.Rapp.history]
> >
> >> 2ˆ2
> > Error: unexpected input in "2À"
> >>
>
>
>
> Does anyone know how to get R (R.app) to interpret a caret as a caret?
>
> Thanks in advance,
>
>
>
> Franklin Bretschneider
> Utrecht University
> Utrecht, The Netherlands
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

< 6 7 8 9 10 11 12 13 14 15 >

1001 - 1100 of 5180 matches

Mail list logo