Re: [R] Resolving installed package updates warning

2018-08-10 Thread Bert Gunter
  What's C++11?

Google it!

-- Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Fri, Aug 10, 2018 at 9:53 AM, Rich Shepard 
wrote:

> On Fri, 10 Aug 2018, MacQueen, Don wrote:
>
> I would start by trying to install rgdal by itself, rather than as part of
>> a "batch" update. As in
>>
>>  Install.packages('rgdal')
>>
>
> Don,
>
>   rgdal was supposed to be installed, but you have a valid point.
>
> My expectation is that you will see a more complete error message specific
>> to rgdal, which presumably will provide a clue or pointer.
>>
>
>   Boy howdy! This is interesting:
>
> * installing *source* package ‘rgdal’ ...
> ** package ‘rgdal’ successfully unpacked and MD5 sums checked
> configure: R_HOME: /usr/lib/R
> configure: CC: gcc
> configure: CXX: g++
> configure: C++11 support available
> configure: rgdal: 1.3-4
> checking for /usr/bin/svnversion... yes
> configure: svn revision: 766
> checking for gdal-config... /usr/bin/gdal-config
> checking gdal-config usability... yes
> configure: GDAL: 2.3.0
> checking C++11 support for GDAL >= 2.3.0... yes
> checking GDAL version >= 1.11.4... yes
> checking gdal: linking with --libs only... no
> checking gdal: linking with --libs and --dep-libs... no
> In file included from /usr/include/gdal.h:45:0,
>  from gdal_test.cc:1:
> /usr/include/cpl_port.h:187:6: error: #error Must have C++11 or newer.
>  #error Must have C++11 or newer.
>   ^
> In file included from /usr/include/gdal.h:49:0,
>  from gdal_test.cc:1:
> /usr/include/cpl_minixml.h:202:47: error: expected template-name before
> '<' token
>  class CPLXMLTreeCloser: public std::unique_ptr CPLXMLTreeCloserDeleter>
>^
> /usr/include/cpl_minixml.h:202:47: error: expected '{' before '<' token
> /usr/include/cpl_minixml.h:202:47: error: expected unqualified-id before
> '<' token
> In file included from /usr/include/ogr_api.h:45:0,
>  from /usr/include/gdal.h:50,
>  from gdal_test.cc:1:
> /usr/include/ogr_core.h:79:28: error: expected '}' before end of line
> /usr/include/ogr_core.h:79:28: error: expected declaration before end of
> line
> In file included from /usr/include/gdal.h:45:0,
>  from gdal_test.cc:1:
> /usr/include/cpl_port.h:187:6: error: #error Must have C++11 or newer.
>  #error Must have C++11 or newer.
>   ^
> In file included from /usr/include/gdal.h:49:0,
>  from gdal_test.cc:1:
> /usr/include/cpl_minixml.h:202:47: error: expected template-name before
> '<' token
>  class CPLXMLTreeCloser: public std::unique_ptr CPLXMLTreeCloserDeleter>
>^
> /usr/include/cpl_minixml.h:202:47: error: expected '{' before '<' token
> /usr/include/cpl_minixml.h:202:47: error: expected unqualified-id before
> '<' token
> In file included from /usr/include/ogr_api.h:45:0,
>  from /usr/include/gdal.h:50,
>  from gdal_test.cc:1:
> /usr/include/ogr_core.h:79:28: error: expected '}' before end of line
> /usr/include/ogr_core.h:79:28: error: expected declaration before end of
> line
> configure: Install failure: compilation and/or linkage problems.
> configure: error: GDALAllRegister not found in libgdal.
> ERROR: configuration failed for package ‘rgdal’
> * removing ‘/usr/lib/R/library/rgdal’
> * restoring previous ‘/usr/lib/R/library/rgdal’
>
> The downloaded source packages are in
> ‘/tmp/RtmpVGIz3Q/downloaded_packages’
> Updating HTML index of packages in '.Library'
> Making 'packages.html' ... done
> Warning message:
> In install.packages("rgdal") :
>   installation of package ‘rgdal’ had non-zero exit status
>
>   Installed here are:
> gcc-5.5.0-i586-1_slack14.2
> gcc-g++-5.5.0-i586-1_slack14.2
> gcc-gfortran-5.5.0-i586-1_slack14.2
> gcc-gnat-5.5.0-i586-1_slack14.2
> gcc-go-5.5.0-i586-1_slack14.2
> gcc-java-5.5.0-i586-1_slack14.2
> gcc-objc-5.5.0-i586-1_slack14.2
> gccmakedep-1.0.3-noarch-1
>
> and
>
> gdal-2.3.0-i586-1_SBo
>
>   What's C++11?
>
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] searching for a specific row name in R

2018-08-13 Thread Bert Gunter
These seem to be basic R questions. You should spend time with an R
tutorial or two for this sort of thing. This list is here to help, but you
also need to do homework on your own if you have not already done so.

Cheers,
Bert






Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Mon, Aug 13, 2018 at 8:36 PM, Deepa  wrote:

> Hi Don,
>
> When there is a list of identifier names that I want to check, the only way
> is to loop over each entry stored in the list of identifier names or is
> there is there any other shortcut?
>
> Many thanks for the response?
>
> On Mon, Aug 13, 2018 at 8:18 PM, MacQueen, Don  wrote:
>
> > Or to return a logical value, i.e., TRUE if the column contains the
> value,
> > FALSE if it does not:
> >
> >   any( x[,2] == 'A501' )
> >
> > -Don
> > --
> > Don MacQueen
> > Lawrence Livermore National Laboratory
> > 7000 East Ave., L-627
> > Livermore, CA 94550
> > 925-423-1062
> > Lab cell 925-724-7509
> >
> >
> >
> > On 8/13/18, 12:09 AM, "R-help on behalf of Albrecht Kauffmann" <
> > r-help-boun...@r-project.org on behalf of alkau...@fastmail.fm> wrote:
> >
> > Hello Deepa,
> >
> > sum(x[,2] == "A501")
> > or
> > which(x[,2] == "A501")
> > .
> > Best,
> > Albrecht
> >
> >
> > --
> >   Albrecht Kauffmann
> >   alkau...@fastmail.fm
> >
> > Am Mo, 13. Aug 2018, um 07:10, schrieb Deepa Maheshvare:
> > > Hello Everyone,
> > >
> > > I have a 1000 x 20 matrix. The second column of the matrix has the
> > names
> > > of identifiers. How do I check when a certain identifier is present
> > in
> > > the set of 1000 identifier names present in the second column. For
> > > instance, let the names of identifiers be A1,A2,...A1000. I want to
> > > check whether A501 is present .How can this be checked?
> > >
> > > Any help will be highly appreciated.
> > >
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Request for help with R program

2018-08-14 Thread Bert Gunter
R has no "associates". It is open source software with many users and
developers with varying skill levels and interests.

I think you are in over your head ("inexperienced in computer programming")
and should seek local resources at Baylor to help you. This list probably
cannot provide the level of assistance you seek.


Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Mon, Aug 13, 2018 at 4:13 PM, Spencer Brackett <
spbracket...@saintjosephhs.com> wrote:

> Good evening,
>
>   I am a high school research student who is partnering with Baylor
> University (TX) on a Genomic research project, and was seeking to use the R
> program to analysis our data— which is from GDC database. R-3.5.1 is
> currently downloaded onto my Windows PC and I am looking to download the
> CGDS and GAIA packages, and then to subsequently downloaded and analyze the
> Genomic data we have extracted via the R packages. I attempted to utilize
> the various help utilities you provide, but I am rather inexperienced with
> computer programming and the R program as a whole. Therefore, I am
> requesting a screen sharing session and/or phone or email correspondence
> with one of your associates so to achieve my goals. Such would be very
> helpful to my and my mentors work, and our pending publication.
>
> Many thanks,
>
> Spencer Brackett
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Changing PDF orientation midstream

2018-08-14 Thread Bert Gunter
1. Probably not. But I'm no pdf expert.

2. This adobe thread may be relevant:

https://forums.adobe.com/thread/1091826

Cheers,
Bert





Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Aug 14, 2018 at 12:46 PM, Stats Student  wrote:

> Hi, I'm wondering whether it is possible to change the orientation of the
> PDF in the middle of the document. In other words, pages 1,2,3 - portrait,
> pages 4,5 - landscape, etc.
>
> This is how I call it -
>
> pdf (file, paper="US") or USr for landscape
>
>
> Thanks!
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Spline function

2018-08-14 Thread Bert Gunter
If I understand correctly, not in general possible.

Suppose for a bunch of different x's the y's are all constant =0. What x
would correspond to y = 1.

Or suppose (x,y) pairs trace a sine function over several periods. Then
there is no unique x corresponding to y = .5, say.

Perhaps if you more explicitly specified the nature of your problem (e.g.
is y monotonic in x?) some assistance might be provided.

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Aug 14, 2018 at 8:48 AM, Tania Morgado Garcia 
wrote:

>  Hello everyone. I'm new to R and I'm using spline functions. With the
> command splinefun (x, y) I get the function of interpolating the values x
> and y. Later, I can evaluate that function for values of x by obtaining the
> respective values of y. The point is that I need the inverse operation,
> with the function, for a value of Y I need to know the value of x. Could
> you please help me?
> A cordial greeting
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ordering of facet_wrap() panels

2018-08-15 Thread Bert Gunter
See ?factor.

You can either use ?ordered to create an ordered factor to sort the levels
as you desire or sort them with factor(). e.g.

> f <- factor(letters[3:1])
> f
[1] c b a
Levels: a b c   ## default ordering

> f <- factor(f, levels = letters[3:1])
> f
[1] c b a
Levels: c b a  ## explicit ordering

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Wed, Aug 15, 2018 at 7:21 AM, Stats Student 
wrote:

> Hi, I am generating multiple charts with facet_wrap() and what what I see,
> R/ggplot sorts the panels by the facet variable. So adding an index to the
> facet variable (1 - bucket, 2 - bucket, etc) does solve the sorting issue
> but it's ugly.
>
> I also read this post which, if I understand correctly, claims that ggplot
> should be using the initial ordering of the data for ordering the charts
> (instead of ordering the data itself).
>
> https://mvuorre.github.io/post/2016/order-ggplot-panel-plots/
>
> Wondering if anyone knows how to direct ggplot use the initial sorting of
> the data to order the panels.
>
> Thank you.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ordering of facet_wrap() panels

2018-08-15 Thread Bert Gunter
1. Unless there is good reason to keep a reply private, always cc the list.
This allows more brains, possible corrections, etc.

2. Have you read ?factor and ?unique ? Always study the docs carefully.
They are generally terse but complete, especially the base docs, and you
can often find your answers there.

3. Your "solution" may work in this case, but if I understand correctly
what you're after,  won't in general. unique() gives the unique values in
the order they appear, which may not be the order you want:

## want ordering to be "a" < "b" < "c"

> f <- rep(letters[3:1],2)

> factor(f, levels = unique(f))
[1] c b a c b a
Levels: c b a  ## not your desired order

Again, please consult the docs and perhaps a tutorial or two as necessary.

-- Bert



On Wed, Aug 15, 2018 at 8:22 AM, Stats Student 
wrote:

> Many thanks, Bert.
>
> I did -
>
> facet_wrap(~factor(var, levels=unique (var))
>
> And it seems to be working fine.
> Do you see any issues with this?
>
> I'm fairly new to R so want to make sure I'm not doing something stupid.
>
> Thanks again.
>
> On Wed, Aug 15, 2018, 7:50 AM Bert Gunter  wrote:
>
>> See ?factor.
>>
>> You can either use ?ordered to create an ordered factor to sort the
>> levels as you desire or sort them with factor(). e.g.
>>
>> > f <- factor(letters[3:1])
>> > f
>> [1] c b a
>> Levels: a b c   ## default ordering
>>
>> > f <- factor(f, levels = letters[3:1])
>> > f
>> [1] c b a
>> Levels: c b a  ## explicit ordering
>>
>> Cheers,
>> Bert
>>
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>> On Wed, Aug 15, 2018 at 7:21 AM, Stats Student <
>> stats.student4...@gmail.com> wrote:
>>
>>> Hi, I am generating multiple charts with facet_wrap() and what what I
>>> see, R/ggplot sorts the panels by the facet variable. So adding an index to
>>> the facet variable (1 - bucket, 2 - bucket, etc) does solve the sorting
>>> issue but it's ugly.
>>>
>>> I also read this post which, if I understand correctly, claims that
>>> ggplot should be using the initial ordering of the data for ordering the
>>> charts (instead of ordering the data itself).
>>>
>>> https://mvuorre.github.io/post/2016/order-ggplot-panel-plots/
>>>
>>> Wondering if anyone knows how to direct ggplot use the initial sorting
>>> of the data to order the panels.
>>>
>>> Thank you.
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exponential day

2018-08-15 Thread Bert Gunter
Please note that R^2 for nonlinear models is nonsense.

Search on "R^2 in nonlinear models" for details, e.g.

http://statisticsbyjim.com/regression/r-squared-invalid-nonlinear-regression/

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Aug 15, 2018 at 10:54 AM Francis Boateng <
francis.boat...@versantphysics.com> wrote:

> Thanks Ellison, I will try it.
>
> Francis
>
>
> -Original Message-
> From: S Ellison 
> Sent: Thursday, August 9, 2018 8:12 AM
> To: Francis Boateng ;
> r-help@r-project.org
> Subject: RE: exponential day
>
> > Please, how can I determine parameters from exponential equation
> > Example
> > one:  y = a*exp(-b*x);  how do I determine  a  and  b , as well as
> > R-square from data sets. And also fitting y = a*exp(-b*x) into the
> > data sets Assuming data sets A = (0,2,4,6,8,10) B =
> > (1,0.8,0.6,0.4,0.2,0.1)
>
> For least squares fitting, you could take logs and do a simple linear fit,
> if the resduals are reasonably homoscedastic in the log domain (or if you
> can sort the weighting out properly).
>
> For non-linear least squares, look at ?nlm, ?nls or (if you want to roll
> your own) ?optim
>
> For max likelihood, maybe nlme in the nlme package.
>
> For other ideas, look up 'non-linear fitting with R' on any search engine,
> or check the R Task Views
>
> S Ellison
>
>
>
> ***
> This email and any attachments are confidential. Any u...{{dropped:18}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting chr to num

2018-08-18 Thread Bert Gunter
u 12.6 is not an integer.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Aug 18, 2018 at 2:20 PM Jeff Reichman 
wrote:

> R-Help Forum
>
>
>
> How do I convert a chr variable that contains percentages to an integer
>
>
>
> Example 12.6% (chr) to 12.6 (int)
>
>
>
> Jeff
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotmath and logical operators?

2018-08-20 Thread Bert Gunter
This is clumsy and probably subject to considerable improvement, but does
it work for you:

left <- quote(x >= 3)
right <- quote(y <= 3) ## these can be anything

## the plot:
plot(1)
eval(substitute(mtext(expression(paste(left, " & ",right))), list(left =
left, right = right)))

## Expression evaluation
eval(substitute(with(df,left & right), list(left = left, right = right)))

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Aug 20, 2018 at 2:00 PM MacQueen, Don via R-help <
r-help@r-project.org> wrote:

> I would like to use plotmath to annotate a plot with an expression that
> includes a logical operator.
>
> ## works well
> tmp <- expression(x >= 3)
> plot(1)
> mtext(tmp)
>
> ## not so well
> tmp <- expression(x >= 3 &  y <= 3)
> plot(1)
> mtext(tmp)
>
> Although the text that's displayed makes sense, it won't be obvious to my
> non-mathematical audience.
>
> I'd appreciate suggestions.
>
>
> I've found a work-around that gets the annotation to look right
>   tmpw <- expression(paste( x >= 3, " & ", y <= 3) )
>   plot(1)
>   mtext(tmpw)
>
>
> But it breaks my original purpose, illustrated by this example:
>
> df <- data.frame(x=1:5, y=1:5)
> tmp <- expression(x >= 3 & y <= 3)
> tmpw <- expression(paste( x >= 3, " & ", y <= 3) )
> with(df, eval(tmp))
> [1] FALSE FALSE  TRUE FALSE FALSE
> with(df, eval(tmpw))
> [1] "FALSE  &  TRUE" "FALSE  &  TRUE" "TRUE  &  TRUE"  "TRUE  &  FALSE"
> "TRUE  &  FALSE"
>
> Thanks
> -Don
>
> --
> Don MacQueen
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
> Lab cell 925-724-7509
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotmath and logical operators?

2018-08-20 Thread Bert Gunter
As I understand it, the problem is:

"A mathematical expression must obey the normal rules of syntax for
any *R* expression,
but it is interpreted according to very different rules than for normal *R*
expressions."

I believe this means that you cannot do what you wanted to using plotmath.

Cheers,
Bert


On Mon, Aug 20, 2018 at 4:14 PM MacQueen, Don  wrote:

> Thanks Bert!
>
>
>
> It certainly works for the example (and shows a much deeper understanding
> of eval, substitute, etc. than I have). But it doesn't appear to generalize
> very well in the way I need (which of course I didn't think of mentioning
> until after I sent the email -- sorry).
>
>
>
> Suppose subs is any expression that would be valid for the subset argument
> of base::subset, for a given data frame. Then I can extract that subset of
> the data frame by using
>
>mydf[  with(mydf, eval(subs)) ,  ]
>
> (or similar).
>
>
>
> Then, having plotted some aspect of that subset, I want to annotate the
> plot with the subset specifications.
>
>
>
> I've used this approach to  set up a system that helps me to interactively
> review various subsets of a large set of data. I save the final selected
> subsetting expressions in some sort of data structure, for later use in
> preparing a report using rmarkdown.
>
>
>
> I was hoping to use plotmath to improve the appearance of the annotations
> -- but I now think it's not worth this kind of effort. I think I'm going to
> settle for mtext( as.character(subs) ).
>
>
>
> -Don
>
>
>
> --
>
> Don MacQueen
>
> Lawrence Livermore National Laboratory
>
> 7000 East Ave., L-627
>
> Livermore, CA 94550
>
> 925-423-1062
>
> Lab cell 925-724-7509
>
>
>
>
>
>
>
> *From: *Bert Gunter 
> *Date: *Monday, August 20, 2018 at 3:38 PM
> *To: *"MacQueen, Don" 
> *Cc: *array R-help 
> *Subject: *Re: [R] plotmath and logical operators?
>
>
>
> This is clumsy and probably subject to considerable improvement, but does
> it work for you:
>
>
>
> left <- quote(x >= 3)
> right <- quote(y <= 3) ## these can be anything
>
>
>
> ## the plot:
>
> plot(1)
>
> eval(substitute(mtext(expression(paste(left, " & ",right))), list(left =
> left, right = right)))
>
>
>
> ## Expression evaluation
>
> eval(substitute(with(df,left & right), list(left = left, right = right)))
>
> Cheers,
>
> Bert
>
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
>
>
>
> On Mon, Aug 20, 2018 at 2:00 PM MacQueen, Don via R-help <
> r-help@r-project.org> wrote:
>
> I would like to use plotmath to annotate a plot with an expression that
> includes a logical operator.
>
> ## works well
> tmp <- expression(x >= 3)
> plot(1)
> mtext(tmp)
>
> ## not so well
> tmp <- expression(x >= 3 &  y <= 3)
> plot(1)
> mtext(tmp)
>
> Although the text that's displayed makes sense, it won't be obvious to my
> non-mathematical audience.
>
> I'd appreciate suggestions.
>
>
> I've found a work-around that gets the annotation to look right
>   tmpw <- expression(paste( x >= 3, " & ", y <= 3) )
>   plot(1)
>   mtext(tmpw)
>
>
> But it breaks my original purpose, illustrated by this example:
>
> df <- data.frame(x=1:5, y=1:5)
> tmp <- expression(x >= 3 & y <= 3)
> tmpw <- expression(paste( x >= 3, " & ", y <= 3) )
> with(df, eval(tmp))
> [1] FALSE FALSE  TRUE FALSE FALSE
> with(df, eval(tmpw))
> [1] "FALSE  &  TRUE" "FALSE  &  TRUE" "TRUE  &  TRUE"  "TRUE  &  FALSE"
> "TRUE  &  FALSE"
>
> Thanks
> -Don
>
> --
> Don MacQueen
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
> Lab cell 925-724-7509
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] differing behavior of mean(), median() and sd() with na.rm

2018-08-22 Thread Bert Gunter
Actually, the dissonance is a bit more basic.

After xxx(, na.rm=TRUE) with all NA's in ... you have numeric(0). So
what you see is actually:

> z <- numeric(0)
> mean(z)
[1] NaN
> median(z)
[1] NA
> sd(z)
[1] NA
> sum(z)
[1] 0
etc.

I imagine that there may be more of these little inconsistencies due to the
organic way R evolved over time. What the conventions should be  can be
purely a matter of personal opinion in the absence of accepted standards.
But I would look to see what accepted standards were, if any, first.

-- Bert


On Wed, Aug 22, 2018 at 7:34 AM Ivan Calandra  wrote:

> Dear useRs,
>
> I have just noticed that when input is only NA with na.rm=TRUE, mean()
> results in NaN, whereas median() and sd() produce NA. Shouldn't it all
> be the same? I think NA makes more sense than NaN in that case.
>
> x <- c(NA, NA, NA) mean(x, na.rm=TRUE) [1] NaN median(x, na.rm=TRUE) [1]
> NAsd(x, na.rm=TRUE) [1] NA
>
> Thanks for any feedback.
>
> Best,
> Ivan
>
> --
> Dr. Ivan Calandra
> TraCEr, laboratory for Traceology and Controlled Experiments
> MONREPOS Archaeological Research Centre and
> Museum for Human Behavioural Evolution
> Schloss Monrepos
> 56567 Neuwied, Germany
> +49 (0) 2631 9772-243
> https://www.researchgate.net/profile/Ivan_Calandra
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] differing behavior of mean(), median() and sd() with na.rm

2018-08-22 Thread Bert Gunter
... And FWIW (not much, I agree), note that if z = numeric(0) and sum(z) =
0, then mean(z) = NaN makes sense, as length(z) = 0, so dividing by 0 gives
NaN. So you can see the sorts of issues you may need to consider.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Aug 22, 2018 at 7:47 AM Bert Gunter  wrote:

> Actually, the dissonance is a bit more basic.
>
> After xxx(, na.rm=TRUE) with all NA's in ... you have numeric(0). So
> what you see is actually:
>
> > z <- numeric(0)
> > mean(z)
> [1] NaN
> > median(z)
> [1] NA
> > sd(z)
> [1] NA
> > sum(z)
> [1] 0
> etc.
>
> I imagine that there may be more of these little inconsistencies due to
> the organic way R evolved over time. What the conventions should be  can be
> purely a matter of personal opinion in the absence of accepted standards.
> But I would look to see what accepted standards were, if any, first.
>
> -- Bert
>
>
> On Wed, Aug 22, 2018 at 7:34 AM Ivan Calandra  wrote:
>
>> Dear useRs,
>>
>> I have just noticed that when input is only NA with na.rm=TRUE, mean()
>> results in NaN, whereas median() and sd() produce NA. Shouldn't it all
>> be the same? I think NA makes more sense than NaN in that case.
>>
>> x <- c(NA, NA, NA) mean(x, na.rm=TRUE) [1] NaN median(x, na.rm=TRUE) [1]
>> NAsd(x, na.rm=TRUE) [1] NA
>>
>> Thanks for any feedback.
>>
>> Best,
>> Ivan
>>
>> --
>> Dr. Ivan Calandra
>> TraCEr, laboratory for Traceology and Controlled Experiments
>> MONREPOS Archaeological Research Centre and
>> Museum for Human Behavioural Evolution
>> Schloss Monrepos
>> 56567 Neuwied, Germany
>> +49 (0) 2631 9772-243
>> https://www.researchgate.net/profile/Ivan_Calandra
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice barchart() with two variables

2018-08-22 Thread Bert Gunter
No reproducible example (see posting guide below) so minimal help.

Remove the quotes from your formula. Why did you think they should be
there? -- see ?formula.

Read the relevant portions of ?xyplot carefully (again?). You seemed to
have missed:

"*Primary variables:* The x and y variables should both be numeric in xyplot,
and an attempt is made to coerce them if not. However, if either is a
factor, the levels of that factor are used as axis labels. In the other
four functions documented here, [ which includes barchart()]  **exactly one
of x and y should be numeric, and the other a factor or shingle**. Which of
these will happen is determined by the horizontal argument — if
horizontal=TRUE, then y will be coerced to be a factor or shingle, otherwise
 x. The default value of horizontal is FALSE if x is a factor or shingle,
TRUEotherwise. (The functionality provided by horizontal=FALSE is not
S-compatible.)

So with the default ... horizontal = FALSE, Med would be treated as a
factor, which I think is precisely the opposite of what you want.

Here is a simple example to indicate how things work:

y <- runif(5)
x <- factor(letters[1:5])
barchart(y~x)

As for fiddling with the colors and patterns of the bars -- generally a bad
idea , especially fill patterns, btw -- see the "col" argument of
?panel.barchart, which is always where you should look for such info (i.e.
panel.whatever). I don't know whether you can fool with fill patterns* --
it may depend on your graphics device -- but you can google around or see
what trellis.par.get() has available (which can be specified in the
"par.settings" argument list in the call).

* For why fooling with fill patterns is a bad idea, google "moiré patterns".

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Aug 22, 2018 at 8:13 AM Rich Shepard 
wrote:

>I've not before created bar charts, only scatter plots and box plots.
> Checking in Deepayan's book, searching the web, and looking at ?barchart
> has
> not shown me the how to get the results I need.
>
>The dataframe looks like this:
> > head(stage_heights)
>Year   Med   Max
> 1 1989 91.17 93.32
> 2 1990 91.22 93.43
> 3 1991 91.24 92.89
> 4 1993 91.14 93.02
> 5 1994 93.92 95.74
> 6 1995 94.34 96.85
>
>I want to show Med and Max heights for each Year with each bar having a
> different color (or pattern) and a single x-axis year label.
>
>Trying to follow the example in ?barchart for a single variable
> produced this:
>
> > barchart('Year' ~ 'Med', data=stage_height,
> panel=lattice.getOption('panel.barchart'),
> default.prepanel=lattice.getOption('prepanel.default.barchart'),box.ratio=2)
> Error in eval(substitute(groups), data, environment(formula)) :
>invalid 'envir' argument of type 'closure'
> and no plot was displayed.
>
>I must be missing the obvious and want a pointer to descriptions that
> teach me how to produce bar charts.
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice barchart() with two variables

2018-08-22 Thread Bert Gunter
See inline.

-- Bert



On Wed, Aug 22, 2018 at 9:17 AM Rich Shepard 
wrote:

> On Wed, 22 Aug 2018, Bert Gunter wrote:
>
> > No reproducible example (see posting guide below) so minimal help.
>
> Hi Bert,
>
>I thought the header and six data rows of the dataframe plus the syntax
> of
> the command I used were sufficient. Regardless, here's the dput() output:
>
> structure(list(Year = c(1989L, 1990L, 1991L, 1993L, 1994L, 1995L,
> 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L,
> 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2013L,
> 2014L, 2015L, 2016L, 2017L, 2018L), Med = c(91.17, 91.22, 91.24,
> 91.14, 93.92, 94.34, 91.32, 91.36, 91.24, 94.33, 94.33, 94, 94.32,
> 94.02, 94.19, 94.05, 94.21, 94.21, 94.32, 94.13, 94.27, 94.34,
> 94.23, 94.25, 94.15, 94.01, 94.09, 94.31, 94.35), Max = c(93.32,
> 93.43, 92.89, 93.02, 95.74, 96.85, 95.86, 94.25, 93.67, 97.42,
> 97.42, 94.99, 96.58, 96.57, 96.32, 95.96, 97.4, 97.28, 96.72,
> 97.43, 95.95, 97.82, 97, 96.6, 96.24, 96.68, 96.96, 96.39, 96.95
> )), class = "data.frame", row.names = c(NA, -29L))
>
>
> > Remove the quotes from your formula. Why did you think they should be
> > there? -- see ?formula.
>
>A prior attempt seemed to suggest the strings needed to be quoted.
>
> > Read the relevant portions of ?xyplot carefully (again?). You seemed to
> > have missed:
>
>I'm trying to create a barchart, not an xyplot.
>

Please see ?xyplot, where you will also see dotplot, barchart, etc.
documented !

>
> > y <- runif(5)
> > x <- factor(letters[1:5])
> > barchart(y~x)
>
>Okay. I see one error in my command that's fixed here:
>
> barchart(stage_heights$Med ~ stage_heights$Year, horizontal=FALSE)
>
> > As for fiddling with the colors and patterns of the bars -- generally a
> bad
> > idea , especially fill patterns, btw -- see the "col" argument of
> > ?panel.barchart, which is always where you should look for such info
> (i.e.
> > panel.whatever). I don't know whether you can fool with fill patterns* --
> > it may depend on your graphics device -- but you can google around or see
> > what trellis.par.get() has available (which can be specified in the
> > "par.settings" argument list in the call).
>
>I need pairs of bars, one each for Med and Max for each year. Color or
> pattern would distinguish the two.
>

?xyplot tells you about the "groups" argument that does exactly this.
Again, please read the relevant sections of ?xyplot carefully.


> > * For why fooling with fill patterns is a bad idea, google "moiré
> patterns".
>
>I did not think that a solid fill or striped fill would create a moire
> pattern on either a computer screen viewing a .pdf file or on the printed
> page.
>

I agree. But color alone usually is the better classifier and suffices; in
black and white, light gray vs. black would work as well for just two
categories I think.



>
>Correcting the barchard() command fixed the main issue; getting the
> second
> set of bars is still eluding me, but I'll continue working on fixing this.
> I'll get the years as the x-axis labels rather than year number in sequence
> from 1 to 29.
>
> Thanks,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice barchart() with two variables

2018-08-22 Thread Bert Gunter
(I know that you said your post may already be "out of date", but ...)

"   Despite additional reading of barchart() examples and help pages I'm
still
missing how to get grouping working and use the years in the dataframe as
labels on the x-axis."

But ?barchart says:
"Formally, if groups is specified, then groups along with subscripts is
passed to the panel function, ..."

which, as I already told you, means you should consult ?panel.barchart . In
particular, the example therein tells you exactly how the "groups" argument
should be specified and how it works (you can change colors via the "col"
argument, of course). Note, in particular, that "groups" must be your
grouping variable, which means, in particular, that you need to reformat
your data frame in what is currently referred to as "tidy" format (aka
"long" format as opposed to "wide") -- one variable per column, one
observation per row.  That is:

Year Value   Summary.Type
199191.24   "Med"
199192.89   "Max"
... etc.

 groups = Summary.Type, ...
in your call will then do the job.

As an aside, this is a good example of why you should adhere to this format
for data analysis in R.

Cheers,
Bert






Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Aug 22, 2018 at 10:34 AM Rich Shepard 
wrote:

> On Wed, 22 Aug 2018, Rich Shepard wrote:
>
> > Correcting the barchard() command fixed the main issue; getting the
> second
> > set of bars is still eluding me, but I'll continue working on fixing
> this.
> > I'll get the years as the x-axis labels rather than year number in
> > sequence from 1 to 29.
>
>Despite additional reading of barchart() examples and help pages I'm
> still
> missing how to get grouping working and use the years in the dataframe as
> labels on the x-axis.
>
>The most recent command version (on the dput output in my previous
> message) is:
>
> med_max <- barchart(stage_heights$Med ~ stage_heights$Year,
> horizontal=FALSE, col = 'black',
>  main = 'Median and Maximum Stage Heights\nUSGS Gauge',
>  ylab = 'Elevation (masl)', xlab = 'Year', groups=TRUE,
>  beside=TRUE, panel = "panel.superbar", prepanel =
> "prepanel.superbar",)
> print(med_max)
>
>I don't think that conditioning into a trellis applies to this barchart
> and I'm not relating the use of scales and labels in a conditioned plot to
> the barchart.
>
>The above command yields an error and I've not found the explanation for
> it:
>
> Error in get(fun, mode = "function", envir = parent.frame()) :
>object 'panel.superbar' of mode 'function' was not found
>
> so I'm definitely not getting the command syntax correct. Help's still
> needed.
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graphing repeated curves

2018-08-22 Thread Bert Gunter
I do not think this does what the OP wants -- it does not produce
polynomials of the form desired.

John Fox's solution using poly() seems to me to be the right approach, but
I will show what I think is a considerably simpler way to build up the
polynomial expressions just as an example of one way to do this sort of
thing in more general circumstances:

fm <- vector("character",6)
fm[1]<- "mpg ~ hp"
for(i in 2:6)fm[i]<- paste0(fm[i-1]," + I(hp^", i,")")
## yielding:
> fm
[1] "mpg ~ hp"
[2] "mpg ~ hp + I(hp^2)"
[3] "mpg ~ hp + I(hp^2) + I(hp^3)"
[4] "mpg ~ hp + I(hp^2) + I(hp^3) + I(hp^4)"
[5] "mpg ~ hp + I(hp^2) + I(hp^3) + I(hp^4) + I(hp^5)"
[6] "mpg ~ hp + I(hp^2) + I(hp^3) + I(hp^4) + I(hp^5) + I(hp^6)"

Although fm is a character vector, the character strings will be
automatically coerced by lm to formulas (see ?lm), so, e.g.

results <- lapply(fm, lm,data = mtcars)

would yield a list of regressions which could then be summarized, plotted
or whatever (again using lapply). e.g.

> results[[3]]

Call:
FUN(formula = X[[i]], data = ..1)

Coefficients:
(Intercept)   hp  I(hp^2)  I(hp^3)
  4.422e+01   -2.945e-019.115e-04   -8.701e-07

One could also choose to do the plotting or whatever within the lapply
call, but I prefer to keep things simple if possible.

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Aug 22, 2018 at 4:43 PM Jim Lemon  wrote:

> Hi Richard,
> This may be what you want:
>
> data(mtcars)
> m<-list()
> for(i in 1:6) {
>  rhterms<-paste(paste0("I(hp^",1:i,")"),sep="+")
>  lmexp<-paste0("lm(mpg~",rhterms,",mtcars)")
>  cat(lmexp,"\n")
>  m[[i]]<-eval(parse(text=lmexp))
> }
> plot(mpg~hp,mtcars,type="n")
> for(i in 1:6) abline(m[[i]],col=i)
>
> Jim
>
>
> On Thu, Aug 23, 2018 at 9:07 AM, Richard Sherman 
> wrote:
> > Hi all,
> >
> > I have a simple graphing question that is not really a graphing
> question, but a question about repeating a task.
> >
> > I’m fiddling with some of McElreath’s Statistical Rethinking, and
> there’s a graph illustrating extreme overfitting (a number of polynomial
> terms in x equal to the number of observations), a subject I know well
> having taught it to grad students for many years.
> >
> > The plot I want to reproduce has, in effect:
> >
> > m1 <- lm( y ~ x)
> > m2 <- lm( y ~ x + x^2)
> >
> > …etc., through lm( y ~ x + x^2 + x^3 + x^4 + x^5 + x^6 ), followed by
> some plot() or lines() or ggplot2() call to render the data and fitted
> curves.
> >
> > Obviously I don’t want to run such regressions for any real purpose, but
> I think it might be useful to learn how to do such a thing in R without
> writing down each lm() call individually. It’s not obvious where I’d want
> to apply this, but I like learning how to repeat things in a compact way.
> >
> > So, something like:
> >
> > data( mtcars )
> > d <- mtcars
> > v <- c( 1 , 2 , 3 , 4 , 5 , 6  )
> > m1 <- lm( mpg ~ hp  , data = d )
> >
> > and then somehow use for() with an index or some flavor of apply() with
> the vector v to repeat this process yielding
> >
> > m2 <- lm( mpg ~ hp + I( hp ^2 ) , data=d)
> > m3 <- lm( mpg ~ hp + I( hp^2 ) + I(hp^3) , data=d )
> >
> > … and the rest through m6 <- lm( mpg ~ hp + I(hp^2) + I(hp^3) + I(hp^4)
> + I(hp^5) + I(hp^6) , data=d )
> >
> > But finding a way to index these values including not just each value
> but each value+1 , then value+1 and value+2, and so on escapes me.
> Obviously I don’t want to include index values below zero.
> >
> > ===
> > Richard Sherman
> > rss@gmail.com
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unclear about the output from summary of ca.jo from package urca

2018-08-23 Thread Bert Gunter
This is about statistics , not R programming, and so is off topic here.
Your first port of call for this sort of thing should be the package docs,
**including any references** . There are references given. Have you studied
them??

Cheers,

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Aug 23, 2018 at 2:12 AM Ashim Kapoor  wrote:

> Dear All,
>
> I am not sure about the summary of the function ca.jo. I have posted my
> query here :-
>
>
> https://stats.stackexchange.com/questions/363188/interpreting-the-names-used-in-the-output-of-johansen-test-in-package-urca-in-r
>
> I did not receive any reply so I am posting my query here.
>
> Many thanks and best regards,
> Ashim
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple counters in a single for loop

2018-08-24 Thread Bert Gunter
Sort of, but you typically wouldn't need to in R because of vectorization,
which buries the iteration in the underlying C code. Here's an example that
may clarify what I mean:

x <- cbind(1:5,6:10)
x ## a 2 column matrix
## get squares of all elements of x
## method 1
m1 <-x^2

##method 2: square the column vectors
m2 <- x
for (i in 1:2)m2[,i] <- m2[,i]^2
identical(m1,m2)
## of course, one could do this by row vectors, too

## method 3: loop through each element
m3 <- x
ix <- as.matrix(expand.grid(1:5,1:2))
ix
m3[ix]^2 ## matrix indexing of an array. This produces a vector,though.

Note also that there is an "iterators" package in R which implements
python-like iterators.I don't know how efficient it is, however.

My overall advice would be that you should try to program in R's native
paradigms, which emphasize whole object manipulation through vectorization,
rather than trying to use Python's, especially if efficiency is a
consideration. Feel free to ignore of course.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Aug 24, 2018 at 6:44 AM Deepa  wrote:

> Hello,
>
> Is there an option to include multiple counters in a single for loop in R?
>
> For instance, in python there is
>
> for i,j in zip(x,range(0,len(x))):
>
>
> Any suggestions?
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need some help with data-wrangling in R

2018-08-24 Thread Bert Gunter
" list of network blocks till 3rd octet:"

This is incomprehensible to me. If that is so for others, also, I suggest
that you provide a reproducible example (see posting guide) to explain what
you mean.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Aug 24, 2018 at 9:41 AM Amit Govil  wrote:

> Hi,
>
> I have log data in which one of the columns have IP ranges and the next
> column is corresponding ports. Eg:
>
>
> IPRange Port
> 10.78.64.0-10.78.66.255 D, A, C
>
> I need to expand the IPRange column into a list of network blocks till 3rd
> octet:
>
> IPRange IP Port
> 192.100.176.0-192.100.179.255 192.100.176.0/24 A, B, C
> 192.100.176.0-192.100.179.255 192.100.177.0/24 A, B, C
> 192.100.176.0-192.100.179.255 192.100.178.0/24 A, B, C
> 192.100.176.0-192.100.179.255 192.100.179.0/24 A, B, C
>
> How do I do this data transformation in R?
>
> Please assist.
>
> Thanks
> Amit
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice barchart() with two variables

2018-08-24 Thread Bert Gunter
For the legend, you can use the full "key" argument for more control. The
docs in ?xyplot for "key" Should answer your questions.  "col" controls
text color within the "text" component and rectangle color within the
"rectangle" component , for example. I think this should work as an
alternative to specifying the par.settings components, but I haven't
checked.

For the scales, again, the docs provide the answer:  the "at" and "labels"
components of "x" component of the scales lists can explicitly control the
x -labels, e.g.

scales = list( x = list( at = ..., labels = ...)etc.

If you are uncomfortable with the R lattice help docs, and you intend to
continue to use lattice plots (a good idea; ggplot is an alternative of
course), Deepayan has written a book that you might wish to get:

http://lmdvr.r-forge.r-project.org/figures/figures.html

There are also numerous web tutorials.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Aug 24, 2018 at 12:38 PM Rich Shepard 
wrote:

> On Wed, 22 Aug 2018, Rich Shepard wrote:
>
> >  More when I have results.
>
>Almost there. I've read the auto.key section in ?barchart and looked at
> examples from stackoverflow on the web without seeing my syntax errors. I
> would like help on two issues:
>
>1. What I want is to have the legend text in black and the colored
> rectangles match the black and grey of the bars. Instead, I get the legend
> text colored and have no idea where the default colors in the rectangles
> got
> there.
>
>2. I've not found how to have the years (rather than the sequence of
> years) as the x-axis labels.
>
>Here are the dput() output and the script:
>
> structure(list(year = c(1989L, 1989L, 1990L, 1990L, 1991L, 1991L,
> 1993L, 1993L, 1994L, 1994L, 1995L, 1995L, 1996L, 1996L, 1997L,
> 1997L, 1998L, 1998L, 1999L, 1999L, 2000L, 2000L, 2001L, 2001L,
> 2002L, 2002L, 2003L, 2003L, 2004L, 2004L, 2005L, 2005L, 2006L,
> 2006L, 2007L, 2007L, 2008L, 2008L, 2009L, 2009L, 2010L, 2010L,
> 2011L, 2011L, 2012L, 2012L, 2013L, 2013L, 2014L, 2014L, 2015L,
> 2015L, 2016L, 2016L, 2017L, 2017L, 2018L, 2018L), value = c(91.17,
> 93.32, 91.22, 93.43, 91.24, 92.89, 91.14, 93.02, 93.92, 95.74,
> 94.34, 96.85, 91.32, 95.86, 91.36, 94.25, 91.24, 93.67, 94.33,
> 97.42, 94.33, 97.42, 94, 94.99, 94.32, 96.58, 94.02, 96.57, 94.19,
> 96.32, 94.05, 95.96, 94.21, 97.4, 94.21, 97.28, 94.32, 96.72,
> 94.13, 97.43, 94.27, 95.95, 94.34, 97.82, 94.23, 97, 94.25, 96.6,
> 94.15, 96.24, 94.01, 96.68, 94.09, 96.96, 94.31, 96.39, 94.35,
> 96.95), type = structure(c(2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
> 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
> 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
> 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
> 1L), .Label = c("Max", "Med"), class = "factor")), class = "data.frame",
> row.names = c(NA,
> -58L))
>
> med_max <- barchart(value ~ year, data=stage_heights,
>  panel = lattice.getOption("panel.barchart"),
>  default.prepanel =
> lattice.getOption("prepanel.default.barchart"),
>  box.ratio = 2, horizontal=FALSE,
> auto.key=list(space='right',
>
> col=c('black', 'grey')),
>  groups=factor(type,labels=c('Median','Maximum')),
> beside=TRUE,
>  col = c('grey','black'),
> labels=list(c(1989,1990,1991,1992, 1993,1994,
>
> 1995,1996,1997,1998,1999,2000,2001,
>
> 2002,2003,2004,2005,2006,2007,2008,
>
> 2009,2010,2011,2012,2013,2014,2015,
>
> 2016,2017,2018),
>
> scales=list(x=list(rot=90)),
>   main = 'Median
> and Maximum Stage Heights',
>   ylab =
> 'Elevation (masl)', xlab = 'Year')
> print(med_max)
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with DNA Methylation Analysis

2018-08-26 Thread Bert Gunter
You should probably post this on the Bioconductor list rather then here, as
you would more likely find the expertise you seek there. You are using
Bioconductor packages after all.

https://support.bioconductor.org/

Cheers,
Bert


On Sun, Aug 26, 2018 at 2:09 PM Spencer Brackett <
spbracket...@saintjosephhs.com> wrote:

> Good evening,
>
>   I am attempting to run the following analysis on TCGA data, however
> something is being reported as an error in my arguments... any ideas as to
> what is incorrect in the following? Thanks!
>
> 1 library(TCGAbiolinks)
> 2
> 3 # Download the DNA methylation data: HumanMethylation450 LGG and GBM.
> 4 path <– "."
> 5
> 6 query.met <– TCGAquery(tumor = c("LGG","GBM"),"HumanMethylation450",
> level = 3)
> 7 TCGAdownload(query.met, path = path )
> 8 met <– TCGAprepare(query = query.met,dir = path,
> 9  add.subtype = TRUE, add.clinical = TRUE,
> 10summarizedExperiment = TRUE,
> 11  save = TRUE, filename = "lgg_gbm_met.rda")
> 12
> 13 # Download the expression data: IlluminaHiSeq_RNASeqV2 LGG and GBM.
> 14 query.exp <– TCGAquery(tumor = c("lgg","gbm"), platform =
> "IlluminaHiSeq_
> RNASeqV2",level = 3)
> 15
> 16 TCGAdownload(query.exp,path = path, type = "rsem.genes.normalized_
> results")
> 17
> 18 exp <– TCGAprepare(query = query.exp, dir = path,
> 19summarizedExperiment = TRUE,
> 20  add.subtype = TRUE, add.clinical = TRUE,
> 21type = "rsem.genes.normalized_results",
> 22  save = T,filename = "lgg_gbm_exp.rda")
>
> To download data on DNA methylation and gene expression…
>
> 1 library(summarizedExperiment)
> 2 # get expression matrix
> 3 data <– assay(exp)
> 4
> 5 # get sample information
> 6 sample.info <– colData(exp)
> 7
> 8 # get genes information
> 9 genes.info <– rowRanges(exp)
>
> Following stepwise procedure for obtaining GBM and LGG clinical data…
>
> 1 # get clinical patient data for GBM samples
> 2 gbm_clin <– TCGAquery_clinic("gbm","clinical_patient")
> 3
> 4 # get clinical patient data for LGG samples
> 5 lgg_clin <– TCGAquery_clinic("lgg","clinical_patient")
> 6
> 7 # Bind the results, as the columns might not be the same,
> 8 # we will plyr rbind.fill , to have all columns from both files
> 9 clinical <– plyr::rbind.fill(gbm_clin ,lgg_clin)
> 10
> 11 # Other clinical files can be downloaded,
> 12 # Use ?TCGAquery_clinic for more information
> 13 clin_radiation <– TCGAquery_clinic("lgg","clinical_radiation")
> 14
> 15 # Also, you can get clinical information from different tumor types.
> 16 # For example sample 1 is GBM, sample 2 and 3 are TGCT
> 17 data <– TCGAquery_clinic(clinical_data_type = "clinical_patient",
> 18samples = c("TCGA-06-5416-01A-01D-1481-05",
> 19  "TCGA-2G-AAEW-01A-11D-A42Z-05",
> 20  "TCGA-2G-AAEX-01A-11D-A42Z-05"))
>
>
> # Searching idat file for DNA methylation
> query <- GDCquery(project = "TCGA-GBM",
>  data.category = "Raw microarray data",
>  data.type = "Raw intensities",
>  experimental.strategy = "Methylation array",
>  legacy = TRUE,
>  file.type = ".idat",
>  platform = "Illumina Human Methylation 450")
>
> **Repeat for LGG**
>
> To access mutational information concerning TMZ methylation…
>
> > mutation <– TCGAquery_maf(tumor = "lgg")
> 2   Getting maf tables
> 3   Source: https://wiki.nci.nih.gov/display/TCGA/TCGA+MAF+Files
> 4   We found these maf files below:
> 5   MAF.File.Name
> 6   2 hgsc.bcm.edu_LGG.IlluminaGA_DNASeq.1.somatic.maf
> 7
> 8   3 LGG_FINAL_ANALYSIS.aggregated.capture.tcga.uuid.curated.somatic.maf
> 9
> 10   Archive.Name Deploy.Date
> 11   2 hgsc.bcm.edu_LGG.IlluminaGA_DNASeq_automated.Level_2.1.0.0
>   10-DEC-13
> 12   3 broad.mit.edu_LGG.IlluminaGA_DNASeq_curated.Level_2.1.3.0
>  24-DEC-14
> 13
> 14   Please, select the line that you want to download: 3
>
> **Repeat this for GBM***
>
> Selecting specified lines to download…
>
> 1 gbm.subtypes <− TCGAquery_subtype(tumor = "gbm")
> 2 lgg.subtypes <− TCGAquery_subtype(tumor = "lgg”)
>
>
>
> Downloading data via the Bioconductor package RTCGAtoolbox…
>
> library(RTCGAToolbox)
> 2
> 3 # Get the last run dates
> 4 lastRunDate <− getFirehoseRunningDates()[1]
> 5 lastAnalyseDate <− getFirehoseAnalyzeDates(1)
> 6
> 7 # get DNA methylation data, RNAseq2 and clinical data for LGG
> 8 lgg.data <− getFirehoseData(dataset = "LGG",
> 9   gistic2_Date = getFirehoseAnalyzeDates(1), runDate = lastRunDate,
> 10   Methylation = TRUE, RNAseq2_Gene_Norm = TRUE, Clinic = TRUE,
> 11   Mutation = T,
> 12   fileSizeLimit = 1)
> 13
> 14 # get DNA methylation data, RNAseq2 and clinical data for GBM
> 15 gbm.data <− getFirehoseData(dataset = "GBM",
> 16   runDate = lastDate, gistic2_Date = getFirehoseAnalyzeDates(1),
> 17   Methylation = TRUE, Clinic = TRUE, RNAseq2_Gene_Norm = TRUE,
> 18   fileSi

Re: [R] Warning: unable to access index for repository...

2018-08-27 Thread Bert Gunter
The main CRAN repository is at:
https://cran.r-project.org/

A full list of repositories can be found under the "Mirrors" link there.

Cheers,
Bert



On Mon, Aug 27, 2018 at 1:19 PM Tully Holmes  wrote:

> Good afternoon,
>
> I'm trying to install a package with the "install.packages" command in
> RGUI, and get the following error message:
>
>
> > install.packages ("tidyverse")
> Warning: unable to access index for repository
> https://mran.microsoft.com/snapshot/2017-05-01/src/contrib:
>   cannot open URL ''
> Warning: unable to access index for repository
> https://mran.microsoft.com/snapshot/2017-05-01/bin/windows/contrib/3.3:
>   cannot open URL '
>
> https://mran.microsoft.com/snapshot/2017-05-01/bin/windows/contrib/3.3/PACKAGES
> '
> Warning message:
> package ‘tidyverse’ is not available (for R version 3.3.3)
>
>
> Is there any documentation available that might document all the
> destination IP addresses that are involved when the "install.packages"
> command is ran?  Our R server is in a part of the state network that has
> limited access to the internet, so I need to have outbound openings made in
> the firewall that will let the install.packages command access what it
> needs to access .  Earlier I have coordinated with the firewall team to
> open ports 80 and 443 outbound from our R server to the 3 IP addresses that
> correspond to mran.microsoft.com, cran.microsoft.com, and
> www.stats.ox.ac.uk.
> I looked up these 3 IP addresses at network-tools.com.  I think there
> might
> be additional IP addresses involved, because after these openings were
> made, I'm still getting the error message above.
>
> Thanks,
>
> --
>
> Tully Holmes
>
> Business Applications Analyst
>
> State of Wyoming
>
> Wyoming Community College Commission
>
> 2300 Capitol Ave., 5th Floor, Suite B
>
> Cheyenne, WY 82002
>
> 307-777-6832
>
> --
>
> E-Mail to and from me, in connection with the transaction
> of public
> business, is subject to the Wyoming Public Records
> Act and may be
> disclosed to third parties.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] r-data partitioning considering two variables (character and numeric)

2018-08-27 Thread Bert Gunter
Just partition the unique stand_ID's and select on them using %in% , say:

id <- unique(dataGenotype$stand_ID)
tst <- sample(id, floor(length(id)/2))
wh <- dataGenotype$stand_ID %in% tst ## logical vector
test<- dataGenotype[wh,]
train <- dataGenotype[!wh,]

There are a million variations on this theme I'm sure.

-- Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Aug 27, 2018 at 3:54 PM Ahmed Attia  wrote:

> I would like to partition the following dataset (dataGenotype) based
> on two variables; Genotype and stand_ID, for example, for Genotype
> H13: stand_ID number 7 may go to training and stand_ID number 18 and
> 21 may go to testing.
>
> Genotypestand_IDInventory_date  stemC   mheight
> H13 75/18/2006  1940.1075   11.33995
> H13 711/1/2008  10898.9597  23.20395
> H13 74/14/2009  12830.1284  23.77395
> H131811/3/2005  2726.42 13.4432
> H13186/30/2008  12226.1554  24.091967
> H13184/14/2009  14141.6825.0922
> H13215/18/2006  4981.7158   15.7173
> H13214/14/2009  20327.0667  27.9155
> H159 3/31/2006  3570.06 14.7898
> H159 11/1/2008  15138.8383  26.2088
> H159 4/14/2009  17035.4688  26.8778
> H15   20 1/18/2005  3016.88114.1886
> H15   2010/4/2006   8330.4688   20.19425
> H15   206/30/2008   13576.5 25.4774
> H15   322/1/20063426.2525   14.31815
> U21   3 1/9/20063660.41615.09925
> U21   3 6/30/2008   13236.2924.27634
> U21   3 4/14/2009   16124.192   25.79562
> U21   6711/4/2005   2812.8425   13.60485
> U21   674/14/2009   13468.455   24.6203
>
> And the desired output is the following;
>
> A-training
>
> Genotypestand_IDInventory_date  stemC   mheight
> H137 5/18/2006  1940.1075   11.33995
> H137 11/1/2008  10898.9597  23.20395
> H137 4/14/2009  12830.1284  23.77395
> H159 3/31/2006  3570.06 14.7898
> H159 11/1/2008  15138.8383  26.2088
> H159 4/14/2009  17035.4688  26.8778
> U216711/4/2005  2812.8425   13.60485
> U21674/14/2009  13468.455   24.6203
>
> B-testing
>
> Genotypestand_IDInventory_date  stemC   mheight
> H13 18   11/3/2005  2726.42 13.4432
> H13 18   6/30/2008  12226.1554  24.091967
> H13 18   4/14/2009  14141.6825.0922
> H13 21   5/18/2006  4981.7158   15.7173
> H13 21   4/14/2009  20327.0667  27.9155
> H15 20   1/18/2005  3016.88114.1886
> H15 20   10/4/2006  8330.4688   20.19425
> H15 20   6/30/2008  13576.5 25.4774
> H15 32   2/1/2006   3426.2525   14.31815
> U21 31/9/2006   3660.41615.09925
> U21 36/30/2008  13236.2924.27634
> U21 34/14/2009  16124.192   25.79562
>
> I tried the following code;
>
> library(caret)
> dataPartitioning <-
> createDataPartition(dataGenotype$stand_ID,1,list=F,p=0.2)
> train = dataGenotype[dataPartitioning,]
> test = dataGenotype[-dataPartitioning,]
>
> Also tried
>
> createDataPartition(unique(dataGenotype$stand_ID),1,list=F,p=0.2)
>
> It did not produce the desired output, the data are partitioned within
> the stand_ID. For example, one row of stand_ID 7 goes to training and
> two rows of stand_ID 7 go to testing. How can I partition the data by
> Genotype and stand_ID together?.
>
>
>
> Ahmed Attia
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] r-data partitioning considering two variables (character and numeric)

2018-08-27 Thread Bert Gunter
Sorry, my bad -- careless reading: you need to do the partitioning within
genotype.
Something like:

by(dataGenotype, dataGenotype$Genotype, function(x){

  u <- unique(x$standID)

   tst <- x$x2 %in% sample(u, floor(length(u)/2))

   list(test = x[tst,], train = x[!tst,]

   })


This will give a list each component of which will split the Genotype into
test and train dataframe subsets by ID. These lists of data frames can then
be recombined into a single test and train dataframe by, e.g. an
appropriate rbind() call.


HOWEVER, note that you will need to modify this function to decide what to
do if/when there is only one ID in a Genotype, as Don MacQueen already
pointed out.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Aug 27, 2018 at 4:09 PM Bert Gunter  wrote:

> Just partition the unique stand_ID's and select on them using %in% , say:
>
> id <- unique(dataGenotype$stand_ID)
> tst <- sample(id, floor(length(id)/2))
> wh <- dataGenotype$stand_ID %in% tst ## logical vector
> test<- dataGenotype[wh,]
> train <- dataGenotype[!wh,]
>
> There are a million variations on this theme I'm sure.
>
> -- Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Aug 27, 2018 at 3:54 PM Ahmed Attia  wrote:
>
>> I would like to partition the following dataset (dataGenotype) based
>> on two variables; Genotype and stand_ID, for example, for Genotype
>> H13: stand_ID number 7 may go to training and stand_ID number 18 and
>> 21 may go to testing.
>>
>> Genotypestand_IDInventory_date  stemC   mheight
>> H13 75/18/2006  1940.1075   11.33995
>> H13 711/1/2008  10898.9597  23.20395
>> H13 74/14/2009  12830.1284  23.77395
>> H131811/3/2005  2726.42 13.4432
>> H13186/30/2008  12226.1554  24.091967
>> H13184/14/2009  14141.6825.0922
>> H13215/18/2006  4981.7158   15.7173
>> H13214/14/2009  20327.0667  27.9155
>> H159 3/31/2006  3570.06 14.7898
>> H159 11/1/2008  15138.8383  26.2088
>> H159 4/14/2009  17035.4688  26.8778
>> H15   20 1/18/2005  3016.88114.1886
>> H15   2010/4/2006   8330.4688   20.19425
>> H15   206/30/2008   13576.5 25.4774
>> H15   322/1/20063426.2525   14.31815
>> U21   3 1/9/20063660.41615.09925
>> U21   3 6/30/2008   13236.2924.27634
>> U21   3 4/14/2009   16124.192   25.79562
>> U21   6711/4/2005   2812.8425   13.60485
>> U21   674/14/2009   13468.455   24.6203
>>
>> And the desired output is the following;
>>
>> A-training
>>
>> Genotypestand_IDInventory_date  stemC   mheight
>> H137 5/18/2006  1940.1075   11.33995
>> H137 11/1/2008  10898.9597  23.20395
>> H137 4/14/2009  12830.1284  23.77395
>> H159 3/31/2006  3570.06 14.7898
>> H159 11/1/2008  15138.8383  26.2088
>> H159 4/14/2009  17035.4688  26.8778
>> U216711/4/2005  2812.8425   13.60485
>> U21674/14/2009  13468.455   24.6203
>>
>> B-testing
>>
>> Genotypestand_IDInventory_date  stemC   mheight
>> H13 18   11/3/2005  2726.42 13.4432
>> H13 18   6/30/2008  12226.1554  24.091967
>> H13 18   4/14/2009  14141.6825.0922
>> H13 21   5/18/2006  4981.7158   15.7173
>> H13 21   4/14/2009  20327.0667  27.9155
>> H15 20   1/18/2005  3016.88114.1886
>> H15 20   10/4/2006  8330.4688   20.19425
>> H15 20   6/30/2008  13576.5 25.4774
>> H15 32   2/1/2006   3426.2525   14.31815
>> U21 31/9/2006   3660.41615.09925
>> U21 36/30/2008  13236.2924.27634
>> U21 34/14/2009  16124.192   25.79562
>>
>> I tried the following code;
>>
>> library(caret)
>> dataPartitioning <-
>> cre

Re: [R] TCGA biolinks, DNA methylation

2018-08-29 Thread Bert Gunter
As an aside, the sep = "," can be omitted, as that's the default anyway.

In his response to Sarah, the OP gave us only "the line was found to be
errored," which of course is useless. Perhaps if he provided explicit
information on what the call and the error was...

-- Bert



On Wed, Aug 29, 2018 at 3:34 PM Sarah Goslee  wrote:

> Hi,
>
> If you had an actual gene analysis question I'd suggest the
> BioConductor email list, but you have a plain old ordinary typo:
>
>  the_data <-read.csv(file="LGG_clinical_drug.csv",header=T,sep",")
>
> You're missing the = after the argument sep
>
>  the_data <- read.csv(file = "LGG_clinical_drug.csv", header = TRUE, sep =
> ",")
>
> Using more spaces in your code would make that typo easier to spot.
>
> Sarah
> On Wed, Aug 29, 2018 at 6:06 PM Spencer Brackett
>  wrote:
> >
> > Good evening R users,
> >
> >   I am attempting to carry out DNA methylation analysis on two separate
> CSV
> > files (LGG and GBM), which I have downloaded onto my R console. To set
> the
> > path<-"." to be indicative of one or both of the csv files, I utilized
> the
> > following functions and received the errors shown. How do I set the "."
> so
> > that I can begin analysis on my files?
> >
> > > the_data <-read.csv(file="LGG_clinical_drug.csv",header=T,sep",")
> > Error: unexpected string constant in "the_data
> > <-read.csv(file="LGG_clinical_drug.csv",header=T,sep",""
> > > the_data<-read.csv(file="GBM_clinical_drug.csv",header=T,sep",")
> > Error: unexpected string constant in
> > "the_data<-read.csv(file="GBM_clinical_drug.csv",header=T,sep",""
> >
> > This is the preliminary portion of the analysis I am trying to run,
> which I
> > am referring to:
> >
> > 1 library(TCGAbiolinks)
> > 2
> > 3 # Download the DNA methylation data: HumanMethylation450 LGG and GBM.
> > 4 path <– "."
> > 5
> > 6 query.met <– TCGAquery(tumor = c("LGG","GBM"),"HumanMethylation450",
> > level = 3)
> > 7 TCGAdownload(query.met, path = path )
> > 8 met <– TCGAprepare(query = query.met,dir = path,
> > 9  add.subtype = TRUE, add.clinical = TRUE,
> > 10summarizedExperiment = TRUE,
> > 11  save = TRUE, filename = "lgg_gbm_met.rda")
> > 12
> > 13 # Download the expression data: IlluminaHiSeq_RNASeqV2 LGG and GBM.
> > 14 query.exp <– TCGAquery(tumor = c("lgg","gbm"), platform =
> "IlluminaHiSeq_
> > RNASeqV2",level = 3)
> > 15
> > 16 TCGAdownload(query.exp,path = path, type = "rsem.genes.normalized_
> > results")
> > 17
> > 18 exp <– TCGAprepare(query = query.exp, dir = path,
> > 19summarizedExperiment = TRUE,
> > 20  add.subtype = TRUE, add.clinical = TRUE,
> > 21type = "rsem.genes.normalized_results",
> > 22  save = T,filename = "lgg_gbm_exp.rda")
> >
> > Many thanks,
> >
> > Spencer Brackett
> >
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R code for parameter estimates

2018-08-30 Thread Bert Gunter
As you have completely failed to follow procedures described in the posting
guide linked below, your post is unlikely to receive any response.

Cheers,

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Aug 30, 2018 at 3:08 PM KABELI MEFANE via R-help <
r-help@r-project.org> wrote:

> Hello R -helpers
>
>
> Can you please be kind enough to help me the R code for GEV parameter
> estimates using Bayes, I have done them using MLE and it would really be
> nice to compare. I am trying to model rainfall data, i have used sevaral
> distributions such as lognormal, Burr, Pearson, GEV but the three parameter
> lognormal and log Pearson were a hustle due to the shift parameter. NowI
> want to use Bayes.
>
> Best Regards
> Kabeli Mefane
>
> “Tell me and I'll forget; show me and I may remember; involve me and I'll
> understand.”
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Resetting bin size in histogram having already changed to relative frequencies

2018-08-31 Thread Bert Gunter
Consult the docs, please. ?hist and the "breaks" argument. Also note the
"freq" argument, which means you should not be computing relative
frequencies manually.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Aug 31, 2018 at 7:42 AM Nick Wray via R-help 
wrote:

> Hello again.  I am trying to alter the bin size on a histogram where I
> have reset the vertical axis to relative frequency, rather than absolute.
> Beneath is a simple example (not my real data) of this:
>
> xvals<-rnorm(1000,0,1)
> xvals
> hist(xvals)
> h<-hist(xvals,plot=F)
>
> h
> h$counts
> h$counts<-h$counts/sum(h$counts)
> h$counts
> plot(h,freq=T,ylab="Relative Frequency")
>
>
> This gives me a plot with bin sizes of 0.5 and the relative frequency, but
> I cannot reset the bin size as well.  I don't know whether the only way to
> do it is to reset all the h$mids etc as well but this seems horrendously
> complicated and I wonder whether I am missing something simple
>
> Any ideas I would be thankful for   Nick Wray
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Account for a factor variability in a logistic GLMM in lme4

2018-09-03 Thread Bert Gunter
You should post this on the r-sig-mixed-models list, not here.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Sep 3, 2018 at 7:43 AM Pedro Vaz  wrote:

> We did a field study in which we tried to understand which factors
> significantly explain the probability of a group of animals (5 species in
> total) crossing through 30 wildlife road-crossing structures. The response
> variable is binomial (yes=crossed; no = did not cross) and was recorded by
> animal species. We did about 30 visits to each crossing structure (our
> random factor) in which we recorded the binomial response by each animal
> species and the values of a few predictors.
>
> So, I have this (simplified for better understanding) mixed effects model:
> library (lme4)
>
> Mymodel <- glmer(cross.01 ~ stream.01 + width.m + grass.per + (1|
> structure.id),
>   data = Mydata, family = binomial)
>
> stream is a factor with 2 levels; width.m is continuous; grass.per is a
> percentage
>
> This is the model in which I assessed crossings by all species combined
> (i.e., cross. 01 = 1 when an animal of any species crossed, cross.01 = 0
> when no animal crossed). However, we did one model per species and those
> species-specific models highlight that different species exhibit different
> relationships between crossings and explanatory variables.
>
> My problem: This means that my model above suffers from an additional
> source of variation related to the species level without accounting for it.
> However I cannot recalibrate the above model adding the species level as
> random factor because, in my binomial response, the zero means no species
> crossed (all zeros would have "NA" or, say, "none" for species) and so that
> additional source of variation is only present when the response was 1.
> Just to confirm this, I did add species as a random factor:
>
> (1 | structure.id) + (1 | species)
>
> As expected, the message is "Error: Response is constant"
>
> How can I account for the species variability in my model in lme4?
>
> A few more details:
> A few more details:
> - I had 5 mammal species crossing through the 30 road-crossing structures.
> In 134 occasions (i.e., 134 of my records on individual
> crossing-structures), no animal crossed (so, @Dimitris Rizopoulos, no, I
> didn't have the species of the animals which did not cross. A "no cross"
> was a "zero" for that visit to the crossing-structure). In 498 occasions,
> at least one animal of a given species crossed the structure (these were my
> "ones" in my logistic response)
> - A side comment: This is to respond to a reviewer in a paper of mine,
> i.e., I did and presented species-specific and "all combined species"
> models in the draft reviewed but now the reviewer is asking me to control
> for the species variability in the "combined species model". He asked me to
> include a random factor but I realized that is not possible since all my
> zeros would have "none" for the species that crossed. So, is it possible to
> control for the species variability in my model in lme4 in another way? I
> know in nlme including a fitting of variance structures it's not that
> difficult...
> - Every time an animal crossed, the binary response was "one" and I
> recorded the animal species as well. Thus, I have variability between
> species in the "ones" but not in my "zeros" of my logistic model.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Display time of PDF plots

2018-09-03 Thread Bert Gunter
1. Plot a random sample of the points (e.g. of rows of matrix/dataframe
containing "x" and "y" columns

2. See the hexbin package

3. Check out the graphics taskview on cran:
https://cran.r-project.org/web/views/Graphics.html
(though it may be somewhat dated by now)

4. Internet search:  e.g. on "display scatterplots with thousands of
points"
typical hit:
https://stackoverflow.com/questions/7714677/scatterplot-with-too-many-points

5. Search/Post on stats.stackexchange.com instead.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Sep 3, 2018 at 10:45 AM Rich Shepard 
wrote:

>This may be an inappropriate forum for this question. If so, please
> point
> me in a better direction.
>
>A current project includes scatter plots with thousands of points. Saved
> as PDF files they display slowly using a pdf viewer or when included in the
> PDF output of a LaTeX document.
>
>Is there a process by which these plots can be 'thinned' so they show
> the
> same overall patterns but with fewer points so they display more quickly?
>
>Rasterizing them to .jpg files using 'convert' allows them to load
> immediately, but the bit-mapped resolution is, of course, much lower than
> the vector PDF format.
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Round down numeric values with decimals

2018-09-04 Thread Bert Gunter
This is *not* "rounding down."

But this should do it I think:
## (see ?floor)

x <- 3.896e09
k <- floor(log10(x))

> floor(x*10^(-k))*10^k
[1] 3e+09

There may be even slicker ways, but this is as slick as I can muster...

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Sep 4, 2018 at 9:33 AM Nelly Reduan  wrote:

> Hello,
>
>
>
> How can I round down numeric values with decimals? For example,
>
>
>
> > signif(3.896037e+09, digits = 1)
>
> [1] 4e+09
>
>
>
> The expected result is 3e+09 (and not 4e+09).
>
>
>
> > signif(8.68542378e-10, digits = 1)
>
> [1] 9e-10
>
>
>
> The expected result is 8e-10 (and not 9e-10).
>
>
>
> Thank you very much for your time.
>
> Have a nice day
>
> Nell
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] leave-one-out cross validation in mixed effects logistic model (lme4)

2018-09-04 Thread Bert Gunter
Please post on the r-sig-mixed-models list, where you are more likely to
find the requisite expertise.

However, FWIW, I think the reviewer's request is complete nonsense (naïve
cross validation requires iid sampling). But the mixed models experts are
the authorities on such judgments (and may tell you that my opinion is
complete nonsense!).

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Sep 4, 2018 at 10:16 AM Pedro Vaz  wrote:

>  Hello,
>
> So, I have this (simplified for better understanding) binomial mixed
> effects model [library (lme4)]
>
> Mymodel <- glmer(cross.01 ~ stream.01 + width.m + grass.per + (1|
> structure.id),
>   data = Mydata, family = binomial)
>
> stream is a factor with 2 levels; width.m is continuous; grass.per is a
> percentage
>
> Now, a reviewer is asking me to apply "a cross-validation procedure (i.e. a
> leave-one-out design coupled with predictive metrics as e.g. AUC) on this
> model"
>
> Does anyone have R-code to do this cross validation in my logistic mixed
> effects model? In the reviewer words: "the model should be evaluated also
> as for their predictive performance, not only for assumptions violation and
> for goodness-of-fit" (which I presented already in the reviewed paper
> draft)
>
> Many thanks in advance,
> pedro
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Round down numeric values with decimals

2018-09-04 Thread Bert Gunter
Note also that if you wish to include 0 and negative numbers, and your
intent is to truncate to 1 digit towards 0, then you must of course check
for 0 separately and modify what I suggested for x != 0 to:

k <- floor(log10(abs(x)))
ifelse(x <0, ceiling(x*10^(-k)), floor(x*10^(-k))) *10^k

Note that this is all vectorized, so, e.g. ,

> x<- c(-101.8, 101.8)
> k <- floor(log10(abs(x)))
> ifelse(x <0, ceiling(x*10^(-k)), floor(x*10^(-k))) *10^k
[1] -100  100

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Sep 4, 2018 at 11:08 AM Bert Gunter  wrote:

> This is *not* "rounding down."
>
> But this should do it I think:
> ## (see ?floor)
>
> x <- 3.896e09
> k <- floor(log10(x))
>
> > floor(x*10^(-k))*10^k
> [1] 3e+09
>
> There may be even slicker ways, but this is as slick as I can muster...
>
> Cheers,
> Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Sep 4, 2018 at 9:33 AM Nelly Reduan  wrote:
>
>> Hello,
>>
>>
>>
>> How can I round down numeric values with decimals? For example,
>>
>>
>>
>> > signif(3.896037e+09, digits = 1)
>>
>> [1] 4e+09
>>
>>
>>
>> The expected result is 3e+09 (and not 4e+09).
>>
>>
>>
>> > signif(8.68542378e-10, digits = 1)
>>
>> [1] 9e-10
>>
>>
>>
>> The expected result is 8e-10 (and not 9e-10).
>>
>>
>>
>> Thank you very much for your time.
>>
>> Have a nice day
>>
>> Nell
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A small thing that amused my small mind

2018-09-05 Thread Bert Gunter
A few days ago, someone asked how to truncate arbitrary numerics to 1
digit towards 0. e.g. -189 should become -100 and 254 should become
200; and all values in magniude < 1 should become 0.

I proposed a somewhat clumsy solution using floor() and ceiling(), but
in fooling with it a bit more, I realized that a better way to do it
is to use R's trunc() function. It is simpler and works for all cases
AFAICS. Here's a litte function to do it -- maybe someone else might
find it amusing/instructive:

poof <- function(x){
   ## truncating to 0
## x is a numeric vector
k <- 10^trunc(log10(abs(x)))
ifelse(k, trunc(x/k)*k, k)
}

## test it
> x <- c(0,-.036578, .4876, -189, 254)
> poof(x)
[1]000 -100  200

Cheers,
Bert

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting random subset by ID

2018-09-07 Thread Bert Gunter
?sample

Should get you started

We expect you to first make an effort to learn about and write your
own code, rather than asking us to write it for you.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Fri, Sep 7, 2018 at 11:38 AM David Joubert  wrote:
>
> Hello R users,
>
> I am working with a large dataset, including roughly 50 000 sequential 
> observations (variable "count") for 8000 individuals (variable "id"). The 
> dataset is very unbalanced, meaning that some individuals have few 
> observations and others have many. Because I plan on running Generalized 
> Linear Models for panel data using pglm and the package has file size 
> restrictions, I want to create 4 randomly selected subsets of 2500 
> individuals from the main dataset. What functions and code would I use to do 
> this?
>
> Thanks in advance,
>
> David Joubert
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Correctly applying aggregate.ts()

2018-09-07 Thread Bert Gunter
Well, let's see:
"monthly.rain <- aggregate.ts(x = dp['sampdate','prcp'], by = list(month = \
substr(dp$sampdate, 1, 7)), FUN = sum, na.rm = TRUE)"

1. x is a data frame, so why are you using the time series method?
Perhaps you need to study S3 method usage in R.

2. You have improperly subscripted the data frame: it should be dp[,
c('sampdate','prcp')] . Perhaps you need to read about how
subscripting in R. However, in this case, no subscripting is needed
(see 3.)

3. As you should be using the data frame method, and the month is
obtained as a substring of sampdate, you should use dp[,'prcp'] as
your data frame so that sum() is not applied to the sampdate column.

4. I assume the "\" indicates  ?

Anyway, once you have corrected all that, here's the call:

> monthly.rain <- aggregate(dp[, 'prcp'],
+   list(substr(dp$sampdate,1,7)),
+   FUN = sum, na.rm = TRUE)
> ## yielding
> monthly.rain
  Group.1x
1 2005-01 4.88
2 2005-02 2.27
3 2005-03 0.06

It's perhaps also worth noting that the formula method (for data
frames) is somewhat more convenient, especially with several grouping
factors in the list:

> monthly.rain <- aggregate(prcp ~ substr(sampdate,1,7), data = dp, FUN = sum, 
> na.rm = TRUE)
> ##yielding
> monthly.rain
  substr(sampdate, 1, 7) prcp
12005-01 4.88
22005-02 2.27
32005-03 0.06

Cheers,

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Fri, Sep 7, 2018 at 2:19 PM Rich Shepard  wrote:
>
>I've read ?aggregate and several blog posts on using aggregate() yet I
> still haven't applied it correctly to my dataframe. The sample data are:
>
> structure(list(sampdate = c("2005-01-01", "2005-01-02", "2005-01-03",
> "2005-01-04", "2005-01-05", "2005-01-06", "2005-01-07", "2005-01-08",
> "2005-01-09", "2005-01-10", "2005-01-11", "2005-01-12", "2005-01-13",
> "2005-01-14", "2005-01-15", "2005-01-16", "2005-01-17", "2005-01-18",
> "2005-01-19", "2005-01-20", "2005-01-21", "2005-01-22", "2005-01-23",
> "2005-01-24", "2005-01-25", "2005-01-26", "2005-01-27", "2005-01-28",
> "2005-01-29", "2005-01-30", "2005-01-31", "2005-02-01", "2005-02-02",
> "2005-02-03", "2005-02-04", "2005-02-05", "2005-02-06", "2005-02-07",
> "2005-02-08", "2005-02-09", "2005-02-10", "2005-02-11", "2005-02-12",
> "2005-02-13", "2005-02-14", "2005-02-15", "2005-02-16", "2005-02-17",
> "2005-02-18", "2005-02-19", "2005-02-20", "2005-02-21", "2005-02-22",
> "2005-02-23", "2005-02-24", "2005-02-25", "2005-02-26", "2005-02-27",
> "2005-02-28", "2005-03-01", "2005-03-02", "2005-03-03"), prcp = c(0.59,
> 0.08, 0.1, 0, 0, 0.02, 0.05, 0.1, 0, 0.02, 0, 0.05, 0.2, 0, 0,
> 0.5, 0.41, 0.84, 0.01, 0.1, 0.01, 0, 0, 0, 0, 0.21, 0.24, 0.13,
> 1.12, 0.01, 0.09, 0, 0, 0, 0.35, 0.18, 0.65, 0.16, 0, 0, 0, 0,
> 0.55, 0.21, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.17, 0.05,
> 0.01, 0)), row.names = c(NA, 62L), class = "data.frame")
>
>What I need to learn how to do is to calculate monthly sum, median, and
> maximum rainfall amounts from the full data set which has daily rainfall
> amounts. My most current effort to calculate monthly sums uses this syntax:
>
> monthly.rain <- aggregate.ts(x = dp['sampdate','prcp'], by = list(month = \
> substr(dp$sampdate, 1, 7)), FUN = sum, na.rm = TRUE)
>
> (entered on a single line) which produces this result:
>
> head(monthly.rain)
> [1] NA
>
>The sample data has 62 of the 113K rows in the dataframe. A larger set can
> be provided if needed.
>
>An explanation of what I've missed is needed.
>
> Regards,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Correctly applying aggregate.ts()

2018-09-07 Thread Bert Gunter
Clarification: When using the formula interface, no subscripting is needed.

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Fri, Sep 7, 2018 at 3:25 PM Bert Gunter  wrote:
>
> Well, let's see:
> "monthly.rain <- aggregate.ts(x = dp['sampdate','prcp'], by = list(month = \
> substr(dp$sampdate, 1, 7)), FUN = sum, na.rm = TRUE)"
>
> 1. x is a data frame, so why are you using the time series method?
> Perhaps you need to study S3 method usage in R.
>
> 2. You have improperly subscripted the data frame: it should be dp[,
> c('sampdate','prcp')] . Perhaps you need to read about how
> subscripting in R. However, in this case, no subscripting is needed
> (see 3.)
>
> 3. As you should be using the data frame method, and the month is
> obtained as a substring of sampdate, you should use dp[,'prcp'] as
> your data frame so that sum() is not applied to the sampdate column.
>
> 4. I assume the "\" indicates  ?
>
> Anyway, once you have corrected all that, here's the call:
>
> > monthly.rain <- aggregate(dp[, 'prcp'],
> +   list(substr(dp$sampdate,1,7)),
> +   FUN = sum, na.rm = TRUE)
> > ## yielding
> > monthly.rain
>   Group.1x
> 1 2005-01 4.88
> 2 2005-02 2.27
> 3 2005-03 0.06
>
> It's perhaps also worth noting that the formula method (for data
> frames) is somewhat more convenient, especially with several grouping
> factors in the list:
>
> > monthly.rain <- aggregate(prcp ~ substr(sampdate,1,7), data = dp, FUN = 
> > sum, na.rm = TRUE)
> > ##yielding
> > monthly.rain
>   substr(sampdate, 1, 7) prcp
> 12005-01 4.88
> 22005-02 2.27
> 32005-03 0.06
>
> Cheers,
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> On Fri, Sep 7, 2018 at 2:19 PM Rich Shepard  wrote:
> >
> >I've read ?aggregate and several blog posts on using aggregate() yet I
> > still haven't applied it correctly to my dataframe. The sample data are:
> >
> > structure(list(sampdate = c("2005-01-01", "2005-01-02", "2005-01-03",
> > "2005-01-04", "2005-01-05", "2005-01-06", "2005-01-07", "2005-01-08",
> > "2005-01-09", "2005-01-10", "2005-01-11", "2005-01-12", "2005-01-13",
> > "2005-01-14", "2005-01-15", "2005-01-16", "2005-01-17", "2005-01-18",
> > "2005-01-19", "2005-01-20", "2005-01-21", "2005-01-22", "2005-01-23",
> > "2005-01-24", "2005-01-25", "2005-01-26", "2005-01-27", "2005-01-28",
> > "2005-01-29", "2005-01-30", "2005-01-31", "2005-02-01", "2005-02-02",
> > "2005-02-03", "2005-02-04", "2005-02-05", "2005-02-06", "2005-02-07",
> > "2005-02-08", "2005-02-09", "2005-02-10", "2005-02-11", "2005-02-12",
> > "2005-02-13", "2005-02-14", "2005-02-15", "2005-02-16", "2005-02-17",
> > "2005-02-18", "2005-02-19", "2005-02-20", "2005-02-21", "2005-02-22",
> > "2005-02-23", "2005-02-24", "2005-02-25", "2005-02-26", "2005-02-27",
> > "2005-02-28", "2005-03-01", "2005-03-02", "2005-03-03"), prcp = c(0.59,
> > 0.08, 0.1, 0, 0, 0.02, 0.05, 0.1, 0, 0.02, 0, 0.05, 0.2, 0, 0,
> > 0.5, 0.41, 0.84, 0.01, 0.1, 0.01, 0, 0, 0, 0, 0.21, 0.24, 0.13,
> > 1.12, 0.01, 0.09, 0, 0, 0, 0.35, 0.18, 0.65, 0.16, 0, 0, 0, 0,
> > 0.55, 0.21, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.17, 0.05,
> > 0.01, 0)), row.names = c(NA, 62L), class = "data.frame")
> >
> >What I need to learn how to do is to calculate monthly sum, median, and
> > maximum rainfall amounts from the full data set which has daily rainfall
> > amounts. My most current effort to calculate monthly sums uses this syntax:
> >
> > monthly.rain <- aggregate.ts(x = dp['sampdate','prcp'], by = list(month = \
> > substr(dp$sampdate, 1, 7)), FUN = sum, na.rm = TRUE)
> >
> > (entered on a single line) which produces this result:
> >
> > head(monthly.rain)
> > [1] NA
> >
> >The sample data has 62 of the 113K rows in the dataframe. A larger set 
> > can
> > be provided if needed.
> >
> >An explanation of what I've missed is needed.
> >
> > Regards,
> >
> > Rich
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bar Graph

2018-09-11 Thread Bert Gunter
Not quite -- he wanted the frequencies not the counts. So something
like this (using the adj argument to center the frequencies above each
bar:

bp <-barplot(Number.of.Death, names.arg=Cause.of.Death, main="Bar
Graph for Death Data", ylab="Number of Deaths", xlab="Cause of Death",
ylim = c(0,500) )

text(bp, y = Number.of.Death + 30, adj = .5,
 lab = round(Number.of.Death/sum(Number.of.Death),2))

Cheers,
Bert

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Sep 11, 2018 at 11:02 AM AbouEl-Makarim Aboueissa
 wrote:
>
> Dear All:
>
>
> I do need your help on how to add frequency to bar plot on the top of each
> bar.
>
>
> here is the R code.
>
>
> *Number.of.Death <- c(432, 217,93, 34, 224)# Number of Death*
>
> *Cause.of.Death <- c("Heart disease", "Cancer", "Stroke", "Accidents",
> "Other")  *
>
> *barplot(Number.of.Death, names.arg=Cause.of.Death, main="Bar Grapg for
> Death Data", ylab="Number of Death", xlab="Cause of Death") *
>
>
>
> Thank you very much for your help in advance.
>
>
> with many thanks
> abou
> __
>
>
> *AbouEl-Makarim Aboueissa, PhD*
>
> *Professor of Statistics*
> *Graduate Coordinator*
>
> *Department of Mathematics and Statistics*
> *University of Southern Maine*
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Undesired tick marks on top, right axes

2018-09-11 Thread Bert Gunter
Well, you might try ?xyplot -- in particular the "scales" list section
and in particular there the "tck" parameter. Adding
tck = c(1, 0)
to the "scales ="  list will probably solve your problem.

-- Bert


On Tue, Sep 11, 2018 at 2:46 PM Rich Shepard  wrote:
>
>Every lattice xyplot() I've created before this one has tick marks on only
> the left and bottom axes. The current plot has sprouted tick marks on the
> top and right side, too, and I want to remove them. I've not found an answer
> to this issue in Deepayan's book or on the web. I would appreciate also
> learning why the extra tick marks appeared. The dput() data are included.
>
>The plotting command;
>
> rain.all.sum <- xyplot(Sum ~ Month, data=agg.all, col = 'black', type = 'h',
> main = 'Monthly Total Precipitation\n2005-2018',
> xlab = 'Year and Month', ylab = 'Precipitation (in)',
> scales = list(x = list(at = seq(1,162,by=6), cex = 
> 0.7, rot = 90)))
>
> plot(rain.all.sum)
>
>The data:
>
> structure(list(Month = structure(1:162, .Label = c("2005-01",
> "2005-02", "2005-03", "2005-04", "2005-05", "2005-06", "2005-07",
> "2005-08", "2005-09", "2005-10", "2005-11", "2005-12", "2006-01",
> "2006-02", "2006-03", "2006-04", "2006-05", "2006-06", "2006-07",
> "2006-08", "2006-09", "2006-10", "2006-11", "2006-12", "2007-01",
> "2007-02", "2007-03", "2007-04", "2007-05", "2007-06", "2007-07",
> "2007-08", "2007-09", "2007-10", "2007-11", "2007-12", "2008-01",
> "2008-02", "2008-03", "2008-04", "2008-05", "2008-06", "2008-07",
> "2008-08", "2008-09", "2008-10", "2008-11", "2008-12", "2009-01",
> "2009-02", "2009-03", "2009-04", "2009-05", "2009-06", "2009-07",
> "2009-08", "2009-09", "2009-10", "2009-11", "2009-12", "2010-01",
> "2010-02", "2010-03", "2010-04", "2010-05", "2010-06", "2010-07",
> "2010-08", "2010-09", "2010-10", "2010-11", "2010-12", "2011-01",
> "2011-02", "2011-03", "2011-04", "2011-05", "2011-06", "2011-07",
> "2011-08", "2011-09", "2011-10", "2011-11", "2011-12", "2012-01",
> "2012-02", "2012-03", "2012-04", "2012-05", "2012-06", "2012-07",
> "2012-08", "2012-09", "2012-10", "2012-11", "2012-12", "2013-01",
> "2013-02", "2013-03", "2013-04", "2013-05", "2013-06", "2013-07",
> "2013-08", "2013-09", "2013-10", "2013-11", "2013-12", "2014-01",
> "2014-02", "2014-03", "2014-04", "2014-05", "2014-06", "2014-07",
> "2014-08", "2014-09", "2014-10", "2014-11", "2014-12", "2015-01",
> "2015-02", "2015-03", "2015-04", "2015-05", "2015-06", "2015-07",
> "2015-08", "2015-09", "2015-10", "2015-11", "2015-12", "2016-01",
> "2016-02", "2016-03", "2016-04", "2016-05", "2016-06", "2016-07",
> "2016-08", "2016-09", "2016-10", "2016-11", "2016-12", "2017-01",
> "2017-02", "2017-03", "2017-04", "2017-05", "2017-06", "2017-07",
> "2017-08", "2017-09", "2017-10", "2017-11", "2017-12", "2018-01",
> "2018-02", "2018-03", "2018-04", "2018-05", "2018-06"), class = "factor"),
>  Sum = c(53.51, 24.2, 88.54, 72.85, 77.3, 49.19, 8.77, 5.75,
>  27.83, 79.75, 123.89, 168.29, 229.69, 70.91, 74.15, 62.3,
>  43.56, 35.08, 3.6, 2.76, 26.83, 47.72, 293.23, 139.84, 103.48,
>  120.91, 85.96, 55.91, 26.56, 29.44, 9.9, 15.38, 33.47, 93.6,
>  105.61, 277.41, 279.38, 144.26, 220.88, 149.75, 82.28, 55.87,
>  5.29, 52.27, 21.1, 64.76, 182.31, 207.13, 196.29, 89.27,
>  187.72, 111.67, 111.72, 38.19, 6.07, 15.25, 52.46, 127.75,
>  208.43, 146.62, 169.34, 94.54, 154.21, 131.39, 151.27, 135.46,
>  9.98, 8.72, 86.67, 142.04, 225.61, 274.93, 196.68, 153.24,
>  263.54, 231.49, 122.23, 58.26, 34.65, 2.96, 28.21, 103.92,
>  217.52, 166.16, 305.27, 168.73, 333.28, 145.68, 101.2, 127.77,
>  15.41, 1.85, 3.49, 245.99, 272.35, 297.05, 177.17, 105.71,
>  118.44, 136.34, 161.01, 53.31, 1.15, 23.43, 200.97, 69.12,
>  158.51, 131.67, 156.95, 266.38, 291.7, 147.15, 101.49, 78.89,
>  26.99, 24.35, 35.76, 210.2, 225.55, 282.85, 153.91, 148.13,
>  187.03, 133.99, 62.28, 17.58, 13.41, 35.58, 47.04, 154.92,
>  317.77, 604.04, 288.91, 210.86, 266.04, 121.62, 78.17, 85.96,
>  29.84, 7.02, 72.13, 404.33, 247.71, 255.5, 138.22, 339.5,
>  368.99, 209.41, 110.08, 63.9, 0.62, 6.97, 133.75, 227.1,
>  312.99, 178.58, 255.8, 155.05, 135.27, 225.55, 15.23, 1.58
>  ), Median = c(0.01, 0, 0, 0.1, 0.1, 0, 0, 0, 0, 0.02, 0.1,
>  0.04, 0.5, 0, 0.1, 0.07, 0, 0, 0, 0, 0, 0, 0.57, 0.03, 0,
>  0.2, 0.055, 0, 0, 0, 0, 0, 0, 0, 0, 0.21, 0.21, 0.02, 0.2,
>  0.11, 0.02, 0, 0, 0, 0, 0, 0.1, 0.165, 0.01, 0.01, 0.15,
>  0.01, 0, 0, 0, 0, 0, 0.02, 0.125, 0, 0.1, 0.08, 0.055, 0.1,
>  0.13, 0.01, 0, 0, 0, 0, 0.2, 0.17, 0.02, 0.07, 0.23, 0.15,
>  0.06, 0, 0, 0, 0, 0.03, 0.1, 0, 0.1, 0.1, 0.2, 0.07, 0, 0.02,
>  0, 0, 0, 0.04, 0.115, 0.24, 0.02, 0.03, 0.01, 0.01, 0.02,
>  0, 0, 0, 0, 0, 0.02, 0, 0, 0.2, 0.18, 0.05, 0, 0, 0, 0, 0,
>  0.08, 0.07, 0.1, 0, 0.01, 0, 0.02, 0, 0, 0, 0, 0, 0, 0.09,
>  0.4, 0.18, 0.1, 0.2, 0, 0, 0, 0, 0

Re: [R] Undesired tick marks on top, right axes

2018-09-11 Thread Bert Gunter
??

Not when I run your code with the tck specification added. Show us
your xyplot invocation. It should be
scales = list(tck = c(1,0), x= etc.)

Bert

On Tue, Sep 11, 2018 at 3:51 PM Rich Shepard  wrote:
>
> On Tue, 11 Sep 2018, Bert Gunter wrote:
>
> > Adding
> > tck = c(1, 0)
> > to the "scales ="  list will probably solve your problem.
>
> Bert,
>
>How interesting. This removed the tick marks on top but left them on the
> right axes. Will think more about this.
>
> Regards,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Undesired tick marks on top, right axes

2018-09-11 Thread Bert Gunter
As I thought, you did not do what I told you to.

Look *carefully* at the two to see your error.

-- Bert


On Tue, Sep 11, 2018 at 5:01 PM Rich Shepard  wrote:
>
> On Tue, 11 Sep 2018, Bert Gunter wrote:
>
> > Not when I run your code with the tck specification added. Show us
> > your xyplot invocation. It should be
> > scales = list(tck = c(1,0), x= etc.)
>
> Bert,
>
>Command:
>
> rain.all.sum <- xyplot(Sum ~ Month, data=agg.all, col = 'black', type = 'h',
> main = 'Monthly Total Precipitation\n2005-2018',
> xlab = 'Year and Month', ylab = 'Precipitation (in)',
> scales = list(x = list(tck = c(1, 0), at = 
> seq(1,162,by=6),
>cex = 0.7, rot = 90)))
>
> rain.all.sum.pdf attached.
>
> Regards,
>
> Rich
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Undesired tick marks on top, right axes

2018-09-11 Thread Bert Gunter
You do that. Your error is obvious.
-- Bert


On Tue, Sep 11, 2018 at 5:39 PM Rich Shepard  wrote:
>
> On Tue, 11 Sep 2018, Bert Gunter wrote:
>
> > As I thought, you did not do what I told you to.
> > Look *carefully* at the two to see your error.
>
> Bert,
>
>You're correct, of course. After moving the tck parameter in front of the
> x list the right-side ticks are gone. Unfortunately, so are the data: the
> panel is empty.
>
>Corrected command:
>
> rain.all.sum <- xyplot(Sum ~ Month, data=agg.all, col = 'black', type = 'p, 
> h',
> main = 'Monthly Total Precipitation\n2005-2018',
> xlab = 'Year and Month', ylab = 'Precipitation (in)',
> scales = list(tck = c(1,0), x = list(at = 
> seq(1,162,by=6),
>cex = 0.7, rot = 90)))
>
>Tomorrow I'll work on why the panel display disappeared along with the
> right axes tick marks. Parentheses all match according to emacs.
>
> Thanks,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Install R into mac

2018-09-12 Thread Bert Gunter
You have given insufficient information for useful help. R installs
(from the Mac binary) without difficulty on my Mac.

Have your student post with sufficient details to the r-sig-mac list.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Wed, Sep 12, 2018 at 9:10 AM AbouEl-Makarim Aboueissa
 wrote:
>
> Dear All:
>
> One of my students has  mac software  OS X Yosemite, Version 10.10.5. He
> could not install R into his mac laptop. I am not familiar with mac at all.
> Any help will be appreciated.
>
> with thanks
> abou
> __
>
>
> *AbouEl-Makarim Aboueissa, PhD*
>
> *Professor of Statistics*
> *Graduate Coordinator*
>
> *Department of Mathematics and Statistics*
> *University of Southern Maine*
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Save printed/plotted output to different directory

2018-09-12 Thread Bert Gunter
Don't post. Try it and see.

-- Bert


On Wed, Sep 12, 2018 at 9:21 AM Rich Shepard  wrote:
>
>I run analyses in one directory and keep images and textual output in
> other directories. My test involving a pdf output specifying an output
> directory relative to the cwd produced a blank image. The command was like
> this:
> pdf('../images/filename.pdf')
>
>Will R accept an absolute path to an output directory or none at all?
>
> TIA,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Save printed/plotted output to different directory

2018-09-12 Thread Bert Gunter
Insufficient info to diagnose. No code for what you did.

-- Bert


On Wed, Sep 12, 2018 at 9:43 AM Rich Shepard  wrote:
>
> On Wed, 12 Sep 2018, Bert Gunter wrote:
>
> > Don't post. Try it and see.
>
> Bert,
>
>Tried using both $HOME and the full path without success and wondered if
> there was a way to direct output to a different directory that hadn't
> occurred to me.
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select and hold missing

2018-09-12 Thread Bert Gunter
"Why it is setting all row values  NA?"

Because the row index is NA. e.g.

> z <- data.frame(a=letters[1:3],b = 1:3); x <- c(TRUE,NA,FALSE)
> z[x,]
  a  b
1 a  1
NA  NA

Change your logical comparison to (using with() to simplify entry):

> dfc[with(dfc, diff > 0 & diff < 100 | is.na(diff)), ]
   week v1 v2 diff
2w1 NA 42   NA
3w1 31 321
4w2 31 52   21
5w2 41 NA   NA
6w3 51 82   31
7w2 11 22   11
8w3 11 121
10   w1 31 72   41

Cheers,
Bert


On Wed, Sep 12, 2018 at 1:39 PM Val  wrote:
>
> I have a data
> dfc <- read.table( text= 'week v1 v2
>   w1  11  11
>   w1  .42
>   w1  31  32
>   w2  31  52
>   w2  41  .
>   w3  51  82
>   w2  11  22
>   w3  11  12
>   w4  21  202
>   w1  31  72
>   w2  71  52', header = TRUE, as.is = TRUE, na.strings=c("",".","NA") )
>
> I want to create this new variable diff = v2-v1  and remove rows based
> on this "diff" value as shown below.
> dfc$diff <-  dfc$v2 - dfc$v1
> I want to   remove row values  <=0  and any value greater than  >=
> 100   and keep all values including NAs
> dfca  <- dfc[((dfc$diff) > 0) & ((dfc$diff) < 100), ]
>
>  However, the result is not what I wanted. I want the output as follow,
>   week v1 v2 diff
>   w1 NA  42  NA
>   w1 31 321
>   w2 31 52   21
>   w2 41  NA  NA
>   w3 51 82   31
>   w2 11 22   11
>   w3 11 121
>   w1 31 72   41
>
> However, I got this,l. Why it is setting all row values  NA?
>week v1 v2 diff
>NA NA   NA
>   w1 31 321
>  w2 31 52   21
>   NA NA   NA
>   w3 51 82   31
>   w2 11 22   11
>   w3 11 121
>   w1 31 72   41
>
> Any help ?
> Thank you.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ddply (or other suitable solution) question

2018-09-13 Thread Bert Gunter
What if there is only one read in the id?


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Sep 13, 2018 at 12:11 PM Andras Farkas via R-help
 wrote:
>
> Dear All,
>
> I have data frame:
> set.seed(123.456)
> df <-data.frame(ID=c(1,1,2,2,2,3,3,3,3,4,4,5,5),
> read=c(1,1,0,1,1,1,0,0,0,1,0,0,0),
> int=c(1,1,0,0,0,1,1,0,0,1,1,1,1),
> z=rnorm(13,1,5),
> y=rnorm(13,1,5))
>
> what I would like to achieve (as best as I see it now) is to create multiple 
> lists (and lists within lists using the data in df) that would be based on 
> the groups in the ID column ("top level of list") and "join together" each 
> line item within the group followed by the next line item ("bottom level 
> list"), so would look like this for
>
> [[ID=1]]
> [[1]][[1]]
>   ID read intzy
>   11   1 5.188935 5.107905
>   11   1 1.766866 4.443201
> [[ID=2]]
> [[2]][[1]]  ID read int z y
>   20   0 -4.690685 3.7695883
>   21   0  7.269075 0.6904414[[ID=2]]
> [[2]][[2]]  ID read intz  y
>   21   0 7.269075  0.6904414
>   21   0 3.132321 -0.5298133[[ID=3]]
> [[3]][[1]]  ID read int  z y
>   31   1 -0.4753574 -0.902355
>   30   1  5.4756283 -2.473535
> [[ID=3]]
> [[3]][[2]]
>   30   1 5.475628 -2.47353489
>   30   0 5.390667 -0.03958639
>
>
> hoping example clear enough... all our help is appreciated,
>
> thanks,
>
>
>
> Andras
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ddply (or other suitable solution) question

2018-09-13 Thread Bert Gunter
Mod my earlier question, it seems that you just want to replicate all
rows within an id if there more than 2 rows. If this is incorrect,
ignore the rest of this post.

Otherwise...

(I assume the data frame is listed in ID order, whatever that is)

set.seed(123.456)
df <-data.frame(ID=c(1,1,2,2,2,3,3,3,3,4,4,5,5),
read=c(1,1,0,1,1,1,0,0,0,1,0,0,0),
int=c(1,1,0,0,0,1,1,0,0,1,1,1,1),
z=rnorm(13,1,5),
y=rnorm(13,1,5))

yielded on my Mac and R version 3.5.1

> df
   ID read int  z   y
1   11   1 -1.8023782  1.55341358
2   11   1 -0.1508874 -1.77920567
3   20   0  8.7935416  9.93456568
4   21   0  1.3525420  3.48925239
5   21   0  1.6464387 -8.83308578
6   31   1  9.5753249  4.50677951
7   30   1  3.3045810 -1.36395704
8   30   0 -5.3253062 -4.33911853
9   30   0 -2.4342643 -0.08987457
10  41   1 -1.2283099 -4.13002224
11  40   1  7.1204090 -2.64445615
12  50   1  2.7990691 -2.12519634
13  50   1  3.0038573 -7.43346655

## The following doubles up the rows by ID
> ix <- tapply(seq_len(nrow(df)),df$ID,
+  function(x){
+ lenx <- length(x)
+ if(lenx > 2)
+c(x[1],rep(x[2]:x[lenx-1],e=2),x[lenx])
+ else x
+  }
+)
> ix
$`1`
[1] 1 2

$`2`
[1] 3 4 4 5

$`3`
[1] 6 7 7 8 8 9

$`4`
[1] 10 11

$`5`
[1] 12 13

## now use the ix list to break up df:

> lapply(ix, function(i)df[i,])
$`1`
  ID read int  z y
1  11   1 -1.8023782  1.553414
2  11   1 -0.1508874 -1.779206

$`2`
ID read intz y
320   0 8.793542  9.934566
421   0 1.352542  3.489252
4.1  21   0 1.352542  3.489252
521   0 1.646439 -8.833086

$`3`
ID read int z   y
631   1  9.575325  4.50677951
730   1  3.304581 -1.36395704
7.1  30   1  3.304581 -1.36395704
830   0 -5.325306 -4.33911853
8.1  30   0 -5.325306 -4.33911853
930   0 -2.434264 -0.08987457

$`4`
   ID read int z y
10  41   1 -1.228310 -4.130022
11  40   1  7.120409 -2.644456

$`5`
   ID read intz y
12  50   1 2.799069 -2.125196
13  50   1 3.003857 -7.433467

I leave it to you to modify the lapply() function to break up each id
data frame into sublists of pairs if that is what you wish to do.
Assuming again that this is actually what you want.

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Thu, Sep 13, 2018 at 1:40 PM Bert Gunter  wrote:
>
> What if there is only one read in the id?
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Thu, Sep 13, 2018 at 12:11 PM Andras Farkas via R-help
>  wrote:
> >
> > Dear All,
> >
> > I have data frame:
> > set.seed(123.456)
> > df <-data.frame(ID=c(1,1,2,2,2,3,3,3,3,4,4,5,5),
> > read=c(1,1,0,1,1,1,0,0,0,1,0,0,0),
> > int=c(1,1,0,0,0,1,1,0,0,1,1,1,1),
> > z=rnorm(13,1,5),
> > y=rnorm(13,1,5))
> >
> > what I would like to achieve (as best as I see it now) is to create 
> > multiple lists (and lists within lists using the data in df) that would be 
> > based on the groups in the ID column ("top level of list") and "join 
> > together" each line item within the group followed by the next line item 
> > ("bottom level list"), so would look like this for
> >
> > [[ID=1]]
> > [[1]][[1]]
> >   ID read intzy
> >   11   1 5.188935 5.107905
> >   11   1 1.766866 4.443201
> > [[ID=2]]
> > [[2]][[1]]  ID read int z y
> >   20   0 -4.690685 3.7695883
> >   21   0  7.269075 0.6904414[[ID=2]]
> > [[2]][[2]]  ID read intz  y
> >   21   0 7.269075  0.6904414
> >   21   0 3.132321 -0.5298133[[ID=3]]
> > [[3]][[1]]  ID read int  z y
> >   31   1 -0.4753574 -0.902355
> >   30   1  5.4756283 -2.473535
> > [[ID=3]]
> > [[3]][[2]]
> >   30   1 5.475628 -2.47353489
> >   30   0 5.390667 -0.03958639
> >
> >
> > hoping example clear enough... all our help is appreciated,
> >
> > thanks,
> >
> >
> >
> > Andras
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>

Re: [R] Question on Binom.Confint

2018-09-13 Thread Bert Gunter
In what package?
Binomial confidence interval functions are in several.

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Sep 13, 2018 at 6:38 PM Guo, Fang (Associate)
 wrote:
>
> Hi,
>
> I have a question with the function Binom.Confint(x,n,"method"=lrt). For 
> likelihood ratio test, I'd like to ask how you define the upper limit when 
> the frequency of successes is zero. Thanks!
>
>
> Fang Guo
> Associate
>
> CORNERSTONE RESEARCH
> 699 Boylston Street, 5th Floor
> Boston, MA 02116-2836
> 617.927.3042 direct
> fa...@cornerstone.com<mailto:fa...@cornerstone.com>
>
> www.cornerstone.com<http://www.cornerstone.com/>
>
>
> ***
> Warning: This email may contain confidential or privileged information
> intended only for the use of the individual or entity to whom it is
> addressed. If you are not the intended recipient, please understand
> that any disclosure, copying, distribution, or use of the contents
> of this email is strictly prohibited.
> ***
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sink() output to another directory

2018-09-13 Thread Bert Gunter
I find your "explanation" confusing. You appear to be misusing
print(). Please read ?print carefully. You print objects in R, not
files. Objects in R do not have "/" in their names (without some
trickery). See ?make.names .

-- Bert






Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Sep 13, 2018 at 7:12 PM Rich Shepard  wrote:
>
> On Thu, 13 Sep 2018, Henrik Bengtsson wrote:
>
> >> sink('stat-summaries/estacada-se-precip.txt')
> >> print(summary(estacada_se_wx))
> >> sink()
> >>
> >> while accepting:
> >>
> >> pdf('../images/rainfall-estacada-se.pdf')
> >>   
> >> plot(rain_est_se)
> >> dev.off()
> >>
> >>Changing the sink() file to
> >> './stat-summaries/estacada-se-precip.txt'
> >>
> >> generates the same error
> >
> > "same error" as what? (ambiguity is the reason for not being able to
> > help you - all the replies in this thread this far are correct and on
> > the spot)
> >
> > BTW, not that it should matter, what is your operating system and version 
> > of R?
>
> Henrik,
>
>As I wrote in earlier messages:
>
> sink('stat-summaries/estacada-wnw-precip.txt')
> print(summary(estacada_se_wx))
> sink()
>
> results in
>
> 24: sink('stat-summaries/estacada-wnw-precip.txt')
> 25: print(/
> ^
> Does not matter if I use single or double quotes.
>
>The message that print() doesn't like the forward slash results when I
> specify 'stat-summaries/estacada-wnw-precip.txt' or
> './stat-summaries/estacada-wnw-precip.txt'.
>
>Running R-3.5.1 on Slackware-14.2.
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to plot gridded data

2018-09-13 Thread Bert Gunter
You may wish to consider posting on r-sig-geo, where you may be more
likely to find expertise for this sort of thing.
-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Sep 13, 2018 at 7:08 PM lily li  wrote:
>
> Hi Petr,
>
> I have merged the data using cbind. The dataset is like this:
> DF
> lat1_lon1  lat1_lon2  lat1_lon3  ...  lat2_lon1
>   1.20   1.30  2.11  ... 1.28
>   1.50   1.81  3.12  ... 2.34
>   2.41   2.22  1.56  ... 2.50
>   3.11   4.21  2.12  ... 3.21
>
> The other file is a shapfile, which I can open using readOGR. Then it shows
> a polygon according to geographical latitude and longitude in degrees. How
> to overlay the values in DF onto the polygon? note that DF has the
> coordinates for a rectangular box that includes the shapefile, but is
> larger. I don't know how to do this. Thanks for your help.
>
> On Wed, Sep 12, 2018 at 3:22 PM, PIKAL Petr  wrote:
>
> > Hi
> >
> > 1. Read files/lines into R ?read.table, ?read.lines
> > 2. Merge files according to your specification ?merge, ?rbind
> > 3. Plot values by suitable command(s) ?plot, ?ggplot
> > 4. If you want more specific answer, please post more specific question,
> > preferably with concise and clear example.
> > 5. Avoid posting in HTML
> >
> > Cheers
> > Petr
> >
> > > -Original Message-
> > > From: R-help  On Behalf Of lily li
> > > Sent: Wednesday, September 12, 2018 8:55 AM
> > > To: R mailing list 
> > > Subject: [R] how to plot gridded data
> > >
> > > Hi R users,
> > >
> > > I have a question about plotting gridded data. I have the files
> > separately, but do
> > > not know how to combine them. For example, each txt file has daily
> > > precipitation data at a specific grid cell, named pr_lat_lon.txt. How to
> > plot all
> > > txt files for one surface (which is rectangular in this case), or how to
> > combine
> > > the txt files together? Thanks.
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních
> > partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/
> > zasady-ochrany-osobnich-udaju/ | Information about processing and
> > protection of business partner’s personal data are available on website:
> > https://www.precheza.cz/en/personal-data-protection-principles/
> > Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou
> > důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení
> > odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any
> > documents attached to it may be confidential and are subject to the legally
> > binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question on Binom.Confint

2018-09-14 Thread Bert Gunter
Then it's binom.confint (case matters in R -- PLEASE DO A TUTORIAL OR
TWO!) and there is no "lrt"  option. So no idea what you're referring
to.

-- Bert


On Fri, Sep 14, 2018 at 6:53 AM Guo, Fang (Associate)
 wrote:
>
> I used library(binom).
>
> -----Original Message-
> From: Bert Gunter [mailto:bgunter.4...@gmail.com]
> Sent: Thursday, September 13, 2018 10:04 PM
> To: Guo, Fang (Associate) 
> Cc: r-help-requ...@r-project.org; R-help 
> Subject: Re: [R] Question on Binom.Confint
>
> In what package?
> Binomial confidence interval functions are in several.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and 
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Thu, Sep 13, 2018 at 6:38 PM Guo, Fang (Associate)  
> wrote:
> >
> > Hi,
> >
> > I have a question with the function Binom.Confint(x,n,"method"=lrt). For 
> > likelihood ratio test, I'd like to ask how you define the upper limit when 
> > the frequency of successes is zero. Thanks!
> >
> >
> > Fang Guo
> > Associate
> >
> > CORNERSTONE RESEARCH
> > 699 Boylston Street, 5th Floor
> > Boston, MA 02116-2836
> > 617.927.3042 direct
> > fa...@cornerstone.com<mailto:fa...@cornerstone.com>
> >
> > www.cornerstone.com<http://www.cornerstone.com/>
> >
> >
> > ***
> > Warning: This email may contain confidential or privileged information
> > intended only for the use of the individual or entity to whom it is
> > addressed. If you are not the intended recipient, please understand
> > that any disclosure, copying, distribution, or use of the contents of
> > this email is strictly prohibited.
> > ***
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> ***
> Warning: This email may contain confidential or privileged information
> intended only for the use of the individual or entity to whom it is
> addressed. If you are not the intended recipient, please understand
> that any disclosure, copying, distribution, or use of the contents
> of this email is strictly prohibited.
> ***

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New to R

2018-09-14 Thread Bert Gunter
Others may help, but I suggest first going through an R tutorial or
two to learn about R's basic data structures, i/o, etc. This list can
help, but cannot substitute for such homework. Some tutorial
recommendations can be found here:
https://www.rstudio.com/online-learning/#r-programming

There are many more, of course.

See also:
?read.table (etc. in the Help page)
?write.table

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Fri, Sep 14, 2018 at 1:56 PM Jim Blackburn
 wrote:
>
> I am newly subscribed to r-project.
>
>
> I have recently plunged into R on a totally self-taught basis (may not have 
> been the smartest decision!)
>
>
>
> I am attempting to download tickers as a time series.  I can successfully 
> create RDA files but I want to convert them to CVS.  Following is the code I 
> have created so far.
>
>
>
> if (!require(BatchGetSymbols)) install.packages('BatchGetSymbols')
>
> library(BatchGetSymbols)
>
> tickers <- c('SPY','VCR', 'RPG')
>
> first.date <- Sys.Date()-365
>
> last.date <- Sys.Date
>
> l.out <- BatchGetSymbols(tickers = tickers,
>
>first.date = first.date,
>
>last.date = last.date,
>
> cache.folder = file.path("c://Users/Owner/Documents/R",
>
> +'BGS_Cache') )
>
> print(l.out$df.control)
>
> print(l.out$df.tickers)
>
>
>
>
>
>
>
> I can print(l.out) and see that it contains all the data, but it is not a 
> data.frame
>
>
>
> Can anyone help with creating a data.frame and then converting to CSV?
>
>
>
> Any help is GREATLY appreciated!
>
>
>
> Thanks
>
>
>
> Jim
>
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bootstrap sample for clustered data

2018-09-16 Thread Bert Gunter
I can't make any sense of your post. Id 3 occurs 6 times, and 2 and 5 occur
twice each in your example.. How do you get (1,1,2,2,3,3,4,4,5,5) out of
that? In other words, specify the mapping of old id's to new.

Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Sep 16, 2018 at 11:51 AM Liu, Lei  wrote:

> Hi there,
>
> I tried to generate bootstrap samples for clustered data. Here is some
> code I found in the web to do the work:
>
> id=c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5)
> y=c(.5, .6, .4, .3, .4, 1, .9, 1, .5, 2)
> x=c(0, 0, 1, 1, 0, 0, 1, 1, 1, 1 )
>
> xx=data.frame(id, x, y)
>
> boot.cluster <- function(x, id){
>
>   boot.id <- sample(unique(id), replace=T)
>   out <- lapply(boot.id, function(i) x[id%in%i,])
>
>   return( do.call("rbind",out) )
>
> }
>
> boot.pro=boot.cluster(xx, xx$id)
>
> Now I have the output
>
>id x   y
> 5   3 0 0.4
> 6   3 0 1.0
> 51  3 0 0.4
> 61  3 0 1.0
> 9   5 1 0.5
> 10  5 1 2.0
> 52  3 0 0.4
> 62  3 0 1.0
> 3   2 1 0.4
> 4   2 1 0.3
>
> However, the id variable is the original id, while I want to take the new
> id as (1, 1, 2, 2, 3, 3, 4, 4, 5, 5) for later analysis. Can anyone show me
> how to do it? Of note, the same original id may have duplicates since the
> bootstrap sample is drawn with replacement. Thanks a lot!
>
> Lei
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bootstrap sample for clustered data

2018-09-16 Thread Bert Gunter
(I neglected to cc this to the list -- Bert)


On Sun, Sep 16, 2018 at 1:36 PM Bert Gunter  wrote:

> You can do a mixed effects model using the existing id's without recoding.
>
> But if you insist, is this the sort of thing you want?
>
> set.seed(-12345) # for reprodicibility
>
> id <- factor(sample(2:5, 10, rep=TRUE))
> id
> new.id <- factor(id,labels = seq_along(levels(id)))
> new.id
>
> Note: There's a slightly slicker way to do this, but it bypasses the
> factor() API, and I prefer not to do that.
>
> Cheers,
> Bert
>
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sun, Sep 16, 2018 at 12:52 PM Liu, Lei  wrote:
>
>> Sorry for the confusion. I just want to recode the id variable to 1 to 5
>> in the bootstrapped sample. This way I can do e.g., a mixed effects model
>> using the new id as the cluster. Thanks!
>>
>> Lei
>>
>>
>>
>> *From:* Bert Gunter [mailto:bgunter.4...@gmail.com]
>> *Sent:* Sunday, September 16, 2018 2:21 PM
>> *To:* Liu, Lei 
>> *Cc:* R-help 
>> *Subject:* Re: [R] bootstrap sample for clustered data
>>
>>
>>
>> I can't make any sense of your post. Id 3 occurs 6 times, and 2 and 5
>> occur twice each in your example.. How do you get (1,1,2,2,3,3,4,4,5,5) out
>> of that? In other words, specify the mapping of old id's to new.
>>
>>
>>
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>>
>>
>>
>> On Sun, Sep 16, 2018 at 11:51 AM Liu, Lei  wrote:
>>
>> Hi there,
>>
>> I tried to generate bootstrap samples for clustered data. Here is some
>> code I found in the web to do the work:
>>
>> id=c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5)
>> y=c(.5, .6, .4, .3, .4, 1, .9, 1, .5, 2)
>> x=c(0, 0, 1, 1, 0, 0, 1, 1, 1, 1 )
>>
>> xx=data.frame(id, x, y)
>>
>> boot.cluster <- function(x, id){
>>
>>   boot.id <- sample(unique(id), replace=T)
>>   out <- lapply(boot.id, function(i) x[id%in%i,])
>>
>>   return( do.call("rbind",out) )
>>
>> }
>>
>> boot.pro=boot.cluster(xx, xx$id)
>>
>> Now I have the output
>>
>>id x   y
>> 5   3 0 0.4
>> 6   3 0 1.0
>> 51  3 0 0.4
>> 61  3 0 1.0
>> 9   5 1 0.5
>> 10  5 1 2.0
>> 52  3 0 0.4
>> 62  3 0 1.0
>> 3   2 1 0.4
>> 4   2 1 0.3
>>
>> However, the id variable is the original id, while I want to take the new
>> id as (1, 1, 2, 2, 3, 3, 4, 4, 5, 5) for later analysis. Can anyone show me
>> how to do it? Of note, the same original id may have duplicates since the
>> bootstrap sample is drawn with replacement. Thanks a lot!
>>
>> Lei
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bootstrap sample for clustered data

2018-09-16 Thread Bert Gunter
Unless there is good reason not to -- which is not the case here --
**always" cc the list. I have done that here.

"Can you help me with it?"
Nope. I'm not a private consultant, and I already made an attempt to do so,
which you seem to have completely ignored. So I'm done.
By the way, "Unfortunately it couldn’t work for my case" is a completely
meaningless comment. You need to explicitly show what you did and what
error messages you received. Read the posting guide below for how to post
an intelligible question.
FInally, if you think this is a mixed model issue -- which I believe you
are confused about, but as I can't penetrate your comments, maybe I'm wrong
-- post on the r-sig-mixed-models list,not here. Same comments go for
posting an intelligible question apply there.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Sep 16, 2018 at 5:34 PM Liu, Lei  wrote:

> Hi Bert,
>
>
>
> Thanks for your help. Unfortunately it couldn’t work for my case. Please
> see my code below. Here id is the cluster. Note different clusters have
> different number of subjects, some have 2, some have 3.
>
>
>
> id=c(1, 1, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5)
>
> y=c(.5, .6, .4, .3, .4, 1, .9, 1, .5, 2, 2.2, 3)
>
> x=c(0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1 )
>
>
>
> xx=data.frame(id, x, y)
>
>
>
> boot.cluster <- function(x, id){
>
>
>
>   boot.id <- sample(unique(id), replace=T)
>
>   out <- lapply(boot.id, function(i) x[id%in%i,])
>
>
>
>   return( do.call("rbind",out) )
>
>
>
> }
>
>
>
> boot.xx=boot.cluster(xx, xx$id)
>
>
>
> Here is the boot.xx dataset:
>
>
>
>id x   y
>
> 5   3 0 0.4
>
> 6   3 0 1.0
>
> 7   3 0 0.9
>
> 1   1 0 0.5
>
> 2   1 0 0.6
>
> 11  5 1 2.2
>
> 12  5 1 3.0
>
> 3   2 1 0.4
>
> 4   2 1 0.3
>
> 13  1 0 0.5
>
> 21  1 0 0.6
>
>
>
> You can see that some clusters (ids) appears multiple times (e.g., id 1
> appears in two places – 4 rows), since bootstrap does a sample *with
> replacement*, we could have the same cluster multiple times. Thus, we
> cannot do a mixed effects model using this data, as we should assume all
> the clusters are different in this new data. Instead, I will reorganize the
> data as below. This is the step I need help:
>
>
>
> new.id x   y
>
> 5   1 0 0.4
>
> 6   1 0 1.0
>
> 7   1 0 0.9
>
> 1   2 0 0.5
>
> 2   2 0 0.6
>
> 11  3 1 2.2
>
> 12  3 1 3.0
>
> 3   4 1 0.4
>
> 4   4 1 0.3
>
> 13  5 0 0.5
>
> 21  5 0 0.6
>
>
>
> Can you help me with it? Thanks a lot!
>
>
>
> Lei
>
>
>
> *From:* Bert Gunter [mailto:bgunter.4...@gmail.com]
> *Sent:* Sunday, September 16, 2018 3:36 PM
> *To:* Liu, Lei 
> *Subject:* Re: [R] bootstrap sample for clustered data
>
>
>
> You can do a mixed effects model using the existing id's without recoding.
>
>
>
> But if you insist, is this the sort of thing you want?
>
>
>
> set.seed(-12345) # for reprodicibility
>
> id <- factor(sample(2:5, 10, rep=TRUE))
> id
> new.id <- factor(id,labels = seq_along(levels(id)))
> new.id
>
>
>
> Note: There's a slightly slicker way to do this, but it bypasses the
> factor() API, and I prefer not to do that.
>
>
>
> Cheers,
>
> Bert
>
>
>
>
>
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
>
>
>
> On Sun, Sep 16, 2018 at 12:52 PM Liu, Lei  wrote:
>
> Sorry for the confusion. I just want to recode the id variable to 1 to 5
> in the bootstrapped sample. This way I can do e.g., a mixed effects model
> using the new id as the cluster. Thanks!
>
> Lei
>
>
>
> *From:* Bert Gunter [mailto:bgunter.4...@gmail.com]
> *Sent:* Sunday, September 16, 2018 2:21 PM
> *To:* Liu, Lei 
> *Cc:* R-help 
> *Subject:* Re: [R] bootstrap sample for clustered data
>
>
>
> I can't make any sense of your post. Id 3 occurs 6 times, and 2 and 5
> occur twice each in your example.. How do you get (1,1,2,2,3,3,4,4,5,5) out
> of that? In other words, specify the mapping of old id's to new.
>
>
>
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (ak

Re: [R] Applying by() when groups have different lengths

2018-09-17 Thread Bert Gunter
Inline.

Bert



On Mon, Sep 17, 2018 at 11:54 AM Rich Shepard 
wrote:

>My dataframe has 113K rows split by a factor into 58 separate
> data.frames,
> with a different numbers of rows (see error output below).
>
>I cannot think of a way of proving a sample of data; if a sample for a
> MWE
> is desired advice on produing one using dput() is needed.
>

This is gibberish. What does "proving a sample of data" mean? etc. Please
proofread and edit.

>
>To summarize each group within this dataframe I'm using by() and getting
> an error because of the different number of rows:
>

> > by(rainfall_by_site, rainfall_by_site[, 'name'], function(x) {
> + mean.rain <- mean(rainfall_by_site[, 'prcp'])
> + })
>

You are misspecifying your function. It has argument x, but you do not use
x in your function. Also the assignment at the end is unnecessary and
probably wrong for your use case. Please go through a tutorial on how to
write functions in R.

You are probably also misusing by(), but as you did not provided sufficient
information -- head(your_data_frame) or similar would have told us its
structure, rather than having us guess -- nor a reproducible example, it's
hard (for me) to figure out your intent. **PLEASE** follow the posting
guide and provide such information. You have been requested to do this
several times already.

Here is the sort of thing I think you wanted to do:

> set.seed(54321) ## for reproducibility
> df <- data.frame(f = sample(LETTERS[1:3], 12, rep = TRUE), y = runif(12))
> df
   f  y
1  B 0.04529991
2  B 0.65272100
3  A 0.99406601
4  A 0.67763735
5  A 0.91854517
6  C 0.46244494
7  A 0.57141480
8  A 0.45193882
9  B 0.16770701
10 B 0.06826135
11 A 0.89691069
12 C 0.27383703

> by(df, df$f, function(x)mean(x$y))
df$f: A
[1] 0.7517521
--
df$f: B
[1] 0.2334973
--
df$f: C
[1] 0.368141

Note that you do not first break up the df into separate df's, which sounds
like what you tried to do.

However, note that if all you want to do is summarize a *single* numeric
column by a factor, you do not need to use by() at all, which is designed
to work on (several columns of) the whole data frame simultaneously. For a
single column, tapply() is all you need (or as Duncan noted, functionality
in the dplyr package.

> with(df,tapply(y,f,mean))
A B C
0.7517521 0.2334973 0.3681410

Finally, if I have misunderstood your intent, my apologies. I tried.

-- Bert



mean.rain <- by(rainfall_by_site, rainfall_by_site[, 'name'], function(x) {
+ mean.rain <- mean(rainfall_by_site[, 'prcp'])
+ })

> Error in (function (..., row.names = NULL, check.rows = FALSE, check.names
> = TRUE,  :
>arguments imply differing number of rows: 4900, 1085, 1894, 2844, 3520,
>   647, 239, 3652, 3701, 3063, 176, 4713, 4887, 119, 165, 1221, 3358, 1457,
>   4896, 166, 690, 1110, 212, 1727, 227, 236, 1175, 1485, 186, 769, 139,
> 203,
>   2727, 4357, 1035, 1329, 1454, 973, 4536, 208, 350, 125, 3437, 731, 4894,
>   2598, 2419, 752, 427, 136, 685, 4849, 914, 171
>
>My web searches have not found anything relevant; perhaps my search
> terms
> (such as 'R: apply by() with different factor row numbers') can be
> improved.
>
>The help pages found using apropos('by') appear the same: ?by,
> ?by.data.frame, ?by.default and provide no hint on how to work with unequal
> rows per factor.
>
>How can I apply by() on these data.frames?
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Applying by() when groups have different lengths [RESOLVED]

2018-09-17 Thread Bert Gunter
"  I did not pick up on by() doing the splitting for me when I read the
help..."

>From ?by:
"A data frame is split by row into data frames subsetted by the values of
one or more factors, and function FUN is applied to each subset in turn."

I do not understand how it could be more clearly stated than that. Care to
elaborate?
Did you run the examples? You should **always** do so.

-- Bert


On Mon, Sep 17, 2018 at 12:56 PM Rich Shepard 
wrote:

> On Mon, 17 Sep 2018, MacQueen, Don wrote:
>
> > I'm also going to guess that maybe your object rainfall_by_site has
> > already been split into separate data frames (because of its name). But
> > by() does the splitting internally, so you should be passing it the
> > original unsplit data frame.
>
> Don,
>
>I did not pick up on by() doing the splitting for me when I read the
> help
> file and a few web sites!
>
>Using the unsplit data.frame did the job; e.g.,
>
> rainfall[, "name"]: Sandy 1.4 NE
> [1] 0.1636066
> 
> rainfall[, "name"]: Sandy 1.7 SSW
> [1] 0.2021324
> 
> rainfall[, "name"]: Sherwood 3.3 SE
> [1] 0.1461752
>
>Now I know how to properly apply by() to an unsplit dataframe. Thanks
> for
> the insightful lesson.
>
> Best regards,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to vectorize this function

2018-09-20 Thread Bert Gunter
Also:

What package does polya() come from and "gamma" (as a numeric value)is
undefined (it is a function).


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Sep 20, 2018 at 9:06 AM David L Carlson  wrote:

> Your function takes an argument "F" that is never used and uses an object
> "y" which is not defined. Give us some data to use for testing different
> approaches along with the answer you expect. It may be possible to use two
> ifelse() functions instead of the loop.
>
> 
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
> -Original Message-
> From: R-help  On Behalf Of Lynette Chang
> Sent: Thursday, September 20, 2018 10:09 AM
> To: r-help@r-project.org
> Subject: [R] How to vectorize this function
>
> Hello everyone,
>
>  I’ve a function with five input argument and one output number.
>   impVolC <- function(callM, K, T, F, r)
>
>  I hope this function can take five vectors as input, then return one
> vector as output. My vectorization ran into problems with the nested
> if-else operation. As a result, I have to write another for loop to call
> this function. Can anyone suggest some methods to overcome it? I put my
> code below, thanks.
>
> impVolC <- function(callM, K, T, F, r){
>
>
>  if(y >= 0){
>  call0 <- K*exp(-r*T)*(exp(y)*polya(sqrt(2*y)) - 0.5)
>  if(callM <= call0){
>sig <- 1/sqrt(T)*(sqrt(gamma + y) - sqrt(gamma - y))
>  }else{
>sig <- 1/sqrt(T)*(sqrt(gamma + y) + sqrt(gamma - y))
>  }
>  }else{
>  call0 <- K*exp(-r*T)*(exp(y)/2 - polya(-sqrt(-2*y)))
>  if(callM <= call0){
>sig <- 1/sqrt(T)*(-sqrt(gamma + y) + sqrt(gamma - y))
>  }else{
>sig <- 1/sqrt(T)*(sqrt(gamma + y) + sqrt(gamma - y))
>  }
>  }
>  sig
> }
>
> for(i in 1:length(call)){
>  sigV[i] <- impVolC(callM = call[i], K = df$Strike[i], T = T, F = F, r =
> r_m) }
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting Production Curves

2018-09-21 Thread Bert Gunter
This list doesn't do statistics -- it does R programming, though statistics
does occur incidentally sometimes in that context. Not in your post
though. You should post on a statistics site like stats.stackexchange.com
for statistics questions.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Sep 20, 2018 at 10:38 PM mikorym via R-help 
wrote:

> Hi All,
>
> By a production curve I mean for example the output of a mine, peak oil
> production or the yield of a farm over time within the same season. It is
> this last example that we should take as the prototypical case.
>
> What I would like to do is to fit a curve that inherits qualities of the
> discrete production data (such as area of the curve equaling the total
> production for the season). Fitting a curve with least squares (such as a
> Gaussean or Hubbert) presents some issues (with regards to accuracy of
> inherited features). My next logical attempt would be to fit a sum of
> curves, such as a Fourier or Wavelet sum. Perhaps there is something
> simpler or more flexible in the way I am thinking?
>
> My question is:
>
> 1. What would be an effective approach be to fit generalised production
> curves?
> 2. If a Wavelet sum is one of the best approaches, what would be a good
> way of implementing such curve fitting (including calculated coefficients)
> in R?
> 3. Is there anything else or another way that I should rather be thinking
> about this instead?
>
> Best regards
> Phillip-Jan van Zyl
> MSc Mathematics, Stellenbosch
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] For Loop

2018-09-22 Thread Bert Gunter
Bob:

Please, please spend some time with an R tutorial or two before you post
here. This list can help, but I think we assume that you have already made
an effort to learn basic R on your own. Your question is about as basic as
it gets, so it appears to me that you have not done this. There are many
many R tutorials out there. Some suggestions, by no means comprehensive,
can be found here:
https://www.rstudio.com/online-learning/#r-programming

Others will no doubt respond, but you can answer it yourself after only a
few minutes with most R tutorials.

Cheers,
Bert




On Sat, Sep 22, 2018 at 2:16 PM rsherry8  wrote:

>
> It is my impression that good R programmers make very little use of the
> for statement. Please consider  the following
> R statement:
>  for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> One problem I have found with this statement is that s must exist before
> the statement is run. Can it be written without using a for
> loop? Would that be better?
>
> Thanks,
> Bob
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error "The system cannot find the file specified..."

2018-09-22 Thread Bert Gunter
You probably want pattern = "\\.PDF" , as "." has a special meaning for
regex's. However, that really shouldn't make any difference.

Obvious questions:
1. dir() returns a vector of file names. Are they pdf's "PDF" or "pdf"
(case matters!) ?
2. extract.tables() almost certainly wants the full path names to the
files, not just the file names, if your working directory isn't set to the
directory containing the files. So what does getwd() give?

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Sep 22, 2018 at 4:22 PM Ek Esawi  wrote:

> Hi All,
>
> I am using the R Tabulizer package to extract tables from a set of pdf
> files. Tabulizer creates a list of data frames; each corresponds to a
> table in a file. My aim is to create a list of lists, one for each
> file.i have 8 files
> The code below kept giving me the error "Error in
> normalizePath(path.expand(path), winslash, mustWork) : path[1]="April
> 24.PDF": The system cannot find the file specified". But when i used
> table_extract (file) for individual files, it works perfectly.
>
> Any help is greatly appreciated.
>
>
> EK
>
>
> path = "C:/Users/name/Documents/TextMining/"
> file.names <- dir(path, pattern =".PDF")
> A <- vector("list", length(file.names))
> for(i in 1:length(file.names)){
>   A[i] <- extract_tables(file.names[i])}
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] For Loop

2018-09-23 Thread Bert Gunter
"... I learned to say "try it and see" in many different ways. "

Version 2: *Never* parallelize your computations   except when you
should.

;-)

-- Bert



On Sun, Sep 23, 2018 at 1:20 PM Duncan Murdoch 
wrote:

> On 23/09/2018 4:00 PM, Wensui Liu wrote:
> > Very insightful. Thanks, Duncan
> >
> > Based on your opinion, is there any benefit to use the parallelism in
> > the corporate computing environment where the size of data is far more
> > than million rows and there are multiple cores in the server.
>
> I would say "try it and see".  Sometimes it probably helps a lot,
> sometimes it's probably detrimental.
>
> Duncan Murdoch
>
> P.S. I last worked in a corporate computing environment 40 years ago
> when I was still wet behind the ears, so you'd probably want to ask
> someone else.  However, more recently I worked in an academic
> environment where I learned to say "try it and see" in many different
> ways.  You just got the basic one today.
>
>
> >
> > Actually the practice of going concurrency or not is more related to my
> > production tasks instead of something academic.
> >
> > Really appreciate your thoughts.
> >
> > On Sun, Sep 23, 2018 at 2:42 PM Duncan Murdoch  > > wrote:
> >
> > On 23/09/2018 3:31 PM, Jeff Newmiller wrote:
> >
> > [lots of good stuff deleted]
> >
> >  > Vectorize is
> >  > syntactic sugar with a performance penalty.
> >
> > [More deletions.]
> >
> > I would say Vectorize isn't just "syntactic sugar".  When I use that
> > term, I mean something that looks nice but is functionally
> equivalent.
> >
> > However, Vectorize() really does something useful:  some functions
> > (e.g.
> > outer()) take other functions as arguments, but they assume the
> > argument
> > is a vectorized function.  If it is not, they fail, or generate
> garbage
> > results.  Vectorize() is designed to modify the interface to a
> function
> > so it acts as if it is vectorized.
> >
> > The "performance penalty" part of your statement is true.  It will
> > generally save some computing cycles to write a new function using a
> > for
> > loop instead of using Vectorize().  But that may waste some
> > programmer time.
> >
> > Duncan Murdoch
> > (writing as one of the authors of Vectorize())
> >
> > P.S. I'd give an example of syntactic sugar, but I don't want to
> bruise
> > some other author's feelings :-).
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading data problem

2018-09-24 Thread Bert Gunter
*Perhaps* useful questions (perhaps *not*, though):

1. What is your OS? What is your R version?
2. How do you know that your data has 151 rows?
3. Are there stray characters -- perhaps a stray eof -- in your data? Have
you checked around row 96 to see what's there?
4. Are the data you did get in R what you expect?

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Sep 24, 2018 at 11:27 AM greg holly  wrote:

> Hi Dear all;
>
> I have a dataset with 151*291 dimension. After making data read into R I am
> getting a data with 96*291 dimension. Even though  I have no error message
> from R I could not understand the reason why I cannot get data correctly?
>
> Here are my codes to make read the data
> a<-read.table("for_R_graphs.csv", header=T, sep=",")
> a<-read.table("for_R_graphs.txt", header=T, sep="\t")
>
> Regards,
>
> Greg
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading data problem

2018-09-24 Thread Bert Gunter
One more question:

5. Have you tried shutting down, restarting R, and rereading?

-- Bert

On Mon, Sep 24, 2018 at 11:36 AM Bert Gunter  wrote:

> *Perhaps* useful questions (perhaps *not*, though):
>
> 1. What is your OS? What is your R version?
> 2. How do you know that your data has 151 rows?
> 3. Are there stray characters -- perhaps a stray eof -- in your data? Have
> you checked around row 96 to see what's there?
> 4. Are the data you did get in R what you expect?
>
> -- Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Sep 24, 2018 at 11:27 AM greg holly  wrote:
>
>> Hi Dear all;
>>
>> I have a dataset with 151*291 dimension. After making data read into R I
>> am
>> getting a data with 96*291 dimension. Even though  I have no error message
>> from R I could not understand the reason why I cannot get data correctly?
>>
>> Here are my codes to make read the data
>> a<-read.table("for_R_graphs.csv", header=T, sep=",")
>> a<-read.table("for_R_graphs.txt", header=T, sep="\t")
>>
>> Regards,
>>
>> Greg
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summarizing R script

2018-09-26 Thread Bert Gunter
All suggestions made by others here are useful, but I would suggest that
computer scientists are probably a better -- or at least valuable
additional -- resource for this sort of knowledge than R programmers. A web
search on "self-documenting code" and/or "reproducible research" should
yield lots of relevant hits. For R specifically, the CRAN "Reproducible
Research" task view should be useful..

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Sep 26, 2018 at 8:39 AM MacQueen, Don via R-help <
r-help@r-project.org> wrote:

> I wonder if the lintr package might be helpful.
>
> --
> Don MacQueen
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
> Lab cell 925-724-7509
>
>
>
> On 9/26/18, 7:00 AM, "R-help on behalf of Spencer Brackett" <
> r-help-boun...@r-project.org on behalf of spbracket...@saintjosephhs.com>
> wrote:
>
> R users,
>
>   Is anyone aware of the proper procedure for summarizing a script(your
> complete list of functions, arguments , and error codes within your R
> console for say a formal report or publication?
>
> Many thanks,
>
> Best wishes,
>
> Spencer Brackett
>
> -- Forwarded message -
> From: CHATTON Anne via R-help 
> Date: Wed, Sep 26, 2018 at 6:03 AM
> Subject: [R] Problems to obtain standardized betas in multiply-imputed
> data
> To: r-help@r-project.org 
>
>
> Dear all,
>
> I am having problems in obtaining standardized betas on a
> multiply-imputed
> data set. Here are the codes I used :
> imp = mice(data, 5, maxit=10, seed=42, print=FALSE)
> FitImp <- with(imp,lm(y ~ x1 + x2 + x3))
> Up to here everything is fine. But when I ask for the standardized
> coefficients of the multiply-imputed regressions using this command :
> sdBeta <- lm.beta(FitImp)
> I get the following error message:
> Error in b * sx : argument non numérique pour un opérateur binaire
>
> Can anyone help me with this please?
>
> Anne
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset only if f.e a column is successive for more than 3 values

2018-09-27 Thread Bert Gunter
1. I assume the values are integers, not floats/numerics (which woud make
it more complicated).

2. Strategy: Take differences (e.g. see ?diff) and look for >3 1's in a
row.

I don't have time to work out details, but perhaps that helps.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Sep 27, 2018 at 7:49 AM Knut Krueger 
wrote:

> Hi to all
>
> I need a subset for values if there are f.e 3 values successive in a
> column of a Data Frame:
> Example from the subset help page:
>
> subset(airquality, Temp > 80, select = c(Ozone, Temp))
> 29 45   81
> 35 NA   84
> 36 NA   85
> 38 29   82
> 39 NA   87
> 40 71   90
> 41 39   87
> 42 NA   93
> 43 NA   92
> 44 23   82
> .
>
> I would like to get only
>
> ...
> 40 71   90
> 41 39   87
> 42 NA   93
> 43 NA   92
> 44 23   82
> 
>
> because the left column is ascending more than f.e three times without gap
>
> Any hints for a package or do I need to build a own function?
>
> Kind Regards Knut
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Access function as text from package by name

2018-09-28 Thread Bert Gunter
Do you mean:
?get



On Thu, Sep 27, 2018, 11:44 PM Sigbert Klinke 
wrote:

> Hi,
>
> I guess I was not clear enough: the name of the function is stored as
> string. Solutions which use the object directly do not help unfortunately.
>
> Thanks Sigbert
>
> Am 27.09.2018 um 12:30 schrieb Sigbert Klinke:
> > Hi,
> >
> > I want to have a function, e.g. graphics::box, as text.
> > Currently I'am using
> >
> > deparse(eval(parse(text='graphics::box')))
> >
> > It is important that '::' and ':::' can be used in the name.
> >
> > Is there a simpler way?
> >
> > Thanks
> >
> > Sigbert
> >
> >
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --
> https://hu.berlin/sk
> https://hu.berlin/mmstat3
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing packages in bulk

2018-10-01 Thread Bert Gunter
Your syntax is wrong.

pkgs

character ***vector*** of the names of packages

See the example at the end of ?install.packages.

"a","b", "c"  is **not** a vector
c("a", "b", "c")  **is** a vector.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Oct 1, 2018 at 3:04 PM Rich Shepard 
wrote:

>For a new installation of R-3.5.1 I want to install all packages on
> another host. I prepared a file, R-libraries.R, which contains
> install.packages('BH, ...') for the entire list.
>
>When I source() this file on the new host and select a CRAN mirror I see
> the message, "Warning message:
> package 'BH,covr, 
> [...truncated]
>
>Do I need a space after each comma or is there something else wrong with
> my syntax?
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] arulesSequences Package

2018-10-01 Thread Bert Gunter
"But if I run in R (alone) it works just fine.  Any idea why I can' t seem
to
run in RStudio or what I might have to do in RStudio?"

This is the R-Help list; RStudio is a totally separate software product,
Any questions you have on RStudio software should be directed to their
site,not here.

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Oct 1, 2018 at 5:03 PM Jeff Reichman 
wrote:

> I Was able to figure out how to create my object using the "read_baskets"
> function but when I run the "cspade" function I receive the following error
> (running in RStudio)
> ...
> 4 -> 1 2 -> 1 -> 1 -- 1 1
> 3 -> 1 2 -> 1 -> 1 -- 1 1
> 3 4 -> 1 2 -> 1 -> 1 -- 1 1
> 3 4 -> 1 -> 1 -> 1 -- 1 1
>  NA MB [2.4s]
> reading sequences ...Error in file(con, "r") : cannot open the connection
> In addition: Warning message:
> In file(con, "r") :
>   cannot open file
>
> 'C:\Users\reichmanj\Documents\R\R-3.5.1\library\arulesSequences\misc\cspade1
> f8c30bb5148.out': No such file or directory
> >
> > cspade> as(s1, "data.frame")
> Error in .class1(object) : object 's1' not found
>
> But if I run in R (alone) it works just fine.  Any idea why I can' t seem
> to
> run in RStudio or what I might have to do in RStudio?
>
> -Original Message-
> From: R-help  On Behalf Of Jeff Reichman
> Sent: Monday, October 1, 2018 4:48 PM
> To: r-help@r-project.org
> Subject: [R] arulesSequences Package
>
> R-Help Forum
>
> For anyone who has used the "arulesSequences" Package how do I start with
> raw data (*.csv file) and convert into a sequence object to run the cSPADE
> function.  The package example use zaki dataset which has already been
> converted.  When I state raw data how should I structure my csv file?
>
> A web example  or RPubs link?
>
> For example
> Seq_1   {E,B}, {C}, {T}
> Seq_2   (E,M}, {C}, {B,E,V,T}
> Seq_3   {E},{C},{T}
>
> Jeff Reichman
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logic operators...more than one??

2018-10-03 Thread Bert Gunter
Inline.
-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Oct 3, 2018 at 4:02 PM David Doyle  wrote:

> I'm sure this is a simple question but I'm not sure where to find the
> answer.
>

Good R tutorials abound on the internet. One resource is:
https://www.rstudio.com/online-learning/#r-programming

but there are tons more if you search.

>
> I want to remove some of the data.  For example when my Location column is
> MW-09, MW-10, or MW-11.
>
> It works fine if I ONLY list ONE of the locations as in:
>
> SampledWells <- MyData[ MyData$Location != "MW-09", ]
>
> But if I try to do more than one (as shown below), I don't get an error but
> I also don't get my SampledWells
> SampledWells <- MyData[ MyData$Location != "MW-09", "MW-10", ]
>
> Thoughts??
>

?"%in%"

as in:

SampledWells <- MyData[ MyData$Location %in% c( "MW-09", "MW-10", "MW-11"),
]





> Thank you for your time
> David
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange paradox

2018-10-05 Thread Bert Gunter
This list is about R programming. Statistics questions, which this is, are
generally off topic here. Try posting on a statistics list like
stats.stackexchange.com instead.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Oct 5, 2018 at 1:48 AM CHATTON Anne via R-help 
wrote:

> Hello,
>
> I am currently analysed two nested models using the same sample. Both the
> simpler model (Model 1 ~ x1 + x2) and the more complex model (Model 2 ~ x1
> + x2 + x3 + x4) yield the same adjusted R-square. Yet the p-value
> associated with the deviance statistic is highly significant (p=0.0047),
> suggesting that the confounders (x3 and x4) account for the prediction of
> the dependent variable.
>
> Does anyone have an explanation of this strange paradox?
>
> Thank you for any suggestion.
>
> Anne
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How can I create a loop for this? Please help me

2018-10-05 Thread Bert Gunter
Two things:

1. As your query is about spatial data, you may do better posting (in plain
text, **not** html, which often gets mangled on these plain text lists) on
the r-sig-geo  list.

2. These lists can help, but do not replace, your obligation to do your own
homework. There are many good R tutorials on the web that you can look to
for help. Some recommendations, by no means all that you may wish to check,
can be found here:

https://www.rstudio.com/online-learning/#r-programming

There are also both tutorials and "Vignettes" (the latter in the packages
themselves) specifically for spatial data analysis and visualization.
Searching on "R tutorial spatial data" brought up several.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Oct 5, 2018 at 8:12 AM Ivy Pieters  wrote:

> Hey there,
> I am very much a newbie in the R world. I have to work with R for my
> internship. I really hope that someone can help me out here, since it costs
> me ages to run and adjust the same script over and over again.
>
> I have a SpatialPointsDataFrame table (sent2field2@data) that I would
> like to split into different SpatialPointsDataFrame tables. The first table
> needs to consist out of [1:13] columns, the second table needs to consist
> of [,c(1,14:25)] the third needs to consist of [,c(1,26:37)]. The name of
> the tables need to be Sent2Field2_1, Sent2Field2_2, Sent2Field2_3, etc etc…
> So only 1,2,3,4 etc need to change within the name. This loop needs to go
> on until there are no columns left anymore in the dataset. Then the next
> step needs to add coordinates to the table (see script below). The name of
> the tables need to be Sent2F2_1, Sent2F2_2, Sent2F2_3, etc etc… So only
> 1,2,3,4 etc need to change within the name. The last step projects the
> table with the added coordinates into a SpatialPointsDataFrame. See the
> script below: (up to now I am adjusting the names manually and running it
> time after time, i am getting crazy, but I really don’t know how to make a
> loop) For now the separate steps are working fine. I really hope someone
> can help me out. Looking forward to anyones reply. Thank you already in
> advance.
>
> #field2
> Sent2Field2_1<-Sent2Field2@data[,1:13]
> Sent2F2_1<-cbind(CoordsSent2_F2,Sent2Field2_1)
> coordinates(Sent2F2_1) <- ~long+lat
>
> Sent2Field2_2<-Sent2Field2@data[,c(1,14:25)]
> Sent2F2_2<-cbind(CoordsSent2_F2,Sent2Field2_2)
> coordinates(Sent2F2_2) <- ~long+lat
>
> Sent2Field2_3<-Sent2Field2@data[,c(1,26:37)]
> Sent2F2_3<-cbind(CoordsSent2_F2,Sent2Field2_3)
> coordinates(Sent2F2_3) <- ~long+lat
>
> Sent2Field2_4<-Sent2Field2@data[,c(1,38:49)]
> Sent2F2_4<-cbind(CoordsSent2_F2,Sent2Field2_4)
> coordinates(Sent2F2_4) <- ~long+lat
>
> Sent2Field2_5<-Sent2Field2@data[,c(1,50:61)]
> Sent2F2_5<-cbind(CoordsSent2_F2,Sent2Field2_5)
> coordinates(Sent2F2_5) <- ~long+lat
>
> etc
> etc
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Smoothing by group - Panel data - exponential/loess

2018-10-07 Thread Bert Gunter
1. This doesn't make much sense:

smoothdf <- data.frame(
  x = 1:n,
  y = as.vector(smooth(dat$g)),
  method = "smooth()"
)

What do you think the "method" invocation does (data.frame has no "method"
argument)?

2. Show us what you have tried -- it depends on what graphics system you
use. In lattice and ggplot2, for example, it's pretty basic. In base
graphics, see ?scatter.smooth, for example -- it would have to be called
for each separate group, of course. See e.g. ?tapply and friends for
processig by separate groups. I'd also suggest that you spend time with a
basic R tutorial or two, where this sort of thing is usually covered.

3. Numerous packages also do this: searching on "plot smooth curves by
groups" on rseek.org brought up lots of stuff.

Cheers,
Bert





Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Oct 7, 2018 at 10:59 AM Miluji Sb  wrote:

> Dear all,
>
> I have panel data for a series (g) for three time periods. The variable is
> likely autocorrelated. I would like to generate a new variable using
> exponential/loess smoothing by group (gid).
>
> For time series, I could have done something like this;
>
> smoothdf <- data.frame(
>   x = 1:n,
>   y = as.vector(smooth(dat$g)),
>   method = "smooth()"
> )
>
> But confused about the panel data setting. Apologies if this a stat
> question along with an R query. Any help will be greatly appreciated.
>
> ### data
> structure(list(gid = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L,
> 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L,
> 9L), year = c(2030L, 2050L, 2100L, 2030L, 2050L, 2100L, 2030L,
> 2050L, 2100L, 2030L, 2050L, 2100L, 2030L, 2050L, 2100L, 2030L,
> 2050L, 2100L, 2030L, 2050L, 2100L, 2030L, 2050L, 2100L, 2030L,
> 2050L, 2100L), g = c(9.24e-05, 0.0001133, 6.3e-05, 7.72e-10,
> 1.41e-09, 4.68e-09, 0.0001736, 0.0002286, 0.0001541, 1.87e-15,
> 3.76e-15, 8.52e-15, 0.0001822, 0.0002391, 0.0001348, 3.69e-06,
> 8.11e-06, 8.63e-06, 3.41e-06, 7.32e-06, 7.18e-06, 8.47e-07, 1.83e-06,
> 1.84e-06, 1.13e-06, 2.37e-06, 2.15e-06)), class = "data.frame", row.names =
> c(NA,
> -27L))
>
>
> Sincerely,
>
> Milu
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cluster samples using self organizing map in R

2018-10-10 Thread Bert Gunter
Search!

the rseek.org site gives many hits for "self organizing maps", including
the som package among others.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Oct 9, 2018 at 11:14 PM A DNA RNA  wrote:

> Dear All,
>
> Who can I use Self Organizing Map (SOM) results to cluster samples? I have
> tried following but this gives me only the clustering of grids, while I
> want to cluster (150) samples:
>
> library(kohonen)
> iris.sc <- scale(iris[, 1:4])
> iris.som <- som(iris.sc, grid=somgrid(xdim = 3, ydim=3, topo="hexagonal"),
>rlen=100, alpha=c(0.05,0.01))
> ##hierarchical clustering
> groups <- 3
> iris.hc <- cutree(hclust(dist(iris.som$codes[[1]])), groups)
> iris.hc
> #V1 V2 V3 V4 V5 V6 V7 V8 V9
> #1  1  2  1  1  2  3  3  2
>
>
> Can anyone help me with this please?
> --
> Tina
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Code "tags"

2018-10-10 Thread Bert Gunter
"get organized around R programming" is rather vague.  Nor do I know what
you mean by "tagging" code snippets.

A standard answer would be to write your "code" as documented functions in
a package (e.g. "LeslieMisc"). RStudio -- a wholly separate software
product -- has various tools to that may also be relevant. RMarkdown allows
you to write documents in which you embed executable R code, which is
useful for producing "vignettes" to illustrate how code works. Roxygen is
useful for producing package docs with a minimum of pain. Any of these --
and others, like just producing inline comments -- may be useful.

-- Bert



On Wed, Oct 10, 2018 at 12:43 PM Leslie Rutkowski <
leslie.rutkow...@gmail.com> wrote:

> Hi all,
>
> I'm trying to get organized about my R programming and I'm looking for a
> way to "tag" snippets of handy code that I go back to time and again. At
> the moment, I just plop .R files into a folder and rely on the file name to
> guide me. In desperation, I've taken to saving the same chunk as multiple
> .R files with different "tags" (e.g. "Loops with titles.R" = "CLT
> example.R").
>
> Any tips would be appreciated.
>
> Thanks,
> Leslie
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Genuine relative paths with R

2018-10-10 Thread Bert Gunter
I haven't followed this discussion closely, so this may be offbase, but in
response to your point 2., note that you can set the working directory via
setwd() in your .Rprofile file. Of course, users can always determine the
working directory via invoking the getwd() function, so I'm not sure what
you mean here. However, as I said, if my comments are useless, please
ignore and do not waste time responding.

Cheers,
Bert



On Wed, Oct 10, 2018 at 6:53 AM Olivier GIVAUDAN <
olivier_givau...@hotmail.com> wrote:

> Hi Gabor,
> ​
>
>   1.  By definition the relative path (I'm excluding the absolute path
> solution for obvious reasons) depends on the current working directory:
> What if my R script is not located along this current working directory? It
> won't work.​
>   2.  What should I write as an option in this .Rprofile file? An absolute
> path to the project's root? Plus I don't want the users to choose their
> working directory: this technicality should be kept hidden from them and be
> automatic.​
>
> Best regards,​
> ​
> Olivier
> 
> De : Gabor Grothendieck 
> Envoyé : samedi 6 octobre 2018 23:33
> À : olivier_givau...@hotmail.com; r-help@r-project.org
> Objet : Re: [R] Genuine relative paths with R
>
> 1. Assuming you are starting the script from within R, if you want to
> keep all the files used by the script together with the script itself
> then just source the script using the absolute or relative path to the
> script using source(..., chdir = TRUE) as shown below and the script
> will run in the directory containing the script. We used an absoluate
> path below but a relative path will work too.  In either case the
> script itself will not need to use absolute paths and it is portable
> to other machines.
>
>   source("/path/to/script.R", chdir = TRUE)
>
> If your script is on your PATH then this would work:
>
>   source(Sys.which("script.R"), chdir = TRUE)
>
> 2. Another approach is to define an R option, say root, using the R
> options() function to define the root directory of your project.  You
> can have a different R option for each project.  Place the options()
> statements to set these R options for your various projects in your
> .Rprofile, say, and in the script use:
>
> root <- getOption("root", ".")
>
> to cause it to retrieve the value of the R option root if it is
> defined and use the current directory otherwise.  Use a different name
> for each project. If the user does not define the R option root it
> will be up to them to change directory first.  Again there will be no
> use of absolute paths in the script itself and it is portable to other
> machines.
>
> What is particularly convenient about this is that it documents where
> all the projects are on the  machine right in the .Rproject file so
> one always knows where they are.
>
> On Sat, Oct 6, 2018 at 8:25 AM Olivier GIVAUDAN
>  wrote:
> >> Dear R users,
> >
> > I would like to work with genuine relative paths in R for obvious
> reasons: if I move all my scripts related to some project as a whole to
> another location of my computer or someone else's computer, if want my
> scripts to continue to run seamlessly.
> >
> > What I mean by "genuine" is that it should not be necessary to hardcode
> one single absolute path (making the code obviously not "portable" - to
> another place - anymore).
> >
> > For the time being, I found the following related posts, unfortunately
> never conclusive or even somewhat off-topic:
> >
> https://stackoverflow.com/questions/1815606/rscript-determine-path-of-the-executing-script
> >
> https://stackoverflow.com/questions/47044068/get-the-path-of-current-script/47045368
> >
> http://r.789695.n4.nabble.com/Script-auto-detecting-its-own-path-td2719676.html
> >
> > So I found 2 workarounds, more or less satisfactory:
> >
> >
> >   1.  Either create a variable "ScriptPath" in the first lines of each
> of my R scripts and run a batch (or shell, etc.) to replace every single
> occurrence of "ScriptPath <-" by "ScriptPath <- [Absolute path of the R
> script]" in all the R scripts located in the folder (and possibly
> subfolders) of the batch file.
> >   2.  Or create an R project file with RStudio and use the package
> "here" to get the absolute path of the R project file and put all the R
> scripts related to this project in the R project directory, as often
> recommended.
> >
> > But I am really wondering why R doesn't have (please tell me if I'm
> wrong) this basic feature as many other languages have it (batch, shell, C,
> LaTeX, SAS with macro-variables, etc.)?
> > Do you know whether the language will have this kind of function in a
> near future? What are the obstacles / what is the reasoning for not having
> it already?
> >
> > Do you know other workarounds?
> >
> > Best regards,
> >
> > Olivier
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and 

Re: [R] Set attributes for object known by name

2018-10-10 Thread Bert Gunter
Well, it can be done without a temporary variable, but I'm not sure you
would want to.
Anyway...

## simplified example
> a <- 1
> vname <- "a"
> eval(substitute(attr(x,"b") <- "hi", list( x = as.name(vname))))
> a
[1] 1
attr(,"b")
[1] "hi"


Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Oct 10, 2018 at 9:48 PM Peter Langfelder 
wrote:

> oops, I think the right code would be
>
> x = get(varname)
> attr(x, "foo") = "bar"
> assign(varname, x)
>
> On Wed, Oct 10, 2018 at 9:30 PM Peter Langfelder <
> peter.langfel...@gmail.com>
> wrote:
>
> > I would try something like
> >
> > x = get(myvarname)
> > attr(x, "foo") = "bar"
> > assign(varname, x)
> >
> > HTH,
> >
> > Peter
> >
> > On Wed, Oct 10, 2018 at 9:15 PM Marc Girondot via R-help <
> > r-help@r-project.org> wrote:
> >
> >> Hello everybody,
> >>
> >> Has someone the solution to set attribute when variable is known by
> name ?
> >>
> >> Thanks a lot
> >>
> >> Marc
> >>
> >> Let see this exemple:
> >>
> >> # The variable name is stored as characters.
> >>
> >> varname <- "myvarname"
> >> assign(x = varname, data.frame(A=1:5, B=2:6))
> >> attributes(myvarname)
> >>
> >> $names
> >> [1] "A" "B"
> >> $class
> >> [1] "data.frame"
> >> $row.names
> >> [1] 1 2 3 4 5
> >>
> >> # perfect
> >>
> >> attributes(get(varname))
> >>
> >> # It works also
> >>
> >> $names
> >> [1] "A" "B"
> >> $class
> >> [1] "data.frame"
> >>
> >> $row.names
> >> [1] 1 2 3 4 5
> >>
> >> attributes(myvarname)$NewAtt <- "MyAtt"
> >>
> >> # It works
> >>
> >> attributes(get(varname))$NewAtt2 <- "MyAtt2"
> >> Error in attributes(get(varname))$NewAtt2 <- "MyAtt2" :
> >>impossible de trouver la fonction "get<-"
> >>
> >> # Error...
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package updates for new versions

2018-10-11 Thread Bert Gunter
?maintainer

in R accesses the package description file to provide maintainer info. No
need to fool around with git or other software development repositories.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Oct 11, 2018 at 9:44 AM Amit Mittal 
wrote:

> You can contact them as Marc suggested by opening an account at bit
> (bitbucket/Atlassian) or git (github) Most prefer strongly to have a
> primary at git before going to Atlassian other code checkers so you can
> discover the package team there esp if any future development is in the
> pipes.
>
> Don't worry about it you dont have to be a founder or developer to have an
> account there :)
>
> BR
>
> On Thu, Oct 11, 2018 at 9:11 PM Rich Shepard 
> wrote:
>
> >This is a question to better my understanding of the relationship
> > between
> > core R versions and packages that work with them. It's not a complaint or
> > criticism.
> >
> >Installed here is R-3.5.1. There are two packages that allow
> integration
> > of R and GRASS that are not yet available for 3.5.1: rpy2 and spgrass7.
> >
> >Appreciating that package maintainers have day jobs that take priority
> > over volunteer package maintenance, I ask only for thoughts on when
> copies
> > of those two packages _might_ be available for 3.5.1.
> >
> > Regards,
> >
> > Rich
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> --
>
> __
>
> Amit Mittal
> Pursuing Ph.D. in Finance and Accounting
> Indian Institute of Management, Lucknow
> Visit my SSRN author page:
> http://ssrn.com/author=2665511
> * Top 10% Downloaded Author on SSRN
> Mob: +91 7525023664
>
> This message has been sent from a mobile device. I may contact you again.
>
> _
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wavelet Package for Time Series

2018-10-13 Thread Bert Gunter
I cannot provide any specific info, but as a matter of policy, packages on
CRAN *must* meet their required maintenance standards (compatibility with
current R version, run on different OS's, etc.)  or they are removed from
CRAN. This is not necessarily so for packages on other repos, e.g. github.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Oct 13, 2018 at 1:17 PM mikorym via R-help 
wrote:

> Hi R-help
>
> Are there any R users that can tell me which of the wavelet packages are
> the "best" to use? By this I mean ones that are being maintained, or e.g.,
> work well with ggplot2 or otherwise have specific advantages to use.
>
> I have seen for example that WaveThresh has an accompanying book.
>
> In terms of being up to date, however, it seems like python may be a
> better option. Although, sometimes more recent does not mean better.
>
> My first use case would be to model time series data.
>
> Thanks you in advance,
> Phillip
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] limit bar graph output

2018-10-14 Thread Bert Gunter
If I understand correctly, just subset your sorted data.

e.g. :

x <- runif(50)
##  50 unsorted values

sort(x, dec = TRUE)[1:10]
## the 10 biggest


-- Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Oct 14, 2018 at 7:13 PM Jeff Reichman 
wrote:

> R-Help Forum
>
> I'm using the following code to reorder (from highest to lowest) my miRNA
> counts.  But there are 500 plus and I only need the first (say) 15-20.  How
> do I limit ggplot to only the first 20 miRNA counts
>
> ggplot(data = corr.m, aes(x = reorder(miRNA, -value), y = value, fill =
> variable)) +
>   geom_bar(stat = "identity")
>
> Jeff
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding unique terms

2018-10-15 Thread Bert Gunter
Here is a base R solution:
"dat" is the data frame as in Robert's solution.

> aggregate(dat[,3:6], by= dat[1], FUN = sum, na.rm = TRUE)
  STUDENT_ID   PO1M PO1T PO2M PO2T
1AA15285 287.80  350   37   50
2AA15286 240.45  330   41   50

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Oct 15, 2018 at 6:42 PM Robert Baer  wrote:

>
>
> On 10/11/2018 5:12 PM, roslinazairimah zakaria wrote:
> > Dear r-users,
> >
> > I have this data:
> >
> > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
> > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
> >  COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
> >  4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
> >  "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
> > "factor"),
> >  PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
> >  82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
> >  100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
> >  41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
> >  X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
> >  NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
> > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
> > "data.frame", row.names = c(NA,
> > -11L))
> >
> > I want to combine the same Student ID and add up all the values for PO1M,
> > PO1T,...,PO2T obtained by the same ID.
> >
> > How do I do that?
> > Thank you for any help given.
> >
> oops!  Forgot to clean up after my cut and paste. Solution with dplyr
> looks like this:
> # Create sums by student ID
> library(dplyr)
> dat %>%
>group_by(STUDENT_ID) %>%
>summarize(sum.PO1M = sum(PO1M, na.rm = TRUE),
>  sum.PO1T = sum(PO1T, na.rm = TRUE),
>  sum.PO2M = sum(PO2M, na.rm = TRUE),
>  sum.PO2T = sum(PO2T, na.rm = TRUE))
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fw: inconsistency in pbmclapply...

2018-10-16 Thread Bert Gunter
As I think you hve been told before, most attachments don't make it through
the mail server. Use ?dput instead to include data.

Also, post in plain text, not html.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Oct 16, 2018 at 8:03 AM akshay kulkarni 
wrote:

> dear members,
>  I am attaching the object LYGH as used in my
> expression with pbmclapply so that you can recreate the expression:
>
> LYG1 <- pbmclapply(LYGH, FUN = auto.arima, mc.cores = detectCores()
>
> very many thanks for your help and time.
> yours sincerely,
> AKSHAY M KULKARNI
> 
> From: akshay kulkarni 
> Sent: Tuesday, October 16, 2018 2:21 PM
> To: R help Mailing list
> Subject: Fw: inconsistency in pbmclapply...
>
>
> dear members,
>  however, "ts must have more than one observation"
> error is only found for the first entry of LYG1 ( LYG1[[20]], LYG1[[200]],
> [[elided Hotmail spam]]
> 
> From: R-help  on behalf of akshay kulkarni <
> akshay...@hotmail.com>
> Sent: Tuesday, October 16, 2018 1:57 PM
> To: R help Mailing  list
> Subject: [R] inconsistency in pbmclapply...
>
> dear members,
>  I am using parallel processing with
> pbmclapply. below is the code. LYG1[[1]] got from pbmclapply is not the
> same as LYG1[[1]] got from direct application of auto.arima. what may be
> wrong?
>
>   debug at #20: LYG1 <- pbmclapply(LYGH, FUN =
> auto.arima, mc.cores = detectCores())
> Browse[2]>
>   |=   |  68%, ETA
> 00:21
> debug at #21: LYG2 <- pbmclapply(LYGH, FUN = ets, mc.cores =
> detectCores())
> Browse[2]> LYG1[[1]]
> [1] "Error in ts(x) : 'ts' object must have one or more observations\n"
> attr(,"class")
> [1] "try-error"
> attr(,"condition")
> 
> Warning message:
> In pbmclapply(LYGH, FUN = auto.arima, mc.cores = detectCores()) :
>   scheduled cores encountered errors in user code
> Browse[2]> LYGH[[1]]
>  [1] 0.7 0.4 0.3 0.15000 0.25000
> 0.95000
>  [7] 1.0 0.3 0.65000 0.2 0.6
> 0.1
> [13] 0.001412873 1.55000 0.15000 0.3 0.45000
> 0.35000
> [19] 0.15000 2.35000 0.25000 0.1 3.7
> 3.95000
> [25] 3.05000 0.9 0.4 1.05000 1.1
> 1.95000
> [31] 2.0 0.65000 0.7 0.25000 5.25000
> 0.8
> [37] 0.001412873
> Browse[2]> auto.arima(LYGH[[1]])
> Series: LYGH[[1]]
> ARIMA(0,1,0)
>
> sigma^2 estimated as 2.303:  log likelihood=-66.1
> AIC=134.2   AICc=134.31   BIC=135.78
>
> very many thanks for your time and effort.
> yours sincerely,
> AKSHAY M KULKARNI
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matching multiple search criteria (Unlisting a nested dataset, take 2)

2018-10-16 Thread Bert Gunter
The problem wasn't the data tibbles. You posted in html -- which you were
explictly warned against -- and that corrupted your text (e.g. some quotes
became "smart quotes", which cannot be properly cut and pasted into R).

Bert


On Tue, Oct 16, 2018 at 2:47 PM Nathan Parsons 
wrote:

> Argh! Here are those two example datasets as data frames (not tibbles).
> Sorry again. This apparently is just not my day.
>
>
> th <- structure(list(status_id = c("x1047841705729306624",
> "x1046966595610927105",
>
> "x1047094786610552832", "x1046988542818308097", "x1046934493553221632",
>
> "x1047227442899775488"), created_at = c("2018-10-04T13:31:45Z",
>
> "2018-10-02T03:34:22Z", "2018-10-02T12:03:45Z", "2018-10-02T05:01:35Z",
>
> "2018-10-02T01:26:49Z", "2018-10-02T20:50:53Z"), text = c("Technique is
> everything with olympic lifts ! @ Body By John https://t.co/UsfR6DafZt";,
>
> "@Subtronics just went back and rewatched ur FBlice with ur CDJs and let me
> tell you man. You are the fucking messiah",
>
> "@ic4rus1 Opportunistic means short-game. As in getting drunk now vs. not
> being hung over tomorrow vs. not fucking up your life ten years later.",
>
> "I tend to think about my dreams before I sleep.", "@MichaelAvenatti
> @SenatorCollins So,  if your client was in her 20s, attending parties with
> teenagers, doesn't that make her at the least immature as hell, or at the
> worst, a pedophile and a person contributing to the delinquency of
> minors?",
>
>
> "i wish i could take credit for this"), lat = c(43.6835853, 40.284123,
>
> 37.7706565, 40.431389, 31.1688935, 33.9376735), lng = c(-70.3284118,
>
> -83.078589, -122.4359785, -79.9806895, -100.0768885, -118.130426
>
> ), county_name = c("Cumberland County", "Delaware County", "San Francisco
> County",
>
> "Allegheny County", "Concho County", "Los Angeles County"), fips =
> c(23005L,
>
>
> 39041L, 6075L, 42003L, 48095L, 6037L), state_name = c("Maine",
>
> "Ohio", "California", "Pennsylvania", "Texas", "California"),
>
> state_abb = c("ME", "OH", "CA", "PA", "TX", "CA"), urban_level =
> c("Medium Metro",
>
> "Large Fringe Metro", "Large Central Metro", "Large Central Metro",
>
> "NonCore (Nonmetro)", "Large Central Metro"), urban_code = c(3L,
>
> 2L, 1L, 1L, 6L, 1L), population = c(277308L, 184029L, 830781L,
>
> 1160433L, 4160L, 9509611L)), class = "data.frame", row.names = c(NA,
>
> -6L))
>
>
> st <- structure(list(terms = c("me abused depressed", "me hurt depressed",
>
> "feel hopeless depressed", "feel alone depressed", "i feel helpless",
>
> "i feel worthless")), row.names = c(NA, -6L), class = c("tbl_df",
>
> "tbl", "data.frame"))
>
> On Tue, Oct 16, 2018 at 2:39 PM Nathan Parsons  >
> wrote:
>
> > Thanks all for your patience. Here’s a second go that is perhaps more
> > explicative of what it is I am trying to accomplish (and hopefully in
> plain
> > text form)...
> >
> >
> > I’m using the following packages: tidyverse, purrr, tidytext
> >
> >
> > I have a number of tweets in the following form:
> >
> >
> > th <- structure(list(status_id = c("x1047841705729306624",
> > "x1046966595610927105",
> >
> > "x1047094786610552832", "x1046988542818308097", "x1046934493553221632",
> >
> > "x1047227442899775488"), created_at = c("2018-10-04T13:31:45Z",
> >
> > "2018-10-02T03:34:22Z", "2018-10-02T12:03:45Z", "2018-10-02T05:01:35Z",
> >
> > "2018-10-02T01:26:49Z", "2018-10-02T20:50:53Z"), text = c("Technique is
> > everything with olympic lifts ! @ Body By John https://t.co/UsfR6DafZt";,
> >
> > "@Subtronics just went back and rewatched ur FBlice with ur CDJs and let
> > me tell you man. You are the fucking messiah",
> >
> > "@ic4rus1 Opportunistic means short-game. As in getting drunk now vs. not
> > being hung over tomorrow vs. not fucking up your life ten years later.",
> >
> > "I tend to think about my dreams before I sleep.", "@MichaelAvenatti
> > @SenatorCollins So, if your client was in her 20s, attending parties with
> > teenagers, doesn't that make her at the least immature as hell, or at the
> > worst, a pedophile and a person contributing to the delinquency of
> minors?",
> >
> > "i wish i could take credit for this"), lat = c(43.6835853, 40.284123,
> >
> > 37.7706565, 40.431389, 31.1688935, 33.9376735), lng = c(-70.3284118,
> >
> > -83.078589, -122.4359785, -79.9806895, -100.0768885, -118.130426
> >
> > ), county_name = c("Cumberland County", "Delaware County", "San Francisco
> > County",
> >
> > "Allegheny County", "Concho County", "Los Angeles County"), fips =
> > c(23005L,
> >
> > 39041L, 6075L, 42003L, 48095L, 6037L), state_name = c("Maine",
> >
> > "Ohio", "California", "Pennsylvania", "Texas", "California"),
> >
> > state_abb = c("ME", "OH", "CA", "PA", "TX", "CA"), urban_level =
> c("Medium
> > Metro",
> >
> > "Large Fringe Metro", "Large Central Metro", "Large Central Metro",
> >
> > "NonCore (Nonmetro)", "Large Central Metro"), urban_code = c(3L,
> >
> > 2L, 1L, 1L, 6L, 1L), population = c(277308L, 184029L, 830781L,
> >
> > 1160433L, 4160L

Re: [R] Matching multiple search criteria (Unlisting a nested dataset, take 2)

2018-10-16 Thread Bert Gunter
OK, as no one else has offered a solution, I'll take a whack at it.

Caveats: This is a brute force attempt using R's basic regular expression
engine. It is inelegant and barely tested, so likely to be at best
incomplete and buggy, and at worst, incorrect. But maybe Nathan or someone
else on the list can fix it up. So if (when) it breaks, complain on the
list to give someone (almost certainly not me) the opportunity.

The basic idea is that the tweets are just character strings and the search
phrases are just character vectors all of whose elements must match
"appropriately" -- i.e. they must match whole words -- in the character
strings. So my desired output from the code is a list indexed by the search
phrases, each of whose components if a logical vector of length the number
of tweets each of whose elements = TRUE iff all the words in the search
phrase match somewhere in the tweet.

Here's the code(using the data Nathan provided):

> words <- sapply(st[[1]],strsplit,split = " +" )
## convert the phrases to a list of character vectors of the words
## Result:
> words
$`me abused depressed`
[1] "me""abused""depressed"

$`me hurt depressed`
[1] "me""hurt"  "depressed"

$`feel hopeless depressed`
[1] "feel"  "hopeless"  "depressed"

$`feel alone depressed`
[1] "feel"  "alone" "depressed"

$`i feel helpless`
[1] "i""feel" "helpless"

$`i feel worthless`
[1] "i" "feel"  "worthless"

> expand.words <-  function(z)lapply(z,function(x)paste0(c("^ *"," ","
"),x, c(" "," "," *$")))
## function to create regexes for words when they are at the beginning,
middle, or end of tweets

> wordregex <- lapply(words,expand.words)
##Result
## too lengthy to include
##
> tweets <- th$text
##extract the tweets
> findin <- function(x,y)
   ## x is a vector of regex patterns
   ## y is a character vector
   ## value = vector,vec, with length(vec) == length(y) and vec[i] == TRUE
iff any of x matches y[i]
{ apply(sapply(x,function(z)grepl(z,y)), 1,any)
}

## add a matching "tweet" to the tweet vector:
> tweets <- c(tweets," i  worthless yxxc ght feel")

> ans <-
lapply(wordregex,function(z)apply(sapply(z,function(x)findin(x,tweets)), 1,
all))
## Result:
> ans
$`me abused depressed`
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE

$`me hurt depressed`
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE

$`feel hopeless depressed`
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE

$`feel alone depressed`
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE

$`i feel helpless`
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE

$`i feel worthless`
[1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE

## None of the tweets match any of the phrases except for the last tweet
that I added.

## Note: you need to add capabilities to handle upper and lower case. See,
e.g. ?casefold

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Oct 16, 2018 at 3:03 PM Bert Gunter  wrote:

> The problem wasn't the data tibbles. You posted in html -- which you were
> explictly warned against -- and that corrupted your text (e.g. some quotes
> became "smart quotes", which cannot be properly cut and pasted into R).
>
> Bert
>
>
> On Tue, Oct 16, 2018 at 2:47 PM Nathan Parsons 
> wrote:
>
>> Argh! Here are those two example datasets as data frames (not tibbles).
>> Sorry again. This apparently is just not my day.
>>
>>
>> th <- structure(list(status_id = c("x1047841705729306624",
>> "x1046966595610927105",
>>
>> "x1047094786610552832", "x1046988542818308097", "x1046934493553221632",
>>
>> "x1047227442899775488"), created_at = c("2018-10-04T13:31:45Z",
>>
>> "2018-10-02T03:34:22Z", "2018-10-02T12:03:45Z", "2018-10-02T05:01:35Z",
>>
>> "2018-10-02T01:26:49Z", "2018-10-02T20:50:53Z"), text = c("Technique is
>> everything with olympic lifts ! @ Body By John https://t.co/UsfR6DafZt";,
>>
>> "@Subtronics just went back and rewatched ur FBlice with ur CDJs and let
>> me
>> tell you man. You are the fucking messiah",
>>
>> "@ic4rus1 Opportunistic means short-game. As in getting drunk now vs. not
>> being hung over tomorrow vs. not fucking up your life ten years later.",
>>
>> &quo

Re: [R] Simulating data with nested random effects

2018-10-17 Thread Bert Gunter
This would almost certainly fit better on the r-sig-mixed-models
list,rather than here. You are more likely to get authoritative responses
about this specialized statistical topic there.

Also -- these are **plain text** mailing lists. Please do not post in html.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Oct 17, 2018 at 5:45 PM Peter Wijeratne 
wrote:

> Hello,
>
> I would like to simulate nested data, where my mixed effects model fitted
> to real data has the form:
>
> y ~ time + (1 | site/subject)
>
> I then take the hyper-parameters from this model to simulate fake data,
> using this function:
>
> create_fake <- function(J,K,L,HP,t){
>
>   # J :
> number of sites
>
> # K : number of
> subjects / site
>
>   # L : number of years# HP:
> hyperparameters from fit, y ~ time + (1 | site/subject)# t: fractional
> effectiveness of treatment
> time <- rep(seq(0,2,length=L), J*K)
> subject <- rep(1:(J*K), each=L)
> site <- sample(rep (1:J, K))
> site1 <- factor(site[subject])
> treatment <- sample(rep (0:1, J*K/2))
> treatment1 <- treatment[subject]
> # time coefficient
> g.0.true <- as.numeric( HP['g.0.true'] )
>
>   # treatment
> coefficient
> g.1.true <- -as.numeric(t)*g.0.true
># intercept
> mu.a.true <- as.numeric( HP['mu.a.true'] )
>
>   # fixed
> effects
> b.true <- (g.0.true + g.1.true*treatment)
>
>   # random
> effects
> sigma.y.true <- as.numeric( HP['sigma.y.true'] ) # residual std dev
> sigma.a.true <- as.numeric( HP['sigma.a.true'] ) # site std dev
> sigma.a0.true <- as.numeric( HP['sigma.a0.true'] ) # site:person std
> dev
> a0.true <- rnorm(J*K, 0, sigma.a0.true)
> a.true <- rnorm(J*K, mu.a.true + a0.true, sigma.a.true)
> y <- rnorm(J*K*L, a.true[subject] + b.true[subject]*time,
> sigma.y.true)
> return(data.frame( y, time, subject, treatment1, site1 ))
>
> I then fit models of the form:
>
> y ~ time + time:treatment1 + (1 | site1/subject)
>
> To the fake data. However, this approach can (but not always) produce a
> 'site' standard deviation approximately a factor of 10 less than in the
> real data.
>
> My question is - is my simulation function correct?
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matching multiple search criteria (Unlisting a nested dataset, take 2)

2018-10-17 Thread Bert Gunter
ave a booth in katy at the real craft wives of katy fest
> @nolabelbrewco cmon yall!everything is better when you top it with
> tias!order today we ship to all 50 ",
> "dolly is so baddd"), lat = c(43.6835853, 40.284123, 37.7706565,
> 40.431389, 31.1688935, 33.9376735, 34.0207895, 44.900818, 29.7926,
> 32.364145), lng = c(-70.3284118, -83.078589, -122.4359785, -79.9806895,
> -100.0768885, -118.130426, -118.4119065, -89.5694915, -95.8224,
> -86.2447285), county_name = c("Cumberland County", "Delaware County",
> "San Francisco County", "Allegheny County", "Concho County",
> "Los Angeles County", "Los Angeles County", "Marathon County",
> "Harris County", "Montgomery County"), fips = c(23005L, 39041L,
> 6075L, 42003L, 48095L, 6037L, 6037L, 55073L, 48201L, 1101L),
> state_name = c("Maine", "Ohio", "California", "Pennsylvania",
> "Texas", "California", "California", "Wisconsin", "Texas",
> "Alabama"), state_abb = c("ME", "OH", "CA", "PA", "TX", "CA",
> "CA", "WI", "TX", "AL"), urban_level = c("Medium Metro",
> "Large Fringe Metro", "Large Central Metro", "Large Central Metro",
> "NonCore (Nonmetro)", "Large Central Metro", "Large Central Metro",
> "Small Metro", "Large Central Metro", "Medium Metro"), urban_code = c(3L,
> 2L, 1L, 1L, 6L, 1L, 1L, 4L, 1L, 3L), population = c(277308L,
> 184029L, 830781L, 1160433L, 4160L, 9509611L, 9509611L, 127612L,
> 4233913L, 211037L), linenumber = 1:10), row.names = c(NA,
> 10L), class = "data.frame")
>
> ## Clean tweets - basically just remove everything we don’t need from the
> text including punctuation and urls
> th %>%
> mutate(linenumber = row_number(),
> text = str_remove_all(text, "[^\x01-\x7F]"),
> text = str_remove_all(text, "\n"),
> text = str_remove_all(text, ","),
> text = str_remove_all(text, "'"),
> text = str_remove_all(text, "&"),
> text = str_remove_all(text, "<"),
> text = str_remove_all(text, ">"),
> text = str_remove_all(text, "http[s]?://[[:alnum:].\\/]+"),
> text = tolower(text)) -> th
>
> ## Create search function that looks for each search term in the provided
> string, evaluates if all three search terms have been found, and returns a
> logical
> srchr <- function(df) {
> str_detect(df, "olympic") -> a
> str_detect(df, "technique") -> b
> str_detect(df, "lifts") -> c
> ifelse(a == TRUE & b == TRUE & c == TRUE, TRUE, FALSE)
> }
>
> ## Evaluate tweets for presence of search term
> th %>%
> mutate(flag = map_chr(text, srchr)) -> th_flagged
>
> As far as I can tell, this works. I have to manually enter each set of
> search terms into the function, which is not ideal. Also, this only
> generates a True/False for each tweet based on one search term - I end up
> with an evaluatory column for each search term that I would then have to
> collapse together somehow. I’m sure there’s a more elegant solution.
>
> --
>
> Nate Parsons
> Pronouns: He, Him, His
> Graduate Teaching Assistant
> Department of Sociology
> Portland State University
> Portland, Oregon
>
> 503-725-9025
> 503-725-3957 FAX
> On Oct 16, 2018, 7:20 PM -0700, Bert Gunter ,
> wrote:
>
> OK, as no one else has offered a solution, I'll take a whack at it.
>
> Caveats: This is a brute force attempt using R's basic regular expression
> engine. It is inelegant and barely tested, so likely to be at best
> incomplete and buggy, and at worst, incorrect. But maybe Nathan or someone
> else on the list can fix it up. So if (when) it breaks, complain on the
> list to give someone (almost certainly not me) the opportunity.
>
> The basic idea is that the tweets are just character strings and the
> search phrases are just character vectors all of whose elements must match
> "appropriately" -- i.e. they must match whole words -- in the character
> strings. So my desired output from the code is a list indexed by the search
> phrases, each of whose components if a logical vector of length the number
> of tweets each of whose elements = TRUE iff all the words in the search
> phrase match somewhere in the tweet.
>
> Here's the code(using the data Nathan provided):
>
> > words <- sapply(st[[1]],strsplit,split = " +" )
> ## convert the phra

Re: [R] Matching multiple search criteria (Unlisting a nested dataset, take 2)

2018-10-18 Thread Bert Gunter
All (especially Nathan): **Please feel free to ignore this post without
response.** It just represents a bit of OCD-ness on my part that may or may
not be of interest to anyone else.

Purpose of this post: To give an alternative considerably simpler and
considerably faster solution to the problem than those which I offered
previously. It may or may not be what the OP asked for, but the improvement
exercise was instructive to me . Notation as previously in this thread.

New solution:

getwords <-
function(x)strsplit(gsub("(^[[:space:]]+)|([[:space:]]+)$)","",tolower(x)),split
= " +")
## split lower-cased text into a vector of "words"
## I made this a bit fancier to handle some "corner" cases, but the
previous simpler version may well suffice.

'%allin%' <- function(x, table)prod(match(x,table, nomatch = 0L)) > 0L
## a convenience function/operator that improves efficiency.

## lists of  search word vectors as before
phrasewords <- getwords(st$terms)
tweets <- getwords(c(th$text, " i  worthless yxxc ght feel")) ## the
tweets + one additional

## simpler approach just using indexing for the bookkeeping that nested
_apply
## loops previously were used for
ans <- expand.grid(phrases = seq_along(phrasewords),tweets =
seq_along(tweets), Result = FALSE)
ans$Result <- apply(ind,1,function(r)phrasewords[[r[1]]] %allin%
tweets[[r[2]]])

## ans is a data frame in which the first column indexes phrases and the
second tweets
## The ith row of ans$Result == TRUE iff all the words in the phrase
indexed by the ith row of the
##  phrase column are contained in the tweet indexed by that row's tweet
column.

This was way faster than my previous offerings.

Note also that just the matching phrases and tweets can be extracted as
usual by:

> ans[ans[,3],]
   phrases tweets Result
42   6  7   TRUE
## all the words in the 6th search phrase appeared in the 7th tweet.

** I promise to natter on about this no longer! **

Cheers,
Bert


On Wed, Oct 17, 2018 at 7:50 PM Bert Gunter  wrote:

>
> If you wish to use R, you need to at least understand its basic data
> structures and functionality. Expecting that mimickry of code in special
> packages will suffice is, I believe, an illusion. If you haven't already
> done so, you should go through a basic R tutorial or two (there are many on
> the web; some recommendations, by no means necessarily "the best",  can be
> found here:
> https://www.rstudio.com/online-learning/#r-programming).
>
> Having said that, I realized that my previous "solution" using regular
> expressions was more complicated than it needed to be and somewhat foolish
> ( so much for all my "expertise"). A simpler and better approach is simply
> to break up both the tweet texts and your search phrases into vectors of
> their "words" (i.e. character strings surrounded by spaces) using
> strplit(), and then using R's built-in matching capabilities with %in%.
> This is quite straightforward, pretty robust (no regex's to wrestle with),
> and does not require "herculean efforts" to understand. The only wrinkle is
> some bookkeeping with the "apply" family of functions. These are, as you
> may know, the functional programming way of handling iteration (loops), but
> they are what I would consider part of "basic" R functionality and worth
> spending the time to learn about.
>
> Herewith my better, simpler proposal, using your example data as before:
>
> getwords <- function(x)strsplit(tolower(x),split = " +")
> ## split text into a vector of lower-cased "words"
>
> phrasewords <- structure(getwords(st$terms), names = st$terms)
> ## named list of your search word vectors
>
> tweets <- getwords(c(th$text, " i  worthless yxxc ght feel"))
> ## the tweets + one additional that should match the last phrase
>
> ans <- lapply(phrasewords, function(x) apply(sapply(tweets,function(y)x
> %in% y), 2, all))
> ## a list indexed by the search phrases,
> ## with each component a vector of logicals with vec[i] == TRUE iff
> ## the ith tweet contains all the words in the search phrase
>
> > ans
> $`me abused depressed`
> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
> $`me hurt depressed`
> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
> $`feel hopeless depressed`
> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
> $`feel alone depressed`
> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
> $`i feel helpless`
> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
> $`i feel worthless`
> [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
>
> -- Bert
>
> On Wed, Oct 17, 2018 at 9:20 AM Nathan Parsons 
> wrote:
>
>> I

Re: [R] Matching multiple search criteria (Unlisting a nested dataset, take 2)

2018-10-18 Thread Bert Gunter
Sorry. Typo.  The last line should be:

ans$Result <- apply(ans,1,function(r)phrasewords[[r[1]]] %allin%
tweets[[r[2]]])

-- Bert



On Thu, Oct 18, 2018 at 7:04 PM Bert Gunter  wrote:

> All (especially Nathan): **Please feel free to ignore this post without
> response.** It just represents a bit of OCD-ness on my part that may or may
> not be of interest to anyone else.
>
> Purpose of this post: To give an alternative considerably simpler and
> considerably faster solution to the problem than those which I offered
> previously. It may or may not be what the OP asked for, but the improvement
> exercise was instructive to me . Notation as previously in this thread.
>
> New solution:
>
> getwords <-
> function(x)strsplit(gsub("(^[[:space:]]+)|([[:space:]]+)$)","",tolower(x)),split
> = " +")
> ## split lower-cased text into a vector of "words"
> ## I made this a bit fancier to handle some "corner" cases, but the
> previous simpler version may well suffice.
>
> '%allin%' <- function(x, table)prod(match(x,table, nomatch = 0L)) > 0L
> ## a convenience function/operator that improves efficiency.
>
> ## lists of  search word vectors as before
> phrasewords <- getwords(st$terms)
> tweets <- getwords(c(th$text, " i  worthless yxxc ght feel")) ## the
> tweets + one additional
>
> ## simpler approach just using indexing for the bookkeeping that nested
> _apply
> ## loops previously were used for
> ans <- expand.grid(phrases = seq_along(phrasewords),tweets =
> seq_along(tweets), Result = FALSE)
> ans$Result <- apply(ind,1,function(r)phrasewords[[r[1]]] %allin%
> tweets[[r[2]]])
>
> ## ans is a data frame in which the first column indexes phrases and the
> second tweets
> ## The ith row of ans$Result == TRUE iff all the words in the phrase
> indexed by the ith row of the
> ##  phrase column are contained in the tweet indexed by that row's tweet
> column.
>
> This was way faster than my previous offerings.
>
> Note also that just the matching phrases and tweets can be extracted as
> usual by:
>
> > ans[ans[,3],]
>    phrases tweets Result
> 42   6  7   TRUE
> ## all the words in the 6th search phrase appeared in the 7th tweet.
>
> ** I promise to natter on about this no longer! **
>
> Cheers,
> Bert
>
>
> On Wed, Oct 17, 2018 at 7:50 PM Bert Gunter 
> wrote:
>
>>
>> If you wish to use R, you need to at least understand its basic data
>> structures and functionality. Expecting that mimickry of code in special
>> packages will suffice is, I believe, an illusion. If you haven't already
>> done so, you should go through a basic R tutorial or two (there are many on
>> the web; some recommendations, by no means necessarily "the best",  can be
>> found here:
>> https://www.rstudio.com/online-learning/#r-programming).
>>
>> Having said that, I realized that my previous "solution" using regular
>> expressions was more complicated than it needed to be and somewhat foolish
>> ( so much for all my "expertise"). A simpler and better approach is simply
>> to break up both the tweet texts and your search phrases into vectors of
>> their "words" (i.e. character strings surrounded by spaces) using
>> strplit(), and then using R's built-in matching capabilities with %in%.
>> This is quite straightforward, pretty robust (no regex's to wrestle with),
>> and does not require "herculean efforts" to understand. The only wrinkle is
>> some bookkeeping with the "apply" family of functions. These are, as you
>> may know, the functional programming way of handling iteration (loops), but
>> they are what I would consider part of "basic" R functionality and worth
>> spending the time to learn about.
>>
>> Herewith my better, simpler proposal, using your example data as before:
>>
>> getwords <- function(x)strsplit(tolower(x),split = " +")
>> ## split text into a vector of lower-cased "words"
>>
>> phrasewords <- structure(getwords(st$terms), names = st$terms)
>> ## named list of your search word vectors
>>
>> tweets <- getwords(c(th$text, " i  worthless yxxc ght feel"))
>> ## the tweets + one additional that should match the last phrase
>>
>> ans <- lapply(phrasewords, function(x) apply(sapply(tweets,function(y)x
>> %in% y), 2, all))
>> ## a list indexed by the search phrases,
>> ## with each component a vector of logicals with vec[i] == TRUE iff
>> ## the ith tweet contains all the words in the search

Re: [R] Project in emacs + ess

2018-10-18 Thread Bert Gunter
Wrong list. This list is about R programming. You should address this to an
emacs support list. Better yet, to an ess list . Here's one place you might
start:

https://www.r-bloggers.com/using-r-with-emacs-and-ess/

Other resources can be found by a web search on "R ess"  .

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Oct 18, 2018 at 9:12 PM Naresh Gurbuxani <
naresh_gurbux...@hotmail.com> wrote:

> I have switched from RStudio to emacs.  In emacs, how can I create a
> project like in RStudio?
>
>
> Within the project directory, I would like to create different directories
> for code, data, results, figures, documents, etc.  In RStudio project,
> relative references work well.  For example, an Sweave document in document
> directory can use command source('code/mycode.R').  In emacs, this does not
> work "out of the box".  In document folder, the command needs to be
> source('../code/mycode.R').  This is minor effort, but a better method must
> exist.
>
>
> Thanks,
>
> Naresh
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Importing SAS datasets into R efficiently

2018-10-19 Thread Bert Gunter
Have you looked at the "foreign" package?

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Oct 19, 2018 at 6:48 AM Jomy Jose  wrote:

> Hi
>
> Is there an efficient way to import SAS datasets into R,presently while
> using haven package it takes long time...Is there a smart work around this
> ?
>
> Thanks in advance
> Jose
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2018-10-20 Thread Bert Gunter
Jeremie's suggestion of course will fail if some of the off-diagonal
elements are the same as those on the diagonals.

The request doesn't make a lot of sense to me, but if "m" is the matrix,

 m[row(z) != col(z)]

reliably extracts the vector of non-diagonal entries, which can then be
dimensioned as desired.
Or upper.tri() and lower.tri() can be used to separately extract the upper
and lower triangle entries via logical indexing.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Oct 20, 2018 at 8:48 AM Jeremie Juste 
wrote:

>
>
> Hello,
>
> Be sure to include the mailing list, when you
> reply. In this way to improve your chances of obtaining a good answer
> and everyone benefits.
>
>
> May be something like this ?
>
> aa <- matrix(1:9,3,3)
> matrix(as.numeric(aa)[!as.numeric(aa) %in% diag(aa)],2,3)
>
>  [,1] [,2] [,3]
> [1,]247
> [2,]368
>
> Hope it helps,
>
> Jeremie
>
> malika yassa  writes:
>
> >  hellow
> > yes i want to extract the non-diagonal part
> > for exampl
> > i have this matrix  [,1] [,2] [,3]
> > [1,]147
> > [2,]258
> > [3,]369the result
> >
> >   [,1] [,2] [,3]
> > [1,]247
> > [2,]368
> >
> > Le samedi 20 octobre 2018 à 15:21:53 UTC+2, Jeremie Juste <
> jeremieju...@gmail.com> a écrit :
> >
> >  malika yassa via R-help  writes:
> >
> > Hello,
> >
> > Can you specify what you mean by deleting exactly?
> > Do you want to have zero in the diagonal or do you want to extract the
> > non-diagonal part?
> >
> > Besides your matrix is not a square matrix. Do you really want to
> > extract the non-diagonal part of a non square matrix?
> >
> > Best regards,
> >
> > Jeremie
> >
> >> hellowplease,do you help mei have this matrixm<-matrix(( 1:12, nrow = 3
> )
> >>
> >> I want to delete the diagonal values of this matrix
> >> can anyone do thisthinks
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] studio server on High SIerra

2018-10-20 Thread Bert Gunter
RStudio is a separate product from R. Post on the RStudio site, not R-help.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Oct 20, 2018 at 8:12 PM Fuchs Ira  wrote:

> Can I run Rstudio Server on OSX 10.13 (High Sierra). If so, can someone
> please point me to install directions? I found an old post that talks about
> using homebrew but I can't find the rstudio-server brew to install.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] match() question or needle haystack problem for a data.frame

2018-10-22 Thread Bert Gunter
Re-read ?match and note the examples for %in%

-- Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Oct 22, 2018 at 7:38 AM Knut Krueger 
wrote:

>
> Hi to all
>
> I would like to reduce the "Mydata" to rows, only if Mydata$Data1 are in
> needles
>
>
>
> needles =c(14390, 14391, 14392, 14427, 14428, 14429, 14430, 14431,
> 14432, 14433, 14434, 14435, 14436, 14437, 14439, 14440, 14441, 15195,
> 15196, 15197, 15198, 15199, 15200, 15201, 15202, 15203, 15204, 15205,
> 15206, 15207, 15208, 15209, 17615, 17616, 17617, 17618, 17619, 17620,
> 17621, 17622, 17623, 17624, 17625, 17626, 17627, 17628, 17629, 17630,
> 17631, 17679, 17680, 17681, 17682, 17683, 17823, 17824, 17825, 17826,
> 17827, 17828, 17829, 17830, 17831, 17862, 17863, 17864, 17865, 17866,
> 17867, 17868, 17869, 17870, 17871, 17872, 17873, 17874, 17875, 17876,
> 17877, 17878, 17879, 17880, 17881, 17882, 17883, 19255, 19256, 19257,
> 19258, 21289, 21290, 21291, 21292, 22890, 22891, 22892, 22893, 22894,
> 22895, 22896, 22897, 22898, 22899, 22900, 22901, 22902, 40428, 40429,
> 40430, 40431, 40432, 40433, 40434, 40435, 40436, 40437)
>
> Haystack =c(14390, 14391, 14392, 14427, 14428, 14429, 14430, 14431,
> 14432, 14433, 14434, 14435, 14436, 14437, 14439, 14440, 14441, 15187,
> 15188, 15195, 15196, 15197, 15198, 15199, 15200, 15201, 15202, 15203,
> 15204, 15205, 15206, 15207, 15208, 15209, 16717, 16718, 17615, 17616,
> 17617, 17618, 17619, 17620, 17621, 17622, 17623, 17624, 17625, 17626,
> 17627, 17628, 17629, 17630, 17631, 17679, 17680, 17681, 17682, 17683,
> 17817, 17818, 17823, 17824, 17825, 17826, 17827, 17828, 17829, 17830,
> 17831, 17862, 17863, 17864, 17865, 17866, 17867, 17868, 17869, 17870,
> 17871, 17872, 17873, 17874, 17875, 17876, 17877, 17878, 17879, 17880,
> 17881, 17882, 17883, 17886, 19255, 19256, 19257, 19258, 21289, 21290,
> 21291, 21292, 22890, 22891, 22892, 22893, 22894, 22895, 22896, 22897,
> 22898, 22899, 22900, 22901, 22902, 40428, 40429, 40430, 40431, 40432,
> 40433, 40434, 40435, 40436, 40437, 40710, 40711, 49127, 49128, 52768)
>
> Mydata =data.frame (DATA1=Haystack, Data2=c(1:length(Haystack)))
>
>
>
> match(Mydata$DATA1, needles, nomatch=NA) does find all data which are in
> needle - the others are set to the nomatch value.
>
> But I don not find out how to reduce the data.frame  -  maybe match() is
> not helpful for that.
>
> Kind regards Knut
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] match() question or needle haystack problem for a data.frame

2018-10-22 Thread Bert Gunter
I suggest you spend a bit of time with an R tutorial or two and, in
particular learn about "logical indexing," as this basic R construct seems
to be mysterious to you.

Cheers,
Bert




On Mon, Oct 22, 2018 at 8:22 AM Knut Krueger 
wrote:

> Am 22.10.18 um 17:01 schrieb Eric Berger:
> > v <- match(Mydata$DATA1, needles, nomatch=NA)
> >  > found <- Mydata[ !is.na (v), ]
> >  > missing <- Mdata[ is.na (v), ]
>
> Thank you it is working, additionally as Bert suggested, it seems that
>
> Mydata[Mydata$DATA1 %in% needles,]
>
> is doing the same.
>
> Kind Regards Knut
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transformations in Tidyverse (renaming and ordering columns)

2018-10-22 Thread Bert Gunter
For clarity's sake:

** Stop posting in HTML.**


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Oct 22, 2018 at 4:55 PM Joel Maxuel  wrote:

> For clarity sake.  More show (with example closer to reality), less tell.
> :^)
>
> ## Current:
>
> > library(knitr)
> > library(tidyverse)
> ── Conflicts
> ─
> tidyverse_conflicts() ──
> x dplyr::filter() masks stats::filter()
> x dplyr::lag()masks stats::lag()
> > library(tibble)
> > library(dplyr)
> >
> > testset <- as_tibble(tribble(~SN, ~Section, ~Order, ~Observation, ~Seq,
> ~Label, ~Value,
> +  2, "For Reporting Quarter", 1, "One", 1,
> "Western", 163,
> +  2, "For Reporting Quarter", 1, "One", 2,
> "Northern", 105,
> +  2, "For Reporting Quarter", 1, "One", 3,
> "Eastern", 121,
> +  2, "For Reporting Quarter", 1, "One", 4,
> "Southern", 74,
> +  2, "For Reporting Quarter", 2, "Two", 1,
> "Western", 147,
> +  2, "For Reporting Quarter", 2, "Two", 2,
> "Northern", 100,
> +  2, "For Reporting Quarter", 2, "Two", 3,
> "Eastern", 106,
> +  2, "For Reporting Quarter", 2, "Two", 4,
> "Southern", 70,
> +  2, "For Reporting Quarter", 3, "Three", 1,
> "Western", 119,
> +  2, "For Reporting Quarter", 3, "Three", 2,
> "Northern", 82,
> +  2, "For Reporting Quarter", 3, "Three", 3,
> "Eastern", 90,
> +  2, "For Reporting Quarter", 3, "Three", 4,
> "Southern", 65))
> > testset %>% select(Observation, Label, Value) %>% spread(key=Observation,
> value=Value)
> # A tibble: 4 x 4
>   Label  One Three   Two
> 
> 1 Eastern12190   106
> 2 Northern   10582   100
> 3 Southern746570
> 4 Western163   119   147
> >
>
> ## Intended:
>
> # A tibble: 4 x 4
>   For Reporting Quarter One   Two Three
>   
> 1 Western   163   147   119
> 2 Northern   105   10082
> 3 Eastern121   10690
> 4 Southern   74 70 65
> >
>
> ##
>
> Unfortunately I don't know how to get there from here.  Section, Order and
> Seq are there to assist with getting the data to the right output
> programmatically, however I don't know how to make use of them.
>
> Hope this helps.
>
> --
> Cheers,
> Joel Maxuel
>
>
> On Mon, Oct 22, 2018 at 6:18 PM Jeff Newmiller 
> wrote:
>
> > If you are willing to work in the context of LaTeX output then perhaps
> you
> > will find the "tables" package useful. However, while you think you have
> > communicated clearly enough regarding what you want to accomplish, I do
> > not, so either someone else will intuit what you want or you will create
> a
> > mock-up of what you want your output to look like to remove the
> guesswork.
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculate the integral

2018-10-24 Thread Bert Gunter
Is this homework? This list has a no homework policy.

Also, please stop posting in html.

Bert Gunter


On Wed, Oct 24, 2018 at 6:40 AM malika yassa via R-help <
r-help@r-project.org> wrote:

> helloplease can anyone help me, I find it difficult to calculate this
> integrali have this
> programmx<-rnorm(10,0,1)y<-rexp(10,2)z<-exp(10,3)s<-vector()for (  j in
> 1:10)s[j]<-x[j+1]+x[j]s1[i]<-s[j]/2f<-function(y,u){exp(y-u)}sapply(x,
> function(i){
>  z[j] integrate(f,lower=s1[j],upper=,s1[j+1] x=i)$value
> for each value of x I will have a vectori can't calculate thisthank you in
> advance
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove text from nested list

2018-10-25 Thread Bert Gunter
1. Please learn how to use dput() to provide examples to responders.
There's not much we can do with a text printout (at least without some work
that I don't care to do).

2. Do you know what mylist[[c(1,2,1)]] means? If not, read ?(Extract) and
note in particular:
"[[ can be applied recursively to lists, so that if the single index i is a
vector of length p, alist[[i]] is equivalent to alist[[i1]]...[[ip]] providing
all but the final indexing results in a list."

As your intent is unclear -- no reproducible example showing the desired
result -- I would suggest just using list indexing to access the matrices
you wish to change. But maybe this does not satisfy your vague request.

Also, something seems screwy in the example you showed: For example, the
[[1]][[2]][[1]] component indicates a 2 x 5 matrix, but I see only 3
columns of text. Am I missing something?

Cheers,
Bert



On Thu, Oct 25, 2018 at 6:04 PM Ek Esawi  wrote:

> Hi All—
>
> I have a list that contains multiple sub-lists and each sub-list
> contains multiple  sub(sub-lists), each of the sub(sub-lists) is made
> up of matrices of text. I want to replace some of the text in some
> parts in the matrices on the list. I tried gsub and stringr,
> str_remove, but nothing seems to work
>
> I tried:
>
> lapply(mylist, function(x) lapply(x, function(y)
> gsub("[0-9][0-9]/[0-9[0-9].*com","",y)))
> lapply(mylist, function(x) str_remove(x,"[0-9][0-9]/[0-9[0-9].*com"))
>
> Any help is greatly apprercaited.
>
>
>
> mylist—this is just an example
>
> [[1]]
> [[1]][[1]]
> [[1]][[1]][[1]]
> [,1]  [,2]  [,3]  [,4] [,5]
> [1,] "12/30 12/30"  "ABABABABABAB"  "8.00"
> [2,] "01/02 01/02"  "”.   “99"
> [3,] "01/02 01/02"  "CACACACACACC” "55.97"
>
> [[1]][[1]][[2]]
> [,1]  [,2]
> [1,] "12/30 12/30" "DDD” “29"
> [2,] "12/30 12/30"  :GGG” “333”
>
> [[1]][[2]]
> [[1]][[2]][[1]]
> [,1]  [,2]  [,3] [,4]  [,5]
> [1,]  "01/02 01/02" "ThankYou" “23”
> [2,] "01/02 01/02"  "Standard data"  "251"
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] source() fails in same directory as script: cannot find file

2018-10-27 Thread Bert Gunter
Your messages got through fine (to me, anyway). I suspect people are just
failing to read through the threads.

-- Bert

On Sat, Oct 27, 2018 at 6:35 AM Ista Zahn  wrote:

> On Fri, Oct 26, 2018 at 10:42 PM Jeff Newmiller
>  wrote:
> >
> > I haven't seen mention of what OS or filesystem types are involved, but
> it superficially looks like it might be one of those cases where the
> filesystem is case-sensitive... check that all your directory and file
> names are capitalized correctly.
>
> Just out of curiosity -- are my messages to r-help not going through?
> I correctly diagnosed the problem as a simple typo in my message on
> Oct 26 at 2:54 PM, but then the discussion just continued as if that
> never happened.
>
> >
> > On October 26, 2018 1:11:19 PM PDT, Jeremie Juste <
> jeremieju...@gmail.com> wrote:
> > >Hello,
> > >
> > >I suspect the error is in the file input-summerize.R.
> > >
> > >I creating a new file input-summerize2.R with only print("hello") for
> > >instance and check if
> > >
> > >>
> >
> >setwd("~/documents/white-papers/geochemistry/willamette-river-mercury/scripts")
> > >> source("input-summerize2.R")
> > >
> > >works
> > >
> > >Hope it helps,
> > >
> > >Jeremie
> > >
> > >
> > >Rich Shepard  writes:
> > >
> > >> On Fri, 26 Oct 2018, Ista Zahn wrote:
> > >>
> > >>> I'm confused. It seems the error is that the file can't be found; if
> > >>> so, what does it matter what is in the file?
> > >>
> > >> Ista,
> > >>
> > >>   Beats me.
> > >>
> > >>> As far as I can see you are either not in the directory you think
> > >you
> > >>> are, or b) the file is not named what you think it is.
> > >>
> > >>   Yes, the error seems that R cannot find the file, but it's in the
> > >same
> > >> directory and the file does exist:
> > >>
> > >>> getwd()
> > >> [1]
> >
> >"/home/rshepard/documents/white-papers/geochemistry/willamette-river-mercury/scripts"
> > >>
> > >>
> > >~/documents/white-papers/geochemistry/willamette-river-mercury/scripts]$
> > >ls input-summerize.R
> > >> input-summerize.R
> > >>
> > >>   So, R is running in the scripts/ directory and the script is there,
> > >too.
> > >>
> > >>   This is why I asked for help as the error makes no sense to me.
> > >>
> > >> Regards,
> > >>
> > >> Rich
> > >>
> > >> __
> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >
> > >__
> > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >https://stat.ethz.ch/mailman/listinfo/r-help
> > >PLEASE do read the posting guide
> > >http://www.R-project.org/posting-guide.html
> > >and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > Sent from my phone. Please excuse my brevity.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MSE Cross-validation with factor interactions terms MARS regression

2018-10-29 Thread Bert Gunter
I did no analysis of your code or thought process, but noticed that you had
the following two successive lines in your code:


y=Testing$wage

y=Wage[-sam,]$wage

This obviously makes no sense, so maybe you should fix this first and then
proceed.

-- Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Oct 29, 2018 at 1:46 PM varin sacha via R-help 
wrote:

>
> Dear R-experts,
> I am having trouble while doing crossvalidation with a MARS regression
> including an interaction term between a factor variable (education) and 1
> continuous variable (age). How could I solve my problem ?
>
> Here below my reproducible example.
>
> ###
>
> install.packages("ISLR")
>
> library(ISLR)
>
> install.packages("earth")
>
> library(earth)
>
> a<-as.factor(Wage$education)
>
> # Create a list to store the results
>
> lst<-list()
>
> # This statement does the repetitions (looping)
>
> for(i in 1 :200) {
>
> n=dim(Wage)[1]
>
> p=0.667
>
> sam=sample(1 :n,floor(p*n),replace=FALSE)
>
> Training =Wage [sam,]
>
> Testing = Wage [-sam,]
>
> mars5<-earth(wage~age+education+year+age*a, data=Wage)
>
> ypred=predict(mars5,newdata=Testing)
>
> y=Testing$wage
>
> y=Wage[-sam,]$wage
>
> MSE = mean(y-ypred)^2
>
> MSE
>
> lst[i]<-MSE
>
> }
>
> mean(unlist(lst))
>
> summary(mars5)
>
> ###
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


<    5   6   7   8   9   10   11   12   13   14   >