date:20130816


On Aug 15, 2013, at 11:45 AM, BoonFei Tan wrote:

 Say I have a dataframe for plotting scatterplot. The dataframe would be
 organized in the following fashion (in CSV format):
 
 name ABC EFG132  45256  67
 to, say 200 000 entries
 
 I am going to first do a scatterplot, after which I am going to subset a
 portion of the dataset into A using alphahull and export them as XYZ. The
 script for doing this:
 
 #plot first plot containing all data
 plot(x = X$ABC,
 y = X$EFG,
 pch=20,)

What's X and why do you have a comma followed by a right-paren?

 #subset data using ahull. choose 4 points on the plot
 A - ahull(locator(4, type=p, pch=20), alpha=1)

It is courteous to put the library call that would load the package that has 
`ahull`.

 #exporting subset
 XYZ - {}for (i in 1:nrow(X)) { if (inahull(A, c(X$ABC[i],X$EFG[i])))
 XYZ - rbind(X,X[i,])}

Are there some missing linefeeds here?
Did you really want XYZ - rbind(X,X[i,])} and not XYZ - rbind(XYZ,X[i,])} ?

Perhaps you should learn to post in plain text, as suggested in the Posting 
Guide.

 
 I am getting the following message if the number of data points in the
 subset that I choose is too large:Error in if (p[2]  a + b * p[1]) { :
 missing value where TRUE/FALSE needed
 
 Does anyone know what might be causing the problem? Any help is much
 appreciated!
 

My vote would be ... missing values.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Issue installing Packages

2013-08-16 Thread Alexandre Khelifa

Hi Guys,

Hope you are doing good. I am using R (3.0.1 - 32 bits) extensively for my
work but I have been having an issue for the last days.

I would like to download (and update) the packages RODBC, forecast and
gdata but I cannot download the binary file from the CRAN Mirrors.
I have tried several of them but the file cannot download completely and
freeze when downloaded at 99%.

Thus, I cannot install it on my R console.
I have checked with several co-workers and they all have the same issues.

Please let me know what I can do.
Please also find attached a copy of the issue while downloading the windows
binary file.

Thanks a lot for your help, and the R support. It is a AMAZING tool.

Regards,

Alexandre Khelifa
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem installing RDCOMEvents

2013-08-16 Thread Kishor Tappita

Dear R-Users,

I am getting the below error when I am trying to install RDCOMEvents
from Omegahat. All the dependencies were installed .Please help me
resolve this issue.

 library(RDCOMEvents)
Loading required package: RDCOMServer
Loading required package: SWinRegistry

Attaching package: 'SWinRegistry'


The following object(s) are masked from package:base :

 append

Loading required package: Ruuid
Loading required package: RDCOMClient
Loading required package: SWinTypeLibs
Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source
= keep.source) :
  in 'RDCOMEvents' methods for export not found: findConnectionPoint,
createCOMEventServer
In addition: There were 18 warnings (use warnings() to see them)
Error: package/namespace load failed for 'RDCOMEvents'

 sessionInfo()
R version 2.9.1 (2009-06-26)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] SWinTypeLibs_0.5-1 RDCOMClient_0.92-0 RDCOMServer_0.6-2  Ruuid_1.22.0
[5] SWinRegistry_0.3-3


Thanks,
Regards,
Kishor

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] to match samples by minute

2013-08-16 Thread PIKAL Petr

Hi

You will get only general answer without some example data. Se Posting Guide.

If I understand correctly you need to reshape your df to have a structure

unixtime, valuefactor1, valuefactor2

and after that valuefactor1-valuefactor2 shall give you desired solution.

One possible way is to split your data frame to two e.g.

df1 - df[df$factor1==bla,]
df2 - df[df$factor1!=bla,]


and then merge

df.m - merge(df1, df2, by=unixtime, all=TRUE)

Regards
Petr


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Zhang Weiwu
 Sent: Thursday, August 15, 2013 6:31 PM
 To: r-help@r-project.org
 Subject: [R] to match samples by minute
 
 
 Perhaps this is simple and common, but it took me quite a while to
 admit I cannot solve it in a simple way.
 
 The data frame `df` has the following columns:
 
 unixtime, value, factor
 
 Now I need a matrix of:
 
 unixtime, value-difference-between-factor1-and-factor2
 
 The naive solution is:
 
 df[df$factor == factor1,] - df[df$factor == factor2,]
 
 It won't work, because factor1 has 1000 valid samples, factor2 has 1400
 valid samples. The invalid samples are dropped on-site, i.e. removed
 before piped into R.
 
 To solve it, I got 2 ideas.
 
 1. create a new data.frame with 24*60 records, each record represent a
 minute in the day, because sampling is done once per minute. Now fit
 all records into their 'slots' by their nearest minute.
 
 2. pair each record with another that has similar unixtime but
 different factor.
 
 Both ideas require for loop into individual records. It feels to C-like
 to write a program that way. Is there a professional way to do it in R?
 If not, I'd even prefer to rewrite the sampler (in C) to not to discard
 invalid samples on-site, than to mangle R.
 
 Thanks.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] to match samples by minute

2013-08-16 Thread PIKAL Petr

Hi

 -Original Message-
 From: Weiwu Zhang [mailto:zhangwe...@realss.com]
 Sent: Friday, August 16, 2013 9:55 AM
 To: PIKAL Petr
 Cc: r-help@r-project.org
 Subject: Re: [R] to match samples by minute

 2013/8/16 PIKAL Petr petr.pi...@precheza.cz:
  You will get only general answer without some example data. Se
 Posting Guide.

 Thanks. Yes I do expect general answer, because I feel this problem of
 unmatched samples is ubiquitious, only that I don't have a good
 Google keyword to dig myself.

  df.m - merge(df1, df2, by=unixtime, all=TRUE)

 Thanks, this is a good general answer indeed. See, I only need a
 keyword (merge) to push me to the right track. In my particular case, I

I am not sure if it is the right track. It depends on my understanding (and 
this can be wrong) of your explanation.

 need to cast my unixtime into number-of-minutes before merging it.

It depends what unixtime is. At least

str(df) can help.

Regards
Petr
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] GUI tools for R

2013-08-16 Thread ashz

Hi,

I wish to build a GUI for my R script, what are the best and easiest tools
to use and which ones as good documentation or books?

Thanks





--
View this message in context: 
http://r.789695.n4.nabble.com/GUI-tools-for-R-tp4673925.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] to match samples by minute

2013-08-16 Thread Weiwu Zhang

2013/8/16 PIKAL Petr petr.pi...@precheza.cz:
 You will get only general answer without some example data. Se Posting Guide.

Thanks. Yes I do expect general answer, because I feel this problem of
unmatched samples is ubiquitious, only that I don't have a good
Google keyword to dig myself.

 df.m - merge(df1, df2, by=unixtime, all=TRUE)

Thanks, this is a good general answer indeed. See, I only need a
keyword (merge) to push me to the right track. In my particular case,
I need to cast my unixtime into number-of-minutes before merging it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] to match samples by minute

2013-08-16 Thread Weiwu Zhang

2013/8/16 PIKAL Petr petr.pi...@precheza.cz:
 It depends what unixtime is. At least str(df) can help.

Thanks. Indeed following you suggestion I found it easier to use than
summary() for debugging. Now I can properly handle POSIXct thanks to
the help I got from this list a few weeks ago:)

 I am not sure if it is the right track. It depends on my understanding (and 
 this can
 be wrong) of your explanation.

I know the efficiency you can obtain through the method you suggested,
as demonstrated by my several previous newbie posts this method. And I
know your effort is helpful for the community too.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issue installing Packages

2013-08-16 Thread Uwe Ligges


Works for me.

If this happens for several mirrors and more than on e package, I 
believe it is a local/internal problem of your network setup. Please ask 
your IT staff.


Uwe Ligges






On 15.08.2013 20:54, Alexandre Khelifa wrote:

Hi Guys,

Hope you are doing good. I am using R (3.0.1 - 32 bits) extensively for my
work but I have been having an issue for the last days.

I would like to download (and update) the packages RODBC, forecast and
gdata but I cannot download the binary file from the CRAN Mirrors.
I have tried several of them but the file cannot download completely and
freeze when downloaded at 99%.

Thus, I cannot install it on my R console.
I have checked with several co-workers and they all have the same issues.

Please let me know what I can do.
Please also find attached a copy of the issue while downloading the windows
binary file.

Thanks a lot for your help, and the R support. It is a AMAZING tool.

Regards,

Alexandre Khelifa



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem installing RDCOMEvents

2013-08-16 Thread Prof Brian Ripley


Are you serious: 'R version 2.9.1 (2009-06-26)' ?

Please see the posting guide (see the footer of this message), and

1) Update your R.

2) If this still does not work, ask the package maintainer.


On 16/08/2013 09:29, Kishor Tappita wrote:

Dear R-Users,

I am getting the below error when I am trying to install RDCOMEvents
from Omegahat. All the dependencies were installed .Please help me
resolve this issue.


library(RDCOMEvents)

Loading required package: RDCOMServer
Loading required package: SWinRegistry

Attaching package: 'SWinRegistry'


 The following object(s) are masked from package:base :

  append

Loading required package: Ruuid
Loading required package: RDCOMClient
Loading required package: SWinTypeLibs
Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source
= keep.source) :
   in 'RDCOMEvents' methods for export not found: findConnectionPoint,
createCOMEventServer
In addition: There were 18 warnings (use warnings() to see them)
Error: package/namespace load failed for 'RDCOMEvents'


sessionInfo()

R version 2.9.1 (2009-06-26)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] SWinTypeLibs_0.5-1 RDCOMClient_0.92-0 RDCOMServer_0.6-2  Ruuid_1.22.0
[5] SWinRegistry_0.3-3


Thanks,
Regards,
Kishor

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] คำถาม

2013-08-16 Thread Boonchai Oua-arunkij

I working with the R Text Mining in Thai Language. I got a problem and i want 
to ask something, please.1.Program R Studio can read Thai language but it's not 
complete. It's can fix?
2.I need to run code to be a graft. How i can writing the code? please suggest 
me.
Thank youÂ very much
à¸à¸à¸à¸¹à¹à¸¡à¸·à¸à¸à¹à¸§à¸¢Â  à¹à¸¥à¸°à¸à¸²à¸£à¸à¸³Text Mining 
à¸à¹à¸§à¸¢à¸à¸£à¸±à¸

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error in cmp function, COMPoissonReg package

2013-08-16 Thread Bündgens , Silke , Dipl . -Psych .

Hi,
I would like to calculate a 
Conway-Maxwell-Poissonhttp://artax.karlin.mff.cuni.cz/r-help/library/compoisson/html/00Index.html
 Regression with the cmp function of the COMPoissonReg package.
However, when I try to run the model, ...
cmp_model = cmp(formula = Vergleich ~c_0001 * Art_Vergleich 
,data=Mindset_AV_long_sel)
...I get an error, although I get the output as well:
#(Intercept)c_00012 Art_Vergleich2 
c_00012:Art_Vergleich2
0.5978370 -0.3881165 -0.4567584  0.8161324
Error in xmat %*% par[1:length(betainit)] : requires numeric/complex 
matrix/vector arguments
I tried to continue with the output I got, but at some point in the procedure 
suggested by Seller and Lotze, I get another error and no further output:


Error in xmat %*% betahat :

  requires numeric/complex matrix/vector arguments



Is it, maybe, because my two predictors are dichotomous and not continuous? Or 
is it just a stupid mistake somewhere I should be embarrassed about?



Is there any other way to calculate a regression with the COM poisson 
distribution? That might be the quicker solution.



I appreciate your help! And sorry, if the post is redundant and/or stupid - 
this is my first post.



Best,



Silke



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding additional points to ggplot2

2013-08-16 Thread Chris89

Hi!
I am having a difficulty adding additional points to a plot using ggplot2..

The case is that I want to plot both original and estimated values in the
same graph, and general I would use 
plot and then lines, but I do not know how to do it with ggplot...

Thanks!

Regards,
Chris



--
View this message in context: 
http://r.789695.n4.nabble.com/Adding-additional-points-to-ggplot2-tp4673928.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem installing RDCOMEvents

2013-08-16 Thread Kishor Tappita

Dear Prof Ripley,

Thanks for your reply. I tried with R -versions (R-3.0.1  R-2.15.2)
but was having problem with installing the dependencies such as
SWinRegistry,SWinTypeLibs. With R version 2.9.1, I was able to install
all the dependencies but am having problem with installing
RDCOMEvents.

As suggested by you I will contact the package maintainer.

Thanks,
Regards,
Kishor


On Fri, Aug 16, 2013 at 3:29 PM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:
 Are you serious: 'R version 2.9.1 (2009-06-26)' ?

 Please see the posting guide (see the footer of this message), and

 1) Update your R.

 2) If this still does not work, ask the package maintainer.



 On 16/08/2013 09:29, Kishor Tappita wrote:

 Dear R-Users,

 I am getting the below error when I am trying to install RDCOMEvents
 from Omegahat. All the dependencies were installed .Please help me
 resolve this issue.

 library(RDCOMEvents)

 Loading required package: RDCOMServer
 Loading required package: SWinRegistry

 Attaching package: 'SWinRegistry'


  The following object(s) are masked from package:base :

   append

 Loading required package: Ruuid
 Loading required package: RDCOMClient
 Loading required package: SWinTypeLibs
 Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source
 = keep.source) :
in 'RDCOMEvents' methods for export not found: findConnectionPoint,
 createCOMEventServer
 In addition: There were 18 warnings (use warnings() to see them)
 Error: package/namespace load failed for 'RDCOMEvents'

 sessionInfo()

 R version 2.9.1 (2009-06-26)
 i386-pc-mingw32

 locale:
 LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
 States.1252;LC_MONETARY=English_United
 States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] SWinTypeLibs_0.5-1 RDCOMClient_0.92-0 RDCOMServer_0.6-2  Ruuid_1.22.0
 [5] SWinRegistry_0.3-3


 Thanks,
 Regards,
 Kishor

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Output from unix PS

Hi,

When I view this output from the unix command ''ps'' the columns seem to
properly aligned but it is not read properly into the data frame. The
aggregate function throws

Error in aggregate.data.frame(data, by = list(COMMAND), FUN = sum) :
  object 'COMMAND' not found

Is there a recomendation to massage this ?

data = read.table(D:\\p..txt,sep=\t)

agg-aggregate(data,by=list(COMMAND),FUN=sum)

684524  0.0  0.0  12348   872 ?SAug09   0:00
hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
684528  0.0  0.0  12348   860 ?SAug09   0:00
hald-addon-keyboard: listening on /dev/input/event1
684532  0.0  0.0  12348   864 ?SAug09   0:00
hald-addon-keyboard: listening on /dev/input/event0
root  4540  0.0  0.0  10256   704 ?SAug09   1:02
hald-addon-storage: polling /dev/sr0
root  4576  0.0  0.0   8540   492 ?Ss   Aug09
0:00 /usr/bin/hidd --server
root  4619  0.0  0.0 122008  1540 ?Ssl  Aug09   0:00 automount
root  4636  0.0  0.0  26348   524 ?Ss   Aug09   0:00 ./hpiod
root  4641  0.0  0.0 154876  6428 ?SAug09
0:00 /usr/bin/python ./hpssd.py
root  4654  0.0  0.0  63544  1212 ?Ss   Aug09
0:00 /usr/sbin/sshd
root  4663  0.0  0.0 134208  2744 ?Ss   Aug09   0:00 cupsd
root  4677  0.0  0.0  21668   896 ?Ss   Aug09   0:00 xinetd
-stayalive -pidfile /var/run/xinetd.pid
root  4695  0.0  0.0  66968  2324 ?Ss   Aug09   0:00 sendmail:
accepting connections
smmsp 4703  0.0  0.0  57716  1760 ?Ss   Aug09   0:00 sendmail:
Queue runner@01:00:00 for /var/spool/clientmqueue
root  4713  0.0  0.0   6480   372 ?Ss   Aug09   0:00 gpm
-m /dev/input/mice -t exps2

Thanks.



This e-Mail may contain proprietary and confidential information and is sent 
for the intended recipient(s) only.  If by an addressing or transmission error 
this mail has been misdirected to you, you are requested to delete this mail 
immediately. You are also hereby notified that any use, any form of 
reproduction, dissemination, copying, disclosure, modification, distribution 
and/or publication of this e-mail message, contents or its attachment other 
than by its intended recipient/s is strictly prohibited.

Visit us at http://www.polarisFT.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] GUI tools for R

Hi,

I am not the expert but I came across shiny.rstudio.org. It has Twitter
bootstrap as its foundation. Others will know if there is a proper web
framework that R can integrate with.


Thanks.





   [R] GUI tools for R  


   ashz 
 to:
   r-help   
16-08-2013 03:19 PM 




   Sent by: 
  r-help-boun...@r-project.org  









Hi,

I wish to build a GUI for my R script, what are the best and easiest tools
to use and which ones as good documentation or books?

Thanks





--
View this message in context:
http://r.789695.n4.nabble.com/GUI-tools-for-R-tp4673925.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




This e-Mail may contain proprietary and confidential information and is sent 
for the intended recipient(s) only.  If by an addressing or transmission error 
this mail has been misdirected to you, you are requested to delete this mail 
immediately. You are also hereby notified that any use, any form of 
reproduction, dissemination, copying, disclosure, modification, distribution 
and/or publication of this e-mail message, contents or its attachment other 
than by its intended recipient/s is strictly prohibited.

Visit us at http://www.polarisFT.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Output from unix PS

2013-08-16 Thread Sarah Goslee

You don't tell us your OS details or provide an actual reproducible
example, but the error suggests that COMMAND is not one of the column
names in your data frame.

Have you checked that? When I run ps on my Fedora system, the column
name is CMD and not COMMAND.

Sarah

On Fri, Aug 16, 2013 at 7:30 AM,  mohan.radhakrish...@polarisft.com wrote:
 Hi,

 When I view this output from the unix command ''ps'' the columns seem to
 properly aligned but it is not read properly into the data frame. The
 aggregate function throws

 Error in aggregate.data.frame(data, by = list(COMMAND), FUN = sum) :
   object 'COMMAND' not found

 Is there a recomendation to massage this ?

 data = read.table(D:\\p..txt,sep=\t)

 agg-aggregate(data,by=list(COMMAND),FUN=sum)

 684524  0.0  0.0  12348   872 ?SAug09   0:00
 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
 684528  0.0  0.0  12348   860 ?SAug09   0:00
 hald-addon-keyboard: listening on /dev/input/event1
 684532  0.0  0.0  12348   864 ?SAug09   0:00
 hald-addon-keyboard: listening on /dev/input/event0
 root  4540  0.0  0.0  10256   704 ?SAug09   1:02
 hald-addon-storage: polling /dev/sr0
 root  4576  0.0  0.0   8540   492 ?Ss   Aug09
 0:00 /usr/bin/hidd --server
 root  4619  0.0  0.0 122008  1540 ?Ssl  Aug09   0:00 automount
 root  4636  0.0  0.0  26348   524 ?Ss   Aug09   0:00 ./hpiod
 root  4641  0.0  0.0 154876  6428 ?SAug09
 0:00 /usr/bin/python ./hpssd.py
 root  4654  0.0  0.0  63544  1212 ?Ss   Aug09
 0:00 /usr/sbin/sshd
 root  4663  0.0  0.0 134208  2744 ?Ss   Aug09   0:00 cupsd
 root  4677  0.0  0.0  21668   896 ?Ss   Aug09   0:00 xinetd
 -stayalive -pidfile /var/run/xinetd.pid
 root  4695  0.0  0.0  66968  2324 ?Ss   Aug09   0:00 sendmail:
 accepting connections
 smmsp 4703  0.0  0.0  57716  1760 ?Ss   Aug09   0:00 sendmail:
 Queue runner@01:00:00 for /var/spool/clientmqueue
 root  4713  0.0  0.0   6480   372 ?Ss   Aug09   0:00 gpm
 -m /dev/input/mice -t exps2

 Thanks.



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Output from unix PS

My code is not complete. But the headers I see are

USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
684517  0.0  0.0  12348   864 ?SAug08   0:00
hald-addon-keyboard: listening on /dev/input/event1
684521  0.0  0.0  12348   860 ?SAug08   0:00
hald-addon-keyboard: listening on /dev/input/event0
root  4529  0.0  0.0  10256   704 ?SAug08   1:57
hald-addon-storage: polling /dev/sr0

The last column value doesn't follow any rule.

Thanks,
Mohan





   Re: [R] Output from unix PS  


   Sarah Goslee 
 to:
   mohan.radhakrishnan  
16-08-2013 06:01 PM 




   Cc:  
   r-help   









You don't tell us your OS details or provide an actual reproducible
example, but the error suggests that COMMAND is not one of the column
names in your data frame.

Have you checked that? When I run ps on my Fedora system, the column
name is CMD and not COMMAND.

Sarah

On Fri, Aug 16, 2013 at 7:30 AM,  mohan.radhakrish...@polarisft.com
wrote:
 Hi,

 When I view this output from the unix command ''ps'' the columns seem to
 properly aligned but it is not read properly into the data frame. The
 aggregate function throws

 Error in aggregate.data.frame(data, by = list(COMMAND), FUN = sum) :
   object 'COMMAND' not found

 Is there a recomendation to massage this ?

 data = read.table(D:\\p..txt,sep=\t)

 agg-aggregate(data,by=list(COMMAND),FUN=sum)

 684524  0.0  0.0  12348   872 ?SAug09   0:00
 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
 684528  0.0  0.0  12348   860 ?SAug09   0:00
 hald-addon-keyboard: listening on /dev/input/event1
 684532  0.0  0.0  12348   864 ?SAug09   0:00
 hald-addon-keyboard: listening on /dev/input/event0
 root  4540  0.0  0.0  10256   704 ?SAug09   1:02
 hald-addon-storage: polling /dev/sr0
 root  4576  0.0  0.0   8540   492 ?Ss   Aug09
 0:00 /usr/bin/hidd --server
 root  4619  0.0  0.0 122008  1540 ?Ssl  Aug09   0:00
automount
 root  4636  0.0  0.0  26348   524 ?Ss   Aug09   0:00 ./hpiod
 root  4641  0.0  0.0 154876  6428 ?SAug09
 0:00 /usr/bin/python ./hpssd.py
 root  4654  0.0  0.0  63544  1212 ?Ss   Aug09
 0:00 /usr/sbin/sshd
 root  4663  0.0  0.0 134208  2744 ?Ss   Aug09   0:00 cupsd
 root  4677  0.0  0.0  21668   896 ?Ss   Aug09   0:00 xinetd
 -stayalive -pidfile /var/run/xinetd.pid
 root  4695  0.0  0.0  66968  2324 ?Ss   Aug09   0:00
sendmail:
 accepting connections
 smmsp 4703  0.0  0.0  57716  1760 ?Ss   Aug09   0:00
sendmail:
 Queue runner@01:00:00 for /var/spool/clientmqueue
 root  4713  0.0  0.0   6480   372 ?Ss   Aug09   0:00 gpm
 -m /dev/input/mice -t exps2

 Thanks.



--
Sarah Goslee
http://www.functionaldiversity.org




This e-Mail may contain proprietary and confidential information and is sent 
for the intended recipient(s) only.  If by an addressing or transmission error 
this mail has been misdirected to you, you are requested to delete this mail 
immediately. You are also hereby notified that any use, any form of 
reproduction, dissemination, copying, disclosure, modification, distribution 
and/or publication of this e-mail message, contents or its attachment other 
than by its intended recipient/s is strictly prohibited.

Visit us at http://www.polarisFT.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the

Re: [R] Output from unix PS

2013-08-16 Thread Sarah Goslee

What does str(yourdata) return?

How about including
dput(head(yourdata, 20)) in your email to create a reproducible example?

Sarah

On Fri, Aug 16, 2013 at 9:08 AM,  mohan.radhakrish...@polarisft.com wrote:
 My code is not complete. But the headers I see are

 USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
 684517  0.0  0.0  12348   864 ?SAug08   0:00
 hald-addon-keyboard: listening on /dev/input/event1
 684521  0.0  0.0  12348   860 ?SAug08   0:00
 hald-addon-keyboard: listening on /dev/input/event0
 root  4529  0.0  0.0  10256   704 ?SAug08   1:57
 hald-addon-storage: polling /dev/sr0

 The last column value doesn't follow any rule.

 Thanks,
 Mohan





Re: [R] Output from unix PS


Sarah Goslee
  to:
mohan.radhakrishnan
 16-08-2013 06:01 PM


 You don't tell us your OS details or provide an actual reproducible
 example, but the error suggests that COMMAND is not one of the column
 names in your data frame.

 Have you checked that? When I run ps on my Fedora system, the column
 name is CMD and not COMMAND.

 Sarah

 On Fri, Aug 16, 2013 at 7:30 AM,  mohan.radhakrish...@polarisft.com
 wrote:
 Hi,

 When I view this output from the unix command ''ps'' the columns seem to
 properly aligned but it is not read properly into the data frame. The
 aggregate function throws

 Error in aggregate.data.frame(data, by = list(COMMAND), FUN = sum) :
   object 'COMMAND' not found

 Is there a recomendation to massage this ?

 data = read.table(D:\\p..txt,sep=\t)

 agg-aggregate(data,by=list(COMMAND),FUN=sum)

 684524  0.0  0.0  12348   872 ?SAug09   0:00
 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
 684528  0.0  0.0  12348   860 ?SAug09   0:00
 hald-addon-keyboard: listening on /dev/input/event1
 684532  0.0  0.0  12348   864 ?SAug09   0:00
 hald-addon-keyboard: listening on /dev/input/event0
 root  4540  0.0  0.0  10256   704 ?SAug09   1:02
 hald-addon-storage: polling /dev/sr0
 root  4576  0.0  0.0   8540   492 ?Ss   Aug09
 0:00 /usr/bin/hidd --server
 root  4619  0.0  0.0 122008  1540 ?Ssl  Aug09   0:00
 automount
 root  4636  0.0  0.0  26348   524 ?Ss   Aug09   0:00 ./hpiod
 root  4641  0.0  0.0 154876  6428 ?SAug09
 0:00 /usr/bin/python ./hpssd.py
 root  4654  0.0  0.0  63544  1212 ?Ss   Aug09
 0:00 /usr/sbin/sshd
 root  4663  0.0  0.0 134208  2744 ?Ss   Aug09   0:00 cupsd
 root  4677  0.0  0.0  21668   896 ?Ss   Aug09   0:00 xinetd
 -stayalive -pidfile /var/run/xinetd.pid
 root  4695  0.0  0.0  66968  2324 ?Ss   Aug09   0:00
 sendmail:
 accepting connections
 smmsp 4703  0.0  0.0  57716  1760 ?Ss   Aug09   0:00
 sendmail:
 Queue runner@01:00:00 for /var/spool/clientmqueue
 root  4713  0.0  0.0   6480   372 ?Ss   Aug09   0:00 gpm
 -m /dev/input/mice -t exps2

 Thanks.



 --
 Sarah Goslee
 http://www.functionaldiversity.org



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Output from unix PS

2013-08-16 Thread Sarah Goslee

Both of these make it entirely clear that your attempt to import your
ps results as a data frame didn't do what you thought.

Instead of being separate columns, you have one factor column (because
the default behavior is to import strings as factors).

There's no column named COMMAND, just a column named
USER...PID..CPU..MEMVSZ...RSS.TTY..STAT.START...TIME.COMMAND

This is why it's good practice to look at your data after import. Also
why it's good practice to not just copy and paste bits of data into
your R-help messages: it's very hard to see that kind of problem from
just the pasted bits.

Based on your read.table command, you assumed that ps output was
tab-delimited, but I think it's variable numbers of spaces. You're
better off looking into read.fwf instead as a way to take your
human-readable output and import it as a data frame.

Sarah

On Fri, Aug 16, 2013 at 9:38 AM,  mohan.radhakrish...@polarisft.com wrote:
 This is the output.

 str(data)
 'data.frame':   78 obs. of  1 variable:
  $
 USER...PID..CPU..MEMVSZ...RSS.TTY..STAT.START...TIME.COMMAND:
 Factor w/ 78 levels 684504  0.0  0.0  31680  4596 ?Ss
 Aug08   0:01 hald,..: 17 18 19 20 21 22 23 24 25 26 ...




 dput(head(data,20))
 structure(list
 (USER...PID..CPU..MEMVSZ...RSS.TTY..STAT.START...TIME.COMMAND =
 structure(c(17L,
 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 15L, 29L,
 30L, 31L, 32L, 33L, 1L, 34L), .Label = c(684504  0.0  0.0  31680
 4596 ?Ss   Aug08   0:01 hald,
 684513  0.0  0.0  12348   872 ?SAug08   0:00
 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket,
 684517  0.0  0.0  12348   864 ?SAug08   0:00
 hald-addon-keyboard: listening on /dev/input/event1,
 684521  0.0  0.0  12348   860 ?SAug08   0:00
 hald-addon-keyboard: listening on /dev/input/event0,
 apache   17344  0.0  0.0 174440  2372 ?SAug11
 0:00 /usr/sbin/httpd,
 apache   17345  0.0  0.0 174440  2372 ?SAug11
 0:00 /usr/sbin/httpd,
 apache   17346  0.0  0.0 174440  2372 ?SAug11
 0:00 /usr/sbin/httpd,
 apache   17347  0.0  0.0 174440  2372 ?SAug11
 0:00 /usr/sbin/httpd,
 apache   17348  0.0  0.0 174440  2372 ?SAug11
 0:00 /usr/sbin/httpd,
 apache   17349  0.0  0.0 174440  2372 ?SAug11
 0:00 /usr/sbin/httpd,
 apache   17350  0.0  0.0 174440  2372 ?SAug11
 0:00 /usr/sbin/httpd,
 apache   17351  0.0  0.0 174440  2372 ?SAug11
 0:00 /usr/sbin/httpd,
 avahi 4830  0.0  0.0  23296  1316 ?Ss   Aug08   0:00
 avahi-daemon: running [DCS-PRO-POL-APP2.local],
 avahi 4831  0.0  0.0  23172   340 ?Ss   Aug08   0:00
 avahi-daemon: chroot helper,
 dbus  4361  0.0  0.0  21380  1004 ?Ss   Aug08   0:00
 dbus-daemon --system,
 gdm   5030  0.0  0.2 222128 17436 ?Ss   Aug08
 0:00 /usr/libexec/gdmgreeter,
 root  4292  0.0  0.0  0 0 ?S   Aug08   0:00
 [rpciod/6],
 root  4293  0.0  0.0  0 0 ?S   Aug08   0:00
 [rpciod/7],
 root  4294  0.0  0.0  0 0 ?S   Aug08   0:00
 [rpciod/8],
 root  4295  0.0  0.0  0 0 ?S   Aug08   0:00
 [rpciod/9],
 root  4296  0.0  0.0  0 0 ?S   Aug08   0:00
 [rpciod/10],
 root  4297  0.0  0.0  0 0 ?S   Aug08   0:00
 [rpciod/11],
 root  4298  0.0  0.0  0 0 ?S   Aug08   0:00
 [rpciod/12],
 root  4299  0.0  0.0  0 0 ?S   Aug08   0:00
 [rpciod/13],
 root  4300  0.0  0.0  0 0 ?S   Aug08   0:00
 [rpciod/14],
 root  4301  0.0  0.0  0 0 ?S   Aug08   0:00
 [rpciod/15],
 root  4322  0.0  0.0  10184   804 ?Ss   Aug08   0:00
 rpc.statd,
 root  4346  0.0  0.0  55204   764 ?Ss   Aug08   0:00
 rpc.idmapd,
 root  4376  0.0  0.0  10456   788 ?Ss   Aug08
 0:00 /usr/sbin/hcid,
 root  4382  0.0  0.0   5960   544 ?Ss   Aug08
 0:00 /usr/sbin/sdpd,
 root  4448  0.0  0.0  0 0 ?S   Aug08   0:00
 [krfcommd],
 root  4485  0.0  0.0  31440  1412 ?Ssl  Aug08   0:00 pcscd,
 root  4495  0.0  0.0   3824   568 ?Ss   Aug08
 0:00 /usr/sbin/acpid,
 root  4505  0.0  0.0  21720  1060 ?SAug08   0:00
 hald-runner,
 root  4529  0.0  0.0  10256   704 ?SAug08   1:57
 hald-addon-storage: polling /dev/sr0,
 root  4565  0.0  0.0   8540   488 ?Ss   Aug08
 0:00 /usr/bin/hidd --server,
 root  4608  0.0  0.0  54424  1520 ?Ssl  Aug08   0:00
 automount,
 root  4625  0.0  0.0  26348   520 ?Ss   Aug08   0:00 ./hpiod,
 root  4630  0.0  0.0 154892  6428 ?SAug08
 0:00 /usr/bin/python ./hpssd.py,
 root  4643  0.0  0.0  62652  1208 ?Ss   Aug08
 0:00 /usr/sbin/sshd,
 root  4652  0.0  0.0 133456  2740 ?Ss   Aug08   0:00 cupsd,
 root  4666  0.0  0.0

Re: [R] Output from unix PS

This is the output.

 str(data)
'data.frame':   78 obs. of  1 variable:
 $
USER...PID..CPU..MEMVSZ...RSS.TTY..STAT.START...TIME.COMMAND:
Factor w/ 78 levels 684504  0.0  0.0  31680  4596 ?Ss
Aug08   0:01 hald,..: 17 18 19 20 21 22 23 24 25 26 ...




 dput(head(data,20))
structure(list
(USER...PID..CPU..MEMVSZ...RSS.TTY..STAT.START...TIME.COMMAND =
structure(c(17L,
18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 15L, 29L,
30L, 31L, 32L, 33L, 1L, 34L), .Label = c(684504  0.0  0.0  31680
4596 ?Ss   Aug08   0:01 hald,
684513  0.0  0.0  12348   872 ?SAug08   0:00
hald-addon-acpi: listening on acpid socket /var/run/acpid.socket,
684517  0.0  0.0  12348   864 ?SAug08   0:00
hald-addon-keyboard: listening on /dev/input/event1,
684521  0.0  0.0  12348   860 ?SAug08   0:00
hald-addon-keyboard: listening on /dev/input/event0,
apache   17344  0.0  0.0 174440  2372 ?SAug11
0:00 /usr/sbin/httpd,
apache   17345  0.0  0.0 174440  2372 ?SAug11
0:00 /usr/sbin/httpd,
apache   17346  0.0  0.0 174440  2372 ?SAug11
0:00 /usr/sbin/httpd,
apache   17347  0.0  0.0 174440  2372 ?SAug11
0:00 /usr/sbin/httpd,
apache   17348  0.0  0.0 174440  2372 ?SAug11
0:00 /usr/sbin/httpd,
apache   17349  0.0  0.0 174440  2372 ?SAug11
0:00 /usr/sbin/httpd,
apache   17350  0.0  0.0 174440  2372 ?SAug11
0:00 /usr/sbin/httpd,
apache   17351  0.0  0.0 174440  2372 ?SAug11
0:00 /usr/sbin/httpd,
avahi 4830  0.0  0.0  23296  1316 ?Ss   Aug08   0:00
avahi-daemon: running [DCS-PRO-POL-APP2.local],
avahi 4831  0.0  0.0  23172   340 ?Ss   Aug08   0:00
avahi-daemon: chroot helper,
dbus  4361  0.0  0.0  21380  1004 ?Ss   Aug08   0:00
dbus-daemon --system,
gdm   5030  0.0  0.2 222128 17436 ?Ss   Aug08
0:00 /usr/libexec/gdmgreeter,
root  4292  0.0  0.0  0 0 ?S   Aug08   0:00
[rpciod/6],
root  4293  0.0  0.0  0 0 ?S   Aug08   0:00
[rpciod/7],
root  4294  0.0  0.0  0 0 ?S   Aug08   0:00
[rpciod/8],
root  4295  0.0  0.0  0 0 ?S   Aug08   0:00
[rpciod/9],
root  4296  0.0  0.0  0 0 ?S   Aug08   0:00
[rpciod/10],
root  4297  0.0  0.0  0 0 ?S   Aug08   0:00
[rpciod/11],
root  4298  0.0  0.0  0 0 ?S   Aug08   0:00
[rpciod/12],
root  4299  0.0  0.0  0 0 ?S   Aug08   0:00
[rpciod/13],
root  4300  0.0  0.0  0 0 ?S   Aug08   0:00
[rpciod/14],
root  4301  0.0  0.0  0 0 ?S   Aug08   0:00
[rpciod/15],
root  4322  0.0  0.0  10184   804 ?Ss   Aug08   0:00
rpc.statd,
root  4346  0.0  0.0  55204   764 ?Ss   Aug08   0:00
rpc.idmapd,
root  4376  0.0  0.0  10456   788 ?Ss   Aug08
0:00 /usr/sbin/hcid,
root  4382  0.0  0.0   5960   544 ?Ss   Aug08
0:00 /usr/sbin/sdpd,
root  4448  0.0  0.0  0 0 ?S   Aug08   0:00
[krfcommd],
root  4485  0.0  0.0  31440  1412 ?Ssl  Aug08   0:00 pcscd,
root  4495  0.0  0.0   3824   568 ?Ss   Aug08
0:00 /usr/sbin/acpid,
root  4505  0.0  0.0  21720  1060 ?SAug08   0:00
hald-runner,
root  4529  0.0  0.0  10256   704 ?SAug08   1:57
hald-addon-storage: polling /dev/sr0,
root  4565  0.0  0.0   8540   488 ?Ss   Aug08
0:00 /usr/bin/hidd --server,
root  4608  0.0  0.0  54424  1520 ?Ssl  Aug08   0:00
automount,
root  4625  0.0  0.0  26348   520 ?Ss   Aug08   0:00 ./hpiod,
root  4630  0.0  0.0 154892  6428 ?SAug08
0:00 /usr/bin/python ./hpssd.py,
root  4643  0.0  0.0  62652  1208 ?Ss   Aug08
0:00 /usr/sbin/sshd,
root  4652  0.0  0.0 133456  2740 ?Ss   Aug08   0:00 cupsd,
root  4666  0.0  0.0  21668   896 ?Ss   Aug08   0:00 xinetd
-stayalive -pidfile /var/run/xinetd.pid,
root  4684  0.0  0.0  66968  2320 ?Ss   Aug08   0:00 sendmail:
accepting connections,
root  4702  0.0  0.0   6480   376 ?Ss   Aug08   0:00 gpm
-m /dev/input/mice -t exps2,
root  4712  0.0  0.0 174440  3800 ?Ss   Aug08
0:00 /usr/sbin/httpd,
root  4767  0.0  0.0 135536  2800 ?Ss   Aug08   0:00 smbd -D,
root  4770  0.0  0.0 107776  1544 ?Ss   Aug08   0:00 nmbd -D,
root  4777  0.0  0.0 135536  1400 ?SAug08   0:00 smbd -D,
root  4788  0.0  0.0  18756   448 ?Ss   Aug08
0:00 /usr/sbin/atd,
root  4892  0.0  0.0  18440   480 ?SAug08
0:00 /usr/sbin/smartd -q never,
root  4897  0.0  0.0   3816   488 tty1 Ss+  Aug08
0:00 /sbin/mingetty tty1,
root  4898  0.0  0.0   3816   484 tty2 Ss+  Aug08
0:00 /sbin/mingetty tty2,
root  4899  0.0  0.0   3816   488 tty3 Ss+  Aug08
0:00 /sbin/mingetty tty3,
root  4900  0.0  0.0

Re: [R] Adding additional points to ggplot2

2013-08-16 Thread Ista Zahn

Hi Chris,

Just add a second geom_point() call and override the y axis mapping.
For example:

library(ggplot2)
ggplot(mtcars, aes(x=hp, y=mpg)) +
  geom_point() +
  geom_point(aes(y=qsec), color=red)

Best,
Ista

On Fri, Aug 16, 2013 at 5:45 AM, Chris89 chris...@stud.ntnu.no wrote:
 Hi!
 I am having a difficulty adding additional points to a plot using ggplot2..

 The case is that I want to plot both original and estimated values in the
 same graph, and general I would use
 plot and then lines, but I do not know how to do it with ggplot...

 Thanks!

 Regards,
 Chris



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Adding-additional-points-to-ggplot2-tp4673928.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting Multiple Factors By Dates With Lattice

2013-08-16 Thread Rich Shepard


On Thu, 15 Aug 2013, Rich Shepard wrote:


Now I see the source of my error: I quoted the data file name! Removing
the quotation marks produces the plots.


  Thanks to A.K. and Dennis Murphy I understand how to plot the data in
these data sets. However, I am not getting the colors within the plot to
match those in the key despite reading about using color pallettes and
experimenting with pallettes and various numbers of colors. Since I don't
see what I'm doing incorrectly I'd appreciate having someone point this out
to me.

  Data set:


dput(bdf)
structure(list(sampdate = structure(c(11156, 11156, 11156, 11156, 
11156, 12241, 12241, 12241, 12241, 12241, 12977, 12977, 12977, 
12977, 12977, 13327, 13327, 13327, 13327, 13327, 14866, 14866, 
14866, 14866, 14866, 14866, 14866, 15168, 15168, 15168, 15168, 
15168, 15168, 15170, 15170, 15170, 15170, 15170, 15170, 15170, 
15532, 15532, 15532, 15532, 15532, 15532), class = Date), func_feed_grp =
structure(c(1L, 
2L, 3L, 6L, 7L, 1L, 2L, 3L, 6L, 7L, 1L, 2L, 3L, 6L, 7L, 1L, 2L, 
3L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 6L, 7L), .Label = c( Filterer
, 
 Gatherer  ,  Grazer,  Omnivore  ,  Parasite  , 
 Predator  ,  Shredder  ), class = factor), pct = c(0.0351, 
0.7054, 0.0442, 0.1078, 0.1074, 0.157, 0.7039, 0.0023, 0.0456, 
0.0912, 0.0293, 0.6634, 0.0055, 0.0552, 0.2466, 0.0414, 0.4776, 
0.1033, 0.2012, 0.1765, 0.0811, 0.5785, 0.0284, 0.0131, 0.0018, 
0.0736, 0.2234, 0.0041, 0.9011, 0.0563, 0.01, 0.0037, 0.0247, 
0.0385, 0.8469, 0.0147, 5e-04, 0.0197, 0.0688, 0.0109, 0.1275, 
0.503, 0.0257, 8e-04, 0.1464, 0.1966)), .Names = c(sampdate, 
func_feed_grp, pct), row.names = c(NA, -46L), class = data.frame)


  The command I'm using to plot this data frame:

xyplot(pct ~ sampdate, data = bdf, groups = func_feed_grp, type = 'l', col =
rainbow(8), key = simpleKey(text = levels(bdf$func_feed_grp), space =
'right'))

  I've used rainbow(6) through rainbow(12) and cannont get it correct. A
sample plot is attached. Notice for the left-most points (year 2000) there
are 5 functional feeding groups in the data: gatherers (approximately 70
percent), predators and shredders (approximately 10 percent each), grazers
and scrapers (approximately 4 percent each). According to the key parasites
are approximately 70 percent of those data and gatherers approximately 10
percent. That's not correct.

Rich

sample.plot.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Output from unix PS

2013-08-16 Thread jim holtman

Here is what I would do to read your data:


 psAll - readLines(/temp/node1-ps-1.txt)
 ps - read.table(text = substring(psAll, 1, 65) # parse the first part
+ , header = TRUE
+ , as.is = TRUE
+ , fill = TRUE
+ , check.names = FALSE
+ )
 commands - substring(psAll, 66)[-1]  # remove header
 ps - cbind(ps, COMMAND = commands)

 head(ps)
  USER PID %CPU %MEM   VSZ RSS TTY STAT START TIME   COMMAND
1 root   100 10372 692   ?   Ss Aug09 0:02  init [5]
2 root   200 0   0   ?   S Aug09 0:00 [migration/0]
3 root   300 0   0   ?   SN Aug09 0:00 [ksoftirqd/0]
4 root   400 0   0   ?   S Aug09 0:00  [watchdog/0]
5 root   500 0   0   ?   S Aug09 0:00 [migration/1]
6 root   600 0   0   ?   SN Aug09 0:00 [ksoftirqd/1]




On Fri, Aug 16, 2013 at 9:08 AM, mohan.radhakrish...@polarisft.com wrote:

 My code is not complete. But the headers I see are

 USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
 684517  0.0  0.0  12348   864 ?SAug08   0:00
 hald-addon-keyboard: listening on /dev/input/event1
 684521  0.0  0.0  12348   860 ?SAug08   0:00
 hald-addon-keyboard: listening on /dev/input/event0
 root  4529  0.0  0.0  10256   704 ?SAug08   1:57
 hald-addon-storage: polling /dev/sr0

 The last column value doesn't follow any rule.

 Thanks,
 Mohan





Re: [R] Output from unix PS


Sarah Goslee
  to:
mohan.radhakrishnan
 16-08-2013 06:01 PM




Cc:
r-help









 You don't tell us your OS details or provide an actual reproducible
 example, but the error suggests that COMMAND is not one of the column
 names in your data frame.

 Have you checked that? When I run ps on my Fedora system, the column
 name is CMD and not COMMAND.

 Sarah

 On Fri, Aug 16, 2013 at 7:30 AM,  mohan.radhakrish...@polarisft.com
 wrote:
  Hi,
 
  When I view this output from the unix command ''ps'' the columns seem to
  properly aligned but it is not read properly into the data frame. The
  aggregate function throws
 
  Error in aggregate.data.frame(data, by = list(COMMAND), FUN = sum) :
object 'COMMAND' not found
 
  Is there a recomendation to massage this ?
 
  data = read.table(D:\\p..txt,sep=\t)
 
  agg-aggregate(data,by=list(COMMAND),FUN=sum)
 
  684524  0.0  0.0  12348   872 ?SAug09   0:00
  hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
  684528  0.0  0.0  12348   860 ?SAug09   0:00
  hald-addon-keyboard: listening on /dev/input/event1
  684532  0.0  0.0  12348   864 ?SAug09   0:00
  hald-addon-keyboard: listening on /dev/input/event0
  root  4540  0.0  0.0  10256   704 ?SAug09   1:02
  hald-addon-storage: polling /dev/sr0
  root  4576  0.0  0.0   8540   492 ?Ss   Aug09
  0:00 /usr/bin/hidd --server
  root  4619  0.0  0.0 122008  1540 ?Ssl  Aug09   0:00
 automount
  root  4636  0.0  0.0  26348   524 ?Ss   Aug09   0:00 ./hpiod
  root  4641  0.0  0.0 154876  6428 ?SAug09
  0:00 /usr/bin/python ./hpssd.py
  root  4654  0.0  0.0  63544  1212 ?Ss   Aug09
  0:00 /usr/sbin/sshd
  root  4663  0.0  0.0 134208  2744 ?Ss   Aug09   0:00 cupsd
  root  4677  0.0  0.0  21668   896 ?Ss   Aug09   0:00 xinetd
  -stayalive -pidfile /var/run/xinetd.pid
  root  4695  0.0  0.0  66968  2324 ?Ss   Aug09   0:00
 sendmail:
  accepting connections
  smmsp 4703  0.0  0.0  57716  1760 ?Ss   Aug09   0:00
 sendmail:
  Queue runner@01:00:00 for /var/spool/clientmqueue
  root  4713  0.0  0.0   6480   372 ?Ss   Aug09   0:00 gpm
  -m /dev/input/mice -t exps2
 
  Thanks.
 
 
 
 --
 Sarah Goslee
 http://www.functionaldiversity.org




 This e-Mail may contain proprietary and confidential information and is
 sent for the intended recipient(s) only.  If by an addressing or
 transmission error this mail has been misdirected to you, you are requested
 to delete this mail immediately. You are also hereby notified that any use,
 any form of reproduction, dissemination, copying, disclosure, modification,
 distribution and/or publication of this e-mail message, contents or its
 attachment other than by its intended recipient/s is strictly prohibited.

 Visit us at http://www.polarisFT.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML

Re: [R] Repeated measures Cox regression ??coxph??

2013-08-16 Thread Göran Broström


Sorry I'm late with this.

On 07/26/2013 02:02 PM, Terry Therneau wrote:

Two choices. If this were a linear model, do you like the GEE
approach or a mixed effects approach? Assume that subject is a
variable containing a per-subject identifier.

GEE approach: add + cluster(subject) to the model statement in
coxph Mixed models approach: Add  + (1|subject) to the model
statment in coxme.


Note that the 'cluster' approach ignores the clustering regarding the 
regression parameter estimates. It tries to correct the optimistic 
variance estimate given by ignoring the clustering, but it does nothing 
about the bias that may be introduced.



When only a very few subjects have multiple events, the mixed model
(random effect) approach may not be reliable, however.  Multiple
events per group are the fuel for estimation of the variance of the
random effect, and with few of these the profile likelihood of the
random effect will be very flat.  You can get esssentially a random
estimate of the variance of the subject effect.  I'm still getting
my arms around this issue, and it has taken me a long time.


John had exactly two observations per subject, and given that a frailty 
model is reasonable, the bias may be substantial if ignoring it. I made 
a small simulation study to convince myself: frailty variance = 1, one 
binary covariate (constant within subjects) and beta coefficient = 1. 
With 20 subjects, the bias for coxme was -0.004, for coxph (with 
'cluster', but it doesn't matter) -0.294 (based on 1000 replicates). 
(The bias for the frailty standard deviation was -0.108, but who cares 
when we regard it as just a nuisance?)


Of course this doesn't prove anything, but it makes me worried; it is 
easy to understand the frailty model, but what is the 'GEE' model in 
this survival case? Why should it be used in John's case?



Frailty is an alternate label for random effects when all we have
is a random intercept.  Multiple labels for the same idea adds
confusion, but nothing else.


The term frailty was (to my knowledge) coined by Vaupel, Manton  
Stallard in a 1979 paper in 'Demography'. They used it to describe 
heterogeneity in demographic data, and what could happen if it was 
ignored. Just for the record.


Göran


Terry Therneau

On 07/25/2013 08:14 PM, Marc Schwartz wrote:

On Jul 25, 2013, at 4:45 PM, David
Winsemiusdwinsem...@comcast.net  wrote:


On Jul 25, 2013, at 12:27 PM, Marc Schwartz wrote:


On Jul 25, 2013, at 2:11 PM, John
Sorkinjsor...@grecc.umaryland.edu  wrote:


Colleagues, Is there any R package that will allow one to
perform a repeated measures Cox Proportional Hazards
regression? I don't think coxph is set up to handle this type
of problem, but I would be happy to know that I am not
correct. I am doing a study of time to hip joint replacement.
As each person has two hips, a given person can appear in the
dataset twice, once for the left hip and once for the right
hip, and I need to account for the correlation of data from a
single individual. Thank you, John



John,

See Terry's 'coxme' package:

http://cran.r-project.org/web/packages/coxme/index.html


When I looked over the description of coxme, I was concerned it
was not really designed with this in mind. Looking at Therneau
and Grambsch, I thought section 8.4.2 in the 'Multiple Events per
Subject' Chapter fit the analysis question well. There they
compared the use of coxph( ...+cluster(ID),,...)  withcoxph(
...+strata(ID),,...). Unfortunately I could not tell for sure
which one was being described as superio but I think it was the
cluster() alternative. I seem to remember there are discussions
in the archives.


David,

I think that you raise a good point. The example in the book (I had
to wait to get home to read it) is potentially different however,
in that the subject's eye's were randomized to treatment or
control, which would seem to suggest comparable baseline
characteristics for each pair of eyes, as well as an active
intervention on one side where a difference in treatment effect
between each eye is being analyzed.

It is not clear from John's description above if there is one hip
that will be treated versus one as a control and whether the extent
of disease at baseline is similar in each pair of hips. Presumably
the timing of hip replacements will be staggered at some level,
even if there is comparable disease, simply due to post-op recovery
time and surgical risk. In cases where the disease between each hip
is materially different, that would be another factor to consider,
however I would defer to orthopaedic physicians/surgeons from a
subject matter expertise consideration. It is possible that the
bilateral hip replacement data might be more of a parallel to
bilateral breast cancer data, if each breast were to be tracked
separately.

I have cc'd Terry here, hoping that he might jump in and offer some
insights into the pros/cons of using coxme versus coxph with either
a cluster or strata based approach, or perhaps even a

Re: [R] Plotting Multiple Factors By Dates With Lattice

2013-08-16 Thread Richard M. Heiberger

The major problem is all the padding and the LF in the level names.
This repair is based on the ?gsub example on ## trim trailing white space.

levels(bdf$func_feed_grp)
## [1]  Filterer\n  GathererGrazer  Omnivore
 
## [5]  ParasitePredatorShredder  
levels(bdf$func_feed_grp) - sub(^[[:space:]]+, , sub([[:space:]]+$,
, levels(bdf$func_feed_grp) )) ## white space, POSIX-style
levels(bdf$func_feed_grp)
## [1] Filterer Gatherer Grazer   Omnivore Parasite Predator
Shredder


I switched the key to lines to match the graph.

You need to control the color choice differently when using a key.
It is discussed in ?xyplot in the paragraph
  Note that 'simpleKey' uses the default settings (see
  'trellis.par.get') to determine the graphical parameters in
  the key, so the resulting legend will be meaningful only if
  the same settings are used in the plot as well.  The
  'par.settings' argument may be useful to temporarily modify
  the default settings for this purpose.


xyplot(pct ~ sampdate, data = bdf, groups = func_feed_grp, type = 'l',
   key = simpleKey(text = levels(bdf$func_feed_grp), space ='right',
points=FALSE, lines=TRUE),
   par.settings=list(superpose.points=list(col=rainbow(7)),
superpose.lines=list(col=rainbow(7


Rich



On Fri, Aug 16, 2013 at 10:49 AM, Rich Shepard rshep...@appl-ecosys.comwrote:

 On Thu, 15 Aug 2013, Rich Shepard wrote:

  Now I see the source of my error: I quoted the data file name! Removing
 the quotation marks produces the plots.


   Thanks to A.K. and Dennis Murphy I understand how to plot the data in
 these data sets. However, I am not getting the colors within the plot to
 match those in the key despite reading about using color pallettes and
 experimenting with pallettes and various numbers of colors. Since I don't
 see what I'm doing incorrectly I'd appreciate having someone point this out
 to me.

   Data set:

  dput(bdf)

 structure(list(sampdate = structure(c(11156, 11156, 11156, 11156, 11156,
 12241, 12241, 12241, 12241, 12241, 12977, 12977, 12977, 12977, 12977,
 13327, 13327, 13327, 13327, 13327, 14866, 14866, 14866, 14866, 14866,
 14866, 14866, 15168, 15168, 15168, 15168, 15168, 15168, 15170, 15170,
 15170, 15170, 15170, 15170, 15170, 15532, 15532, 15532, 15532, 15532,
 15532), class = Date), func_feed_grp =
 structure(c(1L, 2L, 3L, 6L, 7L, 1L, 2L, 3L, 6L, 7L, 1L, 2L, 3L, 6L, 7L,
 1L, 2L, 3L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 1L,
 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 6L, 7L), .Label = c( Filterer
 ,  Gatherer  ,  Grazer,  Omnivore  ,  Parasite
  ,  Predator  ,  Shredder  ), class = factor), pct =
 c(0.0351, 0.7054, 0.0442, 0.1078, 0.1074, 0.157, 0.7039, 0.0023, 0.0456,
 0.0912, 0.0293, 0.6634, 0.0055, 0.0552, 0.2466, 0.0414, 0.4776, 0.1033,
 0.2012, 0.1765, 0.0811, 0.5785, 0.0284, 0.0131, 0.0018, 0.0736, 0.2234,
 0.0041, 0.9011, 0.0563, 0.01, 0.0037, 0.0247, 0.0385, 0.8469, 0.0147,
 5e-04, 0.0197, 0.0688, 0.0109, 0.1275, 0.503, 0.0257, 8e-04, 0.1464,
 0.1966)), .Names = c(sampdate, func_feed_grp, pct), row.names = c(NA,
 -46L), class = data.frame)

   The command I'm using to plot this data frame:

 xyplot(pct ~ sampdate, data = bdf, groups = func_feed_grp, type = 'l', col
 =
 rainbow(8), key = simpleKey(text = levels(bdf$func_feed_grp), space =
 'right'))

   I've used rainbow(6) through rainbow(12) and cannont get it correct. A
 sample plot is attached. Notice for the left-most points (year 2000) there
 are 5 functional feeding groups in the data: gatherers (approximately 70
 percent), predators and shredders (approximately 10 percent each), grazers
 and scrapers (approximately 4 percent each). According to the key parasites
 are approximately 70 percent of those data and gatherers approximately 10
 percent. That's not correct.

 Rich
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting Multiple Factors By Dates With Lattice

2013-08-16 Thread Rich Shepard


On Fri, 16 Aug 2013, Richard M. Heiberger wrote:


The major problem is all the padding and the LF in the level names.
This repair is based on the ?gsub example on ## trim trailing white space.


Rich

  Thanks. I thought of removing white space (didn't notice the spurious
newline) in emacs but did not see that it made a difference. Now I know it
does I'll clean up all the files.


I switched the key to lines to match the graph.


  OK.


You need to control the color choice differently when using a key.
It is discussed in ?xyplot in the paragraph
 Note that 'simpleKey' uses the default settings (see
 'trellis.par.get') to determine the graphical parameters in
 the key, so the resulting legend will be meaningful only if
 the same settings are used in the plot as well.  The
 'par.settings' argument may be useful to temporarily modify
 the default settings for this purpose.


  I'll carefully read it.


xyplot(pct ~ sampdate, data = bdf, groups = func_feed_grp, type = 'l',
  key = simpleKey(text = levels(bdf$func_feed_grp), space ='right',
points=FALSE, lines=TRUE),
  par.settings=list(superpose.points=list(col=rainbow(7)),
superpose.lines=list(col=rainbow(7


  Valuable lessons learned here. Again, thanks.

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] regex challenge

2013-08-16 Thread William Dunlap

The following makes the name converter function an argument to ff (and restores 
the colon operator to the list of formula operators), but I'm not sure what you 
need the converter to do.

ff - function(expr, convertName = function(name)paste0(toupper(name), z)) {
if (is.call(expr)  is.name(expr[[1]])  
is.element(as.character(expr[[1]]), c(~,+,-,*,/,%in%,(, :))) {
for(i in seq_along(expr)[-1]) {
expr[[i]] - Recall(expr[[i]], convertName = convertName)
}
} else if (is.name(expr)) {
expr - as.name(convertName(expr))
}
expr
}

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Frank Harrell
 Sent: Thursday, August 15, 2013 7:47 PM
 To: RHELP
 Subject: Re: [R] regex challenge
 
 Bill that is very impresive.  The only problem I'm having is that I want
 the paste0(toupper(...)) to be a general function that returns a
 character string that is a legal part of a formula object that can't be
 converted to a 'name'.
 
 Frank
 
 
 ---
 Oops, I left ( out of the list of operators.
 
 
 ff - function(expr) {
  if (is.call(expr)  is.name(expr[[1]]) 
   is.element(as.character(expr[[1]]),
 c(~,+,-,*,/,%in%,())) {
  for(i in seq_along(expr)[-1]) {
  expr[[i]] - Recall(expr[[i]])
  }
  } else if (is.name(expr)) {
  expr - as.name(paste0(toupper(as.character(expr)), z))
  }
  expr
 }
 
   ff(a)
 CATz + (AGEz + Heading(Females) * (sex == Female) * SBPz) *
  Heading() * Gz + (AGEz + SBPz) * Heading() * TRIOz ~ Heading() *
  COUNTRYz * Heading() * SEXz
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com
 
 
   -Original Message-
   From: [hidden email] [mailto:[hidden email]] On Behalf
   Of William Dunlap
   Sent: Thursday, August 15, 2013 6:03 PM
   To: Frank Harrell; RHELP
   Subject: Re: [R] regex challenge
  
   Try this one
  
   ff - function (expr)
   {
   if (is.call(expr)  is.name(expr[[1]]) 
is.element(as.character(expr[[1]]),  c(~, +, -, *,
 /, :, %in%))) {
   # the above list should cover the standard formula operators.
   for (i in seq_along(expr)[-1]) {
   expr[[i]] - Recall(expr[[i]])
   }
   }
   else if (is.name(expr)) {
  # the conversion itself
   expr - as.name(paste0(toupper(as.character(expr)), z))
   }
   expr
   }
  
ff(a)
   CATz + (age + Heading(Females) * (sex == Female) * sbp) *
   Heading() * Gz + (age + sbp) * Heading() * TRIOz ~ Heading() *
   COUNTRYz * Heading() * SEXz
  
   Bill Dunlap
   Spotfire, TIBCO Software
   wdunlap tibco.com
  
  
-Original Message-
From: [hidden email] [mailto:[hidden email]] On Behalf
Of Frank Harrell
Sent: Thursday, August 15, 2013 4:45 PM
To: RHELP
Subject: Re: [R] regex challenge
   
I really appreciate the excellent ideas from Bill Dunlap and Greg
 Snow.
  Both suggestions almost work perfectly.  Greg's recognizes
 expressions
such as sex=='female' but not ones such as age  21, age  21, a - b 
0, and possibly other legal R expressions.  Bill's idea is similar to
what Duncan Murdoch suggested to me.  Bill's doesn't catch the case
 when
a variable appears both in an expression and as a regular variable
 (sex
in the example below):
   
f - function(formula) {
   trms - terms(formula)
   variables - as.list(attr(trms, variables))[-1]
   ## the 'variables' attribute is stored as a call to list(),
   ## so we changed the call to a list and removed the first element
   ## to get the variables themselves.
   if (attr(trms, response) == 1) {
 ## terms does not pull apart right hand side of formula,
 ## so we assume each non-function is to be renamed.
 responseVars - lapply(all.vars(variables[[1]]), as.name)
 variables - variables[-1]
   } else {
 responseVars - list()
   }
   ## omit non-name variables from list of ones to change.
   ## This is where you could expand calls to certain functions.
   variables - variables[vapply(variables, is.name, TRUE)]
   variables - c(responseVars, variables) # all are names now
   names(variables) - vapply(variables, as.character, )
   newVars - lapply(variables, function(v) as.name(paste0(toupper(v),
z)))
   formula(do.call(substitute, list(formula, newVars)),
env=environment(formula))
}
   
a - cat + (age + Heading(Females) * (sex == Female) * sbp) *
 Heading() * g + (age + sbp) * Heading() * trio ~ Heading() *
 country * Heading() * sex
f(a)
   
Output:
   
CATz + (AGEz + Heading(Females) * (SEXz == Female) * SBPz) *
 Heading() * Gz + (AGEz + SBPz) * Heading() * TRIOz ~ Heading() *
 COUNTRYz * Heading() * SEXz
   
The method also

Re: [R] คำถาม


On Aug 16, 2013, at 12:50 AM, Boonchai Oua-arunkij wrote:

 I working with the R Text Mining in Thai Language. I got a problem and i want 
 to ask something, please.1.Program R Studio can read Thai language but it's 
 not complete. It's can fix?

This is not the support list for R Studio.

 2.I need to run code to be a graft. How i can writing the code? please 
 suggest me.

I think there is a problem with translation of the concept being translated to 
the word graft. Perhaps the answer is source() but other possibilities would 
be merge or rbind. An example (as suggested in the POsting Guide) would 
help greatly here.

 Thank you very much
 ขอคู่มือด้วย  และการทำText Mining ด้วยครับ
 
   [[alternative HTML version deleted]]
 

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] about plantbreeding library

2013-08-16 Thread Waqas Shafqat

Sir i have successfully installed plant breeding library following the
procedure on the web..

but  problem is that plantbreeding library does not working

I have tried it in both version i.e RGui 3.0.0 and RGui 3.0.1.

please guide me

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] regex challenge

2013-08-16 Thread Frank Harrell

Thanks Bill.  The problem is one of the results of convertName might be 
'Heading(Age in Years)*age'  (this is for the tables package), and 
as.name converts this to `Heading(...)*age` and the backticks cause 
the final formula to have a mixture of regular elements and ` ` quoted 
expression elements, making the formula invalid.

Best,
Frank
---

The following makes the name converter function an argument to ff (and 
restores the colon operator to the list of formula operators), but I'm 
not sure what you need the converter to do.


ff - function(expr, convertName = function(name)paste0(toupper(name), 
z)) {
if (is.call(expr)  is.name(expr[[1]])  
is.element(as.character(expr[[1]]), c(~,+,-,*,/,%in%,(, 
:))) {

for(i in seq_along(expr)[-1]) {
expr[[i]] - Recall(expr[[i]], convertName = convertName)
}
} else if (is.name(expr)) {
expr - as.name(convertName(expr))
}
expr
}

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: [hidden email] [mailto:[hidden email]] On Behalf
 Of Frank Harrell
 Sent: Thursday, August 15, 2013 7:47 PM
 To: RHELP
 Subject: Re: [R] regex challenge

 Bill that is very impresive.  The only problem I'm having is that I want
 the paste0(toupper(...)) to be a general function that returns a
 character string that is a legal part of a formula object that can't be
 converted to a 'name'.

 Frank


 ---
 Oops, I left ( out of the list of operators.


 ff - function(expr) {
  if (is.call(expr)  is.name(expr[[1]]) 
   is.element(as.character(expr[[1]]),
 c(~,+,-,*,/,%in%,())) {
  for(i in seq_along(expr)[-1]) {
  expr[[i]] - Recall(expr[[i]])
  }
  } else if (is.name(expr)) {
  expr - as.name(paste0(toupper(as.character(expr)), z))
  }
  expr
 }

   ff(a)
 CATz + (AGEz + Heading(Females) * (sex == Female) * SBPz) *
  Heading() * Gz + (AGEz + SBPz) * Heading() * TRIOz ~ Heading() *
  COUNTRYz * Heading() * SEXz

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com


   -Original Message-
   From: [hidden email] [mailto:[hidden email]] On Behalf
   Of William Dunlap
   Sent: Thursday, August 15, 2013 6:03 PM
   To: Frank Harrell; RHELP
   Subject: Re: [R] regex challenge
  
   Try this one
  
   ff - function (expr)
   {
   if (is.call(expr)  is.name(expr[[1]]) 
is.element(as.character(expr[[1]]),  c(~, +, -, *,
 /, :, %in%))) {
   # the above list should cover the standard formula operators.
   for (i in seq_along(expr)[-1]) {
   expr[[i]] - Recall(expr[[i]])
   }
   }
   else if (is.name(expr)) {
  # the conversion itself
   expr - as.name(paste0(toupper(as.character(expr)), z))
   }
   expr
   }
  
ff(a)
   CATz + (age + Heading(Females) * (sex == Female) * sbp) *
   Heading() * Gz + (age + sbp) * Heading() * TRIOz ~ Heading() *
   COUNTRYz * Heading() * SEXz
  
   Bill Dunlap
   Spotfire, TIBCO Software
   wdunlap tibco.com
  
  
-Original Message-
From: [hidden email] [mailto:[hidden email]] On Behalf
Of Frank Harrell
Sent: Thursday, August 15, 2013 4:45 PM
To: RHELP
Subject: Re: [R] regex challenge
   
I really appreciate the excellent ideas from Bill Dunlap and Greg
 Snow.
  Both suggestions almost work perfectly.  Greg's recognizes
 expressions
such as sex=='female' but not ones such as age  21, age  21, a 
- b 
0, and possibly other legal R expressions.  Bill's idea is 
similar to

what Duncan Murdoch suggested to me.  Bill's doesn't catch the case
 when
a variable appears both in an expression and as a regular variable
 (sex
in the example below):
   
f - function(formula) {
   trms - terms(formula)
   variables - as.list(attr(trms, variables))[-1]
   ## the 'variables' attribute is stored as a call to list(),
   ## so we changed the call to a list and removed the first 
element

   ## to get the variables themselves.
   if (attr(trms, response) == 1) {
 ## terms does not pull apart right hand side of formula,
 ## so we assume each non-function is to be renamed.
 responseVars - lapply(all.vars(variables[[1]]), as.name)
 variables - variables[-1]
   } else {
 responseVars - list()
   }
   ## omit non-name variables from list of ones to change.
   ## This is where you could expand calls to certain functions.
   variables - variables[vapply(variables, is.name, TRUE)]
   variables - c(responseVars, variables) # all are names now
   names(variables) - vapply(variables, as.character, )
   newVars - lapply(variables, function(v) 
as.name(paste0(toupper(v),

z)))
   formula(do.call(substitute, list(formula, newVars)),
env=environment(formula))
}
   
a - cat +

Re: [R] Memory limit on Linux?

2013-08-16 Thread Stackpole, Chris

Greetings,

Just a follow up on this problem. I am not sure where the problem lies, but we 
think it is the users code and/or CRAN plugin that may be the cause. We have 
been getting pretty familiar with R recently and we can allocate and load large 
datasets into 10+GB of memory. One of our other users runs a program at the 
start of every week and claims he regularly gets 35+GB of memory (indeed, when 
we tested it on this week's data set it was just over 30GB). So it is clear 
that this problem is not a problem with R, the system, or any artificial limits 
that we can find.

So why is there a difference between one system and the other in terms of usage 
on what should be the exact same code? Well first off, I am not convinced it is 
the same dataset even though that is the claim (I don't have access to verify 
for various reasons). Second, he is using some libraries from the CRAN repos. 
We have already found an instance a few months ago where we had a bad compile 
that was behaving weird. I reran the compile for that library and it 
straightened out. I am wondering if this is the possibility again. The user is 
researching the library sets now.

In short, we don't have a solution yet to this explicit problem but at least I 
know for certain it isn't the system or R. Now that I can take a solid stance 
on those facts I have good ground to approach the user and politely say Let's 
look at how we might be able to improve your code.

Thanks to everyone who helped me debug this issue. I do appreciate it.

Chris Stackpole 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Envelope curve for scatterplot

2013-08-16 Thread frauke

Hi, 

to be concise, let me start with my problem: I have a scatterplot that I
want to fit an envelope curve to. The picture of the scatterplot is below. 
http://r.789695.n4.nabble.com/file/n4673965/day_DryPond1_1_rid03.jpg 

I have 140 of these plots that I need to compare. Rather than to visually
compare all the plots, I would like to compare the parameters of the curves
enveloping them. 
Possible parameters could be the x-value when the curves starts dropping
down, the y-value that is the asymptote of the enveloping curve and some
parameter describing the curve itself. 

After some googling, I still have no idea how to approach this. Could
anybody please give me a hint where to start?

I attached the data. The plot was generated using the code below(I plotted
the 2nd column against columns 15, 16, 17. )
  plot(output[,2],(1-output[,15]),xlab=Precip [inch],
ylab=Eff.,ylim=c(-1,1),xlim=c(0,8),pch=18)
  points(output[,2],(1-output[,16]),col=2,pch=18)
  points(output[,2],(1-output[,17]),col=3,pch=18)
DryPond1_1_rid03_24.txt
http://r.789695.n4.nabble.com/file/n4673965/DryPond1_1_rid03_24.txt  

Any help would be much appreciated. 

Thank you! Frauke





--
View this message in context: 
http://r.789695.n4.nabble.com/Envelope-curve-for-scatterplot-tp4673965.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A question about using delayedAssign

2013-08-16 Thread Duncan Murdoch


On 13-08-14 9:11 PM, Gang Peng wrote:

I run the examples in delayedAssign:

msg - old
delayedAssign(x, msg)
msg - new!
x

If I run these four commands together, x is new. If I run the first two
commands first and then run the last two commands, x is old.

I just cannot figure out why.


You aren't telling us everything.  What did you do in between running 
the first two and the last two?  Presumably something you did forced the 
evaluation of x.  That is what causes the behaviour you saw.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting Multiple Factors By Dates With Lattice

You could remove the white space also by:
library(stringr)
levels(bdf$func_feed_grp)
#[1] Filterer     Gatherer    Grazer      Omnivore  
#[5]  Parasite    Predator    Shredder  


levels(bdf$func_feed_grp)- str_trim(levels(bdf$func_feed_grp))
levels(bdf$func_feed_grp)
#[1] Filterer Gatherer Grazer   Omnivore Parasite Predator 
Shredder

A.K.



- Original Message -
From: Rich Shepard rshep...@appl-ecosys.com
To: R help r-help@r-project.org
Cc: 
Sent: Friday, August 16, 2013 12:01 PM
Subject: Re: [R] Plotting Multiple Factors By Dates With Lattice

On Fri, 16 Aug 2013, Richard M. Heiberger wrote:

 The major problem is all the padding and the LF in the level names.
 This repair is based on the ?gsub example on ## trim trailing white space.

Rich

  Thanks. I thought of removing white space (didn't notice the spurious
newline) in emacs but did not see that it made a difference. Now I know it
does I'll clean up all the files.

 I switched the key to lines to match the graph.

  OK.

 You need to control the color choice differently when using a key.
 It is discussed in ?xyplot in the paragraph
          Note that 'simpleKey' uses the default settings (see
          'trellis.par.get') to determine the graphical parameters in
          the key, so the resulting legend will be meaningful only if
          the same settings are used in the plot as well.  The
          'par.settings' argument may be useful to temporarily modify
          the default settings for this purpose.

  I'll carefully read it.

 xyplot(pct ~ sampdate, data = bdf, groups = func_feed_grp, type = 'l',
       key = simpleKey(text = levels(bdf$func_feed_grp), space ='right',
 points=FALSE, lines=TRUE),
       par.settings=list(superpose.points=list(col=rainbow(7)),
 superpose.lines=list(col=rainbow(7

  Valuable lessons learned here. Again, thanks.

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting GAM fit using RGL

2013-08-16 Thread Duncan Murdoch


On 13-08-15 1:15 PM, David Winsemius wrote:


On Aug 15, 2013, at 2:23 AM, Lucas Holland wrote:


Hello all,

I’ve fitted a bivariate smoothing model (with GAM) to some data, using two 
explanatory variables, x and y.  Now I’d like to add the surface corresponding 
to my fit to a 3D scatterplot generated using plot3d().

My approach so far is to create a grid of x and y values and the corresponding 
predicted values and to try to use surface3d with that grid.

grid - expand.grid(x = seq(-1,1,length=20),
y = seq(-1,1, length=20))

grid$z - predict(fit.nonparametric, newdata=grid)

surface3d(grid$x, grid$y, matrix(grid$z, nrow=length(grid$x), 
ncol=length(grid$y)))


?surface3d
# Should be:

  surface3d( unique(grid$x), unique(grid$y),
 z= matrix(grid$z, nrow=length(grid$x), 
ncol=length(grid$y)))


Or you could make x and y into matrices as well.  In this case you'll 
get the same result, but if x or y weren't strictly increasing 
sequences, there'd be a difference.


Duncan Murdoch





This however plots a number of surfaces that do not look like the fitted 
surface obtained by vis.gam(fit.nonparametric which actually looks a lot like 
the „truth“ (I’m using simulated data so I know the true regression surface).

I think I’m using surface3d wrong but I can’t seem to spot my mistake.



Always look at the Arguments section of help pages carefully.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory limit on Linux?


On Aug 16, 2013, at 10:19 AM, Stackpole, Chris wrote:

 Greetings,
 
 Just a follow up on this problem. I am not sure where the problem lies, but 
 we think it is the users code and/or CRAN plugin that may be the cause. We 
 have been getting pretty familiar with R recently and we can allocate and 
 load large datasets into 10+GB of memory. One of our other users runs a 
 program at the start of every week and claims he regularly gets 35+GB of 
 memory (indeed, when we tested it on this week's data set it was just over 
 30GB). So it is clear that this problem is not a problem with R, the system, 
 or any artificial limits that we can find.
 
 So why is there a difference between one system and the other in terms of 
 usage on what should be the exact same code? Well first off, I am not 
 convinced it is the same dataset even though that is the claim (I don't have 
 access to verify for various reasons). Second, he is using some libraries 
 from the CRAN repos. We have already found an instance a few months ago where 
 we had a bad compile that was behaving weird. I reran the compile for that 
 library and it straightened out. I am wondering if this is the possibility 
 again. The user is researching the library sets now.
 
 In short, we don't have a solution yet to this explicit problem

You may consider this to be an explicit problem but it doesn't read like 
something that is explicit to me. If you load an object that takes 10GB and 
then make a modification to it, there will be 2 or three versions of it in 
memory, at least until the garbage collector runs. Presumably your external 
*NIX methods of assessing memory use will fail to understand this fact of 
R-life.


 but at least I know for certain it isn't the system or R. Now that I can take 
 a solid stance on those facts I have good ground to approach the user and 
 politely say Let's look at how we might be able to improve your code.
 
 Thanks to everyone who helped me debug this issue. I do appreciate it.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Weighted SUR/NSUR

2013-08-16 Thread Ariel

Arne Henningsen-3 wrote
 Is it possible
 to run SUR with weights using systemfit? I mean weighted seemingly
 unrelated
 regression (weighted SUR)
 
 Currently, systemfit cannot estimate (SUR) models with
 observation-specific
 weights :-(
 
 or weighted nonlinear unrelated regression (weighted NSUR).
 
 We are still not yet finished with implementing nonlinear models in
 systemfit (see http://www.systemfit.org/) :-(

I recently had a student come to me with a very similar (okay, identical)
problem as the OP.  I had to learn PROC MODEL, anyway, so I thought I’d poke
around in R while I was at it.  I have nothing to add about any problems
with or the lack of maturity of the estimation procedure for nlsystemfit(),
but I do have some ideas about observation-level weights.

It took me awhile to make the leap from the fairly straightforward linear
weighted least squares (for example, see  Weisberg's Applied Linear
Regression textbook equation 5.8) to understanding how weighting worked in
nonlinear least squares.  The R help forum certainly came in handy:
https://stat.ethz.ch/pipermail/r-help/2004-November/060424.html.  I can add
weights into a nonlinear regression by simply multiplying both the response
and the nonlinear function by the square root of the desired weights. 
Here’s a toy example, where I compare a model fit using the “weights”
argument in nls() with a model where I put the weights in “by hand” :

DNase1 = subset(DNase, Run == 1)
fit2 = nls(density ~ Asym/(1 + exp((xmid - log(conc))/scal)),
 data = DNase1,
 start = list(Asym = 3, xmid = 0, scal = 1), weights =
rep(1:8, each = 2))
summary(fit2)

# Take the square root of the weights for fitting “by hand”
sw = sqrt(rep(1:8, each = 2) )
fit3 = nls(sw*density ~ sw*(Asym/(1 + exp((xmid - log(conc))/scal))),
DNase1,
 start = list(Asym = 3, xmid = 0, scal = 1) )
summary(fit3)

# The predicted values for fit3 need to be divided by the weights 
# but the residuals are weighted residuals
predict(fit2)
predict(fit3)/sw

It seems like this weighted approach could be easily extended to the model
formulas for a system of nonlinear equations (it would be similar for linear
equations) to be fit with systemfit.  

  Parresol (2001) in his paper
 titled Additivity of nonlinear biomass
 equations has run weighted NSUR using PROC MODEL (SAS institute Inc.1993).
 I was wondering if r can do that.

It turned out I had to use this weighting approach in PROC MODEL, as well,
when each equation in the system had a different set of weights.  The
estimates I get when fitting the Parresol example mentioned by the OP using
nlsystemfit and PROC MODEL are within spitting distance of each other, so I
feel like I am at least making the same mistakes in both software packages.  

I'm wondering if my logic is sound or if I'm missing some complication that
occurs when working with systems of equations.  I’ve seen several folks
looking to fit weighted systems of equations in R with systemfit, and this
approach might get them what they need.

Ariel



--
View this message in context: 
http://r.789695.n4.nabble.com/Weighted-SUR-NSUR-tp4670602p4673973.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A question about using delayedAssign

2013-08-16 Thread William Dunlap

Change
   delayedAssign(x, msg)
to
   delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ; msg })
and you will see the message when the delayed assignment is triggered.
You could add print(sys.calls()) to that to see the call stack if it isn't
obvious.

 msg - old
 delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ; print(sys.calls()) 
 ; msg })
 f - function(p) paste(x, p)
 f(qwerty)
Assigning 'msg' to 'x' now
[[1]]
f(qwerty)

[[2]]
paste(x, p)

[[3]]
print(sys.calls())

[1] old qwerty
 x
[1] old


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Gang Peng
 Sent: Wednesday, August 14, 2013 6:12 PM
 To: r-help@r-project.org
 Subject: [R] A question about using delayedAssign
 
 I run the examples in delayedAssign:
 
 msg - old
 delayedAssign(x, msg)
 msg - new!
 x
 
 If I run these four commands together, x is new. If I run the first two
 commands first and then run the last two commands, x is old.
 
 I just cannot figure out why.
 
 Thanks.
 Gang
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multi Correspondence Analysis

Hi,

You can upload the dataset using:
library(XLConnect)
 wb-loadWorkbook(excel_data.xlsx)
 dat1- readWorksheet(wb,sheet=excel data,region=A1:DA101) #region can be 
specified to read a subset of the dataset.  Here, I read the full #dataset.

dim(dat1)
#[1] 100 105
str(dat1)
#'data.frame':    100 obs. of  105 variables:
# $ cid : num  17226 26226 32226 47226 48226 ...
# $ q14a_1  : chr  6 5 6 6 ...
# $ q14a_2  : chr  6 7 6 5 ...
# $ q14a_3  : chr  6 6 6 5 ...



There are a lot of missing values.  


Other option would be to save the file .csv and call by read.csv().  In that 
case, only the active sheet will be saved.

#For example: after saving the file as excel_data.csv

dat2- read.csv(excel_data.csv,header=TRUE,stringsAsFactors=FALSE)

 dim(dat2)
#[1] 100 105


Regarding the multi correspondence analysis, the link below may help you.
http://gastonsanchez.wordpress.com/2012/10/13/5-functions-to-do-multiple-correspondence-analysis-in-r/

A.K.





Hi everyone, 

I am new with R and I need some help. 

I have the following data 

excel_data.xlsx

And I would like to upload it in R and run a multi correspondence analysis. 

Any help will be really appreciate it. 

Thanks in advance, 

mils

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about plantbreeding library

2013-08-16 Thread Marc Girondot


Le 16/08/13 19:48, Waqas Shafqat a écrit :

Sir i have successfully installed plant breeding library following the
procedure on the web..

but  problem is that plantbreeding library does not working

I have tried it in both version i.e RGui 3.0.0 and RGui 3.0.1.

please guide me

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

You should read the posting guidelines. 
http://www.r-project.org/posting-guide.html
Give a reproducible example to show what you try to do and why it does 
not work.

Sincerely

Marc Girondot

--
__
Marc Girondot, Pr

Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
e-mail: marc.giron...@u-psud.fr
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting GAM fit using RGL


On Aug 16, 2013, at 10:55 AM, Duncan Murdoch wrote:

 On 13-08-15 1:15 PM, David Winsemius wrote:
 
 On Aug 15, 2013, at 2:23 AM, Lucas Holland wrote:
 
 Hello all,
 
 I’ve fitted a bivariate smoothing model (with GAM) to some data, using two 
 explanatory variables, x and y.  Now I’d like to add the surface 
 corresponding to my fit to a 3D scatterplot generated using plot3d().
 
 My approach so far is to create a grid of x and y values and the 
 corresponding predicted values and to try to use surface3d with that grid.
 
 grid - expand.grid(x = seq(-1,1,length=20),
y = seq(-1,1, length=20))
 
 grid$z - predict(fit.nonparametric, newdata=grid)
 
 surface3d(grid$x, grid$y, matrix(grid$z, nrow=length(grid$x), 
 ncol=length(grid$y)))
 
 ?surface3d
 # Should be:
 
  surface3d( unique(grid$x), unique(grid$y),
 z= matrix(grid$z, nrow=length(grid$x), 
 ncol=length(grid$y)))
 
 Or you could make x and y into matrices as well.  In this case you'll get the 
 same result, but if x or y weren't strictly increasing sequences, there'd be 
 a difference.

Thanks for increasing my knowledge on this point. And for providing rgl to the 
world. After looking at the Details section of the help page more carefully 
than I had previously, I wondered: Has anyone ever done a projection of a Klein 
bottle into rgl?

I didn't find one and my initial efforts with surface3d failed. (I managed to 
crash that seesion with a misguided call to the global replace function.)

I did get success with misc3d's parameteric3d with a parametrisation attributed 
to Robert Israel:

require(rgl); require(misc3d)

x = function(u,v){-(2/15)*cos(u)*(3*cos(v)-30*sin(u)+90*cos(u)^4*sin(u)- 
60*cos(u)^6*sin(u)+5*cos(u)*cos(v)*sin(u))}

y = 
function(u,v){-(1/15)*sin(u)*(3*cos(v)-3*cos(u)^2*cos(v)-48*cos(u)^4*cos(v)+48*cos(u)^6*cos(v)-60*sin(u)+5*cos(u)*cos(v)*sin(u)
 
-5*cos(u)^3*cos(v)*sin(u) 
-80*cos(u)^5*cos(v)*sin(u)+80*cos(u)^7*cos(v)*sin(u))}

z = function(u,v){ (2/15)*(3+5*cos(u)*sin(u))*sin(v) }

parametric3d(x,y,z, seq(0,pi,length=100), seq(0,2*pi,length=100) )

-- 
David.

 Duncan Murdoch
 
 
 
 This however plots a number of surfaces that do not look like the fitted 
 surface obtained by vis.gam(fit.nonparametric which actually looks a lot 
 like the „truth“ (I’m using simulated data so I know the true regression 
 surface).
 
 I think I’m using surface3d wrong but I can’t seem to spot my mistake.
 
 
 Always look at the Arguments section of help pages carefully.
 
 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding additional points to ggplot2

2013-08-16 Thread zelfortin

Hi,

you could also use a factor variable to differentiate your observed and 
estimated values and change shape and/or color based on that factor. 

e.g.

ggplot(aes(x=X,y=Y, shape=factor(Type))) + geom_point()  #For changing 
shapes
ggplot(aes(x=X,y=Y, color=factor(Type))) + geom_point()#For 
changing colors
ggplot(aes(x=X,y=Y, color=factor(Type), shape=factor(Type))) + 
geom_point()#For changing colors

Ista also gave a good solution, but if you ever have more than two sets of 
points/lines to plot on the same graph you will have a simpler and faster 
way of doing it. Also, if your data is set into different columns and you 
do not have a factor, you can use the melt() function in the package 
reshape2.  Your data will be melted into one line with the value beside the 
variable i.e. the column name which can be used as a factor.


Cheers.

JM 

Le vendredi 16 août 2013 05:45:21 UTC-4, Chris89 a écrit :

 Hi! 
 I am having a difficulty adding additional points to a plot using 
 ggplot2.. 

 The case is that I want to plot both original and estimated values in the 
 same graph, and general I would use 
 plot and then lines, but I do not know how to do it with ggplot... 

 Thanks! 

 Regards, 
 Chris 



 -- 
 View this message in context: 
 http://r.789695.n4.nabble.com/Adding-additional-points-to-ggplot2-tp4673928.html
  
 Sent from the R help mailing list archive at Nabble.com. 

 __ 
 r-h...@r-project.org javascript: mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help 
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html 
 and provide commented, minimal, self-contained, reproducible code. 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Is it possible to avoid copying arrays when calling list()?

2013-08-16 Thread MRipley

Usually R is pretty good about not copying objects when it doesn't need 
to.  However, the list() function seems to make unnecessary copies.  For 
example:


 system.time(x-double(10^9))
   user  system elapsed
  1.772   4.280   7.017
 system.time(y-double(10^9))
   user  system elapsed
  2.564   3.368   5.943
 system.time(z-list(x,y))
   user  system elapsed
  5.520   6.748  12.304

I have a function where I create two large arrays, manipulate them in 
certain ways, and then return both as a list.  I'm optimizing the 
function, so I'd like to be able to build the return list quickly.  The 
two large arrays drop out of scope immediately after I make the list and 
return it, so copying them is completely unnecessary.


Is there some way to do this?  I'm not familiar with manipulating lists 
through the .Call interface, and haven't been able to find much about 
this in the documentation.  Might it be possible to write a fast (but 
possibly unsafe) list function using .Call that doesn't make copies of 
the arguments?


PS A few things I've tried.  First, this is not due to triggering 
garbage collection -- even if I call gc() before list(x,y), it still 
takes a long time.


Also, I've tried rewriting the function by creating the list at the 
beginning as in:

result - list(x=double(10^9),y=double(10^9))
and then manipulating result$x and result$y but this made my code run 
slower, as R seemed to be making other unnecessary copies while 
manipulating elements of a list like this.


I've considered (though not implemented) creating an environment rather 
than a list, and returning the environment, but I'd rather find a simple 
way of creating a list without making copies if possible.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Randomly drop a percent of data from a data.frame

2013-08-16 Thread Christopher Desjardins

Hi,
I have the following data.

 set.seed(6245)
 data - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
 round(data,digits=3)
  x1 x2 x3 x4
1  0.482  1.320 -0.859 -0.142
2 -0.753 -0.041 -0.063  0.886
3  0.028 -0.256 -0.069  0.354
4 -0.086  0.475  0.244  0.781
5  0.690 -0.181  1.274  1.633

What I would like to do is drop 20% of the data. But I want this 20% to
only come from dropping data from x3 and x4. It doesn't have to be evenly,
i.e. I don't care to drop 2 from x3 and 2 from x4 or make sure only one
observation has missing data on only one variable. I just want to drop 20%
of the data through x3 and x4 only.  In other words,

   x1 x2 x3 x4
1  0.482  1.320 -0.859 NA
2 -0.753 -0.041 -0.063  0.886
3  0.028 -0.256  NA  0.354
4 -0.086  0.475  NA  0.781
5  0.690 -0.181  NA  1.633

OR

  x1 x2 x3 x4
1  0.482  1.320 NA -0.142
2 -0.753 -0.041 -0.063  0.886
3  0.028 -0.256  NA  NA
4 -0.086  0.475  0.244  NA
5  0.690 -0.181  1.274  1.633

OR

  x1 x2 x3 x4
1  0.482  1.320 -0.859 -0.142
2 -0.753 -0.041 -0.063 NA
3  0.028 -0.256 -0.069 NA
4 -0.086  0.475  0.244 NA
5  0.690 -0.181  1.274 NA

ETC. are all fine.

Any ideas how I can do this?
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory limit on Linux?

2013-08-16 Thread Stackpole, Chris

 From: David Winsemius [mailto:dwinsem...@comcast.net] 
 Sent: Friday, August 16, 2013 12:59 PM
 Subject: Re: [R] Memory limit on Linux?
[snip] 
  In short, we don't have a solution yet to this explicit problem

 You may consider this to be an explicit problem but it doesn't read like 
 something
 that is explicit to me. If you load an object that takes 10GB and then make 
 a
 modification to it, there will be 2 or three versions of it in memory, at 
 least until
 the garbage collector runs. Presumably your external *NIX methods of assessing
 memory use will fail to understand this fact of R-life.

Hrm. Maybe explicit was the wrong word. Maybe specific would have been a 
better choice. Sorry.

What I was trying to imply is that we can't replicate this exact same problem 
with anything else or in any other form but this users particular code/dataset. 
So the problem is very narrow in scope and related to the user code/dataset and 
therefore not to R in general. Where this odd behavior is coming from is still 
undetermined, I have at least narrowed the band of possibilities down 
significantly. 

Thanks!

Chris Stackpole

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A question about using delayedAssign

2013-08-16 Thread William Dunlap

Are you using a GUI like RStudio to run R?  If it, it may be looking
at the values of things after each command to update its workspace
window, and the looking will trigger the delayed assignments.

(I cannot reproduce what you show using command line R on Linux.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

From: Gang Peng [mailto:michael.gang.p...@gmail.com]
Sent: Friday, August 16, 2013 1:55 PM
To: William Dunlap
Cc: r-help@r-project.org
Subject: Re: [R] A question about using delayedAssign

Hi Bill,
Thanks. According to the output, the assignment was triggered immediately after 
'delayedAssign'. So strange.

 msg - old
 delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ; msg })
 msg - new!
 x
Assigning 'msg' to 'x' now
[1] new!
 msg - old
 delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ; msg })
Assigning 'msg' to 'x' now
 msg - new!
 x
[1] old
Best,
Gang

2013/8/16 William Dunlap wdun...@tibco.commailto:wdun...@tibco.com
Change
   delayedAssign(x, msg)
to
   delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ; msg })
and you will see the message when the delayed assignment is triggered.
You could add print(sys.calls()) to that to see the call stack if it isn't
obvious.

 msg - old
 delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ; print(sys.calls()) 
 ; msg })
 f - function(p) paste(x, p)
 f(qwerty)
Assigning 'msg' to 'x' now
[[1]]
f(qwerty)

[[2]]
paste(x, p)

[[3]]
print(sys.calls())

[1] old qwerty
 x
[1] old


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.comhttp://tibco.com


 -Original Message-
 From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Gang Peng
 Sent: Wednesday, August 14, 2013 6:12 PM
 To: r-help@r-project.orgmailto:r-help@r-project.org
 Subject: [R] A question about using delayedAssign

 I run the examples in delayedAssign:

 msg - old
 delayedAssign(x, msg)
 msg - new!
 x

 If I run these four commands together, x is new. If I run the first two
 commands first and then run the last two commands, x is old.

 I just cannot figure out why.

 Thanks.
 Gang

   [[alternative HTML version deleted]]

 __
 R-help@r-project.orgmailto:R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Which skewness does pelpe3() from package lmom use?

2013-08-16 Thread frauke

Hi everyone,

I was trying to fit a Log Pearson III distribution through some maxima data.
I got thrown off because my results in Excel (using a frequency factor
table) are different from my results using pelpe3() in the R-package lmom.
The only reason I can think of is the skewness.

The Pearson III distribution has three parameters: Location (mean), scale
(standard deviation) and shape (skewness). The command pelpe3() returns the
same values for the first two parameters like I had computed in Excel.
However, the skewness is much higher.

Excel uses the type II skewness. *Does anybody know what type of skewness
pelpe3() uses and why?*

To be complete, I included some data.
data.txt http://r.789695.n4.nabble.com/file/n4673984/data.txt
This table gives the the maximum daily rainfall for each of 22 years. I want
to fit the distribution through those maximum rainfall events. These are the
parameters that I generated for the Pearson III distribution in Excel and R:

* Excel R*
*location (mean) *0.680.68
*shape (skewness)*1.04 *1.6*
*scale (stand dev)*0.38 0.4

Any help would be greatly appreciated!

Thank you!

Frauke

--
View this message in context:
http://r.789695.n4.nabble.com/Which-skewness-does-pelpe3-from-package-lmom-use-tp4673984.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Randomly drop a percent of data from a data.frame

Hi,
May be this helps:
#data1 (changed `data` to `data1`)
set.seed(6245)
 data1 - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
 data1- round(data1,digits=3)

data2- data1

data1[,3:4]-lapply(data1[,3:4],function(x){x1- 
match(x,sample(unlist(data1[,3:4]),round(0.8*length(unlist(data1[,3:4]);x[is.na(x1)]-NA;x})
 data1
#  x1 x2 x3 x4
#1  0.482  1.320 NA -0.142
#2 -0.753 -0.041 -0.063  0.886
#3  0.028 -0.256 -0.069  0.354
#4 -0.086  0.475  0.244  0.781
#5  0.690 -0.181  1.274  1.633


#or
data2[,3:4]-lapply(data2[,3:4],function(x){x1- 
match(x,sample(unlist(data2[,3:4]),round(0.8*length(unlist(data2[,3:4]);x[is.na(x1)]-NA;x})
 data2
#  x1 x2 x3 x4
#1  0.482  1.320 -0.859 -0.142
#2 -0.753 -0.041 NA NA
#3  0.028 -0.256 -0.069  0.354
#4 -0.086  0.475  0.244  0.781
#5  0.690 -0.181  1.274  1.633
A.K.



- Original Message -
From: Christopher Desjardins cddesjard...@gmail.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Friday, August 16, 2013 3:02 PM
Subject: [R] Randomly drop a percent of data from a data.frame

Hi,
I have the following data.

 set.seed(6245)
 data - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
 round(data,digits=3)
      x1     x2     x3     x4
1  0.482  1.320 -0.859 -0.142
2 -0.753 -0.041 -0.063  0.886
3  0.028 -0.256 -0.069  0.354
4 -0.086  0.475  0.244  0.781
5  0.690 -0.181  1.274  1.633

What I would like to do is drop 20% of the data. But I want this 20% to
only come from dropping data from x3 and x4. It doesn't have to be evenly,
i.e. I don't care to drop 2 from x3 and 2 from x4 or make sure only one
observation has missing data on only one variable. I just want to drop 20%
of the data through x3 and x4 only.  In other words,

       x1     x2     x3     x4
1  0.482  1.320 -0.859 NA
2 -0.753 -0.041 -0.063  0.886
3  0.028 -0.256      NA  0.354
4 -0.086  0.475      NA  0.781
5  0.690 -0.181      NA  1.633

OR

      x1     x2     x3     x4
1  0.482  1.320     NA -0.142
2 -0.753 -0.041 -0.063  0.886
3  0.028 -0.256      NA  NA
4 -0.086  0.475  0.244  NA
5  0.690 -0.181  1.274  1.633

OR

      x1     x2     x3     x4
1  0.482  1.320 -0.859 -0.142
2 -0.753 -0.041 -0.063     NA
3  0.028 -0.256 -0.069     NA
4 -0.086  0.475  0.244     NA
5  0.690 -0.181  1.274     NA

ETC. are all fine.

Any ideas how I can do this?
Chris

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Randomly drop a percent of data from a data.frame



Hi,
Suppose the dataset had odd number of columns:
set.seed(6458)
 data2- data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5))
n- prod(dim(data2))
 n
#[1] 15
dummy- rep(F,n/2)
dummy[sample(1:(n/2),n*.2)]-T
dummy
#[1]  TRUE FALSE  TRUE FALSE FALSE FALSE  TRUE

data2[,c(x2, x3)][matrix(dummy, nc = 2)]  - NA
#Error in `[-.data.frame`(`*tmp*`, matrix(dummy, nc = 2), value = NA) : 
 # unsupported matrix index in replacement
#In addition: Warning message:
#In matrix(dummy, nc = 2) :
 # data length [7] is not a sub-multiple or multiple of the number of rows [4]

I might do:
n1- 2*nrow(data2) ##for 2 columns
dummy- rep(FALSE,n1)
 dummy[sample(1:n1,n1*.2)]-TRUE
data2[,c(x2,x3)][matrix(dummy,nc=2)]-NA
data2
#   x1 x2 x3
#1 -0.55899744  0.6622481 -0.3305958
#2  0.12776368 NA NA
#3 -1.09734838  0.2069539 -0.6997853
#4  0.75919499 -0.5683809  0.4752002
#5 -0.03063141 -0.7549605  2.6038635


A.K.

From: Richard Kwock richardkw...@gmail.com
To: arun smartpink...@yahoo.com 
Cc: Christopher Desjardins cddesjard...@gmail.com; R help 
r-help@r-project.org 
Sent: Friday, August 16, 2013 5:55 PM
Subject: Re: [R] Randomly drop a percent of data from a data.frame



Try this:

data - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
data - round(data,digits=3)

#get the total counts
n = prod(dim(data))

#set up a dummy array/matrix
dummy - rep(F, n/2)
dummy[sample(1:(n/2), n*.2)] - T

# 5x2 dummy matrix with T and F
matrix(dummy, nc = 2)


#subset the T indices in x3 and x4 and replace with NAs
data[,c(x3, x4)][matrix(dummy, nc = 2)]  - NA

data

#      x1     x2     x3     x4
#1 -1.310  0.659     NA  0.510
#2 -3.003 -0.004     NA     NA
#3  0.584  0.310     NA -0.087
#4  1.644 -2.792 -0.390 -0.382
#5 -1.791  0.840  1.137  0.820

Richard



On Fri, Aug 16, 2013 at 2:34 PM, arun smartpink...@yahoo.com wrote:

Hi,
May be this helps:
#data1 (changed `data` to `data1`)
set.seed(6245)
 data1 - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
 data1- round(data1,digits=3)

data2- data1

data1[,3:4]-lapply(data1[,3:4],function(x){x1- 
match(x,sample(unlist(data1[,3:4]),round(0.8*length(unlist(data1[,3:4]);x[is.na(x1)]-NA;x})
 data1
#  x1 x2 x3 x4
#1  0.482  1.320 NA -0.142
#2 -0.753 -0.041 -0.063  0.886
#3  0.028 -0.256 -0.069  0.354
#4 -0.086  0.475  0.244  0.781
#5  0.690 -0.181  1.274  1.633


#or
data2[,3:4]-lapply(data2[,3:4],function(x){x1- 
match(x,sample(unlist(data2[,3:4]),round(0.8*length(unlist(data2[,3:4]);x[is.na(x1)]-NA;x})
 data2
#  x1 x2 x3 x4
#1  0.482  1.320 -0.859 -0.142
#2 -0.753 -0.041 NA NA
#3  0.028 -0.256 -0.069  0.354
#4 -0.086  0.475  0.244  0.781
#5  0.690 -0.181  1.274  1.633
A.K.




- Original Message -
From: Christopher Desjardins cddesjard...@gmail.com
To: r-help@r-project.org r-help@r-project.org
Cc:
Sent: Friday, August 16, 2013 3:02 PM
Subject: [R] Randomly drop a percent of data from a data.frame

Hi,
I have the following data.

 set.seed(6245)
 data - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
 round(data,digits=3)
      x1     x2     x3     x4
1  0.482  1.320 -0.859 -0.142
2 -0.753 -0.041 -0.063  0.886
3  0.028 -0.256 -0.069  0.354
4 -0.086  0.475  0.244  0.781
5  0.690 -0.181  1.274  1.633

What I would like to do is drop 20% of the data. But I want this 20% to
only come from dropping data from x3 and x4. It doesn't have to be evenly,
i.e. I don't care to drop 2 from x3 and 2 from x4 or make sure only one
observation has missing data on only one variable. I just want to drop 20%
of the data through x3 and x4 only.  In other words,

       x1     x2     x3     x4
1  0.482  1.320 -0.859 NA
2 -0.753 -0.041 -0.063  0.886
3  0.028 -0.256      NA  0.354
4 -0.086  0.475      NA  0.781
5  0.690 -0.181      NA  1.633

OR

      x1     x2     x3     x4
1  0.482  1.320     NA -0.142
2 -0.753 -0.041 -0.063  0.886
3  0.028 -0.256      NA  NA
4 -0.086  0.475  0.244  NA
5  0.690 -0.181  1.274  1.633

OR

      x1     x2     x3     x4
1  0.482  1.320 -0.859 -0.142
2 -0.753 -0.041 -0.063     NA
3  0.028 -0.256 -0.069     NA
4 -0.086  0.475  0.244     NA
5  0.690 -0.181  1.274     NA

ETC. are all fine.

Any ideas how I can do this?
Chris

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do

Re: [R] Randomly drop a percent of data from a data.frame

2013-08-16 Thread Christopher Desjardins

Hi,
Thanks for the help. What I actually ended up doing was writing a copy of
for loops and I ended up getting something works.
Thanks.
Chris


On Fri, Aug 16, 2013 at 4:34 PM, arun smartpink...@yahoo.com wrote:

 Hi,
 May be this helps:
 #data1 (changed `data` to `data1`)
 set.seed(6245)
  data1 - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
  data1- round(data1,digits=3)

 data2- data1

 data1[,3:4]-lapply(data1[,3:4],function(x){x1-
 match(x,sample(unlist(data1[,3:4]),round(0.8*length(unlist(data1[,3:4]);x[
 is.na(x1)]-NA;x})
  data1
 #  x1 x2 x3 x4
 #1  0.482  1.320 NA -0.142
 #2 -0.753 -0.041 -0.063  0.886
 #3  0.028 -0.256 -0.069  0.354
 #4 -0.086  0.475  0.244  0.781
 #5  0.690 -0.181  1.274  1.633


 #or
 data2[,3:4]-lapply(data2[,3:4],function(x){x1-
 match(x,sample(unlist(data2[,3:4]),round(0.8*length(unlist(data2[,3:4]);x[
 is.na(x1)]-NA;x})
  data2
 #  x1 x2 x3 x4
 #1  0.482  1.320 -0.859 -0.142
 #2 -0.753 -0.041 NA NA
 #3  0.028 -0.256 -0.069  0.354
 #4 -0.086  0.475  0.244  0.781
 #5  0.690 -0.181  1.274  1.633
 A.K.



 - Original Message -
 From: Christopher Desjardins cddesjard...@gmail.com
 To: r-help@r-project.org r-help@r-project.org
 Cc:
 Sent: Friday, August 16, 2013 3:02 PM
 Subject: [R] Randomly drop a percent of data from a data.frame

 Hi,
 I have the following data.

  set.seed(6245)
  data - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
  round(data,digits=3)
   x1 x2 x3 x4
 1  0.482  1.320 -0.859 -0.142
 2 -0.753 -0.041 -0.063  0.886
 3  0.028 -0.256 -0.069  0.354
 4 -0.086  0.475  0.244  0.781
 5  0.690 -0.181  1.274  1.633

 What I would like to do is drop 20% of the data. But I want this 20% to
 only come from dropping data from x3 and x4. It doesn't have to be evenly,
 i.e. I don't care to drop 2 from x3 and 2 from x4 or make sure only one
 observation has missing data on only one variable. I just want to drop 20%
 of the data through x3 and x4 only.  In other words,

x1 x2 x3 x4
 1  0.482  1.320 -0.859 NA
 2 -0.753 -0.041 -0.063  0.886
 3  0.028 -0.256  NA  0.354
 4 -0.086  0.475  NA  0.781
 5  0.690 -0.181  NA  1.633

 OR

   x1 x2 x3 x4
 1  0.482  1.320 NA -0.142
 2 -0.753 -0.041 -0.063  0.886
 3  0.028 -0.256  NA  NA
 4 -0.086  0.475  0.244  NA
 5  0.690 -0.181  1.274  1.633

 OR

   x1 x2 x3 x4
 1  0.482  1.320 -0.859 -0.142
 2 -0.753 -0.041 -0.063 NA
 3  0.028 -0.256 -0.069 NA
 4 -0.086  0.475  0.244 NA
 5  0.690 -0.181  1.274 NA

 ETC. are all fine.

 Any ideas how I can do this?
 Chris

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A question about using delayedAssign

Hi Bill,

Thanks. According to the output, the assignment was triggered immediately
after 'delayedAssign'. So strange.

 msg - old
 delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ; msg })
 msg - new!
 x
Assigning 'msg' to 'x' now
[1] new!
 msg - old
 delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ; msg })
Assigning 'msg' to 'x' now
 msg - new!
 x
[1] old

Best,
Gang


2013/8/16 William Dunlap wdun...@tibco.com

 Change
delayedAssign(x, msg)
 to
delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ; msg })
 and you will see the message when the delayed assignment is triggered.
 You could add print(sys.calls()) to that to see the call stack if it isn't
 obvious.

  msg - old
  delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ;
 print(sys.calls()) ; msg })
  f - function(p) paste(x, p)
  f(qwerty)
 Assigning 'msg' to 'x' now
 [[1]]
 f(qwerty)

 [[2]]
 paste(x, p)

 [[3]]
 print(sys.calls())

 [1] old qwerty
  x
 [1] old


 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com


  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf
  Of Gang Peng
  Sent: Wednesday, August 14, 2013 6:12 PM
  To: r-help@r-project.org
  Subject: [R] A question about using delayedAssign
 
  I run the examples in delayedAssign:
 
  msg - old
  delayedAssign(x, msg)
  msg - new!
  x
 
  If I run these four commands together, x is new. If I run the first two
  commands first and then run the last two commands, x is old.
 
  I just cannot figure out why.
 
  Thanks.
  Gang
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A question about using delayedAssign

Hi Duncan,

I did nothing between running the first two and the last two. The following
is the output:

 msg - old
 delayedAssign(x, msg)
 msg - new!
 x
[1] new!
 msg - old
 delayedAssign(x, msg)
 msg - new!
 x
[1] old

Thanks,
Gang


2013/8/16 Duncan Murdoch murdoch.dun...@gmail.com

 On 13-08-14 9:11 PM, Gang Peng wrote:

 I run the examples in delayedAssign:

 msg - old
 delayedAssign(x, msg)
 msg - new!
 x

 If I run these four commands together, x is new. If I run the first two
 commands first and then run the last two commands, x is old.

 I just cannot figure out why.


 You aren't telling us everything.  What did you do in between running the
 first two and the last two?  Presumably something you did forced the
 evaluation of x.  That is what causes the behaviour you saw.

 Duncan Murdoch



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issue installing Packages

2013-08-16 Thread Alexandre Khelifa

Hi,

Works well now. Don't know if any server was done or if it was something
wrong with my firewall...

Thanks a lot for the quick response though.

Alexandre

On Fri, Aug 16, 2013 at 2:53 AM, Uwe Ligges lig...@statistik.tu-dortmund.de
wrote:

Works for me.

If this happens for several mirrors and more than on e package, I believe
it is a local/internal problem of your network setup. Please ask your IT
staff.

Uwe Ligges

On 15.08.2013 20:54, Alexandre Khelifa wrote:

Hi Guys,

Hope you are doing good. I am using R (3.0.1 - 32 bits) extensively for my
work but I have been having an issue for the last days.

I would like to download (and update) the packages RODBC, forecast and
gdata but I cannot download the binary file from the CRAN Mirrors.
I have tried several of them but the file cannot download completely and
freeze when downloaded at 99%.

Thus, I cannot install it on my R console.
I have checked with several co-workers and they all have the same issues.

Please let me know what I can do.
Please also find attached a copy of the issue while downloading the
windows
binary file.

Thanks a lot for your help, and the R support. It is a AMAZING tool.

Regards,

Alexandre Khelifa

__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

Re: [R] Randomly drop a percent of data from a data.frame

2013-08-16 Thread Richard Kwock

Try this:

data - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
data - round(data,digits=3)

#get the total counts
n = prod(dim(data))

#set up a dummy array/matrix
dummy - rep(F, n/2)
dummy[sample(1:(n/2), n*.2)] - T

# 5x2 dummy matrix with T and F
matrix(dummy, nc = 2)

#subset the T indices in x3 and x4 and replace with NAs
data[,c(x3, x4)][matrix(dummy, nc = 2)]  - NA

data

#  x1 x2 x3 x4
#1 -1.310  0.659 NA  0.510
#2 -3.003 -0.004 NA NA
#3  0.584  0.310 NA -0.087
#4  1.644 -2.792 -0.390 -0.382
#5 -1.791  0.840  1.137  0.820

Richard


On Fri, Aug 16, 2013 at 2:34 PM, arun smartpink...@yahoo.com wrote:

 Hi,
 May be this helps:
 #data1 (changed `data` to `data1`)
 set.seed(6245)
  data1 - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
  data1- round(data1,digits=3)

 data2- data1

 data1[,3:4]-lapply(data1[,3:4],function(x){x1-
 match(x,sample(unlist(data1[,3:4]),round(0.8*length(unlist(data1[,3:4]);x[
 is.na(x1)]-NA;x})
  data1
 #  x1 x2 x3 x4
 #1  0.482  1.320 NA -0.142
 #2 -0.753 -0.041 -0.063  0.886
 #3  0.028 -0.256 -0.069  0.354
 #4 -0.086  0.475  0.244  0.781
 #5  0.690 -0.181  1.274  1.633


 #or
 data2[,3:4]-lapply(data2[,3:4],function(x){x1-
 match(x,sample(unlist(data2[,3:4]),round(0.8*length(unlist(data2[,3:4]);x[
 is.na(x1)]-NA;x})
  data2
 #  x1 x2 x3 x4
 #1  0.482  1.320 -0.859 -0.142
 #2 -0.753 -0.041 NA NA
 #3  0.028 -0.256 -0.069  0.354
 #4 -0.086  0.475  0.244  0.781
 #5  0.690 -0.181  1.274  1.633
 A.K.



 - Original Message -
 From: Christopher Desjardins cddesjard...@gmail.com
 To: r-help@r-project.org r-help@r-project.org
 Cc:
 Sent: Friday, August 16, 2013 3:02 PM
 Subject: [R] Randomly drop a percent of data from a data.frame

 Hi,
 I have the following data.

  set.seed(6245)
  data - data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
  round(data,digits=3)
   x1 x2 x3 x4
 1  0.482  1.320 -0.859 -0.142
 2 -0.753 -0.041 -0.063  0.886
 3  0.028 -0.256 -0.069  0.354
 4 -0.086  0.475  0.244  0.781
 5  0.690 -0.181  1.274  1.633

 What I would like to do is drop 20% of the data. But I want this 20% to
 only come from dropping data from x3 and x4. It doesn't have to be evenly,
 i.e. I don't care to drop 2 from x3 and 2 from x4 or make sure only one
 observation has missing data on only one variable. I just want to drop 20%
 of the data through x3 and x4 only.  In other words,

x1 x2 x3 x4
 1  0.482  1.320 -0.859 NA
 2 -0.753 -0.041 -0.063  0.886
 3  0.028 -0.256  NA  0.354
 4 -0.086  0.475  NA  0.781
 5  0.690 -0.181  NA  1.633

 OR

   x1 x2 x3 x4
 1  0.482  1.320 NA -0.142
 2 -0.753 -0.041 -0.063  0.886
 3  0.028 -0.256  NA  NA
 4 -0.086  0.475  0.244  NA
 5  0.690 -0.181  1.274  1.633

 OR

   x1 x2 x3 x4
 1  0.482  1.320 -0.859 -0.142
 2 -0.753 -0.041 -0.063 NA
 3  0.028 -0.256 -0.069 NA
 4 -0.086  0.475  0.244 NA
 5  0.690 -0.181  1.274 NA

 ETC. are all fine.

 Any ideas how I can do this?
 Chris

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A question about using delayedAssign

I see. I am using RStudio.

Thanks,
Gang


2013/8/16 William Dunlap wdun...@tibco.com

  Are you using a GUI like RStudio to run R?  If it, it may be looking

 at the values of things after each command to update its workspace

 window, and the looking will trigger the delayed assignments.

 ** **

 (I cannot reproduce what you show using command line R on Linux.)

 ** **

 Bill Dunlap

 Spotfire, TIBCO Software

 wdunlap tibco.com

 ** **

 *From:* Gang Peng [mailto:michael.gang.p...@gmail.com]
 *Sent:* Friday, August 16, 2013 1:55 PM
 *To:* William Dunlap
 *Cc:* r-help@r-project.org
 *Subject:* Re: [R] A question about using delayedAssign

 ** **

 Hi Bill,

 Thanks. According to the output, the assignment was triggered immediately
 after 'delayedAssign'. So strange.

  msg - old
  delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ; msg })
  msg - new!
  x
 Assigning 'msg' to 'x' now
 [1] new!
  msg - old
  delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ; msg })
 Assigning 'msg' to 'x' now
  msg - new!
  x
 [1] old

 Best,
 Gang

 ** **

 2013/8/16 William Dunlap wdun...@tibco.com

 Change
delayedAssign(x, msg)
 to
delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ; msg })
 and you will see the message when the delayed assignment is triggered.
 You could add print(sys.calls()) to that to see the call stack if it isn't
 obvious.

  msg - old
  delayedAssign(x, { cat(Assigning 'msg' to 'x' now\n) ;
 print(sys.calls()) ; msg })
  f - function(p) paste(x, p)
  f(qwerty)
 Assigning 'msg' to 'x' now
 [[1]]
 f(qwerty)

 [[2]]
 paste(x, p)

 [[3]]
 print(sys.calls())

 [1] old qwerty
  x
 [1] old


 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com



  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf
  Of Gang Peng
  Sent: Wednesday, August 14, 2013 6:12 PM
  To: r-help@r-project.org
  Subject: [R] A question about using delayedAssign
 
  I run the examples in delayedAssign:
 
  msg - old
  delayedAssign(x, msg)
  msg - new!
  x
 
  If I run these four commands together, x is new. If I run the first two
  commands first and then run the last two commands, x is old.
 
  I just cannot figure out why.
 
  Thanks.
  Gang
 

[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 ** **


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is it possible to avoid copying arrays when calling list()?

If you don't want to copy the data, you can use environments. You can first
define x and y in the global environment and then in the function, use
function get() to get x, y in the global environment. When you change x and
y in the function, x and y also change in the global environment.

Best,
Gang


2013/8/16 MRipley mrip...@gmail.com

 Usually R is pretty good about not copying objects when it doesn't need
 to.  However, the list() function seems to make unnecessary copies.  For
 example:

  system.time(x-double(10^9))
user  system elapsed
   1.772   4.280   7.017
  system.time(y-double(10^9))
user  system elapsed
   2.564   3.368   5.943
  system.time(z-list(x,y))
user  system elapsed
   5.520   6.748  12.304

 I have a function where I create two large arrays, manipulate them in
 certain ways, and then return both as a list.  I'm optimizing the function,
 so I'd like to be able to build the return list quickly.  The two large
 arrays drop out of scope immediately after I make the list and return it,
 so copying them is completely unnecessary.

 Is there some way to do this?  I'm not familiar with manipulating lists
 through the .Call interface, and haven't been able to find much about this
 in the documentation.  Might it be possible to write a fast (but possibly
 unsafe) list function using .Call that doesn't make copies of the arguments?

 PS A few things I've tried.  First, this is not due to triggering garbage
 collection -- even if I call gc() before list(x,y), it still takes a long
 time.

 Also, I've tried rewriting the function by creating the list at the
 beginning as in:
 result - list(x=double(10^9),y=double(**10^9))
 and then manipulating result$x and result$y but this made my code run
 slower, as R seemed to be making other unnecessary copies while
 manipulating elements of a list like this.

 I've considered (though not implemented) creating an environment rather
 than a list, and returning the environment, but I'd rather find a simple
 way of creating a list without making copies if possible.

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] bySum error in ffbase package ?

2013-08-16 Thread Steve Chen

Hi all,

Since I upgraded to R3.0.1 and also upgraded ffbase package, I got the 
following error
when using bySum( ) funciton in ffbase.

For example:

 library(ffbase)
 bySum(iris$Sepal.Length,iris$Species)

Error in bySum(iris$Sepal.Length, iris$Species) :
  REAL() can only be applied to a 'numeric', not a 'symbol'


Any idea ?

Steve Chen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is it possible to avoid copying arrays when calling list()?