from:"jim holtman"

Re: [R] Bug (?) in read.fwf

2007-11-08 Thread jim holtman

have you tried  as.is=TRUE

On Nov 8, 2007 6:20 AM,  [EMAIL PROTECTED] wrote:
 Hi,

 I'm trying to use read.fwf

temp = read.fwf (Raw data.txt, widths = c (11, 21, 10, rep
 (16, 6)) ,skip = 2, n = 2, stringsAsFactors = FALSE, strip.white = TRUE)

 but no matter what I do the strings are turned into factors.  I believe
 it's the n=2 parameter that causes the problem as it seems to work
 without this.  Am I missing something?

 Thanks in advance,

 David Jessop


 Issued by UBS AG or affiliates to professional investors for
 information only and its accuracy/completeness is not guaranteed.
 All opinions may change without notice and may differ to
 opinions/recommendations expressed by other business areas of UBS.
 UBS may maintain long/short positions and trade in instruments
 referred to. Unless stated otherwise, this is not a personal
 recommendation, offer or solicitation to buy/sell and any
 prices/quotations are indicative only. UBS may provide investment
 banking and other services to, and/or its employees may be directors
 of, companies referred to. To the extent permitted by law, UBS does
 not accept any liability arising from the use of this communication.

  (c) 2007 UBS.  All rights reserved. Intended for recipient only
 and not for further distribution without the consent of UBS.

 UBS Limited is a company registered in England  Wales under company
 number 2035362, whose registered office is at 1 Finsbury Avenue,
 London, EC2M 2PP, United Kingdom.

 UBS AG (London Branch) is registered as a branch of a foreign company
 under number BR004507, whose registered office is at
 1 Finsbury Avenue, London, EC2M 2PP, United Kingdom.

 UBS Clearing and Execution Services Limited is a company registered
 in England  Wales under company number 03123037, whose registered
 office is at 1 Finsbury Avenue, London, EC2M 2PP, United Kingdom.








 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pattern matching accross multiple matrices

2007-11-08 Thread jim holtman

You are putting your results back into A which might change things
as you execute.  This might be a faster way:

result - matrix(NA,dim(A)[1], dim(A)[2])

# now compute the cases
result[(A ==1)  (D == 1)  (P ==1)] - Case1
result[(A == -1)  (D == -1)  (P == -1)] - Case2
...

On Nov 8, 2007 12:27 PM, Martin Tomko [EMAIL PROTECTED] wrote:
 Hi all,

 I have a set of patterns which can occur in a series of (3) matrices. I
 want to identify those and create a fourth one with the identifiers of
 the cases.

 Something like:

for (i in 1:l) {
for (j in 1:w) {

A[A[i,j]==1  D[i,j]==1  P[i,j]==1] - Case1;
A[A[i,j]==-1  D[i,j]==-1  P[i,j]==-1] - Case2;

 etc
 }
 }

 the code seems to run, but is very slow Could anyone please suggest
 a better approach? I was thinking that 3 matrices could be stacked in a
 cube, and the column of a cube searched for a pattern, but am not sure
 how to do that...

 Thanks
 Martin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculate percentages in a table of data

2007-11-08 Thread jim holtman

This will do the calculations and the plot:

 x - scan(textConnection(255 0 255 0 255 255 255 0 255 0
+ 255 255 255 255 0 255 255 0 255 0
+ 255 255 255 255 255 255 255 0 255 0
+ 255 255 255 255 0 255 255 0 255 0
+ 255 255 0 255 255 255 255 0 255 0
+ 255 255 255 0 255 0 0 255 0 255), what=0)
Read 60 items
 x - matrix(x, ncol=10, byrow=TRUE)
 num.zeros - apply(x, 1, function(z) sum(z == 0))
 num.zeros * 10
[1] 40 30 20 30 30 40
 plot(num.zeros * 10, type='o')



On Nov 8, 2007 1:54 PM, Luca Penasa [EMAIL PROTECTED] wrote:
 Hi everybody,
 Im a newbie, but i hope someone can help me in this work...
 Ill try to explain what i need to do in the best way, but my english is
 not good...
 Iv imported a big table of data, this table is something like this:

 255 0 255 0 255 255 255 0 255 0
 255 255 255 255 0 255 255 0 255 0
 255 255 255 255 255 255 255 0 255 0
 255 255 255 255 0 255 255 0 255 0
 255 255 0 255 255 255 255 0 255 0
 255 255 255 0 255 0 0 255 0 255

 I need to calculate for every row the number of cells with 255 and the
 number of cells with 0... from this values i would like to obtain the
 percentage of 0 presents in the row after i want to plot the data in
 a graph showing the variations of this percentage along the rows...

 What i want to obtain is an array of this type:
 40
 30
 20
 30
 30
 40

 Someone can give me a hint on how to obtain this?? maybe anybody can
 suggest me the functions i could use...

 what software do you suggest me for plot the data?? i was thinking in
 gnuplot so i can plot the graph in svg format...

 Please help me

 thank you

 Luca Penasa,
 geology student,
 University of Padua, Italy






  --
  Email.it, the professional e-mail, gratis per te: http://www.email.it/f

  Sponsor:
  Fai conoscere la tua azienda con l'invio di newsletter e campagne email 
 marketing. Con soli 250 Euro incrementi la tua visibilità!
 *
  Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=7150d=8-11

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to create an array of list?

2007-11-08 Thread jim holtman

I think something like this is what you are after.  This will create 7
pairs of lists with the parameters that I think you want.  I don't
have the data (if you want to sent it to me, I may be able to test it)
so you will have to test it yourself.

# create a list for the results
result - vector('list' 7)
for (n in 1:7){
# initialize the pair of list in the result
result[[n]] - vector('list', 2)
for (ii in 1:2){
sublist - vector('list', 3)  # for the parameters
for (jj in 1:3){
if(cc[n, ii, jj] == 0) sublist[[jj]] - levels(MyModel[, jj])
else sublist[[jj]] - cc[n, ii, jj]
}
names(sublist) - names(MyModel)
result[[n]][[ii]] - sublist
}
}
str(result)  # see what it looks like



On Nov 8, 2007 6:31 PM, Gang Chen [EMAIL PROTECTED] wrote:
 Thanks again for the response!

 For example, I want to run the following

   contrast(fit.lme, list(Trust=U, Sex=levels(Model$Sex),
 Freq=levels(Model$Freq)), list(Trust=T, Sex=levels(Model$Sex),
 Freq=levels(Model$Freq)))

 The 2nd and 3rd arguments are two lists that I'm trying to construct
 based on the data frame 'Model'. Of course I could provide the two
 lists explicitly as the above command. However for a general usage, I
 would like to build the two lists from the user's input. That is how
 the issue of creating an array of list came about. In the example I
 provided, it would run 7 separate contrasts line the one shown above,
 each of which contains 2 lists, and each list has 3 named components
 (Trust, Sex, and Freq) each of which is of unequal components
 (depending on the contrast specification). And that is why I wanted
 to have an array of 7 X 2 X 3.

 Hope this is clearer. Any better solutions?

 Thanks,
 Gang




 On Nov 8, 2007, at 6:18 PM, jim holtman wrote:

  I am still not sure what you expect as output.  Can you provide an
  example of what you think that you need.  What is it that you are
  trying to construct?  How do you then plan to use them?  There might
  be other ways of going about it if we knew what the intent was -- what
  is the structure that you are trying to create?  The code that you
  have is probably having problems with the number of elements in the
  replacement, so to see what the alternatives are, can you give an
  explicit example of what you would like as an outcome and then how you
  intend to use it.
 
  On Nov 8, 2007 5:19 PM, Gang Chen [EMAIL PROTECTED] wrote:
  Thanks for the response!
 
  I want to create those lists so that I could use them in a function
  ('contrast' in contrast package) as arguments.
 
  Any suggestions?
 
  Thanks,
  Gang
 
  On Nov 8, 2007, at 5:12 PM, jim holtman wrote:
 
  Can you tell us what you want to do, and not how you want to do it.
  Without the data it is hard to see.  Some of your indexing probably
  does not have the correct number of parameters when trying to do the
  replacement.  An explanation of what you expect the output to be
  would
  be useful in determining what the script might look like.
 
  On Nov 8, 2007 4:51 PM, Gang Chen [EMAIL PROTECTED] wrote:
  I have trouble creating an array of lists? For example, I want
  to do
  something like this
 
  clist - array(data=NA, dim=c(7, 2, 3));
  for (n in 1:7) {
 for (ii in 1:2) {
 for (jj in 1:3) {
 if (cc[n, ii, jj] == 0) { clist[n, ii, ][[jj]] -
  list(levels(MyModel[,colnames(MyModel)[jj]])); }
 
else  { clist[n, ii, ][[jj]] - cc[n, ii, jj]; }
names(clist[n, ii, ][[jj]]) - colnames(MyModel)[jj];
 }
 }
  }
 
  but I get an error:
 
  Error in `*tmp*`[n, ii, ] : incorrect number of dimensions
 
  Is it because each list has different number of components? The two
  variables involved in the loop, character matrix cc and dataframe
  MyModel are shown below:
 
  cc
  , , 1
 
   [,1] [,2]
  [1,] U  T
  [2,] 0  0
  [3,] 0  0
  [4,] 0  0
  [5,] U  T
  [6,] U  T
  [7,] U  T
 
  , , 2
 
   [,1] [,2]
  [1,] 0  0
  [2,] M  F
  [3,] 0  0
  [4,] 0  0
  [5,] 0  0
  [6,] 0  0
  [7,] 0  0
 
  , , 3
 
   [,1] [,2]
  [1,] 0  0
  [2,] 0  0
  [3,] Lo Hi
  [4,] No Hi
  [5,] Hi Hi
  [6,] Lo Lo
  [7,] No No
 
  MyModel
 Trust Sex Freq
  1  T   F   Hi
  2  T   F   Hi
  3  T   F   Hi
  4  T   F   Hi
  5  T   F   Hi
  6  T   F   Hi
  7  T   F   Hi
  8  T   F   Hi
  9  T   F   Lo
  10 T   F   Lo
  11 T   F   Lo
  12 T   F   Lo
  13 T   F   Lo
  14 T   F   Lo
  15 T   F   Lo
  16 T   F   Lo
  17 T   F   No
  18 T   F   No
  19 T   F   No
  20 T   F   No
  21 T   F   No
  22 T   F   No
  23 T   F   No
  24 T   F   No
  25 T   M   Hi
  26 T   M   Hi
  27 T   M   Hi
  28 T   M   Hi
  29 T   M   Hi
  30 T   M   Hi
  31 T   M   Hi
  32 T   M   Hi
  33 T   M   Lo
  34 T   M   Lo
  35 T   M   Lo
  36 T   M   Lo
  37 T   M   Lo
  38 T   M   Lo
  39 T

Re: [R] How to create an array of list?

2007-11-08 Thread jim holtman

I am still not sure what you expect as output.  Can you provide an
example of what you think that you need.  What is it that you are
trying to construct?  How do you then plan to use them?  There might
be other ways of going about it if we knew what the intent was -- what
is the structure that you are trying to create?  The code that you
have is probably having problems with the number of elements in the
replacement, so to see what the alternatives are, can you give an
explicit example of what you would like as an outcome and then how you
intend to use it.

On Nov 8, 2007 5:19 PM, Gang Chen [EMAIL PROTECTED] wrote:
 Thanks for the response!

 I want to create those lists so that I could use them in a function
 ('contrast' in contrast package) as arguments.

 Any suggestions?

 Thanks,
 Gang

 On Nov 8, 2007, at 5:12 PM, jim holtman wrote:

  Can you tell us what you want to do, and not how you want to do it.
  Without the data it is hard to see.  Some of your indexing probably
  does not have the correct number of parameters when trying to do the
  replacement.  An explanation of what you expect the output to be would
  be useful in determining what the script might look like.
 
  On Nov 8, 2007 4:51 PM, Gang Chen [EMAIL PROTECTED] wrote:
  I have trouble creating an array of lists? For example, I want to do
  something like this
 
  clist - array(data=NA, dim=c(7, 2, 3));
  for (n in 1:7) {
 for (ii in 1:2) {
 for (jj in 1:3) {
 if (cc[n, ii, jj] == 0) { clist[n, ii, ][[jj]] -
  list(levels(MyModel[,colnames(MyModel)[jj]])); }

else  { clist[n, ii, ][[jj]] - cc[n, ii, jj]; }
names(clist[n, ii, ][[jj]]) - colnames(MyModel)[jj];
 }
 }
  }
 
  but I get an error:
 
  Error in `*tmp*`[n, ii, ] : incorrect number of dimensions
 
  Is it because each list has different number of components? The two
  variables involved in the loop, character matrix cc and dataframe
  MyModel are shown below:
 
  cc
  , , 1
 
   [,1] [,2]
  [1,] U  T
  [2,] 0  0
  [3,] 0  0
  [4,] 0  0
  [5,] U  T
  [6,] U  T
  [7,] U  T
 
  , , 2
 
   [,1] [,2]
  [1,] 0  0
  [2,] M  F
  [3,] 0  0
  [4,] 0  0
  [5,] 0  0
  [6,] 0  0
  [7,] 0  0
 
  , , 3
 
   [,1] [,2]
  [1,] 0  0
  [2,] 0  0
  [3,] Lo Hi
  [4,] No Hi
  [5,] Hi Hi
  [6,] Lo Lo
  [7,] No No
 
  MyModel
 Trust Sex Freq
  1  T   F   Hi
  2  T   F   Hi
  3  T   F   Hi
  4  T   F   Hi
  5  T   F   Hi
  6  T   F   Hi
  7  T   F   Hi
  8  T   F   Hi
  9  T   F   Lo
  10 T   F   Lo
  11 T   F   Lo
  12 T   F   Lo
  13 T   F   Lo
  14 T   F   Lo
  15 T   F   Lo
  16 T   F   Lo
  17 T   F   No
  18 T   F   No
  19 T   F   No
  20 T   F   No
  21 T   F   No
  22 T   F   No
  23 T   F   No
  24 T   F   No
  25 T   M   Hi
  26 T   M   Hi
  27 T   M   Hi
  28 T   M   Hi
  29 T   M   Hi
  30 T   M   Hi
  31 T   M   Hi
  32 T   M   Hi
  33 T   M   Lo
  34 T   M   Lo
  35 T   M   Lo
  36 T   M   Lo
  37 T   M   Lo
  38 T   M   Lo
  39 T   M   Lo
  40 T   M   Lo
  41 T   M   No
  42 T   M   No
  43 T   M   No
  44 T   M   No
  45 T   M   No
  46 T   M   No
  47 T   M   No
  48 T   M   No
  49 U   F   Hi
  50 U   F   Hi
  51 U   F   Hi
  52 U   F   Hi
  53 U   F   Hi
  54 U   F   Hi
  55 U   F   Hi
  56 U   F   Hi
  57 U   F   Lo
  58 U   F   Lo
  59 U   F   Lo
  60 U   F   Lo
  61 U   F   Lo
  62 U   F   Lo
  63 U   F   Lo
  64 U   F   Lo
  65 U   F   No
  66 U   F   No
  67 U   F   No
  68 U   F   No
  69 U   F   No
  70 U   F   No
  71 U   F   No
  72 U   F   No
  73 U   M   Hi
  74 U   M   Hi
  75 U   M   Hi
  76 U   M   Hi
  77 U   M   Hi
  78 U   M   Hi
  79 U   M   Hi
  80 U   M   Hi
  81 U   M   Lo
  82 U   M   Lo
  83 U   M   Lo
  84 U   M   Lo
  85 U   M   Lo
  86 U   M   Lo
  87 U   M   Lo
  88 U   M   Lo
  89 U   M   No
  90 U   M   No
  91 U   M   No
  92 U   M   No
  93 U   M   No
  94 U   M   No
  95 U   M   No
  96 U   M   No
 
  Thanks,
  Gang
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
  --
  Jim Holtman
  Cincinnati, OH
  +1 513 646 9390
 
  What is the problem you are trying to solve?





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting

Re: [R] Problem with R version 2.6.0

2007-11-09 Thread jim holtman

Have you tried using 'setwd'?  I have no problem with changing
directories and executing scripts.  Can you provide an example of the
script that you are trying to execute?  How does it crash?  Does is
to it only when you 'source' it?  More information is needed.

On Nov 9, 2007 10:21 AM, Dimitri Liakhovitski [EMAIL PROTECTED] wrote:
 I just installed R 2.6.0 (had R 2.5 before).
 Here is my problem. Usually, when I work with R I first go to
 File-Change dir and browse to a folder that seats OUTSIDE of the
 folder C:\Program Files\R\R-2.6.0 and then create my script there
 (and open and re-open it there). I never had any problems with R 2.4
 or R 2.5.
 However, after I installed R 2.6.0, R crashes every time I try to open
 a script - if I work outside the R folder. Interestingly, no problems
 when I work in the folder C:\Program Files\R\R-2.6.0 (and create my
 new folders and subfolders there).
 Any advice?
 Dimitri

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to more efficently read in a big matrix

2007-11-09 Thread jim holtman

If they are all numeric, then read it in with:

x - scan('yourfile', what=0)  # assuming blank separators

This will create a single vector of the values.  Now this comes in in
row order if that is what your data file has, so you could just add
dimensions of

dim(x) - c(487, 238305)

rows and columns are transposed, but if you have enough memory, you
can transpose them, or just leave the data as is, and change your
processing to reorder the rows/cols.  This should lets you read it in
in the fastest manner and then play with it.

On Nov 9, 2007 11:52 PM, affy snp [EMAIL PROTECTED] wrote:
 Hi Jim,

 Thanks a lot! I am currently running it on my laptop but without any
 success. I could upload it to a server which is with 8Gb memory
 and it might be better to go from there.

 Actually, I could have the whole file splitted in two parts,
 one with 2nd column to 95th column, the other one with
 the rest of columns. However, I need all rows for the
 two parts.

 The file is in txt format and around 480Mb, very large though.
 Yes, it is of numeric values.

 I appreciate!

 Allen






 On Nov 9, 2007 11:46 PM, jim holtman [EMAIL PROTECTED] wrote:
  If they are all numeric, you can use 'scan' to read them in.  With
  that amount of data, you will need almost 1GB to contain the single
  object.  If you want to do any processing, you will probably need a
  machine with at least 3-4GB of physical memory, preferrably a 64-bit
  version of R.  What type of computer are you using?  Do you really
  need all the data in at once, or can you process it in smaller batches
  (e.g., 20,000 rows at a time)?  So a little more detail on what you
  actually want to do with the data would be useful, since it does
  create a very large object.  BTW how large is the file you are reading
  and what is its format?  Have you considered a database with this
  amount of data?
 
 
  On Nov 9, 2007 11:39 PM, affy snp [EMAIL PROTECTED] wrote:
   Dear list,
  
   I need to read in a big table with 487 columns and 238,305 rows (row names
   and column names are supplied). Is there a code to read in the table in
   a fast way? I tried the read.table() but it seems that it takes forever :(
  
   Thanks a lot!
  
   Best,
  Allen
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
 
  --
  Jim Holtman
  Cincinnati, OH
  +1 513 646 9390
 
  What is the problem you are trying to solve?
 




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to more efficently read in a big matrix

2007-11-09 Thread jim holtman

Your data is mixed: numeric and characters/factors.  You can use
skip=1 to skip the header line, but it looks like the rest is mixed.
In you example there are only 5 columns; are you just showing the
first 5 columns?  if there is the pattern that you show, then you
would have a scan like:

scan('yourfile', what=list('', 0, '', 0, ''))

You can extend the 'what' to the size of the column that you have; e.g.

what=c(rep(c(list(''), list(0)), rep=243), list(''))



On Nov 10, 2007 12:29 AM, affy snp [EMAIL PROTECTED] wrote:
 Hi Jim,

 I tired scan() first and got

  x - scan(file=243_47mel_withnormal_expression_log2.txt, what=0)
 Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  scan() expected 'a real', got 'probe_set'

 So I guess it requires the file be numeric. But I do have row names
 and header.

 The real file looks like (I am listing the header and first 4 rows of the 
 file):

 probe_set WM_806_Signal_A WM_806_call WM_1716_Signal_A WM_1716_call
 SNP_A-1909444   1.59  B 1.48B
 SNP_A-2237149   2.24  B 1.87B
 SNP_A-2118217   2.04  AB   1.70   AB
 SNP_A-1866065   1.80  NoCall  1.39   A

 So how can I get rid of the header and row.names to use scan()?

 Thanks!

 Allen




 On Nov 10, 2007 12:18 AM, jim holtman [EMAIL PROTECTED] wrote:
  Here is an example of reading in file of 3M numbers (11MB of text
  file) on my laptop:
 
   system.time(x - scan('/tempyy', what=0))
  Read 300 items
 user  system elapsed
 6.220.166.53
   str(x)
   num [1:300] 1 2 3 4 5 6 7 8 9 10 ...
   gc()
used (Mb) gc trigger (Mb) max used (Mb)
  Ncells  169954  4.6 35  9.4   35  9.4
  Vcells 3102277 23.77803840 59.6  7200206 55.0
   object.size(x)
  [1] 2424
 
  This took about 7 seconds.  You have about 40X more data, so it should
  be interesting to see how it scales up.  The object size if 24MB, so
  40X more is about 1GB.
 
 
  On Nov 9, 2007 11:52 PM, affy snp [EMAIL PROTECTED] wrote:
   Hi Jim,
  
   Thanks a lot! I am currently running it on my laptop but without any
   success. I could upload it to a server which is with 8Gb memory
   and it might be better to go from there.
  
   Actually, I could have the whole file splitted in two parts,
   one with 2nd column to 95th column, the other one with
   the rest of columns. However, I need all rows for the
   two parts.
  
   The file is in txt format and around 480Mb, very large though.
   Yes, it is of numeric values.
  
   I appreciate!
  
   Allen
  
  
  
  
  
  
   On Nov 9, 2007 11:46 PM, jim holtman [EMAIL PROTECTED] wrote:
If they are all numeric, you can use 'scan' to read them in.  With
that amount of data, you will need almost 1GB to contain the single
object.  If you want to do any processing, you will probably need a
machine with at least 3-4GB of physical memory, preferrably a 64-bit
version of R.  What type of computer are you using?  Do you really
need all the data in at once, or can you process it in smaller batches
(e.g., 20,000 rows at a time)?  So a little more detail on what you
actually want to do with the data would be useful, since it does
create a very large object.  BTW how large is the file you are reading
and what is its format?  Have you considered a database with this
amount of data?
   
   
On Nov 9, 2007 11:39 PM, affy snp [EMAIL PROTECTED] wrote:
 Dear list,

 I need to read in a big table with 487 columns and 238,305 rows (row 
 names
 and column names are supplied). Is there a code to read in the table 
 in
 a fast way? I tried the read.table() but it seems that it takes 
 forever :(

 Thanks a lot!

 Best,
Allen

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

   
   
   
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
   
What is the problem you are trying to solve?
   
  
 
 
 
  --
 
  Jim Holtman
  Cincinnati, OH
  +1 513 646 9390
 
  What is the problem you are trying to solve?
 




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to more efficently read in a big matrix

2007-11-09 Thread jim holtman

It sounds like the data is not all numeric; you have a 'factor' in
your read statement. It also sounds like either some of your lines
are incomplete in the number of columns since are you trying to read
in a B as a numeric. So if you have a character, then one way of
doing it is:

x - scan('yourfile', what=c(list(''), rep(list(0), 486)))

This will read the first column in as a character and the other 486 as
numeric.

On Nov 10, 2007 12:19 AM, affy snp [EMAIL PROTECTED] wrote:
Thanks Jim.

I tried:

A-read.table(file=243_47mel_withnormal_expression_log2.txt,
+header=TRUE,row.names=1,colClasses=c('factor', rep('numeric',486)))

by specifying colClass but it did not work.

The error message I got is:

A-read.table(file=243_47mel_withnormal_expression_log2.txt,header=TRUE,row.names=1,colClasses=c('factor',
rep('numeric',486)))
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
scan() expected 'a real', got 'B'

Let me try what you suggested.

Thanks!

Allen

On Nov 10, 2007 12:07 AM, jim holtman [EMAIL PROTECTED] wrote:
If they are all numeric, then read it in with:

x - scan('yourfile', what=0) # assuming blank separators

This will create a single vector of the values. Now this comes in in
row order if that is what your data file has, so you could just add
dimensions of

dim(x) - c(487, 238305)

rows and columns are transposed, but if you have enough memory, you
can transpose them, or just leave the data as is, and change your
processing to reorder the rows/cols. This should lets you read it in
in the fastest manner and then play with it.

On Nov 9, 2007 11:52 PM, affy snp [EMAIL PROTECTED] wrote:
Hi Jim,

Thanks a lot! I am currently running it on my laptop but without any
success. I could upload it to a server which is with 8Gb memory
and it might be better to go from there.

Actually, I could have the whole file splitted in two parts,
one with 2nd column to 95th column, the other one with
the rest of columns. However, I need all rows for the
two parts.

The file is in txt format and around 480Mb, very large though.
Yes, it is of numeric values.

I appreciate!

Allen

On Nov 9, 2007 11:46 PM, jim holtman [EMAIL PROTECTED] wrote:
If they are all numeric, you can use 'scan' to read them in. With
that amount of data, you will need almost 1GB to contain the single
object. If you want to do any processing, you will probably need a
machine with at least 3-4GB of physical memory, preferrably a 64-bit
version of R. What type of computer are you using? Do you really
need all the data in at once, or can you process it in smaller batches
(e.g., 20,000 rows at a time)? So a little more detail on what you
actually want to do with the data would be useful, since it does
create a very large object. BTW how large is the file you are reading
and what is its format? Have you considered a database with this
amount of data?

On Nov 9, 2007 11:39 PM, affy snp [EMAIL PROTECTED] wrote:
Dear list,

I need to read in a big table with 487 columns and 238,305 rows (row
names
and column names are supplied). Is there a code to read in the table
in
a fast way? I tried the read.table() but it seems that it takes
forever :(

Thanks a lot!

Best,
Allen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to emerge two tables by taking the ave.

2007-11-11 Thread jim holtman

Here is the way to read the data and convert it.  Your data was a
dataframe with the first column being the id:

 x - read.table(textConnection(id b1   b2   b3
+ a1 246
+ a2 12   NA
+ a3 46   NA), header=TRUE)
 y - read.table(textConnection(idb1  b2  b3
+ a1  NA44
+ a2  22   NA
+ a3  122), header=TRUE)
 # look at what x  y are:
 str(x)
'data.frame':   3 obs. of  4 variables:
 $ id: Factor w/ 3 levels a1,a2,a3: 1 2 3
 $ b1: int  2 1 4
 $ b2: int  4 2 6
 $ b3: int  6 NA NA
 str(y)
'data.frame':   3 obs. of  4 variables:
 $ id: Factor w/ 3 levels a1,a2,a3: 1 2 3
 $ b1: int  NA 2 1
 $ b2: int  4 2 2
 $ b3: int  4 NA 2
 # to convert to matrix, get rid of first column
 x - as.matrix(x[,-1])
 y - as.matrix(y[,-1])
 z - mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y)
  dim(z) - dim(x)
  z
 [,1] [,2] [,3]
[1,]  2.045
[2,]  1.52  NaN
[3,]  2.542
  is.na(z) - is.nan(z)
  z
 [,1] [,2] [,3]
[1,]  2.045
[2,]  1.52   NA
[3,]  2.542




On Nov 11, 2007 10:47 PM, affy snp [EMAIL PROTECTED] wrote:
 Hi,Jim. I created two txt files as:

 x.txt

 id b1   b2   b3
 a1 246
 a2 12   NA
 a3 46   NA

 y.txt
 idb1  b2  b3
 a1  NA44
 a2  22   NA
 a3  122


 I tried it one more time but got different z:

  x-read.table(file=x.txt,header=TRUE,row.names=1,na.strings = NA)
 Warning message:
 In read.table(file = x.txt, header = TRUE, row.names = 1, na.strings = 
 NA) :
  incomplete final line found by readTableHeader on 'x.txt'
  x
   b1 b2 b3
 a1  2  4  6
 a2  1  2 NA
 a3  4  6 NA
  y-read.table(file=y.txt,header=TRUE,row.names=1,na.strings = NA)
 Warning message:
 In read.table(file = y.txt, header = TRUE, row.names = 1, na.strings = 
 NA) :
  incomplete final line found by readTableHeader on 'y.txt'
  y
   b1 b2 b3
 a1 NA  4  4
 a2  2  2 NA
 a3  1  2  2
  z - mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y)
  dim(z) - dim(x)
 Error in dim(z) - dim(x) :
  dims [product 9] do not match the length of object [3]
  z - mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y)
  z
  b1   b2   b3
 2.00 3.33 4.00
 


 Allen

 On Nov 11, 2007 10:41 PM, jim holtman [EMAIL PROTECTED] wrote:
  What did your text files look like?  It would appear that there was
  not a line feed on the last line of the file.  Also what does 'str' of
  x and y show?  It appears that one is a data frame and one is a
  matrix.  That might be causing some of the problems.
 
 
  On Nov 11, 2007 10:30 PM, affy snp [EMAIL PROTECTED] wrote:
   Hi Jim,
  
   Thanks a lot! I am wondering why I ended up getting the result as follows:
  
x-read.table(file=x.txt,header=TRUE,row.names=1,na.strings = NA)
   Warning message:
   In read.table(file = x.txt, header = TRUE, row.names = 1, na.strings = 
   NA) :
incomplete final line found by readTableHeader on 'x.txt'
x
 b1 b2 b3
   a1  2  4  6
   a2  1  2 NA
   a3  4  6 NA
y-as.matrix(read.table(file=y.txt,header=TRUE,row.names=1,na.strings 
= NA))
   Warning message:
   In read.table(file = y.txt, header = TRUE, row.names = 1, na.strings = 
   NA) :
incomplete final line found by readTableHeader on 'y.txt'
y
 b1 b2 b3
   a1 NA  4  4
   a2  2  2 NA
   a3  1  2  2
z - mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y)
z
b1   b2   b3 NA NA NA NA NA
   2.33 3.50 3.50 2.75 3.50 4.00 2.75 4.00
  NA
   4.00
dim(z) - dim(x)
z
   [,1] [,2] [,3]
   [1,] 2.33 2.75 2.75
   [2,] 3.50 3.50 4.00
   [3,] 3.50 4.00 4.00
is.na(z) - is.nan(z)
z
   [,1] [,2] [,3]
   [1,] 2.33 2.75 2.75
   [2,] 3.50 3.50 4.00
   [3,] 3.50 4.00 4.00
   
  
  
   Allen
  
  
   On Nov 11, 2007 5:27 PM, jim holtman [EMAIL PROTECTED] wrote:
Here is one way of doing it:
   
 x
 [,1] [,2] [,3]
[1,]246
[2,]12   NA
[3,]46   NA
 y
 [,1] [,2] [,3]
[1,]   NA44
[2,]22   NA
[3,]122
 z - mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y)
 dim(z) - dim(x)
 z
 [,1] [,2] [,3]
[1,]  2.045
[2,]  1.52  NaN
[3,]  2.542
 # to change it to NA
 is.na(z) - is.nan(z)
 z
 [,1] [,2] [,3]
[1,]  2.045
[2,]  1.52   NA
[3,]  2.542
   


   
   
On Nov 11, 2007 4:52 PM, affy snp [EMAIL PROTECTED] wrote:
 Dear list,

 I am new to R and very inexperienced. Sorry for the trouble.
 I have two txt files and want to merge them by taking the average.
 More specifically, for example, the txt file1, with row names and 
 column names,
 consists of 238000 rows and 196 columns. Each column corresponds
 to a sample. The data is mixed with numeric or NA. So what I plan to
 do is:

 (1) Take the 1st column from txt file 1

Re: [R] help in long loops

2007-11-12 Thread jim holtman

What happens if you have multiple matches in the comparison between
the content_feat and ob_feat?  Why don't you just iterate through the
content_feat and use 'match' to find the corresponding match in
ob_feat?  This should speed it up.  Also why are you using 'as.matrix'
when the values in the 'if' statement are objects of size 1?  Are any
of the objects dataframes?  If so, convert them to matrices for
efficiency.

On Nov 12, 2007 12:09 PM, Mahmudul Haque [EMAIL PROTECTED] wrote:
 hi,

 please help me out in the following case. seems like it stuck in some 
 where(already 7 hrs passed). what I want is to combine 4 matrix in to one 
 matrix of desired length.

 final_matrix-function(ob_feat,content_feat,link_feat,link_feat_transformed){
 complete_feat-matrix(rep(-1),nrow=11402,ncol=278)


 for(i in 1:8944)
{q-c(0)
for(j in 1:11402)
{
if(as.matrix(content_feat[i,2])==as.matrix(ob_feat[j,2]))
{complete_feat[i,1]=as.matrix(ob_feat[j,2])
complete_feat[i,2:97]=as.matrix(content_feat[i,3:98])
complete_feat[i,98:99]=as.matrix(ob_feat[j,3:4])
complete_feat[i,100:140]=as.matrix(link_feat[j,3:43])
complete_feat[i,141:278]=as.matrix(link_feat_transformed[j,3:140])
   q-1}
if(q==1)
break;
}


}
for (i in 8945:11402){
complete_feat[i,1]=as.matrix(ob_feat[i,2])
complete_feat[i,98:99]=as.matrix(ob_feat[i,3:4])
complete_feat[i,100:140]=as.matrix(link_feat[i,3:43])
complete_feat[i,141:278]=as.matrix(link_feat_transformed[i,3:140])
}

 list(complete_feat=complete_feat)
 }

 kind regards,

 mahmudul haque

  __



[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to pool a group of samples and take the ave.

2007-11-12 Thread jim holtman

Try something like this:

myAvg - rowMeans(A[,48:243])
B - A[1:47,] / myAvg

On Nov 12, 2007 1:37 PM, affy snp [EMAIL PROTECTED] wrote:
 Dear list,

 Hi! I have a table A, 238304 rows and 243 columns (representing
 samples). First of all, I would like to pool a group of samples
 from 48th column to 243rd column and take the average across
 them and make a single column,saying as the reference column.

 Second, I want to use each column of first 47 columns in table
 A divided by the reference column and end up with a new table
 B with 238304 rows and 47 columns.

 Is there any simple code which especially could do sth like
   reference_column-(A[,48]+A[,49]+...A[,243])/196
 and B-A[,1:47]/reference_column?

 Thank you very much for your help!

 Best,
  Allen

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to pool a group of samples and take the ave.

2007-11-12 Thread jim holtman

Was supposed to be:

B - A[,1:47] / myAvg

On Nov 12, 2007 3:10 PM, jim holtman [EMAIL PROTECTED] wrote:
 Try something like this:

 myAvg - rowMeans(A[,48:243])
 B - A[1:47,] / myAvg


 On Nov 12, 2007 1:37 PM, affy snp [EMAIL PROTECTED] wrote:
  Dear list,
 
  Hi! I have a table A, 238304 rows and 243 columns (representing
  samples). First of all, I would like to pool a group of samples
  from 48th column to 243rd column and take the average across
  them and make a single column,saying as the reference column.
 
  Second, I want to use each column of first 47 columns in table
  A divided by the reference column and end up with a new table
  B with 238304 rows and 47 columns.
 
  Is there any simple code which especially could do sth like
reference_column-(A[,48]+A[,49]+...A[,243])/196
  and B-A[,1:47]/reference_column?
 
  Thank you very much for your help!
 
  Best,
   Allen
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem you are trying to solve?




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time plotting problem

2007-11-12 Thread jim holtman

Now your first data point is 9/26/09; is it supposed to be 9/26/06?

On Nov 12, 2007 1:47 PM, John Kane [EMAIL PROTECTED] wrote:
 I am completely misunderstanding how to handle dates.
 I want to plot a couple of data series against some
 dates.  Simple example 1 below works fine.
 Unfortunately I have multiple observations per day (no
 time breakdowns) and observations across years.
 (example 2 very simplistic version )

 Can anyone suggest a quick fix or point me to
 something to read?  I thought that zoo might do it but
 I seem to be missing something there too.

 Any suggestions gratefully recieved.


 Example 1 consecutive dates same year.
 =
 x - days
 9/26/09
 9/27/06
 9/28/06
 9/29/06
 9/29/06
 9/29/06
 10/1/06
 10/1/06
 10/2/06
 10/3/06

 mydata - read.table(textConnection(x), header=TRUE,
 as.is=TRUE); mydata

 mydates - as.Date(mydata[,1], %m/%d/%y); mydates
 mynums - rnorm(10)
 plot(mydates, mynums)
 
 Example 2 (things go blooy!)
 non-consecutive dates different years.

 =
 x - days
 9/26/09
 9/27/06
 9/28/06
 9/29/06
 9/29/06
 9/29/06
 10/1/07  # - year changes
 10/1/07
 10/2/07
 10/3/07

 mydata - read.table(textConnection(x), header=TRUE,
 as.is=TRUE); mydata

 mydates - as.Date(mydata[,1], %m/%d/%y); mydates
 mynums - rnorm(10)
 plot(mydates, mynums)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] update matrix with subset of it where only row names match

2007-11-12 Thread jim holtman

Here is one way of doing it that uses the row and column names:

 # create test data
 mat1 - matrix(0, nrow=10, ncol=3)
 dimnames(mat1) - list(paste('row', 1:10, sep=''), LETTERS[1:3])
 mat2 - matrix(1:3, ncol=1, dimnames=list(c('row3', 'row7', 'row5'), B))
 mat2
 B
row3 1
row7 2
row5 3
 # create indexing matrix
 indx - cbind(match(rownames(mat2), rownames(mat1)), match(colnames(mat2), 
 colnames(mat1)))
 indx
 [,1] [,2]
[1,]32
[2,]72
[3,]52
 mat1[indx] - mat2
 mat1
  A B C
row1  0 0 0
row2  0 0 0
row3  0 1 0
row4  0 0 0
row5  0 3 0
row6  0 0 0
row7  0 2 0
row8  0 0 0
row9  0 0 0
row10 0 0 0


On Nov 12, 2007 4:54 PM, Martin Waller [EMAIL PROTECTED] wrote:
 I guess this has a simple solution:

 I have matrix 'mat1' which has row and column names, e.g.:

A   B   C
 row10   0   0
 row20   0   0
 
 rown0   0   0

 I have a another matrix 'mat2', essentially a subset of 'mat1' where the
 rownames are all in 'mat1' e.g.:

B
 row35
 row86
 row54   7


 I want to insert the values of matrix mat2 for column B (in reality it
 could be some or all of column names A, B or C, etc.) (same name in both
 matrices if that matters - rownames of mat2 guaranteed to be in mat1)
 into matrix mat1 where the rownames match, so final desired result is:

 matrix mat1:
A   B   C
 row10   0   0
 row20   0   0
 row30   5   0
 ...
 row80   6   0
 ...
 row54   0   7   0
 ..
 rown0   0   0

 My solution was (along the lines of):

 mat1[rownames(mat2)%in%rownames(mat1),B]=mat2[,B]

 Is there a better way? It doesn't 'feel' right?

 Thanks - hope I explained it right (its late and I had a little drink
 about an hour ago,etc).


 Martin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cleaning database: grep()? apply()?

2007-11-13 Thread jim holtman

Here is how to wittle it down for the first two parts of your
question.  I am not exactly what you are after in the third part.  Is
it that you want specific DATEs or do you want the ratio of the
DATE[max]/DATE[min]?

 x - read.table(textConnection(CODENAME  
  DATE DATA1
+ 4813'ADVANCED TELECOM'19870.013
+ 3845'ADVANCED THERAPEUTIC SYS LTD'198710.1
+ 3845'ADVANCED THERAPEUTIC SYS LTD'19892.463
+ 3845'ADVANCED THERAPEUTIC SYS LTD'19881.563
+ 2836'ADVANCED TISSUE SCI  -CL A'  19870.847
+ 2836'ADVANCED TISSUE SCI  -CL A'   1989   0.872
+ 2836'ADVANCED TISSUE SCI  -CL A'   1988
0.529), header=TRUE)
 # matches on things to delete
 delete_indx - grep(-CL A$|-OLD$|-ADS$, x$NAME)
 # delete them
 x - x[-delete_indx,]
 x
  CODE NAME DATE  DATA1
1 4813 ADVANCED TELECOM 1987  0.013
2 3845 ADVANCED THERAPEUTIC SYS LTD 1987 10.100
3 3845 ADVANCED THERAPEUTIC SYS LTD 1989  2.463
4 3845 ADVANCED THERAPEUTIC SYS LTD 1988  1.563
 # I assume you want to use NAME to check for ranges of data
 date_range - tapply(x$DATE, x$NAME, function(dates) diff(range(dates)))
 date_range
ADVANCED TELECOM ADVANCED THERAPEUTIC SYS LTD
   02
  ADVANCED TISSUE SCI  -CL A
  NA
 # delete ones with less than 3 years
 names_to_delete - names(date_range[date_range  2])
 # delete those entries
 x - x[!(x$NAME %in% names_to_delete),]
 x
  CODE NAME DATE  DATA1
2 3845 ADVANCED THERAPEUTIC SYS LTD 1987 10.100
3 3845 ADVANCED THERAPEUTIC SYS LTD 1989  2.463
4 3845 ADVANCED THERAPEUTIC SYS LTD 1988  1.563




On Nov 13, 2007 2:34 PM, Jonas Malmros [EMAIL PROTECTED] wrote:
 Dear R users,

 I have a huge database and I need to adjust it somewhat.

 Here is a very little cut out from database:

 CODENAME   DATE 
 DATA1
 4813ADVANCED TELECOM19870.013
 3845ADVANCED THERAPEUTIC SYS LTD198710.1
 3845ADVANCED THERAPEUTIC SYS LTD19892.463
 3845ADVANCED THERAPEUTIC SYS LTD19881.563
 2836ADVANCED TISSUE SCI  -CL A  19870.847
 2836ADVANCED TISSUE SCI  -CL A   1989   0.872
 2836ADVANCED TISSUE SCI  -CL A   1988   0.529

 What I need is:
 1) Delete all cases containing -CL A (and also -OLD, -ADS, etc) at the end
 2) Delete all cases that have less than 3 years of data
 3) For each remaining case compute ratio DATA1(1989) / DATA1(1987)
 [and then ratios involving other data variables] and output this into
 new database consisting of CODE, NAME, RATIOs.

 Maybe someone can suggest an effective way to do these things? I
 imagine the first one would involve grep(), and 2 and 3 would involve
 apply family of functions, but I cannot get my mind around the actual
 code to perform this adjustments. I am new to R, I do write code but
 usually it consists of for-functions and plotting. I would much
 appreciate your help.
 Thank you in advance!
 --
 Jonas Malmros
 Stockholm University
 Stockholm, Sweden

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] update matrix with subset of it where only row names match

2007-11-13 Thread jim holtman

Lets take a look at your solution:

 mat1 - matrix(0, nrow=10, ncol=3)
 dimnames(mat1) - list(paste('row', 1:10, sep=''), LETTERS[1:3])
 mat2 - matrix(1:3, ncol=1, dimnames=list(c('row3', 'row7', 'row5'), B))
 mat2
 B
row3 1
row7 2
row5 3
 mat1[rownames(mat2)%in%rownames(mat1),B]=mat2[,B]
Error in mat1[rownames(mat2) %in% rownames(mat1), B] = mat2[, B] :
  number of items to replace is not a multiple of replacement length

 rownames(mat2)%in%rownames(mat1)
[1] TRUE TRUE TRUE
 mat2[,B]
row3 row7 row5
   123

I got an error statement using your statement with %in%.  This is
because it produces a vector a 3 TRUE values are you can see above.
With recycling to will the matrix, you get the error message.  What
you want to provide is the index value of the rows to replace in.
What you would need in this case is the following statement:

 mat1[match(rownames(mat2), rownames(mat1)),B]=mat2[,B]

Now your solution would have to be changed everytime you wanted a
different column replaced.  My solution determined which of the column
names matched in the objects.

In R, there are a number of ways of doing things.  As to which is
'better', it all depends.  In most cases it is probably a matter of
'style' or what a person is used to.  Better does come into play
when you are taking about performance and there might be a factor of
10X, 100X or 1000X depending on how you used some statements.  I
happen to like to try to break things down into some simple steps so
if I have to go back later, I think I might be able to understand it
again.

If you are coming from a C/Java background, then one of hard things to
get your mind around it to think in terms of 'vectorized' operations
and also the difference in some of the ways that you create/manipulate
data structures in R vs. some other languages.

HTH

On Nov 13, 2007 4:44 PM, Martin Waller [EMAIL PROTECTED] wrote:

 jim holtman wrote:
  Here is one way of doing it that uses the row and column names:
 
  # create test data
  mat1 - matrix(0, nrow=10, ncol=3)
  dimnames(mat1) - list(paste('row', 1:10, sep=''), LETTERS[1:3])
  mat2 - matrix(1:3, ncol=1, dimnames=list(c('row3', 'row7', 'row5'), B))
  mat2
   B
  row3 1
  row7 2
  row5 3
  # create indexing matrix
  indx - cbind(match(rownames(mat2), rownames(mat1)), match(colnames(mat2), 
  colnames(mat1)))
  indx
   [,1] [,2]
  [1,]32
  [2,]72
  [3,]52
  mat1[indx] - mat2
  mat1
A B C
  row1  0 0 0
  row2  0 0 0
  row3  0 1 0
  row4  0 0 0
  row5  0 3 0
  row6  0 0 0
  row7  0 2 0
  row8  0 0 0
  row9  0 0 0
  row10 0 0 0
 
 
  On Nov 12, 2007 4:54 PM, Martin Waller [EMAIL PROTECTED] wrote:
 snip

 OK - I see that, and thanks for your response, but (and excuse my
 ignorance, less than 2 months in R...) can you help me to see why this
 is 'better' (whatever that means, if anything)?  From a newbie (at least
 my) POV, it seems less clear than my original solution. Again, please
 bear in mind I am relatively new so please be patient if I'm not seeing
 something that's obvious to yourself. I have a genuine desire to learn.


 Martin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems with splinefun()

2007-11-14 Thread jim holtman

Exactly what values do you want right away?

You can do:

result - spline(.)

and then reference 'result$x' and 'result$y'.  Can you be more
specific on your request and provide an example of what you are
currently doing (with data) and what you expect the results to be.

On Nov 14, 2007 5:31 AM, david csongor [EMAIL PROTECTED] wrote:
 I am working with the function: splinefun() ...
 When plugging in the variables, I get the function program as if
 though having only entered   'splinefun. only way to get the values
 is by
 spline(xxx,yyy, n=length(xxx)/10, ties = mean)$x  and  spline(xxx,yyy,
 n=length(xxx)/10, ties = mean)$y.
 I'm just wondering if there is something wrong with the package or if
 I'm doing something wrong... Is there a way to get the values right
 away? Has this happened to anyone?

 Thanx in advance for the the ever great help one gets asking stuff this way!!!

 /David

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get row numbers of a subset of rows

2007-11-14 Thread jim holtman

Here is a way of doing it using 'rle':

 x - read.table(textConnection( SNPChromosome  
 PhysicalPosition
+ 1 SNP_A-1909444  1   7924293
+ 2 SNP_A-2237149  1   8173763
+ 3 SNP_A-4303947  1   8191853
+ 4 SNP_A-2236359  1   8323433
+ 5 SNP_A-2205441  1   8393263
+ 6 SNP_A-1909445  1   7924293
+ 7 SNP_A-2237146  2   8173763
+ 8 SNP_A-4303946  2   8191853
+ 9 SNP_A-2236357  2   8323433
+ 10 SNP_A-2205442 2   8393263), header=TRUE)
 # use rle to get the 'runs'
 y - rle(x$Chromosome)
 # create dataframe with start/ends and values
 start - head(cumsum(c(1, y$lengths)), -1)
 index - data.frame(values=y$values, start=start, end=start + y$lengths - 1)

 index
  values start end
1  1 1   6
2  2 7  10



On Nov 14, 2007 10:56 AM, affy snp [EMAIL PROTECTED] wrote:
 Hello list,

 I read in a txt file using

 B-read.table(file=data.snp,header=TRUE,row.names=NULL)

 by specifying the row.names=NULL so that the rows are numbered.
 Below is an example after how the table looks like using
 B[1:10,1:3]


  SNPChromosome  PhysicalPosition
 1 SNP_A-1909444  1   7924293
 2 SNP_A-2237149  1   8173763
 3 SNP_A-4303947  1   8191853
 4 SNP_A-2236359  1   8323433
 5 SNP_A-2205441  1   8393263
 6 SNP_A-1909445  1   7924293
 7 SNP_A-2237146  2   8173763
 8 SNP_A-4303946  2   8191853
 9 SNP_A-2236357  2   8323433
 10 SNP_A-2205442 2   8393263

 I am wondering if there is a way to return the start and end row numbers
 for a subset of rows.

 For example, If I specify B[,2]=1, I would like to get
 start=1 and end=6

 if B[,2]=2, then start=7 and end=10

 Is there any way in R to quickly do this?

 Thanks a bunch!

 Allen

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get row numbers of a subset of rows

2007-11-14 Thread jim holtman

That works for the specific value of '1', but you would have to repeat
it for other values in the column.  If you had 100 different ranges in
that column, what would you do?  Here is another solution using
'range' on the same data:

 tapply(seq_len(nrow(x)), x$Chromosome, range)
$`1`
[1] 1 6

$`2`
[1]  7 10


On Nov 14, 2007 12:04 PM, Bert Gunter [EMAIL PROTECTED] wrote:
 Am I missing something? ...

 Why not: range(seq(nrow(B))[B[,2]==1] ) ?? ## note: == not =

 Alternatively, and easily generalized (to start with a frame which is a
 subset of the original and any subset of rows, contiguous or not)

 range(as.numeric(row.names(B)[B[,2]==1]))

 Again, am I missing something that makes this obvious solution impossible?
 (Wouldn't be the first time.)

 Bert Gunter
 Genentech Nonclinical Statistics



 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
 Behalf Of jim holtman
 Sent: Wednesday, November 14, 2007 8:39 AM
 To: affy snp
 Cc: r-help@r-project.org
 Subject: Re: [R] How to get row numbers of a subset of rows

 Here is a way of doing it using 'rle':

  x - read.table(textConnection( SNPChromosome
 PhysicalPosition
 + 1 SNP_A-1909444  1   7924293
 + 2 SNP_A-2237149  1   8173763
 + 3 SNP_A-4303947  1   8191853
 + 4 SNP_A-2236359  1   8323433
 + 5 SNP_A-2205441  1   8393263
 + 6 SNP_A-1909445  1   7924293
 + 7 SNP_A-2237146  2   8173763
 + 8 SNP_A-4303946  2   8191853
 + 9 SNP_A-2236357  2   8323433
 + 10 SNP_A-2205442 2   8393263), header=TRUE)
  # use rle to get the 'runs'
  y - rle(x$Chromosome)
  # create dataframe with start/ends and values
  start - head(cumsum(c(1, y$lengths)), -1)
  index - data.frame(values=y$values, start=start, end=start + y$lengths -
 1)
 
  index
  values start end
 1  1 1   6
 2  2 7  10
 


 On Nov 14, 2007 10:56 AM, affy snp [EMAIL PROTECTED] wrote:
  Hello list,
 
  I read in a txt file using
 
  B-read.table(file=data.snp,header=TRUE,row.names=NULL)
 
  by specifying the row.names=NULL so that the rows are numbered.
  Below is an example after how the table looks like using
  B[1:10,1:3]
 
 
   SNPChromosome  PhysicalPosition
  1 SNP_A-1909444  1   7924293
  2 SNP_A-2237149  1   8173763
  3 SNP_A-4303947  1   8191853
  4 SNP_A-2236359  1   8323433
  5 SNP_A-2205441  1   8393263
  6 SNP_A-1909445  1   7924293
  7 SNP_A-2237146  2   8173763
  8 SNP_A-4303946  2   8191853
  9 SNP_A-2236357  2   8323433
  10 SNP_A-2205442 2   8393263
 
  I am wondering if there is a way to return the start and end row numbers
  for a subset of rows.
 
  For example, If I specify B[,2]=1, I would like to get
  start=1 and end=6
 
  if B[,2]=2, then start=7 and end=10
 
  Is there any way in R to quickly do this?
 
  Thanks a bunch!
 
  Allen
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem you are trying to solve?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] enumeration variable by groups

2007-11-14 Thread jim holtman

Here is a way to do it:

 x - scan(textConnection(1 48  1 45  2 50  2 42  1 41  2 51  1 52  1 43  2 
 52), what=0L)
Read 18 items
 x - matrix(x, ncol=2, byrow=TRUE)
 colnames(x) - c('gender', 'score')
 x
  gender score
 [1,]  148
 [2,]  145
 [3,]  250
 [4,]  242
 [5,]  141
 [6,]  251
 [7,]  152
 [8,]  143
 [9,]  252
 # split out categories
 y - split(seq_len(nrow(x)), x[,1])
 # combine into new matrix
 x.new - do.call('rbind', lapply(y, function(.row) cbind(x[.row,], 
 index=seq_along(.row
 x.new
  gender score index
 [1,]  148 1
 [2,]  145 2
 [3,]  141 3
 [4,]  152 4
 [5,]  143 5
 [6,]  250 1
 [7,]  242 2
 [8,]  251 3
 [9,]  252 4





On Nov 14, 2007 12:58 PM, lamack lamack [EMAIL PROTECTED] wrote:





 Dear all, How can I create an enumeration variable by groups?

 I have:
 gender score  1 48  1 45  2 50  2 42  1 41  2 51  1 52  1 43  2 52

 and Y would like to get:

genderscoreindex
 148   1
 145   2
 141   3
 152   4
 143   5
 250   1
 242   2
 251   3
 252   4
 best regards
 _

 [[replacing trailing spam]]

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Romoving elements from a vector. Looking for the opposite of c(), New user

2007-11-15 Thread jim holtman

You can also check out the 'set' operations: setdiff, intersect, union.

On Nov 15, 2007 12:08 PM, John Kane [EMAIL PROTECTED] wrote:
 I think you've read Thomas's request in reverse. and
 what he want is:
 x[!x %in% z]

 Thanks for the %in% approach BTW.

 --- Charilaos Skiadas [EMAIL PROTECTED] wrote:

 
  On Nov 15, 2007, at 9:15 AM, Thomas Fr��jd

 wrote:
 
   Hi
  
   I have three vectors say x, y, z. One of them, x
  contains observations
   on a variable. To x I want to append all
  observations from y and
   remove all from z. For appending c() is easily
  used
  
   x - c(x,y)
  
   But how do I remove all observations in z from x?
  You can say I am
   looking for the opposite of c().
 
  If you are looking for the opposite of c, provided
  you want to remove
  the first part of things, then perhaps this would
  work:
 
  z-c(x,y)
  z[-(1:length(x))]
 
  However, if you wanted to remove all appearances of
  elements of x
  from c(x,y), regardless of whether those elements
  appear in the x
  part of in the y part, I think you would want:
 
  z[!z %in% x]
 
  Probably there are other ways.
 
  Welcome to R!
 
   Best regards
 
  Haris Skiadas
  Department of Mathematics and Computer Science
  Hanover College
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems working with large data

2007-11-15 Thread jim holtman

A little more information might be useful.  If your matrix is numeric,
then a single copy will require about 250MB of memory.  What type of
system are you on and how much memory do you have?  When you say you
are having problems, what are they?  Is it a problem reading the data
in?  Are you getting allocation errors?  Is your system paging?  If
you have 2GB of memory, you should be fine depending on how many
copies of the data you have.

On Nov 15, 2007 10:53 AM,  [EMAIL PROTECTED] wrote:

 Hi,

 I'm working with a numeric matrix with 55columns and 581012 rows and I'm 
 having problems in allocation of memory using some of the functions in R: for 
 example lda, rda (library MASS), princomp(package mva) and mvnorm.etest 
 (energy package). I've read tips to use less memory in help(read.table) and 
 managed to use some of this functions, but haven't been able to work with 
 mvnorm.etest.

 I would like to know the better way to solve this problem, as well as doing 
 it faster.

 Best regards,

 Pedro Marques

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read complicated file

2007-11-15 Thread jim holtman

 -198
 175 -17912  -47 27  186 -18030  0   -25   
   -91 164 117 -155188 149 -28 24  5   20  
 -31 52  -78 45  -133-63 -77 75  -183
 130 -119-47 -8  -40 64  209 166 48  -65   
   -244111 110 -106-248-21 -1732   -38 111 
 30  -174257
 59  -180-73 -278-124-22 107 164 73  160   
   -136-37 119 -10 100 -4  0   182 152 35  
 256 70  148 -9  -4  0   49  128 -44 
 21  36  143 -114-59 -1107   -40 -80 -70   
   99  27  -27 184 293 257 -83 44  101 65  
 -68 -167158 94  -39 130 59  -34934  
 47  -10870  141 55  138 -20 -83 81  -15   
   74  -107140 -280107 -32583  125 -64 200 
 -122123 -280
 21
 ...

 The first bit up to END can be skipped. That's the first 90 lines.

 Then I need to do something like this:
 while data still exist in the file
 {
 skip 3 lines
 scan 81 values into temp
 scan 82nd value, which is 11, 12, 21, 22. Depending on value, temp is
 added to one of these vars
 }

 The data are written in clumps.
 Each clump has 3 lines with info I don't need.
 Then it has 81 values which are the actual data I want to read into
 some variable temp
 Then the 82nd value tells me which of 4 variables to add temp onto.

 Any tips on how to approach this using scan() greatly appreciated. I
 know I can use skip as an argument to scan.

 Thanks very much for any help!

 Bill

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] overlapping intervals

2008-02-01 Thread jim holtman

Here is one way of doing it:

 c.i - read.table(textConnection( 17130612   17587118
+ 17712302   18221688
+ 21225764   21387314
+ 25012714   30748348
+ 33852816   34480192
+ 36012944   36209144
+ 36252300   36280276
+ 36737468   36971144
+ 43693832   43878548))

 d.i - read.table(textConnection(  17712302   18100404
+  21203780   21387314
+  25012714   30748348
+  33852816   34384588
+  34794536   35996440))
 closeAllConnections()

 # setup data.frame for comparing
 x - rbind(data.frame(t=c.i$V1, oper=1, type='c'),
+ data.frame(t=c.i$V2, oper=-1, type='c'),
+ data.frame(t=d.i$V1,oper=1, type='d'),
+ data.frame(t=d.i$V2, oper=-1, type='d'))

 # put in time order
 x - x[order(x$t),]
 # determine overlaps
 x$over - cumsum(x$oper)
 x
  t oper type over
1  171306121c1
10 17587118   -1c0
2  177123021c1
19 177123021d2
24 18100404   -1d1
11 18221688   -1c0
20 212037801d1
3  212257641c2
12 21387314   -1c1
25 21387314   -1d0
4  250127141c1
21 250127141d2
13 30748348   -1c1
26 30748348   -1d0
5  338528161c1
22 338528161d2
27 34384588   -1d1
14 34480192   -1c0
23 347945361d1
28 35996440   -1d0
6  360129441c1
15 36209144   -1c0
7  362523001c1
16 36280276   -1c0
8  367374681c1
17 36971144   -1c0
9  436938321c1
18 43878548   -1c0
 # assuming that c  d don't overlap themselves, oper=2 indicate an overlap
 overlap - which(x$over == 2)
 # print overlaps
 for (i in overlap){
+ print(x[i + c(-1,0,1,2),])
+ }
  t oper type over
2  177123021c1
19 177123021d2
24 18100404   -1d1
11 18221688   -1c0
  t oper type over
20 212037801d1
3  212257641c2
12 21387314   -1c1
25 21387314   -1d0
  t oper type over
4  250127141c1
21 250127141d2
13 30748348   -1c1
26 30748348   -1d0
  t oper type over
5  338528161c1
22 338528161d2
27 34384588   -1d1
14 34480192   -1c0



On Feb 1, 2008 4:03 PM, mohamed nur anisah [EMAIL PROTECTED] wrote:
 hi!!

  Below I have 4 columns vector of c and d which are unequal in length.These c 
 and d have 2 columns each where these 2 columns represent an interval values. 
 How am I going to get an overlapping over these interval values?? Please help 
 me sort this problem!! Thanks in advance..

 c   d
  17130612   17587118 17712302   18100404
 17712302   18221688 21203780   21387314
 21225764   21387314 25012714   30748348
 25012714   30748348 33852816   34384588
 33852816   34480192 34794536   35996440
 36012944   36209144
 36252300   36280276
 36737468   36971144
 43693832   43878548


 -

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting 3 vectors on one graph.

2008-02-02 Thread jim holtman

What you want to do is to use 'plot' for the initial vector and then
lines to add the other two.  You will have to set the range of the
y-axis in the initial call to plot.  The sequence would probably
look like this:

plot(a, ylim=range(a, b, c), col='black')
lines(b, col='red')
lines(c, col='green')

On Feb 1, 2008 7:06 PM, cvandy [EMAIL PROTECTED] wrote:

 I'm an R newbie and am trying to plot 3 vectors, say a,b,c.  I have
 downloaded 3 R manuals and searched your forum.  There are plenty of X vs Y
 examples, but cannot find how to plot 3, or more vectors one one graph.  I'm
 sure I overlooked something.
 Thanks for any help.
 CHV
 --
 View this message in context: 
 http://www.nabble.com/Plotting-3-vectors-on-one-graph.-tp15236552p15236552.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ignore error t.test in a loop

2008-02-02 Thread jim holtman

?try

e.g.,

for(i in x) try(t.test(...))



On Feb 2, 2008 7:43 AM, My Coyne [EMAIL PROTECTED] wrote:
 Hi,

 I place a t.test in a loop and would like to continue to process the loop
 even when t.test encounter error.  How do I do that?For example, in one
 iteration, the data is completely constant and t.test gives error, the
 entire program terminates.  I would like to write the information out to a
 file, and the loop should continue.



 Thanks



 My D. Coyne


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculation fraction/ratio

2008-02-02 Thread jim holtman

Is this what you want?

 x - read.table(textConnection(Index A
+ 1 1
+ 1 2
+ 1 3
+ 2 4
+ 2 3
+ 3 7
+ 3 9
+ 3 3
+ 3 1), header=TRUE)
 data.frame(x, value=ave(x$A, x$Index, FUN=function(z) z / sum(z)))
  Index A value
1 1 1 0.167
2 1 2 0.333
3 1 3 0.500
4 2 4 0.5714286
5 2 3 0.4285714
6 3 7 0.350
7 3 9 0.450
8 3 3 0.150
9 3 1 0.050


On Feb 1, 2008 6:19 PM, YIHSU CHEN [EMAIL PROTECTED] wrote:
 Dear R users

 I wonder if there is a quick way to calculate the ratio/fraction of a
 list/data frame.  For example, if I have a data frame with two fields:
 Index and A.  I would like to know the fractions of A's within the same
 Index.   That is, for Index =1, three fractions will be 1/(1+2+3)=0.17,
 2/(1+2+3)=0.33, and 3/1+2+3=0.5.   Likewise for Index =2 and Index 3.
 So, I then generate a new vector of 0.17, 0.33, 0.5... ,etc.


  Index A  1 1  1 2  1 3  2 4  2 3  3 7  3 9  3 3  3 1

 Thank you so much

 Yihsu

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Call for papers for CMG'08

2008-02-02 Thread jim holtman

UseR,

I am on the conference committee for CMG'08 (Computer Measurement
Group - www.cmg.org).  At the last several conferences I have
presented papers, and workshops, on the use of R in computer
performance analysis and visualization.  I am sending this out to see
if anyone would be interested in submitting a paper for our next
conference that will be held in Las Vegas in December, 2008. If so,
you can check the website for details, or ask me.

I am especially interested in papers on the visualization of data
(since this will be one of the hot tracks at the conference), and
especially using R, since this is an opportunity to introduce an
audience whose job it is to analyze data to the power of R.

If you don't want to submit a paper, but have some ideas, or
experiences, for using R as it relates to computer performance, please
let me know since I might be able to use some of the material to show
how other organizations are using R in this context.

Please feel free to contact me with any questions or comments.

Jim Holtman
[EMAIL PROTECTED]
+1 513 646 9390


What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] precision in seq

2008-02-04 Thread jim holtman

FAQ 7.31

On 2/4/08, Eric Elguero [EMAIL PROTECTED] wrote:
 Hi everybody,

 this is a warning more than a question.

 I noticed that seq produces approximate results:

  seq(0,1,0.05)[19]==0.9
 [1] TRUE
  seq(0,1,0.05)[20]==0.95
 [1] FALSE
  seq(0,1,0.05)[21]==1
 [1] TRUE

  seq(0,1,0.05)[20]-0.95
 [1] 1.110223024625157e-16

 I do not understand why 0.9 and 1 are correct (within some
 tolerance or strictly exact?)  and 0.95 is not.

 this one works:

  ((0:20)/20)[20]==0.95
 [1] TRUE

 Eric Elguero

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting identical data in a column

2008-02-04 Thread jim holtman

Is this what you want?

 x - read.table(textConnection(  chrN start end
+ 1 chr1  11122333  11122633
+ 2 chr1  11122333  11122633
+ 3 chr3  11122333  11122633
+ 8 chr3 111273334 111273634
+ 7 chr2  12122334  12122634
+ 4 chr1  21122377  21122677
+ 5 chr2  33122355  33122655
+ 6 chr2  33122355  33122655), header=TRUE)
 x$count - ave(x$start, x$start, FUN=length)
 x
  chrN start   end count
1 chr1  11122333  11122633 3
2 chr1  11122333  11122633 3
3 chr3  11122333  11122633 3
8 chr3 111273334 111273634 1
7 chr2  12122334  12122634 1
4 chr1  21122377  21122677 1
5 chr2  33122355  33122655 2
6 chr2  33122355  33122655 2



On 2/4/08, joseph [EMAIL PROTECTED] wrote:
 Hi Peter
 I have the following data frame with chromosome name, start and end positions:
   chrN start end
 1 chr1  11122333  11122633
 2 chr1  11122333  11122633
 3 chr3  11122333  11122633
 8 chr3 111273334 111273634
 7 chr2  12122334  12122634
 4 chr1  21122377  21122677
 5 chr2  33122355  33122655
 6 chr2  33122355  33122655
 I would like to count the positions that have the same start and add a new 
 column with the count number;
 the new data frame should look like this:
  chrN
  start end  count
 1 chr1  11122333  11122633   3
 2 chr1  11122333  11122633   3
 3 chr3  11122333  11122633   3
 8 chr3 111273334 111273634 1
 7 chr2  12122334  12122634   1
 4 chr1  21122377  21122677   1
 5 chr2  33122355  33122655   2
 6 chr2  33122355  33122655   2
 Can you please show me how to achieve this?
 Thanks
 Joseph


  
 
 Be a better friend, newshound, and


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] precision in seq

2008-02-05 Thread jim holtman

If you want 0,0.05,0.1,...0.95,1.00  then think about encoding as characters:

 sprintf(%.2f, seq(0, 1, 0.05))
 [1] 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40
0.45 0.50 0.55
[13] 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00


then you won't have the problem of dealing with floating point
numbers, and still have the ability to later convert the character
strings back to numeric for processing.  Character strings will give
you the exact matches that you were expecting (but won't get) with
floating point.

On 2/5/08, Eric Elguero [EMAIL PROTECTED] wrote:
 thank you to all who answered.


  0+0.05+
 + 0.05+0.05+0.05+0.05+0.05+0.05+
 + 0.05+0.05+0.05+0.05+0.05+0.05+
 + 0.05+0.05+0.05+0.05+0.05+0.05 - 0.95
 [1] 3.330669e-16

  seq(0,1,0.05)[20] - 0.95
 [1] 1.110223e-16

  0+19*0.05 - 0.95
 [1] 1.110223e-16

 so this is the way seq calculates. I would have guessed
 that addition was more accurate than multiplication,
 but that is not the case.

 this one however bothers me:
  19/20-0.95
 [1] 0


 I noticed this problem when I tried to extract rows of a matrix
 according to whether values of some vector where in the set
 (0,0.05,...,0.95,1), with something like x%in%seq(0,1,0.05)
 Now I understand that I should not use this construction
 unless x is of type integer. Would you agree?

 Eric Elguero

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vector loop

2008-02-05 Thread jim holtman

Not too sure of exactly what you want to do with the loop.  Here is
one that prints out the values:

 x - 1:10
 for (i in x) print(i)
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10



On 2/5/08, mohamed nur anisah [EMAIL PROTECTED] wrote:
 hi,

  I'm in my learning process of doing a programming with for loop. How to 
 make a loop of a vector of length 10 where elements are 1,2,3,4,5,6,7,8,9,10. 
 Any suggestion needed!! Many thanks.

  Cheers,
  Anisah


 -

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error message from apply()

2008-02-05 Thread jim holtman

The error message was coming from the call to colMeans where 'x' was
not a matrix; it was a vector that resulted from the 'apply' call.
Did you intend to use 'mean' instead like this example:

 data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, -0.7539687,
+ 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, 0.9936430,
+ 0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152, -0.7458182, -
+ 0.2231588, -0.5051651, -0.74871174, 0.9450363, 0.4797723, -0.9033313, -
+ 0.5825065, 0.8523742, 0.7402795, -0.7134312, -0.8162558, 0.6345438, -
+ 0.05704138), 3,10)

 num - apply(data2_1, 2, function(x) {sum(x  (mean(x, na.rm = TRUE) +
+ 1*sd(x, na.rm = TRUE)), na.rm = TRUE)})
 num
 [1] 0 1 1 1 0 0 1 1 0 0



On Feb 5, 2008 8:43 PM, Ng Stanley [EMAIL PROTECTED] wrote:
 Hi,

 I keep getting the error message. Please help.

 Error in colMeans(x, na.rm = TRUE) :   'x' must be an array of at least two
 dimensions

 The codes are:

 data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, -0.7539687,
 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, 0.9936430,
 0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152, -0.7458182, -
 0.2231588, -0.5051651, -0.74871174, 0.9450363, 0.4797723, -0.9033313, -
 0.5825065, 0.8523742, 0.7402795, -0.7134312, -0.8162558, 0.6345438, -
 0.05704138), 3,10)

 num - apply(data2_1, 2, function(x) {sum(x  (colMeans(x, na.rm = TRUE) +
 1*sd(x, na.rm = TRUE)), na.rm = TRUE)})

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to plot an user-defined function

2008-02-05 Thread jim holtman

Your function 'll' only returns a single value when passed a vector:

 x - seq(0,2,.1)
 ll(x)
[1] -7.571559


'plot' expects to pass a vector to the function and have it return a
vector of the same length; e.g.,

 sin(x)
 [1] 0. 0.09983342 0.19866933 0.29552021 0.38941834 0.47942554
0.56464247 0.64421769 0.71735609
[10] 0.78332691 0.84147098 0.89120736 0.93203909 0.96355819 0.98544973
0.99749499 0.99957360 0.99166481
[19] 0.97384763 0.94630009 0.90929743


So you either have to rewrite your function, or have a loop that will
evaluate the function at each individual point and then plot it.

On Feb 5, 2008 7:06 PM, John Smith [EMAIL PROTECTED] wrote:
 Dear R-users,

 Suppose I have defined a likelihood function as ll(tau), how can I plot this
 likelihood function by calling it by plot?

 I want to do it like this:

 ll - function(tau)
  {
w - 1 / (s^2 + tau^2)
mu - sum(theta * w) / sum(w)
-1/2*sum((theta-mu)^2*w -log(w))
  }
 plot(ll, 0, 2)



 But have the following error:
 Error in xy.coords(x, y, xlabel, ylabel, log) :
  'x' and 'y' lengths differ
 In addition: Warning messages:
 1: In s^2 + tau^2 :
  longer object length is not a multiple of shorter object length
 2: In theta * w :
  longer object length is not a multiple of shorter object length
 3: In (theta - mu)^2 * w :
  longer object length is not a multiple of shorter object length


 Thanks

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error message from apply()

2008-02-05 Thread jim holtman

You matrix only has 3 rows, so when you do 'apply(data2_1,2,...)' you
are extracting columns which only have a length of 3 while thr has a
length of 10

 str(data2_1)
 num [1:3, 1:10]  0.958  0.271 -0.950 -0.130 -0.754 ...
 str(thr)
 num [1:10]  1.060  0.528  0.104  0.925 -0.256 ...

 That is why you get the error message of a size mismatch.

On Feb 5, 2008 10:21 PM, Ng Stanley [EMAIL PROTECTED] wrote:
 Replacing colMeans by mean removed the warning messages. Thanks

 However, when I precompute thr, and pass it to function(x), the error
 returns. Using the shorter data2_1, doesn't give any warnings. What is
 happening ?

 data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, -0.7539687,
 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, 0.9936430,
 0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152, -0.7458182, -
 0.2231588, -0.5051651, -0.74871174, 0.9450363, 0.4797723, -0.9033313, -
 0.5825065, 0.8523742, 0.7402795, -0.7134312, -0.8162558, 0.6345438, -
 0.05704138), 3,10)
 # data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, -
 0.7539687, 0.5344464, -0.8205933, 0.1581723, -0.5351588), 3,3)

 thr - colMeans(data2_1, na.rm = TRUE) + sd(data2_1, na.rm = TRUE)

 num - apply(data2_1, 2, function(x) {
sum(x  (thr), na.rm = TRUE)
 })



 On 2/6/08, jim holtman [EMAIL PROTECTED] wrote:
 
  The error message was coming from the call to colMeans where 'x' was
  not a matrix; it was a vector that resulted from the 'apply' call.
  Did you intend to use 'mean' instead like this example:
 
   data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, -
  0.7539687,
  + 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, 0.9936430,
  + 0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152, -0.7458182, -
  + 0.2231588, -0.5051651, -0.74871174, 0.9450363, 0.4797723, -0.9033313, -
  + 0.5825065, 0.8523742, 0.7402795, -0.7134312, -0.8162558, 0.6345438, -
  + 0.05704138), 3,10)
  
   num - apply(data2_1, 2, function(x) {sum(x  (mean(x, na.rm = TRUE) +
  + 1*sd(x, na.rm = TRUE)), na.rm = TRUE)})
   num
  [1] 0 1 1 1 0 0 1 1 0 0
  
 
 
  On Feb 5, 2008 8:43 PM, Ng Stanley [EMAIL PROTECTED] wrote:
   Hi,
  
   I keep getting the error message. Please help.
  
   Error in colMeans(x, na.rm = TRUE) :   'x' must be an array of at least
  two
   dimensions
  
   The codes are:
  
   data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, -
  0.7539687,
   0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, 0.9936430,
   0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152, -0.7458182, -
   0.2231588, -0.5051651, -0.74871174, 0.9450363, 0.4797723, -0.9033313, -
   0.5825065, 0.8523742, 0.7402795, -0.7134312, -0.8162558, 0.6345438, -
   0.05704138), 3,10)
  
   num - apply(data2_1, 2, function(x) {sum(x  (colMeans(x, na.rm = TRUE)
  +
   1*sd(x, na.rm = TRUE)), na.rm = TRUE)})
  
  [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
 
  --
  Jim Holtman
  Cincinnati, OH
  +1 513 646 9390
 
  What is the problem you are trying to solve?
 

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] inserting text lines in a dat frame

2008-02-06 Thread jim holtman

Try this and see if it is what you want:

x - read.table(textConnection( V1V2 V3
1 chr1 11255 55
2 chr1 11320 29
3 chr1 11400 45
4 chr2 21680 35
5 chr2 21750 84
6 chr2 21820 29
7 chr2 31890 46
8 chr3 32100 29
9 chr3 52380 29
10 chr3 66450 46 ), header=TRUE)
cat(browser position chr1:1-1\nrowser hide all\n, file='tempxx.txt')
result - lapply(split(x, x$V1), function(.chro){
cat(sprintf(track type=wiggle_0 name=sample description=%s_sample
visibility=full\nvariableStep chrom=%s span=1\n,
as.character(.chro$V1[1]), as.character(.chro$V1[1])),
file=tempxx.txt, append=TRUE)
write.table(.chro, sep=\t, file=tempxx.txt, append=TRUE,
col.names=FALSE, row.names=FALSE)
})



On Feb 5, 2008 11:22 PM, joseph [EMAIL PROTECTED] wrote:





 Hi Jim
  I am trying to prepare a bed file to load as accustom track on the UCSC
 genome browser.
 I have a data frame that looks like the one below.
  x
  V1V2 V3
 1 chr1 11255 55
 2 chr1 11320 29
 3 chr1 11400 45
 4 chr2 21680 35
 5 chr2 21750 84
 6 chr2 21820 29
 7 chr2 31890 46
 8 chr3 32100 29
 9 chr3 52380
  29
 10 chr3 66450 46
 I would like to insert the following 4 lines at the beginning:
 browser position chr1:1-1
 browser hide all
 track type=wiggle_0 name=sample description=chr1_sample visibility=full
 variableStep chrom=chr1 span=1
 and then insert 2 lines before each chromosome:
 track type=wiggle_0 name=sample description=chr2_sample visibility=full
 vriableStep chrom=chr2 span=1
 The final result should be tab delimited file that looks like this:
 browser position chr1:1-1
 browser hide all
 track type=wiggle_0 name=sample description=chr1_sample visibility=full
 variableStep chrom=chr1 span=1
 chr1 11255 55
 chr1 11320 29
 chr1 11400 45
 track type=wiggle_0 name=sample description=chr2_sample visibility=full
 variableStep chrom=chr2 span=1
 chr2 21680 35
 chr2 21750 84
 chr2 21820 29
 track type=wiggle_0 name=sample description=chr3_sample visibility=full
 variableStep chrom=chr3
  span=1
 chr3 32100 29
 chr3 32170 45
 chr3 32240 45
 Any kind of help or guidance will be much appreciated.
 Joseph

 
 Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it
 now.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error message from apply()

2008-02-06 Thread jim holtman

Is 'thr' supposed to be the mean and sd of all the values in data2_1?
If so, then

thr - mean(data2_1, na.rm=TRUE) + sd(data2_1,na.rm=TRUE)

I am not exactly sure of what is the problem that you are trying to
solve.  You just have to make sure that the object you are creating
by precomputing has the right structure to do what you want.

On Feb 6, 2008 12:56 AM, Stanley Ng [EMAIL PROTECTED] wrote:
 Now I understand why 3 by 3 data2_1 works and not the 3x10 data2_1.

 How can I precompute thr and pass it safely to function(x) for the column
 operation ?


 -Original Message-
 From: jim holtman [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, February 06, 2008 11:33
 To: Ng Stanley
 Cc: r-help
 Subject: Re: [R] error message from apply()

 You matrix only has 3 rows, so when you do 'apply(data2_1,2,...)' you are
 extracting columns which only have a length of 3 while thr has a length of
 10

  str(data2_1)
  num [1:3, 1:10]  0.958  0.271 -0.950 -0.130 -0.754 ...
  str(thr)
  num [1:10]  1.060  0.528  0.104  0.925 -0.256 ...
 
  That is why you get the error message of a size mismatch.

 On Feb 5, 2008 10:21 PM, Ng Stanley [EMAIL PROTECTED] wrote:
  Replacing colMeans by mean removed the warning messages. Thanks
 
  However, when I precompute thr, and pass it to function(x), the error
  returns. Using the shorter data2_1, doesn't give any warnings. What is
  happening ?
 
  data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772,
  -0.7539687, 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065,
  0.9936430, 0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152,
  -0.7458182, - 0.2231588, -0.5051651, -0.74871174, 0.9450363,
  0.4797723, -0.9033313, - 0.5825065, 0.8523742, 0.7402795, -0.7134312,
  -0.8162558, 0.6345438, - 0.05704138), 3,10) # data2_1 -
  matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, - 0.7539687,
  0.5344464, -0.8205933, 0.1581723, -0.5351588), 3,3)
 
  thr - colMeans(data2_1, na.rm = TRUE) + sd(data2_1, na.rm = TRUE)
 
  num - apply(data2_1, 2, function(x) {
 sum(x  (thr), na.rm = TRUE)
  })
 
 
 
  On 2/6/08, jim holtman [EMAIL PROTECTED] wrote:
  
   The error message was coming from the call to colMeans where 'x' was
   not a matrix; it was a vector that resulted from the 'apply' call.
   Did you intend to use 'mean' instead like this example:
  
data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772,
-
   0.7539687,
   + 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065,
   + 0.9936430, 0.2278786, -0.8160700, -0.3314779, -0.4047975,
   + 0.1168152, -0.7458182, - 0.2231588, -0.5051651, -0.74871174,
   + 0.9450363, 0.4797723, -0.9033313, - 0.5825065, 0.8523742,
   + 0.7402795, -0.7134312, -0.8162558, 0.6345438, - 0.05704138), 3,10)
   
num - apply(data2_1, 2, function(x) {sum(x  (mean(x, na.rm =
TRUE) +
   + 1*sd(x, na.rm = TRUE)), na.rm = TRUE)})
num
   [1] 0 1 1 1 0 0 1 1 0 0
   
  
  
   On Feb 5, 2008 8:43 PM, Ng Stanley [EMAIL PROTECTED] wrote:
Hi,
   
I keep getting the error message. Please help.
   
Error in colMeans(x, na.rm = TRUE) :   'x' must be an array of at
 least
   two
dimensions
   
The codes are:
   
data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772,
-
   0.7539687,
0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065,
0.9936430, 0.2278786, -0.8160700, -0.3314779, -0.4047975,
0.1168152, -0.7458182, - 0.2231588, -0.5051651, -0.74871174,
0.9450363, 0.4797723, -0.9033313, - 0.5825065, 0.8523742,
0.7402795, -0.7134312, -0.8162558, 0.6345438, - 0.05704138), 3,10)
   
num - apply(data2_1, 2, function(x) {sum(x  (colMeans(x, na.rm =
TRUE)
   +
1*sd(x, na.rm = TRUE)), na.rm = TRUE)})
   
   [[alternative HTML version deleted]]
   
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
   
  
  
  
   --
   Jim Holtman
   Cincinnati, OH
   +1 513 646 9390
  
   What is the problem you are trying to solve?
  
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem you are trying to solve?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying

Re: [R] matrix loop

2008-02-06 Thread jim holtman

What exactly are you intending the loop to do?  Why do you have the
'as.matrix' in the middle of the loop?  Where was 'y' defined?   Does
this do what you want?

 outer(1:5, 1:10, +)
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]23456789   1011
[2,]3456789   10   1112
[3,]456789   10   11   1213
[4,]56789   10   11   12   1314
[5,]6789   10   11   12   13   1415



On Feb 6, 2008 7:52 PM, mohamed nur anisah [EMAIL PROTECTED] wrote:
 Dear list,

  I'm trying to make a loop of a (5x10) matrix and below are my codes. Could 
 anybody help me figure out why my loop is not working. Thanks in advance!!


  m-1:5
 n-1:10
 for(i in 1:length(m))
 { for(j in 1:length(n))
  {
  y[i,j]=sum(i,j)
  y-as.matrix(y[i,j])
  }
  }
  cheers,
  Anisah


 -

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subsetting a data.frame degenerates at one column?

2008-02-08 Thread jim holtman

try:

input[,targets, drop=FALSE]

see:

?[

for an explanation.


On 2/8/08, Allen S. Rout [EMAIL PROTECTED] wrote:

 Greetings.

 At the moment, I'm applying R to some AIX 'nmon' output, trying to get
 a handle on some disk performance metrics.  In case anyone's
 interested:

 http://docs.osg.ufl.edu/tsm/pdf/

 some of them are more edifying than others. (ahem)

 I'm trying to develop a somewhat general framework for plotting these
 measures, in the hopes that it's of some use to people other than me.
 One obstacle I encounter is that, when I select one column out of a
 data.frame, the result is no longer a data.frame.  So, say I've got,
 in data frame 'input'

  disk1 disk2 disk3 disk4
 T 0 1 0 4
 T0001 0 1 0 5
 T0002 0 1 0 5
 T0003 0 2 0 4
 T0004 0 2 0 3
 T0005 0 1 0 3
 T0006 0 0 0 3

 and somewhere I've noted a list

 targets - c('disk2','disk3')

 I can say

 input[,targets]
  disk2 disk3
 T 1 0
 T0001 1 0
 T0002 1 0
 T0003 2 0
 T0004 2 0
 T0005 1 0
 T0006 0 0

 but if

 targets - c('disk2')
 input[,targets]
 [1] 1 1 1 2 2 1 0

 Ick.

 I've been reading through the indexing and data.frame docs, and remain
 unsatisfied so far.  Where is my thinking going wrong?



 - Allen S. Rout

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FW: merge multiple csv files

2008-02-08 Thread jim holtman

Don't have your data, but something like this is close:

# something like the following.  read into a list for easier processing
allFile - Sys.glob(sample*.csv)
results - lapply(allFiles, function(.file){
# extract number from file name
num - as.integer(sub(^.*?([[:digit:]]+).*, \\1, .file, perl=TRUE))
.in - read.table(.file, skip=5)
.in$obs - num
.in
})

# combine into a single dataframe
result - do.call(rbind, results)

# now do your processing for average
z - split(result, result[,1])  # split by first column
do.call(rbind, lapply(z, function(.avg){
data.frame(x=.avg[1,1], y=mean(.avg[,2]))
}))



On 2/8/08, Gator Connection [EMAIL PROTECTED] wrote:





 Dear list:I have a folder that contains more than 50 csv files labels 
 sequencially like sample01.csv to sample50.csv. for each file the first 5 
 rows are descriptive of the data collected (useful but not needed in data 
 merge). each file then start the data at row 6 and have 2 variables x and y. 
 In order to know which file one observation is from, I'd like to have a new 
 variable location, for example if the data are from file sample11.csv, then 
 the location for that obs is 11.Another difficulty is there might be two 
 observations actually repetitive, for example sample05.csv might contain (4, 
 10) and (4, 12). I'd like to average it into (4, 11).  Any suggestions are 
 welcome.Jack

 Connect and share in new ways with Windows Live. Get it now!
 _


 08
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to extract characters from a character string

2008-02-08 Thread jim holtman

This should do it for you:

 x
[1] 32?35.421 N
 sub(^.*?([[:digit:].]+) N, \\1, x, perl=TRUE)
[1] 35.421



On 2/8/08, Weidong Gu [EMAIL PROTECTED] wrote:
 Hi, I ran into a problem when I complied a dataset with UTM coordinates.
 For calculating distances between sites, I need to reformat the
 coordinates from, for example,



 32?35.421 N, to 35.421, i.e. I need to delete all digits before symbol ?
 and a space and N at the end of the string. What functions I should use?




 Thanks in advance.





 Weidong Gu,

 Department of Medicine
 University of Alabama, Birmingham






[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can I index a dataframe with a reference from/to a second dataframe?

2008-02-08 Thread jim holtman

, 1L, 8L, 7L, 9L, 23L, 10L, 28L, 11L, 12L,
31L, 30L, 17L, 16L, 4L, 5L, 3L, 25L, 22L, 20L, 24L, 21L,
26L, 27L, 19L, 2L, 18L, 32L, 33L), .Label = c(Abies balsamea,
Acer pensylvanicum, Acer rubrum, Acer saccharum, Acer
 spicatum,
Amelanchier, Betula alleghaniensis, Betula papyrifera,
Cornus alternifolia, Cornus canadensis, Diervilla lonicera,
Dirca palustris, Fagus grandifolia, Fraxinus americana,
Fraxinus nigra, Lonicera canadensis, Ostrya virginiana,
Picea glauca, Picea mariana, Pinus resinosa, Pinus strobus,
Populus tremuloides, Prunus serotina, Prunus virginiana,
Quercus rubra, Ribes , Sorbus americana, Thuja occidentalis,

Tilia americana, Tsuga canadensis, Ulmus americana,
Viburnum acerifolium, Viburnum lantanoides), class = factor),
Cname = structure(c(7L, 27L, 30L, 2L, 3L, 6L, 31L, 33L, 1L,
8L, 10L, 15L, 11L, 21L, 4L, 14L, 20L, 17L, 18L, 23L, 25L,
24L, 29L, 26L, 12L, 16L, 13L, 5L, 9L, 28L, 32L, 22L, 19L), .Label =
 c(Alternate-leaved Dogwood,
American Basswood, American Beech, American Elm, American
 Mountain-ash,
Balsam Fir, Black Ash, Black Cherry, Black Spruce,
Bunchberry, Bush Honeysuckle, Choke Cherry, Currant,
Eastern Hemlock, Eastern White Cedar, Eastern White Pine,
Fly Honeysuckle, Hard Maple, Hobblebush, Ironwood,
Leatherwood, Maple-leaved Viburnum, Mountain Maple,
Northern Red Oak, Red Maple, Red Pine, Serviceberry,
Striped Maple, Trembling Aspen, White Ash, White Birch,
White Spruce, Yellow Birch), class = factor)), .Names =
 c(spp,
 spp.orig, OPL, form, Type, keep, Sname, Cname), row.names
 = c(1,
 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
 26, 27, 28, 29, 30, 31, 32, 33, 34), class =
 data.frame)

 Thanks, DaveT.
 *
 Silviculture Data Analyst
 Ontario Forest Research Institute
 Ontario Ministry of Natural Resources
 [EMAIL PROTECTED]
 http://ofri.mnr.gov.on.ca

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vector Size

2008-02-08 Thread jim holtman

How much memory do you have on your system?  What type of system do
you have?  There is information in the archive about generating a
sequence like this without having to have it all in memory at once.
BTW, your matrix will require 1GB to store a single copy, so you will
probably need at least 2-3X (2-3GB) to create it and do something with
it.

On Feb 8, 2008 7:28 PM, Oscar A [EMAIL PROTECTED] wrote:

 Hello everybody!!
 I'm from Colombia (South America) and I'm new on R.  I've been trying to
 generate all of the possible combinations for a 6 number combination with
 numbers that ranges from 1 to 53.

 I've used the following commands:

 datos-c(1:53)
 M-matrix(data=(combn(datos,6,FUN=NULL,simplify=TRUE)),nrow=22957480,ncol=6,byrow=TRUE)

 Once the commands are executed, the program shows the following:

 Error: CANNOT ALLOCATE A VECTOR OF SIZE 525.5 Mb


 How can I fix this problem?
 --
 View this message in context: 
 http://www.nabble.com/Vector-Size-tp15366901p15366901.html
 Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error in the function

2008-02-09 Thread jim holtman

a quick look at it shows you would be trying to access y[n+1] in the
last part of that loop and that is greater than the number of entries
in 'y' so you will get an NA and this is not legal for comparisons.

On Feb 9, 2008 6:07 PM, mohamed nur anisah [EMAIL PROTECTED] wrote:
 Dear lists,

  i want to find the non-overlapping interval values with this code:

  mysetdiff=function(x,y){
  m=length(x)
  n=length(y)
   bx = logical(m)
   by = logical(n)
for(i in 1:m){
 for(j in 1:n){
  if(x[i]=y[j+1]){
bx[i] = T
by[j] = T
NA= NA
  }
}
  }
 sx = x[!bx]
   sy = y[!by]
  s=c(sx,sy)
 return(s)
  }

  Below is my dataset. When i called back my function with the 
 code;mysetdiff(f,e). An error had occur: Error in if (x[i] = y[j + 1]) { : 
 missing value where TRUE/FALSE needed. How am i going to fix my function so 
 that i can get the values of my non-overlapping interval. Any suggestion?? 
 Thanks a bunch!!

   e
  [1]  17130612  17712302  21225764  25012714  33852816  36012944  36252300
  [8]  36737468  43693832  44148616  45318876  45852632  53258208  58530988
 [15]  60437872  72516480  79673224  93128744  94269896  95868704  99651504
 [22] 113688560 131101008 132955984 135891280 141318144 148257888 156158176
 [29] 157797616 162055856 168221296 173125232 176267104 182826240 183742528
 [36] 184401728 190671888 196639616  17587118  18221688  21387314  30748348
 [43]  34480192  36209144  36280276  36971144  43878548  44496056  45740012
 [50]  46752088  53700056  58603536  60691012  72757696  80077728  93181480
 [57]  94474624  97418088 106596368 120128352 132462320 132980744 135998880
 [64] 142259520 151591840 156920960 157838176 162743136 168466848 173167936
 [71] 176338384 182930096 184149776 185735712 190910576
  f
  [1]  17712302  21203780  25012714  33852816  34794536  36012944  37891284
  [8]  43693832  44148616  45852632  53289188  61573112  63664928  72516480
 [15]  79673224  94474624  95868704  99651504 113688560 125159688 127388568
 [22] 131101008 154599216 176267104 181504912 182562720 182826240 183742528
 [29] 196841904  18100404  21387314  30748348  34384588  35996440  36252300
 [36]  37942556  43878548  44496056  46752088  53700056  62637560  63969952
 [43]  72757696  80077728  94617360  97144032 106596368 120128352 127220456
 [50] 127504536 132462320 154717312 176338384 181836032 182687824 182930096
 [57] 184149776




 -

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Length problem

2008-02-11 Thread jim holtman

You were asking for the length of the first element of the vector
coppie, which is of course 1.  Did you mean to say lgngth(coppie)?
length(data[,4]) is asking how many elements in that column, which
seems to be 5.  also your statement

coppie - c(data[4:length(data)])

seems strange.  What did you intend to do?

On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote:

   Hi all
   I have this problem:
   In my database .dta, called data I have five rows
   data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta)
   # From this database  I wuold like to create another
   coppie-c(data[4:length(data)])
   but I find this

   # Length of  original data
   length(data[,4])
   5   RIGHT!!
   # Length of new data
   length(coppie[1])
   1  WHY??
   Thank you all for your help
   Paolo Grillo
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with bwplot

2008-02-12 Thread jim holtman

Not the most straightforward way, but I think it gets the job done:

x - read.table(textConnection(Ageclass Scale MeanSex
1 21-40BP 40.26667 female
2 41-60BP 34.10714 female
3 61-79BP 37.3 female
4 21-40GH 30.25000 female
5 41-60GH 39.00926 female
6 61-79GH 49.3 female
7 21-40MH 56.5 female
8 41-60MH 62.42857 female
9 61-79MH 72.72727 female
1021-40PF 25.86111 female
1141-60PF 42.42063 female
1261-79PF 52.17172 female
1321-40RE 38.09524 female
1441-60RE 42.85714 female
1561-79RE 42.42424 female
1621-40RP 20.0 female
1741-60RP 25.89286 female
1861-79RP 15.90909 female
1921-40SF 51.7 female
2041-60SF 63.9 female
2161-79SF 57.95455 female
2221-40VT 32.1 female
2341-60VT 36.96429 female
2461-79VT 33.18182 female
2521-40BP 35.0   male
2641-60BP 37.75000   male
2761-79BP 36.0   male
2821-40GH 42.16667   male
2941-60GH 41.89062   male
3061-79GH 41.4   male
3121-40MH 72.0   male
3241-60MH 66.60417   male
3361-79MH 75.2   male
3421-40PF 41.85185   male
3541-60PF 55.31250   male
3661-79PF 47.0   male
3721-40RE 37.03704   male
3841-60RE 54.16667   male
3961-79RE 46.7   male
4021-40RP 27.8   male
4141-60RP 28.12500   male
4261-79RP 20.0   male
4321-40SF 61.1   male
4441-60SF 66.40625   male
4561-79SF 60.0   male
4621-40VT 38.9   male
4741-60VT 30.93750   male
4861-79VT 42.0   male), header=TRUE)
# setup the plot for the max range
plot(0, type='n', ylim=range(x$Mean), xlim=range(as.integer(x$Scale)),
xaxt='n',
ylab=Mean, xlab=Scale)
# plot the axis
axis(1, at=seq_along(levels(x$Scale)), labels=levels(x$Scale))
# split the data
x.s - split(x, list(x$Ageclass, x$Sex))
# plot the data
invisible(lapply(seq_along(x.s), function(.grp){
lines(as.integer(x.s[[.grp]]$Scale), x.s[[.grp]]$Mean, col=.grp,
type='o')
}))
legend('topleft', legend=names(x.s), lwd=3, col=seq_along(x.s))




On Feb 12, 2008 12:21 PM, Tom Cohen [EMAIL PROTECTED] wrote:

  Dear list,

  I have following data set, which I want to plot the Scale variable on
 the
  x-axis and Mean´on the y-axis for each Ageclass and for each sex. The
 Mean
  value of each Ageclass for each sex would be connected by a line.
 Totally,
  there should be 6 lines, from which three present the Mean values of each
  Ageclass for respective sex. Are there any easy ways to do this in R?


  Ageclass Scale MeanSex
 1 21-40BP 40.26667 female
 2 41-60BP 34.10714 female
 3 61-79BP 37.3 female
 4 21-40GH 30.25000 female
 5 41-60GH 39.00926 female
 6 61-79GH 49.3 female
 7 21-40MH 56.5 female
 8 41-60MH 62.42857 female
 9 61-79MH 72.72727 female
 1021-40PF 25.86111 female
 1141-60PF 42.42063 female
 1261-79PF 52.17172 female
 1321-40RE 38.09524 female
 1441-60RE 42.85714 female
 1561-79RE 42.42424 female
 1621-40RP 20.0 female
 1741-60RP 25.89286 female
 1861-79RP 15.90909 female
 1921-40SF 51.7 female
 2041-60SF 63.9 female
 2161-79SF 57.95455 female
 2221-40VT 32.1 female
 2341-60VT 36.96429 female
 2461-79VT 33.18182 female
 2521-40BP 35.0   male
 2641-60BP 37.75000   male
 2761-79BP 36.0   male
 2821-40GH 42.16667   male
 2941-60GH 41.89062   male
 3061-79GH 41.4   male
 3121-40MH 72.0   male
 3241-60MH 66.60417   male
 3361-79MH 75.2   male
 3421-40PF 41.85185   male
 3541-60PF 55.31250   male
 3661-79PF 47.0   male
 3721-40RE 37.03704   male
 3841-60RE 54.16667   male
 3961-79RE 46.7   male
 4021-40RP 27.8   male
 4141-60RP 28.12500   male
 4261-79RP 20.0   male
 4321-40SF 61.1   male
 4441-60SF 66.40625   male
 4561-79SF 60.0   male
 4621-40VT 38.9   male
 4741-60VT 30.93750   male
 4861-79VT 42.0   male
  Thanks for any help,
 Tom


 -
 Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling.

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What

Re: [R] how to specify modes of certain fields in read.table

2008-02-12 Thread jim holtman

If you want to use colClasses, then do:

read.table(, colClasses=rep('numeric', 50))

On Feb 12, 2008 5:40 PM, Weidong Gu [EMAIL PROTECTED] wrote:

 I have a data file with 50 columns. Among them, there are two
 coordinates, X and Y

 X

 Y

 641673.78807

 3607080.78438

 641436.56207

 3607108.30543

 641165.28042

 3607136.82957

 640879.58373

 3607116.20568



 When I use read.table, it rounds X and Y to the maximal 8 decimal number
 as.



 641673.8  3607081

 641436.6  3607108

 641165.3  3607137

 640879.6  3607116

 640683.5  3607105



 My question is how to specify these two columns in read.table. Maybe
 colClasses helps but I have 50 columns...



 Thanks



 Weidong Gu,

 Department of Medicine
 University of Alabama, Birmingham
 1900 University Blvd., Birmingham, Alabama 35294
 Email: [EMAIL PROTECTED]
 PH: (205)-975-9053




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to specify modes of certain fields in read.table

2008-02-12 Thread jim holtman

It is just printing them out with that significance; the numbers are stored
with about 15 digits.  If you want more, use 'options':

 x - scan(textConnection(641673.78807
+
+ 3607080.78438
+
+ 641436.56207
+
+ 3607108.30543
+
+ 641165.28042
+
+ 3607136.82957
+
+ 640879.58373
+
+ 3607116.20568
+
+ ), what=0)
Read 8 items
 x
[1]  641673.8 3607080.8  641436.6 3607108.3  641165.3 3607136.8  640879.6
3607116.2
 options(digits=20)
 x
[1]  641673.78807 3607080.78438  641436.56207 3607108.30543  641165.28042
3607136.82957  640879.58373
[8] 3607116.20568



On Feb 12, 2008 5:40 PM, Weidong Gu [EMAIL PROTECTED] wrote:

 I have a data file with 50 columns. Among them, there are two
 coordinates, X and Y

 X

 Y

 641673.78807

 3607080.78438

 641436.56207

 3607108.30543

 641165.28042

 3607136.82957

 640879.58373

 3607116.20568



 When I use read.table, it rounds X and Y to the maximal 8 decimal number
 as.



 641673.8  3607081

 641436.6  3607108

 641165.3  3607137

 640879.6  3607116

 640683.5  3607105



 My question is how to specify these two columns in read.table. Maybe
 colClasses helps but I have 50 columns...



 Thanks



 Weidong Gu,

 Department of Medicine
 University of Alabama, Birmingham
 1900 University Blvd., Birmingham, Alabama 35294
 Email: [EMAIL PROTECTED]
 PH: (205)-975-9053




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] conflict within packages

2008-02-12 Thread jim holtman

you can use:

package::getNames()

to reference the one that you want.

On Feb 12, 2008 3:45 PM, Elizabeth Purdom [EMAIL PROTECTED] wrote:

 Hi,
 I am trying to use two contributed packages, both of which have a
 function 'getNames'. So if I load them both they obviously conflict.
 Currently I manually detach one package and then reload the other to be
 able to use one function right after another. Is there anything else I
 can do?
 Best,
 Elizabeth

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reorder data frame columns by negating list of names

2008-02-12 Thread jim holtman

try this:

 x - read.table(textConnection(   a  b  c  d  e  f  g  h
+1 1  6 11 16 21 26 31 36
+2 2  7 12 17 22 27 32 37
+3 3  8 13 18 23 28 33 38
+4 4  9 14 19 24 29 34 39
+5 5 10 15 20 25 30 35 40), header=TRUE)
 # initial columns
 init.cols - c('b', 'd', 'h')
 # now get the remaining
 remaining - setdiff(colnames(x), init.cols)
 x[,c(init.cols, remaining)]
   b  d  h a  c  e  f  g
1  6 16 36 1 11 21 26 31
2  7 17 37 2 12 22 27 32
3  8 18 38 3 13 23 28 33
4  9 19 39 4 14 24 29 34
5 10 20 40 5 15 25 30 35



On Feb 12, 2008 12:19 PM, Thompson, David (MNR)
[EMAIL PROTECTED] wrote:
 Hello,

 I would like to reorder columns in a data frame by their names as
 demonstrated below:

 Take this data frame:
 xxx - data.frame(matrix(1:40, ncol=8))
 names(xxx) - letters[1:8]
 xxx
  a  b  c  d  e  f  g  h
1 1  6 11 16 21 26 31 36
2 2  7 12 17 22 27 32 37
3 3  8 13 18 23 28 33 38
4 4  9 14 19 24 29 34 39
5 5 10 15 20 25 30 35 40

 and reorder the columns like this:
 xxx[,c( c('b', 'd', 'h'), c('a', 'c', 'e', 'f', 'g') )]
   b  d  h a  c  e  f  g
1  6 16 36 1 11 21 26 31
2  7 17 37 2 12 22 27 32
3  8 18 38 3 13 23 28 33
4  9 19 39 4 14 24 29 34
5 10 20 40 5 15 25 30 35

 where I only have to name the columns that I'm interested in moving to
 the first few positions, something like:
 xxx[,c( c('b', 'd', 'h'), -c('b', 'd', 'h') )]
Error in -c(b, d, h) : invalid argument to unary operator

 Suggestions? and Thank you, DaveT.
 *
 Silviculture Data Analyst
 Ontario Forest Research Institute
 Ontario Ministry of Natural Resources
 [EMAIL PROTECTED]
 http://ofri.mnr.gov.on.ca

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] summary statistics

2008-02-12 Thread jim holtman

Here is one way of doing it:  (no exactly sure if 'mode' makes sense
with your data)

 x - read.table(textConnection(RM   mgl
+ 1  215 0.9285714
+ 2  215 0.7352941
+ 3  215 1.6455696
+ 4  215 0.600
+ 5   sc 1.833
+ 6   sc 0.833
+ 7   sc 2.5438596
+ 8   sc 0.250
+ 9  202NA
+ 10 202 0.550
+ 11 202 0.8148148
+ 12 202 1.667
+ 13 198 0.5038760
+ 14 198 0.3823529
+ 15 198 0.760
+ 16 198 0.480
+ 17  hc 3.1818182
+ 18  hc 3.7254902
+ 19  hc 4.375
+ 20  hc 2.6415094
+ 21 190 0.350
+ 22 190 0.440
+ 23 190 0.650
+ 24 190 0.500
+ 25  bc 9.000
+ 26  bc 5.000
+ 27  bc 4.000
+ 28  bc 3.200
+ 29 185 0.7386364
+ 30 185 0.500
+ 31 185 1.1538462
+ 32 185 0.600
+ 33 179 1.8181818
+ 34 179 1.198
+ 35 179 2.500
+ 36 179 2.000
+ 37 148 2.083
+ 38 148 2.333
+ 39 148 3.100
+ 40 148 2.2142857
+ 41 119 2.444
+ 42 119 2.3275862
+ 43 119 4.7142857
+ 44 119 1.7692308
+ 45  61 2.889
+ 46  61 3.250
+ 47  61 4.750
+ 48  61 2.6337449), header=TRUE)
 # compute the stats
 x.stats - by(x, x$RM, function(.rm){
+ c(mean=mean(.rm$mgl, na.rm=TRUE), median=median(.rm$mgl, na.rm=TRUE))
+ })
 do.call(rbind, x.stats)
 meanmedian
119 2.8138868 2.3860153
148 2.4327381 2.2738095
179 1.8790455 1.9090909
185 0.7481206 0.6693182
190 0.485 0.470
198 0.5315572 0.4919380
202 1.0104938 0.8148148
215 0.9773588 0.8319327
61  3.3806584 3.069
bc  5.300 4.500
hc  3.4809545 3.4536542
sc  1.3651316 1.333




On Feb 12, 2008 11:57 AM, stephen sefick [EMAIL PROTECTED] wrote:
 below is my data frame.  I would like to compute summary statistics
 for mgl for each river mile (mean, median, mode).  My apologies in
 advance-  I would like to get something like the SAS print out of PROC
 Univariate.  I have performed an ANOVA and a tukey LSD and I would
 just like the summary statistics.
 thanks

 stephen

 RM   mgl
 1  215 0.9285714
 2  215 0.7352941
 3  215 1.6455696
 4  215 0.600
 5   sc 1.833
 6   sc 0.833
 7   sc 2.5438596
 8   sc 0.250
 9  202NA
 10 202 0.550
 11 202 0.8148148
 12 202 1.667
 13 198 0.5038760
 14 198 0.3823529
 15 198 0.760
 16 198 0.480
 17  hc 3.1818182
 18  hc 3.7254902
 19  hc 4.375
 20  hc 2.6415094
 21 190 0.350
 22 190 0.440
 23 190 0.650
 24 190 0.500
 25  bc 9.000
 26  bc 5.000
 27  bc 4.000
 28  bc 3.200
 29 185 0.7386364
 30 185 0.500
 31 185 1.1538462
 32 185 0.600
 33 179 1.8181818
 34 179 1.198
 35 179 2.500
 36 179 2.000
 37 148 2.083
 38 148 2.333
 39 148 3.100
 40 148 2.2142857
 41 119 2.444
 42 119 2.3275862
 43 119 4.7142857
 44 119 1.7692308
 45  61 2.889
 46  61 3.250
 47  61 4.750
 48  61 2.6337449


 --
 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.

-K. Mullis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] indices of rows containing one or more elements 0

2008-02-12 Thread jim holtman

Is this what you are after?

 test - matrix(c(0,2,0,1,3,5), 3,2)
 (x - which(test  0, arr.ind=TRUE))
 row col
[1,]   2   1
[2,]   1   2
[3,]   2   2
[4,]   3   2
 unique(x[, 'row'])
[1] 2 1 3



On Feb 12, 2008 9:40 PM, Ng Stanley [EMAIL PROTECTED] wrote:
 Hi,

 Given test - matrix(c(0,2,0,1,3,5), 3,2)

  test[test0]
 [1] 2 1 3 5

 These are values 0

  which(test0)
 [1] 2 4 5 6

 These are array indices of those values 0

  which(apply(test0, 1, all))
 [1] 2

 This gives the row whose elements are all 0

 I can't seem to get indices of rows containing one or more elements 0

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] regular expression for na.strings / read.table

2008-02-12 Thread jim holtman

Here is one way of doing it:

 # read the file in as lines, do the convert and then re-read
 x - readLines(textConnection( X1 X.789 LNM. X78 X56  X89 X56.1 X100
+ 1  2   700  AUW  78  56   8956  100
+ 2  3   400  TOC  78  56   8956   10
+ 3  4   389  RMN  78  56   8956  *89
+ 4  5   400  LNM  78  56 *45256  100
+ 5  6   200  UTC  78 *40   8956  100
+ 6  7   100  GAT  78  56856 *100
+ 7  879 *LNM  78  56956  100
+ 8  989  TCG  78  56  80056 *100
+ 9 10   78*  LNM  78  56   8956  100))
 x.c - gsub(\\*[[:alnum:]]*|[[:alnum:]]*\\*, NA, x)
 x.new - read.table(textConnection(x.c), header=TRUE)
 closeAllConnections()

 x.new
  X1 X.789 LNM. X78 X56 X89 X56.1 X100
1  2   700  AUW  78  56  8956  100
2  3   400  TOC  78  56  8956   10
3  4   389  RMN  78  56  8956   NA
4  5   400  LNM  78  56  NA56  100
5  6   200  UTC  78  NA  8956  100
6  7   100  GAT  78  56   856   NA
7  879 NA  78  56   956  100
8  989  TCG  78  56 80056   NA
9 10NA  LNM  78  56  8956  100


On Feb 12, 2008 9:30 AM,  [EMAIL PROTECTED] wrote:

 Dear all,

 I am working with a csv file.
 Some data of the file are not valid and they are marked with a star '*'.
 For example : *789.

 I have attached with this email a example file (test.txt) that looks like
 the data I have to work with.


 I see 2 possibilities ..thast I cannot manage anyway in R:

 1-first  easiest solution:
 Read the data with read.csv in R, and define as na strings all cells
 containing a star (*).
 Something which would looks like this ...

 
 DATA-read.csv(test.txt,na.strings=list(length(grep(\\*,DATA,value=T))==0))

  DATA
  X1 X.789 LNM. X78 X56  X89 X56.1 X100
 1  2   700  AUW  78  56   8956  100
 2  3   400  TOC  78  56   8956   10
 3  4   389  RMN  78  56   8956  *89
 4  5   400  LNM  78  56 *45256  100
 5  6   200  UTC  78 *40   8956  100
 6  7   100  GAT  78  56856 *100
 7  879 *LNM  78  56956  100
 8  989  TCG  78  56  80056 *100
 9 10   78*  LNM  78  56   8956  100


 ...but which would work (Stars are still there)! Do anyone knows how to do
 that ?

 2-Second solution:
 - first read the file with DATA-read.csv(test.txt)
 - then replace all fields containing a * with NA in applying the following
 function to the object DATA:
 DATA_cleaned-apply(DATA,c(1,2),function(x){if(length(grep(\\*,x,value=TRUE))==1){x-NA}})
  DATA_cleaned
  X1   X.789 LNM. X78  X56  X89  X56.1 X100
  [1,] NULL NULL  NULL NULL NULL NULL NULL  NULL
  [2,] NULL NULL  NULL NULL NULL NULL NULL  NULL
  [3,] NULL NULL  NULL NULL NULL NULL NULL  NA
  [4,] NULL NULL  NULL NULL NULL NA   NULL  NULL
  [5,] NULL NULL  NULL NULL NA   NULL NULL  NULL
  [6,] NULL NULL  NULL NULL NULL NULL NULL  NA
  [7,] NULL NULL  NA   NULL NULL NULL NULL  NULL
  [8,] NULL NULL  NULL NULL NULL NULL NULL  NA
  [9,] NULL NANULL NULL NULL NULL NULL  NULL

 stars have deaseper, but all the rest too !
 The pb comes from the fact that if a field does not contain any *, the
 command
 if(length(grep(\\*,x,value=T))==1) return NULL instead of FALSE !

 I you have any idea, please let me know !

 Many thanks,

 Jessica
 

 Jessica Gervais
 Mail: [EMAIL PROTECTED]

 Resource Centre for Environmental Technologies,
 Public Research Centre Henri Tudor,
 Technoport Schlassgoart,
 66 rue de Luxembourg,
 P.O. BOX 144,
 L-4002 Esch-sur-Alzette, Luxembourg

 (See attached file: test.txt)
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matching Problem

2008-02-12 Thread jim holtman

Here is one way of doing it:

 MyData - c(Test1,Test2,I(Test1^2),I(Test2^3),I(Test1.Test2^2))
 x - gsub(^(.*\\(|)([^^)]*|.*).*, \\2, MyData)
 x
[1] Test1   Test2   Test1   Test2   Test1.Test2
 unique(x)
[1] Test1   Test2   Test1.Test2



On Feb 12, 2008 5:44 AM, Tom.O [EMAIL PROTECTED] wrote:

 Hi

 I have this vector of strings.

 MyData - c(Test1,Test2,I(Test1^2),I(Test2^3),I(Test1.Test2^2))
 where I want to extract only the text after I( and before ^ so that the
 string returned only contain c(Test1,Test2,Test1.Test2)

 I am not very skilled in the use of matching patterns so bare with me but I
 belive I should use gsub('^.\\(', ,MyData) for removing the I( and
 gsub(\\^.+, '',MyData) for the end. but theres got to be a more elegant
 way that does the trick in one go.

 So I would appriciate I anyone could give me some advice.

 Thanks Tom
 --
 View this message in context: 
 http://www.nabble.com/Matching-Problem-tp15430660p15430660.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] shaded area graph and extra plot

2008-02-12 Thread jim holtman

Use 'xlim=c(1993,2008)' in your second plot to setup the same range.

On Feb 12, 2008 10:39 AM, Luis Ridao Cruz [EMAIL PROTECTED] wrote:
 R-help,

 I'm using the code below to plot a shaded area graph.

 At the same time I want to plot a second series on the y-axis (from
 par(new=T) on)
 but as the two series have different x-axis range (first 1994:2007 and
 second 1996:2007)
 the corresponding x's do not match.

 How can this be sorted out?

 Thanks in advance

 #
 plot.new()
 plot.window(xlim=c(1993,2008), xaxs=i, ylim=c(0,400), yaxs=i)

 x=1994:2007
 xx = c(1994, x, 2007)

 yy1 = c(0, indexSp[,Xhat5Sp]+indexSp[,seA], 0 )
 yy2 = c(0, indexSp[,Xhat5Sp]-indexSp[,seA], 0 )

 polygon(xx, yy1, col=grey, lty=0)
 polygon(xx, yy2, col=white, lty=0)
 lines(x, indexSp[,Xhat5Sp], type=l)

 axis(1)
 axis(2)

 par(new=T)
 plot(1996:2007, c(0,0,indexSu[,Xhat5Su]), type=p, col=2, lwd=2,
 cex=1,ann=T,axes=F)
 axis(4)
 #

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rolling sum (like in Rmetrics package)

2008-02-13 Thread jim holtman

Have you tried 'filter'?

 x - 1:20
 filter(x,filter=rep(1,5))
Time Series:
Start = 1
End = 20
Frequency = 1
 [1] NA NA 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 NA NA




On 2/13/08, joshv [EMAIL PROTECTED] wrote:


 Hello, I'm new to R and would like to know how to create a vector of
 rolling
 sums. (I have seen the Rmetrics package and the rollMean function and I
 would like to do the same thing except Sum instead of Mean.)  I imagine
 someone has done this, I just can't find it anywhere.

 Example:
 x - somevector   #where x is 'n' entries long

 #what I would like to do is:

 x1 - x[1:20]
 output1 - sum(x1)

 x2 - x[2:21]
 output2 - sum(x2)

 x3 - ...

 ouput - c(output1, output2, ...)


 Thanks,
 JV
 --
 View this message in context:
 http://www.nabble.com/rolling-sum-%28like-in-Rmetrics-package%29-tp15459848p15459848.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] write output in a custom format

2008-02-14 Thread jim holtman

Here is a start.  You basically have to interate through your data and
use 'cat' to write it out:

particle - list(dose=c(1,100.0,0),pos=data.frame(x=c(0,1,0,1),y=c(0,1,0,1)))
output - file(/tempxx.txt, w)
cat(particle$dose, \n, file=output, sep= )
for (i in 1:nrow(particle$pos)){
cat(particle$pos$x[i], particle$pos$y[i], \n, file=output, sep= )
}
cat(#\n, file=output, sep= )
close(output)

Here is what the file looks like:

1 100 0
0 0
1 1
0 0
1 1
#


On 2/14/08, baptiste Auguié [EMAIL PROTECTED] wrote:
 Hi,


 I need to create a text file in the following format,

  1 100.0 0
   0 0
   1 1
   0 0
   1 1
  #
  1 100.0 0
   0 0
   0 1
   1 0
   1 1
 ...

 where # is part of the format and not a R comment.

 Each block (delimited by #) consists of a first line with three
 values, call it dose, and a list of (x,y) coordinates which are a
 matrix or data.frame,


  particle - list(dose=c(1,100.0,0),pos=data.frame(x=c(0,1,0,1),y=c
  (0,1,0,1)))
 
  print(particle)



 I'd like to establish a connection to a file and append to it a
 particle block in the format above, or even write the whole file at
 once.

 Because different lines have a different number of elements, I
 couldn't get write.table to work in this case, and my attempts at sink
 (), dump(), writeLines(), writeChar() all turn into really dirty
 solutions. I have this feeling I'm overlooking a simple solution.

 Any help welcome,


 baptiste

 _

 Baptiste Auguié

 Physics Department
 University of Exeter
 Stocker Road,
 Exeter, Devon,
 EX4 4QL, UK

 Phone: +44 1392 264187

 http://newton.ex.ac.uk/research/emag
 http://projects.ex.ac.uk/atto

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Replacing columns in a data frame using a previous condition

2008-02-14 Thread jim holtman

Is this what you want to do?

 x - data.frame(a=1:10, b=1:10, c=1:10, d=1:10)
 z - cbind(c=11:20, d=11:20)
 z
   c  d
 [1,] 11 11
 [2,] 12 12
 [3,] 13 13
 [4,] 14 14
 [5,] 15 15
 [6,] 16 16
 [7,] 17 17
 [8,] 18 18
 [9,] 19 19
[10,] 20 20
 x[,colnames(z)] - z[, colnames(z)]
 x
a  b  c  d
1   1  1 11 11
2   2  2 12 12
3   3  3 13 13
4   4  4 14 14
5   5  5 15 15
6   6  6 16 16
7   7  7 17 17
8   8  8 18 18
9   9  9 19 19
10 10 10 20 20



On 2/14/08, Jorge Iván Vélez [EMAIL PROTECTED] wrote:
 Dear R-list,

 I'm working with a data frame which dimensions are

  dim(GERU)
 [1] 3468  318

 and looks like

  GERU[1:10,1:10]
   ped ind par1 par2 sex sta rs7696470 rs7696470.1 rs1032896 rs1032896.1
 1  USA5854   200   2   1 4   4 1   1
 2  USA5854   312   1   1 4   4 1   1
 3  USA5854   412   2   2 1   4 1   3
 4  USA5854   512   1   2 4   2 2   1
 5  USA5855   100   1   1 0   0 0   0
 6  USA5855   200   2   2 1   0 0   0
 7  USA5855   312   1   2 0   2 0   0
 8  USA5855   412   1   1 2   0 2   1
 9  USA5855   512   1   2 0   1 0   0
 10 USA5856   100   1   13   3 3   3

 What I would like to do is:

 1. Identify which column (from 6 to 318) has more than 4 categories (I
 solved that). In GERU would be rs7696470 and rs7696470.1.
 2. Using the columns in step 1, replace its entries equals to 2 for 3. For
 example, rs7696470 would be 4,4,1,4,0,1,0,3,0,3 and so on.
 3. Once replaced the entries, I need to rewrite the columns in GERU.

 Here is what I've done:

  # Function to identify columns with 3 or more categories
  tx=function(x) ifelse(dim(table(x))4,1,0)

  # Identifying the columns
  M4=apply(GUPN[,-c(1:6)],2,tx)
  names(which(MR==1))# Step 1
  [1] rs335322 rs335322.1   rs186750 rs186750.1
 rs1565901rs1565901.1  rs1565902
  [8] rs1565902.1  rs11131334   rs11131334.1 rs1948616
 rs1948616.1  rs4484334rs4484334.1
 [15] rs1497921rs1497921.1  rs1391320rs1391320.1
 rs1497913rs1497913.1  rs996208
 [22] rs996208.1
  # Step 2
  REPLACE=GUPN[,names(which(AR==1))]
  RES=apply(REPLACE,2,function(x) ifelse(x==2,3,x))
  RES[1:10,1:5]
   rs335322 rs335322.1 rs186750 rs186750.1 rs1565901
 1 1  33  3 3
 2 1  13  3 3
 3 3  31  3 3
 4 1  33  3 3
 5 0  00  0 0
 6 0  00  0 0
 7 0  00  0 0
 8 0  00  0 0
 9 0  00  0 0
 101  33  3 1

 Now, the problem I have is replacing the columns in GERU by the columns in
 RES (step 3). At the end the dimension of the new data set should be
 3468x318. Any help would be greatly appreciated.

 Thanks you so much,


 Jorge

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] write output in a custom format

2008-02-14 Thread jim holtman

There is nothing wrong with a loop for handling this case.  Most of
your time is probably going to be spent writing out the files.  If you
don't want 'for' loops, you can use 'lapply', but I am not sure what
type of performance improvement you will see.  You are having to
make decisions on each particle on how to write it. You can also use
awk/perl as you indicated, but you would have to write the data out
for those programs.  You might take a test run and see.  I would guess
that by the time you format it for awk and then run awk, you could
have done the whole thing in R.  But it is your choice and there are
plenty of tools to choose from.

On 2/14/08, baptiste Auguié [EMAIL PROTECTED] wrote:
 Thanks for the input! It does work fine, however I'll have to do
 another loop to repeat this whole process quite a few times (10^3,
 10^4 particles maybe), so I was hoping for a solution without loop.
 Maybe I could reshape all the values into a big array, dump it to a
 file and replace some values using system(awk...). I just don't
 really know how to format the data, having different number of values
 for some lines. Would that be a sensible thing to do?

 thanks,

 baptiste




 On 14 Feb 2008, at 16:49, jim holtman wrote:

  Here is a start.  You basically have to interate through your data and
  use 'cat' to write it out:
 
  particle - list(dose=c(1,100.0,0),pos=data.frame(x=c(0,1,0,1),y=c
  (0,1,0,1)))
  output - file(/tempxx.txt, w)
  cat(particle$dose, \n, file=output, sep= )
  for (i in 1:nrow(particle$pos)){
  cat(particle$pos$x[i], particle$pos$y[i], \n, file=output,
  sep= )
  }
  cat(#\n, file=output, sep= )
  close(output)
 
  Here is what the file looks like:
 
  1 100 0
  0 0
  1 1
  0 0
  1 1
  #
 
 
  On 2/14/08, baptiste Auguié [EMAIL PROTECTED] wrote:
  Hi,
 
 
  I need to create a text file in the following format,
 
  1 100.0 0
   0 0
   1 1
   0 0
   1 1
  #
  1 100.0 0
   0 0
   0 1
   1 0
   1 1
  ...
 
  where # is part of the format and not a R comment.
 
  Each block (delimited by #) consists of a first line with three
  values, call it dose, and a list of (x,y) coordinates which are a
  matrix or data.frame,
 
 
  particle - list(dose=c(1,100.0,0),pos=data.frame(x=c(0,1,0,1),y=c
  (0,1,0,1)))
 
  print(particle)
 
 
 
  I'd like to establish a connection to a file and append to it a
  particle block in the format above, or even write the whole file at
  once.
 
  Because different lines have a different number of elements, I
  couldn't get write.table to work in this case, and my attempts at
  sink
  (), dump(), writeLines(), writeChar() all turn into really dirty
  solutions. I have this feeling I'm overlooking a simple solution.
 
  Any help welcome,
 
 
  baptiste
 
  _
 
  Baptiste Auguié
 
  Physics Department
  University of Exeter
  Stocker Road,
  Exeter, Devon,
  EX4 4QL, UK
 
  Phone: +44 1392 264187
 
  http://newton.ex.ac.uk/research/emag
  http://projects.ex.ac.uk/atto
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
  --
  Jim Holtman
  Cincinnati, OH
  +1 513 646 9390
 
  What is the problem you are trying to solve?

 _

 Baptiste Auguié

 Physics Department
 University of Exeter
 Stocker Road,
 Exeter, Devon,
 EX4 4QL, UK

 Phone: +44 1392 264187

 http://newton.ex.ac.uk/research/emag
 http://projects.ex.ac.uk/atto

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Retrieving data frames from a for loop

2008-02-14 Thread jim holtman

Use a 'list' to capture the data within the loop:

 result - vector('list', 20)  # preallocate
 tab - data.frame(x=1:20)
 for (i in 1:20) {
+
+ g-sample(rep(LETTERS[1:2],each=10))
+ result[[i]] -data.frame(tab,g)
+
+ }
 # you can now access the combinations like this:
 result[[1]]
x g
1   1 B
2   2 A
3   3 B
4   4 B
5   5 B
6   6 B
7   7 A
8   8 B
9   9 A
10 10 B
11 11 A
12 12 B
13 13 B
14 14 A
15 15 A
16 16 A
17 17 A
18 18 B
19 19 A
20 20 A
 result[[5]]
x g
1   1 B
2   2 A
3   3 B
4   4 B
5   5 A
6   6 A
7   7 B
8   8 A
9   9 B
10 10 A
11 11 B
12 12 A
13 13 B
14 14 B
15 15 B
16 16 A
17 17 A
18 18 A
19 19 A
20 20 B




On Thu, Feb 14, 2008 at 6:42 PM, Judith Flores [EMAIL PROTECTED] wrote:
 Dear R-helpers,

   I need to retrieve the data frames generated in a
 for loop. What I have looks something like this:

 where tab is a pre-existing data frame.

 for (i in 1:20) {

 g-sample(rep(LETTERS[1:2],each=10))
 combination-data.frame(tab,g)

 }

   I tried to name every single combination doing
 this:

 assign(paste('combination',i), combination)

  without success.

 I need to retrieve every combination per separate.

 Thank you once again for your help.


  
 
 Looking for last minute shopping deals?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to specify the location of tick mark on x axies

2008-02-16 Thread jim holtman

I think what you want for your last statement is:

lines(pts, y2)

This uses the value of the tick marks to plot your line.



On Feb 16, 2008 6:53 AM, Xin [EMAIL PROTECTED] wrote:
 hi,

   I did barplot. My data are:

 y1-c(13, 20, 22, 19, 10, 16, 8, 4, 3, 5, 7, 4, 0, 4, 4, 2, 4, 2, 2, 5, 1)
 y2-c(13, 23.29568698, 18.1385593, 14.97159795, 12.57640037, 10.65752306,
 9.079421331, 7.7625489, 6.653641903, 5.714125735, 4.914645265,
 4.232117758, 3.647980094, 3.147064034, 2.716830439, 2.346823055,
 2.02826436, 1.753747752, 1.516997668, 1.31267921, 1.136244845
 )
 x-c(0, 1, 6, 11, 16, 21, 26, 31, 36, 41, 46, 51, 56, 61, 66, 71,
 76, 81, 86, 91, 96)

 pts=barplot(y1,ylim=c(0,40),axes=TRUE,names.arg=x,border=TRUE,col=white)
 axis(side=1,at=pts, labels=F, tick=T)

 x axis with tickmarks exactly at the middle of the bars

 Then I want to add line into the barplot. I used

 lines(x,y2)

 But the data points of the line is plotted at the beggining of each category
 on x axis. I want to them plotted at the middle of each category.

 Can you help?

 Xin



 - Original Message -
 From: jim holtman [EMAIL PROTECTED]
 To: Xin [EMAIL PROTECTED]
 Sent: Saturday, February 16, 2008 11:43 AM
 Subject: Re: [R] how to specify the location of tick mark on x axies


  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  Can you provide an example of what you are doing and what you want.
 
  On Feb 16, 2008 6:14 AM, Xin [EMAIL PROTECTED] wrote:
  Dear:
 
 I want to plot barplot and let bar be in the middle of each x axis
  category.
 
Do you have this experience?
 
Many Thanks!
 
Xin
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
  --
  Jim Holtman
  Cincinnati, OH
  +1 513 646 9390
 
  What is the problem you are trying to solve?
 





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to estimate weekly Variance

2008-02-16 Thread jim holtman

  
  
  
  
 
 
   Be a better friend, newshound, and
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
  reproducible code.
  
 
 
  --
  Jim Holtman
  Cincinnati, OH
  +1 513 646 9390
 
  What is the problem you are trying to solve?
 


 Felipe D. Carrillo
  Fishery Biologist
  US Fish  Wildlife Service
  California, USA



  
 
 Never miss a thing.  Make Yahoo your home page.
 http://www.yahoo.com/r/hs




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] predicting memory usage

2008-02-18 Thread jim holtman

If this is numeric, then for just storing one copy, you will require
86000 * 2500 * 8 = 1.7GB of memory.  You should have 3-4X that if you
want to analyze it, so you might need about 6GB of physical memory and
a 64-bit version of R.  Is there some other alternative?  Do you need
all the values at once, or can you use a database to access the
portions you want?

On 2/18/08, Federico Calboli [EMAIL PROTECTED] wrote:
 Hi All,

 is there a way of predicting memory usage?

 I need to build an array of 86000 by 2500 numbers (or I might create
 a list of 2 by 2500 arrays 43000 long). How much memory should I
 expect to use/need?

 Cheers,

 Fede

 --
 Federico C. F. Calboli
 Department of Epidemiology and Public Health
 Imperial College, St. Mary's Campus
 Norfolk Place, London W2 1PG

 Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

 f.calboli [.a.t] imperial.ac.uk
 f.calboli [.a.t] gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Huge number

2008-02-18 Thread jim holtman

If you want to compute (157+221)! then sum up the log:

 a - 1:(157+221)
 sum(log10(a))
[1] 811.8165

This is about 6.55e811 which exceeds the range of floating point
numbers (1.797693e+308).  You might check out the Brobdingnag package.

On Feb 18, 2008 6:23 PM, Hyojin Lee [EMAIL PROTECTED] wrote:
 Hi,
 I'm trying to calculate p-value to findout definitely expressed genes
 compare A to B situation.
 I got this data(this is a part of data) from whole organism , and each
 number means each expression values (that means, we could think 'a' gene
 is 13 in A situation, and it turns 30 in B situation)
 To findout probability, I'm going to use Audic - Claverie Method. ( The
 significance of digital gene expression profiles. 1997)

 But using this equation p(x|y), I have to calculate (x+y)!  first. but I
 can't  calculate (157+221)! or (666+1387)! in R.
 That's probabily the handling problem of huge number, How could I
 calculate p value in this data with R?


 A B
 Total5874641 6295980
 a13  30
 b36  39
 c0   5
 d40  61
 e16  20
 f13  11
 g3   3
 h9   5
 i12  35
 j157 221
 k17  39
 l6   17
 m666 1387
 n2   5









 The significance of digital gene expression profiles.

 Audic S
 http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmedCmd=SearchTerm=%22A
 udic%20S%22%5BAuthor%5Ditool=EntrezSystem2.PEntrez.Pubmed.Pubmed_Result
 sPanel.Pubmed_RVAbstractPlusDrugs1 , Claverie JM
 http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmedCmd=SearchTerm=%22C
 laverie%20JM%22%5BAuthor%5Ditool=EntrezSystem2.PEntrez.Pubmed.Pubmed_Re
 sultsPanel.Pubmed_RVAbstractPlusDrugs1 .

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interpolation between 2 vectors

2008-02-19 Thread jim holtman

check out the 'approx' function.

On Feb 19, 2008 12:44 PM, Dani Valverde [EMAIL PROTECTED] wrote:
 Hello,
 I have two vectors, one with 13112 points and the other one with 10909.
 I wonder if there is a way to interpolate the data so the shorter
 vectors has the same number of points as the longer one.
 Best,
 Dani

 --
 Daniel Valverde Saubí

 Grup de Biologia Molecular de Llevats
 Facultat de Veterinària de la Universitat Autònoma de Barcelona
 Edifici V, Campus UAB
 08193 Cerdanyola del Vallès- SPAIN

 Centro de Investigación Biomédica en Red
 en Bioingeniería, Biomateriales y
 Nanomedicina (CIBER-BBN)

 Grup d'Aplicacions Biomèdiques de la RMN
 Facultat de Biociències
 Universitat Autònoma de Barcelona
 Edifici Cs, Campus UAB
 08193 Cerdanyola del Vallès- SPAIN
 +34 93 5814126

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem Using the %in% command

2008-02-20 Thread jim holtman

With the format you have, we have to split out the genes separated by
commas and then do 'table'.  Here is one way of doing it:

 x - readLines(textConnection(  Function 
  x
+ Function1   gene5, gene19, gene22, gene23
+ Function2  gene1, gene7, gene19
+ Function3   gene2, gene3, gene7, gene23))
 closeAllConnections()
 # funny data; split it up. get rid of header
 x - x[-1]
 # split on blanks
 x.b - strsplit(x, [[:blank:]]+)
 # recombine into a 'long' format
 x.c - lapply(x.b, function(z) cbind(z[1], unlist(strsplit(z[-1], ,
 x.c - do.call(rbind, x.c)
 table(list(x.c[,1], x.c[,2]))
   .2
.1  gene1 gene19 gene2 gene22 gene23 gene3 gene5 gene7
  Function1 0  1 0  1  1 0 1 0
  Function2 1  1 0  0  0 0 0 1
  Function3 0  0 1  0  1 1 0 1



On 2/20/08, Paul Christoph Schröder [EMAIL PROTECTED] wrote:
 I'm sorry if I didn't wrote it the right way. I'm just starting in the world
 of R and it's not that easy at the beginning.
 I wrote it again with code and comments. I hope it is understandable now. Do
 you think I should post it again in this shape?

 func_gen-read.delim(file, header=T) #contains functions (rows) and genes
 (colum); func_gen is a data.frame

 #It looks like this:
 #  Function  x
 # Function1   gene5, gene19, gene22, gene23
 # Function2  gene1, gene7, gene19
 # Function3   gene2, gene3, gene7, gene23

 # Duplicates of genes exist between different functions. This is why the
 read.delim command was used instead of the read.table command #because
 of duplicate 'row.names' are not allowed error.

 all_genes #contains all genes from above data frame; all_genes is a
 data.frame
 #It looks like this:
 # Genes
 # gene1
 # gene2
 # gene3
 # gene5
 # gene7
 # gene19
 # gene 22
 # gene 23

 func_gen[,2] %in% all_genes #this should result in a true-false matrix
 # Like this:
 # Functiongene1gene2gene3   gene5   gene7   gene19   gene22
  gene23
 # Function1   F  F  F T  F
  T  T T
 # Function2   T  F  F F  T
  T  F F
 # Function3   F  T  T F  T
  F  F T

 #and instead I obtain a true-false matrix with only FALSE-values.

 Thanks in advance!
 Paul


 --
Paul C. Schröder
PhD-Student
Division of Proteomics, Genomics 
 Bioinformatics
Center for Applied Medicine (CIMA)
University of
 Navarra
Avda. Pio XII, 55
E-31008 Pamplona, Spain

Tel: +34 948 194700, ext
 5023
email: [EMAIL PROTECTED]





 jim holtman escribió:
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
and provide
 commented, minimal, self-contained, reproducible code.

It is hard to give a
 solution if we don't have the problem statement,
or an example of the data
 structures you are using.

On Feb 20, 2008 6:57 AM, Paul Christoph
 Schröder
[EMAIL PROTECTED] wrote:

 Hello all!

I have the following problem with the %in% command:

1) I have a
 data frame that consists of functions (rows) and genes
(columns). The whole
 has been loaded with the read.delim command
because of gene-duplications
 between the different rows.
2) Now, there is another data frame that
 contains all the genes (only
the genes and without duplicates) from all the
 functions of the above
data frame.

What I want to do now is to use the %
 in % command to obtain a
TRUE-FALSE data frame. This should be a data
 frame, where for every
function some genes are TRUE and some are FALSE
 depending if they were
or not in the specific function when matched against
 the all genes
data frame.

The main problem I have is the way how the
 genes are in the first data
frame. I used the unlist command to separate
 them through commas ,.
But every time I do the match between the first and
 second data frame it
returns out FALSE for every gene in every
 function.

Can anyone please give me a hind how to handle the problem?
Thank
 you very much in advance!

Paul

--
Paul C. Schröder
PhD-Student
Division of
 Proteomics, Genomics  Bioinformatics
Center for Applied Medicine
 (CIMA)
University of Navarra
Avda. Pio XII, 55
E-31008 Pamplona, Spain

Tel:
 +34 948 194700, ext 5023
email: [EMAIL PROTECTED]





 [[alternative
 HTML version
 deleted]]


__
R-help@r-project.org
 mailing
 list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do
 read the posting guide
 http://www.R-project.org/posting-guide.html
and provide
 commented, minimal, self-contained, reproducible code.









-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve

Re: [R] variable syntax problem

2008-02-21 Thread jim holtman

Exactly what do you mean by additional text?  Have you tried paste?

On 2/21/08, Paul Hammer [EMAIL PROTECTED] wrote:
 dear members,

 i would like to write a variable in a plot title (main=) but i don't
 know the right syntax:(...i tried a lot of different ways without success.

 here my example:

 y=30
 z=33
 for (i in 10:length(tissue)) {
 png(filename = tissues[i], width = 1024, height = 768, pointsize = 12,
 bg = white)
 gene.graph(ENSG0115252, rma.affy, gps=list(1:3, y:z),
 type=mean-int, gp.col=c(red, blue), by.order=TRUE,
 scale.to.gene=FALSE, use.symbol=TRUE, use.mt=FALSE, *main=PDE1A
 (red=prostate, blue=tissues[i])*, ylab=intensity / probeset,
 exon.y=1, exon.height=1, exon.bg.col=#c3c3c3,
 exon.bg.border.col=black, show.introns=TRUE)
 y=y-3
 z=z-3
 dev.off() }

 when i write main=tissues[i] the value is written right. but i would
 like to have an additional text...

 thanks
 paul

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variable syntax problem

2008-02-21 Thread jim holtman

?assign

On 2/21/08, Paul Hammer [EMAIL PROTECTED] wrote:
 Paul Hammer schrieb:
  jim holtman schrieb:
  Exactly what do you mean by additional text?  Have you tried paste?
 
  On 2/21/08, Paul Hammer [EMAIL PROTECTED] wrote:
 
  dear members,
 
  i would like to write a variable in a plot title (main=) but i don't
  know the right syntax:(...i tried a lot of different ways without success.
 
  here my example:
 
  y=30
  z=33
  for (i in 10:length(tissue)) {
  png(filename = tissues[i], width = 1024, height = 768, pointsize = 12,
  bg = white)
  gene.graph(ENSG0115252, rma.affy, gps=list(1:3, y:z),
  type=mean-int, gp.col=c(red, blue), by.order=TRUE,
  scale.to.gene=FALSE, use.symbol=TRUE, use.mt=FALSE, *main=PDE1A
  (red=prostate, blue=tissues[i])*, ylab=intensity / probeset,
  exon.y=1, exon.height=1, exon.bg.col=#c3c3c3,
  exon.bg.border.col=black, show.introns=TRUE)
  y=y-3
  z=z-3
  dev.off() }
 
  when i write main=tissues[i] the value is written right. but i would
  like to have an additional text...
 
  thanks
  paul
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
  thank you jim,
 
  that was what i meant :)

 now i would like to call a varaible like an another variable value...

 example:

 for (i in 10:length(tissue)) {
 PSA_SI_tissues[i] = splicing.index(rma.affy, ENSG0142515,
 tissue, c(prostate,tissues[i]), vector.out=FALSE)
 }

 with paste it does not work

 for (i in 10:length(tissue)) {
 paste(PSA_SI_,tissues[i]) = splicing.index(rma.affy,
 ENSG0142515, tissue, c(prostate,tissues[i]), vector.out=FALSE)
 }

 any suggestions?

 thanks
 paul


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unable to create/index a zoo irregular timeseries

2008-02-21 Thread jim holtman

You need to convert to POSIXct since POSIXlt is a vector of size 9.
So do the following:

miedate - as.POSIXct(strptime(as.character(pressione[,1]),
format=%d-%m-%Y %H:%M:%S))

There is a newsletter (I forget the issue) that you might want to
refer to on using 'dates'.




On 2/21/08, vittorio [EMAIL PROTECTED] wrote:
 In the text file pressione2008.csv I have the following

 Data,MAX,MIN,Note
 07-01-2008 08:00:00, 135, 90, Eccessi feste, inizio dieta
 07-01-2008 18:00:00, 135, 85, 
 08-01-2008 08:00:00, 125, 75, 

 which is a collection of blood pressure data at different time of the day.
 I would like to build an its with MIN  MAX blood pressure but being a real
 newbye with zoo I obtain the following

  library(zoo)
 pressione - data.frame(read.csv(pressione2008.csv))

  miedate - strptime(as.character(pressione[,1]), format=%d-%m-%Y %H:%M:%S)

  miedate
 [1] 2008-01-07 08:00:00 2008-01-07 18:00:00 2008-01-08 08:00:00

  str(miedate)
  POSIXlt[1:9], format: 2008-01-07 08:00:00 2008-01-07 18:00:00 ...

  ts- as.zoo(matrix(pressione[,2:3],ncol=2), miedate)
  ts
 Error in Ops.POSIXt(freq, d) : * not defined for POSIXt objects

  ts- zoo(matrix(pressione[,2:3],ncol=2), miedate)
 Error in order(x, ..., na.last = na.last, decreasing = decreasing) :
  unimplemented type 'list' in 'orderVector1'
 In addition: Warning message:
 In zoo(matrix(pressione[, 2:3], ncol = 2), miedate) :
  some methods for zoo objects do not work if the index entries
 in 'order.by' are not unique

 

 Please help

 Ciao
 Vittorio

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variable syntax problem

2008-02-21 Thread jim holtman

Also consider using a 'list' to store the results.

On 2/21/08, Paul Hammer [EMAIL PROTECTED] wrote:
 Paul Hammer schrieb:
  jim holtman schrieb:
  Exactly what do you mean by additional text?  Have you tried paste?
 
  On 2/21/08, Paul Hammer [EMAIL PROTECTED] wrote:
 
  dear members,
 
  i would like to write a variable in a plot title (main=) but i don't
  know the right syntax:(...i tried a lot of different ways without success.
 
  here my example:
 
  y=30
  z=33
  for (i in 10:length(tissue)) {
  png(filename = tissues[i], width = 1024, height = 768, pointsize = 12,
  bg = white)
  gene.graph(ENSG0115252, rma.affy, gps=list(1:3, y:z),
  type=mean-int, gp.col=c(red, blue), by.order=TRUE,
  scale.to.gene=FALSE, use.symbol=TRUE, use.mt=FALSE, *main=PDE1A
  (red=prostate, blue=tissues[i])*, ylab=intensity / probeset,
  exon.y=1, exon.height=1, exon.bg.col=#c3c3c3,
  exon.bg.border.col=black, show.introns=TRUE)
  y=y-3
  z=z-3
  dev.off() }
 
  when i write main=tissues[i] the value is written right. but i would
  like to have an additional text...
 
  thanks
  paul
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
  thank you jim,
 
  that was what i meant :)

 now i would like to call a varaible like an another variable value...

 example:

 for (i in 10:length(tissue)) {
 PSA_SI_tissues[i] = splicing.index(rma.affy, ENSG0142515,
 tissue, c(prostate,tissues[i]), vector.out=FALSE)
 }

 with paste it does not work

 for (i in 10:length(tissue)) {
 paste(PSA_SI_,tissues[i]) = splicing.index(rma.affy,
 ENSG0142515, tissue, c(prostate,tissues[i]), vector.out=FALSE)
 }

 any suggestions?

 thanks
 paul


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Save a group of matrix

2008-02-21 Thread jim holtman

Look at using a list to store the data, something like this:

 results - list()
 for (year in 2002:2008){
+ results[[as.character(year)]] - matrix(year,10,10)
+ }
 results
$`2002`
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,] 2002 2002 2002 2002 2002 2002 2002 2002 2002  2002
 [2,] 2002 2002 2002 2002 2002 2002 2002 2002 2002  2002
 [3,] 2002 2002 2002 2002 2002 2002 2002 2002 2002  2002
 [4,] 2002 2002 2002 2002 2002 2002 2002 2002 2002  2002
 [5,] 2002 2002 2002 2002 2002 2002 2002 2002 2002  2002
 [6,] 2002 2002 2002 2002 2002 2002 2002 2002 2002  2002
 [7,] 2002 2002 2002 2002 2002 2002 2002 2002 2002  2002
 [8,] 2002 2002 2002 2002 2002 2002 2002 2002 2002  2002
 [9,] 2002 2002 2002 2002 2002 2002 2002 2002 2002  2002
[10,] 2002 2002 2002 2002 2002 2002 2002 2002 2002  2002

$`2003`
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,] 2003 2003 2003 2003 2003 2003 2003 2003 2003  2003
 [2,] 2003 2003 2003 2003 2003 2003 2003 2003 2003  2003
 [3,] 2003 2003 2003 2003 2003 2003 2003 2003 2003  2003
 [4,] 2003 2003 2003 2003 2003 2003 2003 2003 2003  2003
 [5,] 2003 2003 2003 2003 2003 2003 2003 2003 2003  2003
 [6,] 2003 2003 2003 2003 2003 2003 2003 2003 2003  2003
 [7,] 2003 2003 2003 2003 2003 2003 2003 2003 2003  2003
 [8,] 2003 2003 2003 2003 2003 2003 2003 2003 2003  2003
 [9,] 2003 2003 2003 2003 2003 2003 2003 2003 2003  2003
[10,] 2003 2003 2003 2003 2003 2003 2003 2003 2003  2003

$`2004`


On 2/21/08, Alfonso Pérez Rodríguez [EMAIL PROTECTED] wrote:
 It seems that is not posible to send R file in the messages, well, then I
 resend the message with the script included.


 Hello, I'm creating a loop to work with vegan, to get a species abundance
 curve. Here I send the script I've created and also an excel file to prove
 what it can do.
 Well, I have a database with 20 years, and each year we have sampled 19
 stratum, and in each estratum we have carry out some sumpling. Then, with
 the script that I've sent I've got to calculate the species abundance curve
 for each stratum but only for one year. I want to be able to do this for the
 20 years sampled but separately, obtaining one independent matrix for each
 year, but I don't know how to do, I sure it's very simple but I've not
 encountered the way to do it.

 If someone can help me I would be very grateful, thank you


 SCRIPT

 library(reshape)
 library(vegan)
 Input=D:/R/Analisis aprendizaje/Input
 setwd(Input)
 Data=read.table(PruebasRNA3.csv,header=T,sep=;,dec=.)

 Estr=unique(Data$ESTRATO)
 LEstr=length(Estr)

 Results= matrix(nrow=20, ncol=LEstr)
 Results[is.na(Results)]=0

 for(i in 1:LEstr)
  {
  Datasel=Data[Data$ESTRATO==Estr[i],]
  SubData=data.frame(Datasel$PESCA, Datasel$Sp, Datasel$Numero)
  TransData - reshape(SubData, v.names=Datasel.Numero,
 idvar=Datasel.PESCA,
   timevar=Datasel.Sp, direction=wide)
  TransData[is.na(TransData)] - 0
  SAC=specaccum(TransData,random,permutations=100)
  # str(SAC), a través de esta función veo cual es la estructura de mis
 datos y puedo pedir las columnas  que me interesen, que en este
 caso serían de la 3 a la 5 (sites, richness y sd)
  Pesc=length(SAC$richness)
for (j in 1:Pesc)
{
Results[j,i]=SAC$richness[j]
}
  }
 Results
 write.table(Results,file=D:/R/Analisis aprendizaje/Output/Results.txt)



 Alfonso Pérez Rodríguez
 Instituto de Investigaciones Marinas
 C/ Eduardo Cabello nº 6
 C.P. 36208 Vigo (España)
 Tlf.- 986231930 Extensión 241
 e-mail: [EMAIL PROTECTED]


 


  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get names of a list into df:s?

2008-02-21 Thread jim holtman

Here is one way of doing it:

 lapply(names(g), function(z)cbind(x=g[[z]], var1=z))
[[1]]
  x var1
1 1a
2 2a
3 3a

[[2]]
  x var1
1 4b
2 5b
3 6b

[[3]]
  x var1
1 7c
2 8c
3 9c


On Thu, Feb 21, 2008 at 1:22 PM, Lauri Nikkinen [EMAIL PROTECTED] wrote:
 R users,

 I have a simple lapply question.

 g - list(a=1:3, b=4:6, c=7:9)
 g - lapply(g, function(x) as.data.frame(x))
 lapply(g, function(x) cbind(x, var1 = rep(names(g), each=nrow(x))[1:nrow(x)]))

 I get

 $a
  x var1
 1 1a
 2 2a
 3 3a

 $b
  x var1
 1 4a
 2 5a
 3 6a

 $c
  x var1
 1 7a
 2 8a
 3 9a

 And I would like to have

 $a
  x var1
 1 1a
 2 2a
 3 3a

 $b
  x var1
 1 4b
 2 5b
 3 6b

 $c
  x var1
 1 7c
 2 8c
 3 9c

 How should I modify my lapply clause to achieve this?

 Best regards,
 Lauri

  sessionInfo()
 R version 2.6.1 (2007-11-26)
 i386-apple-darwin8.10.1

 locale:
 C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with cut

2008-02-22 Thread jim holtman

One way of finding out is to look at the code for cut.default.  Here
is the result of tracing through it where it determines where the cuts
are for 12 equal spacings:

D(2)
 [1] 149.804 166.170 182.536 198.902 215.268 231.634 248.000 264.366
280.732 297.098 313.464 329.830
[13] 346.196

As you can see one of the breakpoints is at 329.830 that is why 330 is
in the (330,346] category.  The statements in the function that do
this are:

if (length(breaks) == 1) {
if (is.na(breaks) | breaks  2)
stop(invalid number of intervals)
nb - as.integer(breaks + 1)
dx - diff(rx - range(x, na.rm = TRUE))
if (dx == 0)
dx - abs(rx[1])
breaks - seq.int(rx[1] - dx/1000, rx[2] + dx/1000, length.out = nb)
}

You can see there is a small fudge factor applied to both ends to make
sure all the data is included.  That is what causes the perceived
problem.

On Fri, Feb 22, 2008 at 8:21 AM,  [EMAIL PROTECTED] wrote:
 Hi All,

 I might misunderstood how cut works. But following behaviour surprises
 me.

 vv - seq(150, 346, by= 4)
 cc - cut(vv, 12)
 cc[vv == 330]
 Results [1] (330,346]

 I would have expected 330 to fall into (313,330] category.

 Can you please advice what do I do wrong?

 Many Thanks,
 Jussi Lehto

 Visit our website at http://www.ubs.com

 This message contains confidential information and is ...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with cut

2008-02-22 Thread jim holtman

You can also get more detail on where the intervals are with 'dig.lab':

 cc - cut(vv, 12, dig.lab=6)
 str(cc)
 Factor w/ 12 levels (149.804,166.17],..: 1 1 1 1 1 2 2 2 2 3 ...
 cc
 [1] (149.804,166.17]  (149.804,166.17]  (149.804,166.17]
(149.804,166.17]  (149.804,166.17]
 [6] (166.17,182.536]  (166.17,182.536]  (166.17,182.536]
(166.17,182.536]  (182.536,198.902]
[11] (182.536,198.902] (182.536,198.902] (182.536,198.902]
(198.902,215.268] (198.902,215.268]
[16] (198.902,215.268] (198.902,215.268] (215.268,231.634]
(215.268,231.634] (215.268,231.634]
[21] (215.268,231.634] (231.634,248] (231.634,248]
(231.634,248] (231.634,248]
[26] (248,264.366] (248,264.366] (248,264.366]
(248,264.366] (264.366,280.732]
[31] (264.366,280.732] (264.366,280.732] (264.366,280.732]
(280.732,297.098] (280.732,297.098]
[36] (280.732,297.098] (280.732,297.098] (297.098,313.464]
(297.098,313.464] (297.098,313.464]
[41] (297.098,313.464] (313.464,329.83]  (313.464,329.83]
(313.464,329.83]  (313.464,329.83]
[46] (329.83,346.196]  (329.83,346.196]  (329.83,346.196]
(329.83,346.196]  (329.83,346.196]
12 Levels: (149.804,166.17] (166.17,182.536] (182.536,198.902]
(198.902,215.268] ... (329.83,346.196]


On Fri, Feb 22, 2008 at 10:59 AM, Henrique Dallazuanna [EMAIL PROTECTED] 
wrote:
 Is to show the categorys which contains '330'

 On 22/02/2008, Heinz Tuechler [EMAIL PROTECTED] wrote:
  At 15:22 22.02.2008, Henrique Dallazuanna wrote:
   Try this:
   
   grep(330, levels(cc), value=T)
 
 
  Could you please explain in a little more detail,
   how this answers the original question?
 
  I would have expected 330 to fall into (313,330] category.
Can you please advice what do I do wrong?
 
 
  Thank you
 
 
   Heinz

 
 
 
   On 22/02/2008, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Hi All,

  I might misunderstood how cut works. But following behaviour surprises
  me.

  vv - seq(150, 346, by= 4)
  cc - cut(vv, 12)
  cc[vv == 330]
  Results [1] (330,346]

  I would have expected 330 to fall into (313,330] category.

  Can you please advice what do I do wrong?

  Many Thanks,
  Jussi Lehto

 Visit our website at http://www.ubs.com

  This message contains confidential information and is 
  in...{{dropped:29}}


 __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


   
   
   --
   Henrique Dallazuanna
   Curitiba-Paraná-Brasil
   25° 25' 40 S 49° 16' 22 O
   
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
 
 


 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Corrected : Efficient writing of calculation involving each element of 2 data frames

2008-02-22 Thread jim holtman

take a look at the 'embed' function.  With the you can create a matrix
with the added shifted in each column.  You would want to do
embed(your.data,100).

On Fri, Feb 22, 2008 at 4:15 PM, Vikas N Kumar
[EMAIL PROTECTED] wrote:
 Hi

 I have 2 data.frames each of the same number of rows (approximately 3 or
 more entries).
 They also have the same number of columns, lets say 2.
 One column has the date, the other column has a double precision number. Let
 the column names be V1, V2.

 Now I want to calculate the correlation of the 2 sets of data, for the last
 100 days for every day available in the data.frames.

 My code looks like this :
 # Let df1, and df2 be the 2 data frames with the required data
 ## begin code snippet

 my_corr - c();
 for ( i_start in 100:nrow(df1))
   my_corr[i_start-99] -
 cor(x=df1[(i_start-99):i_start,V2],y=df2[(i_start-99):i_start,V2])
 ## end of code snippet

 This runs very slowly, and takes more than an hour to run if I have to
 calculate correlation between 10 data sets leaving me with 45 runs of this
 snippet or taking more than 30 minutes to run.

 Is there an efficient  way to write  this piece of code where I can get it
 to run faster ?

 If I do something similar in Excel, it is much faster. But I have to use R,
 since this is a part of a bigger program.

 Any help will be appreciated.

 Thanks and Regards
 Vikas






 --
 http://www.vikaskumar.org/

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fixed effects

2008-02-22 Thread jim holtman

help.search('fixed effect') creates these matches.  Does one of the do
what you want?

Help files with alias or concept or title matching 'fixed effect'
using fuzzy matching:



fixef(lme4)Extract Fixed Effects
lmer(lme4) Fit (Generalized) Linear Mixed-Effects Models
fixed.effects(nlme)Extract Fixed Effects
fixed.effects.lmList(nlme) Extract lmList Fixed Effects
lme(nlme)  Linear Mixed-Effects Models
lmeStruct(nlme)Linear Mixed-Effects Structure
nlme(nlme) Nonlinear Mixed-Effects Models
nlmeStruct(nlme)   Nonlinear Mixed-Effects Structure



On Fri, Feb 22, 2008 at 7:42 PM, Petros Andreou
[EMAIL PROTECTED] wrote:
 Hello everyone!

 I would really appreciate it if someone knew where could I find the command
 in R in order to run a fixed effects regression.
 What format should my data have?

 I have looked through the manual and I could not find anything

 Thank you in advance,


 Petros

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Aranda-Ordaz

2008-02-23 Thread jim holtman

I have no idea if it is helpful, but a quick google search turned up:

LINKINF # Two S-PLUS functions to compute influence diagnostics ...The
assumed logit link is embedded within the Aranda-Ordaz parametric #
family of link functions. # Written by John Yick and Andy H. Lee ...
phase.hpcc.jp/mirrors/stat/S/linkinf - 5k - Cached - Similar pages - Note this


On Sat, Feb 23, 2008 at 10:11 AM, o ha wang [EMAIL PROTECTED] wrote:
 Hi all,

  Does anyone know R code or SAS code for Aranda-Ordaz link family?

  thanks,
  xiao yue


 -

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color area between two time-series via polygon()?

2008-02-24 Thread jim holtman

I think you have to change one statement in your program:

 xx - cbind(time(z[,1]),rev(time(z[,2])))



On Sun, Feb 24, 2008 at 7:44 PM,  [EMAIL PROTECTED] wrote:
 Hi all,



 I would like to color the area between two time-series. I tried it by
 using the polygon() function but I keeps drawing lines between beginning
 and end points.

 Is there another more appropriate function or how could I close the
 polygon at the end en the beginning of the time series (e.g., drawing a
 straight line)?



 The following doesn't plot a polygon between the two time-series:

 z - ts(matrix(rnorm(200), 100), start=c(1961, 1), frequency=12)

 plot(z, plot.type=single, lty=1:2)

 xx - cbind(time(z[,1]),rev(z[,2]))

 yy - cbind(as.vector(z[,1]),rev(as.vector(z[,2])))

 polygon(xx,yy, col=gray, border = red)



 I would like to make it look like this (but then for time series)

 n - 100
 xx - c(0:n, n:0)
 yy - c(c(0,cumsum(stats::rnorm(n))), rev(c(0,cumsum(stats::rnorm(n)
 plot   (xx, yy, type=n, xlab=Time, ylab=Distance)
 polygon(xx, yy, col=gray, border = red)



 Thanks for your help,

 Jan




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Graph Axis

2008-02-25 Thread jim holtman

You have to convert you date to be a Date class:

x - read.table(/tempxx.txt, header=TRUE, as.is=TRUE)
x$Date - as.Date(x$Date, %d/%m/%Y)
plot(x$Date, x$Rate, type='l')


On 2/25/08, Khadija Mohammedali [EMAIL PROTECTED] wrote:

 Hi I have data of exchange rates and time, and am trying to draw a graph that 
 will show the rates on the y axis and dates on the x axis. I am using the 
 following code: plot(rate, type='l', xlab='Date', ylab='Rate', main='£ to 
 Euro rate over 5 years')This gives me the graph I want although I want to 
 display the dates on the x axis, even if its just 2002, 2003,...2008. 
 Attached is my data. Hope you can help.
 _
 [[elided Hotmail spam]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Graph Axis

2008-02-25 Thread jim holtman

Plot the x-axis with one less data point:

plot(x$Date[-1], returns,)

On Mon, Feb 25, 2008 at 5:39 PM, Khadija Mohammedali [EMAIL PROTECTED] wrote:
 Hi Jim

 Thank you for your quick response. This worked great. I am having the same
 problem again. I have moved on to calculating returns from rates and want to
 plot returns on the y axis and again dates on the x axis. The code I am
 using to calculate returns is as follows:
 rate-x$Rate
 returns-(diff(log(rate)))
 If I do:
 plot(returns, type=l) I get the graph I want however am having problems
 with the x axis again. Modification of the code below doesnt work as I now
 have one less dimension in returns. Hope I am making sense.  Your help is
 much appreciated.


  Date: Mon, 25 Feb 2008 16:32:44 -0500
  From: [EMAIL PROTECTED]
  To: [EMAIL PROTECTED]
  Subject: Re: [R] Graph Axis
  CC: r-help@r-project.org


 
  You have to convert you date to be a Date class:
 
  x - read.table(/tempxx.txt, header=TRUE, as.is=TRUE)
  x$Date - as.Date(x$Date, %d/%m/%Y)
  plot(x$Date, x$Rate, type='l')
 
 
  On 2/25/08, Khadija Mohammedali [EMAIL PROTECTED] wrote:
  
   Hi I have data of exchange rates and time, and am trying to draw a graph
 that will show the rates on the y axis and dates on the x axis. I am using
 the following code: plot(rate, type='l', xlab='Date', ylab='Rate', main='£
 to Euro rate over 5 years')This gives me the graph I want although I want to
 display the dates on the x axis, even if its just 2002, 2003,...2008.
 Attached is my data. Hope you can help.
   _
   [[elided Hotmail spam]]
  
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
  
  
 
 
  --
  Jim Holtman
  Cincinnati, OH
  +1 513 646 9390
 
  What is the problem you are trying to solve? Tell me what you want to
  do, not how you want to do it.


 
 She said what? About who? Shameful celebrity quotes on Search Star!



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] combining 40,000 with 40,000 data frame (different tact)

2008-02-26 Thread jim holtman

?rbind

On 2/26/08, stephen sefick [EMAIL PROTECTED] wrote:
 I have not been able to find anything to do what I want, so I am going
 to tact to the left.  I have twp continuous time series for two years
 with the same fourteen variables.  I would like to simply append the
 second year to the first.  They both have the same column headings
 etc.  Just like tapping two pieces of paper together for a long number
 series.
 Thanks

 Stephen

 --
 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.

-K. Mullis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] numeric format

2008-02-26 Thread jim holtman

Those are parameter to 'print';  what you want is something like:

 x - data.frame(a=runif(10))
 print(x)
 a
1  0.713705394
2  0.715496609
3  0.629578524
4  0.184360667
5  0.456639418
6  0.008667156
7  0.260985437
8  0.270915631
9  0.689128652
10 0.302484280
 print(x,scientific=F, digits=4)
  a
1  0.713705
2  0.715497
3  0.629579
4  0.184361
5  0.456639
6  0.008667
7  0.260985
8  0.270916
9  0.689129
10 0.302484



On 2/26/08, cvandy [EMAIL PROTECTED] wrote:

 Hi!
 I'm an R newbie and this should be a trivial problem, but I can't make it
 work and cannot find what I'm doing wrong in the literature.
 I entered the the command:
 table-data.frame(x, scientific=F, digits=4)
 table
 This prints a column of x with 16 useless decimal places after the decimal
 point.  Also, it prints an unwanted index number (1-20) in the left column.
 How do I get rid of the index column and how do I control the number of
 decimal places?
 Thanks in advance.
 CHV
 --
 View this message in context: 
 http://www.nabble.com/numeric-format-tp15700452p15700452.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot y1 and y2 on one graph

2008-02-27 Thread jim holtman

This should do what you want:

x-1:10
y1-x+runif(10)*2
y2-seq(0,50,length.out=10)+rnorm(10)*10


plot(y1~x, bty='c')
par(new=TRUE)  # plot on the same graph
plot(y2~x, col='red', axes=FALSE, bty='c', xlab='', ylab='')
axis(4, col.axis='red', col='red')
mtext(y2, 4, col='red', line=-2)



On Wed, Feb 27, 2008 at 5:05 PM, milton ruser [EMAIL PROTECTED] wrote:
 Dear all

 I have a code like

 x-1:10
 y1-x+runif(10)*2
 y2-seq(0,50,length.out=10)+rnorm(10)*10

 par(mfrow=c(1,2))
 plot(y1~x)
 plot(y2~x)

 Now I would like to plot y1 and y2 on the same graph, with its two scales
 (y1 on left and y2 on rigth side).

 Any help are welcome.

 Kind regards

 Miltinho

 Brazil

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] write.csv +RMySQL request

2008-02-28 Thread jim holtman

?capture.output

myoutput - capture.output(write.csv(...))

On Thu, Feb 28, 2008 at 7:34 PM, Tristan Casey [EMAIL PROTECTED] wrote:
Hello,

I am relatively new to R and learning its ins and outs. As part of a website
I am building, I need to read and write csv files directly from an SQL
database. Basically I want to convert R variables (dataframes) into CSV
format, store them as another R variable (as a properly formatted text string
suitable for csv reading) and then send this to one row in a database.

The SQL part is fine, the problem arises because I cannot capture the output
of write.csv! It posts to the terminal when file= is used, however I also
want to store it. Does anyone have any ideas?

Thanks in advance!

e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve? Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting multiple tables when using table(dataframe) to tabulate data

2008-02-28 Thread jim holtman

Is this what you want?

 tapply(x$count, list(x$delta_ts, x$status), sum)
   ASSIGNED CLOSED NEW RESOLVED
2008-02-212 NA   20
2008-02-220  0   61
2008-02-232  1  120
2008-02-247  4  162
2008-02-252  6  225
2008-02-266  8  383
2008-02-27   NA  3  565
2008-02-28   NA  3  565


On Thu, Feb 28, 2008 at 8:22 PM, obradoa [EMAIL PROTECTED] wrote:

 I am having hard time tabulating data in a dataframe, and getting a single
 table for an answer. I am trying to tabulate all counts for given
 status on a given date.


 I have a data frame such as:


 delta_ts   status count
 1  2008-02-27   CLOSED 3
 2  2008-02-27  NEW56
 3  2008-02-27 RESOLVED 5
 4  2008-02-21 ASSIGNED 1
 5  2008-02-21 ASSIGNED 1
 6  2008-02-21  NEW 2
 7  2008-02-21 RESOLVED 0
 8  2008-02-22 ASSIGNED 0
 9  2008-02-22   CLOSED 0
 10 2008-02-22  NEW 6
 11 2008-02-22 RESOLVED 1
 12 2008-02-23 ASSIGNED 2
 13 2008-02-23   CLOSED 1
 14 2008-02-23  NEW12
 15 2008-02-23 RESOLVED 0
 16 2008-02-24 ASSIGNED 7
 17 2008-02-24   CLOSED 4
 18 2008-02-24  NEW16
 19 2008-02-24 RESOLVED 2
 20 2008-02-25 ASSIGNED 2
 21 2008-02-25   CLOSED 6
 22 2008-02-25  NEW22
 23 2008-02-25 RESOLVED 5
 24 2008-02-26 ASSIGNED 6
 25 2008-02-26   CLOSED 8
 26 2008-02-26  NEW38
 27 2008-02-26 RESOLVED 3
 28 2008-02-28   CLOSED 3
 29 2008-02-28  NEW56
 30 2008-02-28 RESOLVED 5


 When I do table on that frame I get a long list that looks like this:


  table(data)
 , , count = 0

status
 delta_ts ASSIGNED CLOSED NEW RESOLVED
  2008-02-210  0   01
  2008-02-221  1   00
  2008-02-230  0   01
  2008-02-240  0   00
  2008-02-250  0   00
  2008-02-260  0   00
  2008-02-270  0   00
  2008-02-280  0   00


 and so on all the way up to

 , , count = 56

status
 delta_ts ASSIGNED CLOSED NEW RESOLVED
  2008-02-210  0   00
  2008-02-220  0   00
  2008-02-230  0   00
  2008-02-240  0   00
  2008-02-250  0   00
  2008-02-260  0   00
  2008-02-270  0   10
  2008-02-280  0   10



 What I actually want is for my counts to be properly tabulated in one single
 table that looks something like this.


 delta_ts ASSIGNED CLOSED NEW RESOLVED
  2008-02-212  5   915

 and so on...


 Any ideas what I am doing wrong?

 Thanks!
 --
 View this message in context: 
 http://www.nabble.com/Getting-multiple-tables-when-using-table%28dataframe%29-to-tabulate-data-tp15750098p15750098.html
 Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Large loops hang?

2008-02-28 Thread jim holtman

If you really want to do a loop, then preallocate your storage.  You
were dynamically allocating each time through (or there abouts):

 system.time({
+ res - numeric(10)
+ for (i in 1:10) {
+x - rnorm(2)
+res[i] - x[2] - x[1]
+}
+ })
   user  system elapsed
   2.750.023.10

On Thu, Feb 28, 2008 at 3:30 PM, Minimax [EMAIL PROTECTED] wrote:
 Dear useRs,

 Suppose we have loop:

 res - c()
 for (i in 1:10) {
x - rnorm(2)
res - c(res,x[2]-x[1])
}

 and this loop for 10^5 cases runs about - for example 5 minutes.

 When I add one zero (10^6) this loop will not end overnight but probably
 hangs. This occurs regardless of calculated statistics in such
 simulation, always above 10^5 times. Nested loops do not help.

 Any suggestions for collecting larger amount of Monte Carlo data ?

 Regards

 Minimax

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fwd: Re: How to create following chart for visualizing multivariate time series

2008-02-28 Thread jim holtman

Try something like this:

require(grDevices) # for colours
x - y - seq(-4*pi, 4*pi, len=27)
r - sqrt(outer(x^2, y^2, +))
image(x, y, r, col=gray((0:32)/32))
colors - colorRampPalette(c('red', 'yellow', 'blue'))  # create you
color spectrum
image(x,y,r, col=colors(100))


On Thu, Feb 28, 2008 at 9:28 PM, Megh Dal [EMAIL PROTECTED] wrote:
 I used ?image function to do that, like below :

 require(grDevices) # for colours
 x - y - seq(-4*pi, 4*pi, len=27)
 r - sqrt(outer(x^2, y^2, +))
 image(x, y, r, col=gray((0:32)/32))

 However my next problem to add a color pallet for color description [as shown 
 in following link]. If anyone here tell me how to do that, it will be good 
 for me.

 Regards,




 Megh Dal [EMAIL PROTECTED] wrote:  Hi all,

 Can anyone here please tell me whether is it possible to produce a chart 
 displayed in http://www.datawolf.blogspot.com/ in R for visualizing 
 multivariate time series? If possible how?


 Regards,


 -



 -



 -

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] setwd on other computer?

2008-02-29 Thread jim holtman

I can do it under Windows for network mounted files that are on some
other system: e.g.,

setwd(p:/APPS)

On 2/29/08, Paul Hammer [EMAIL PROTECTED] wrote:
 hi members,

 is it possible to set the work directory ( e.g. via setwd() ) on a other
 computer than R has been started?

 thanks
 paul

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] while loop syntax help

2008-02-29 Thread jim holtman

Does this give the answer that you want?

 x - c(5,5,7,6,5,4,3)
 result - NULL
 for (i in 1:(length(x) - 2)){
+ if ((x[i + 1]  x[i])  (x[i + 2]  x[i])) result - c(result, i)
+ }
 result
[1] 3 4 5




On 2/29/08, zack holden [EMAIL PROTECTED] wrote:

 Dear list,
 I'm trying to write my first looping function in R. After many hours of 
 searching help files and previous posts, I'm at wits end. Please forgive my 
 programming ignorance...any help is greatly appreciated.

 I need to sort through a vector (x) and identify the point at which 2 
 successive values become smaller than the previous value.

 I've written a while statement that I think should work. It's should 
 basically say: If value 1  value 2 and also  value3, then == row(Value 1). 
 Else, go to the next Value. However, output returns NULL, no matter how 
 I've modified the syntax.

 Thanks in advance for any help.

 Zack
 #
 x - c(5,5,7,6,5,4,3) x - data.frame(x)
 y -length(x)-2counter - 1
 output = c()
 while(counter = y) {

 counter1 - counter+1counter2 - counter+2
 if(x[counter,1]  x[counter1,1]|| x[counter1,1]  x[counter2,1]){output = 
 x[counter, ]
 } else {
 counter = counter+1
 }
 counter = y}
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can the matrix size limit be increased?

2008-02-29 Thread jim holtman

You only have 1 row in your matrix, so what you are getting printed
out is not an empty matrix, but the header.  If you print the
transpose you get:

 head(t(tst))
 [,1]
[1,]1
[2,]2
[3,]3
[4,]4
[5,]5
[6,]6


The default is to only print out 10 values.  Your data is there,
you just have to wait for the headers to print out, or at least make a
matrix with more than one row.

On Fri, Feb 29, 2008 at 2:25 PM, Robert Leach [EMAIL PROTECTED] wrote:
 Hi there,

 I'm brand new to R, so let me know if this question is not
 appropriate for this list.  I've been reading through the
 documentation and have tried a number of things, but am pretty much
 stuck so far.  Here's the session info:

   sessionInfo()
 R version 2.6.2 (2008-02-08)
 i386-apple-darwin8.10.1

 locale:
 C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 loaded via a namespace (and not attached):
 [1] rcompgen_0.1-17


 So I seem to be hitting a limit on matrix size.  First I read in my
 data into a list and it's OK:

   mz - scan(data.column3.txt, list(0))
 Read 158991 records
  mz
   mz
 [[1]]
 [1] 0.00e+00 0.00e+00 1.003393e+01 3.651888e+00 0.00e
 +00
 [6] 0.00e+00 3.067042e+00 1.277249e+00 1.984366e+00 3.644203e
 +01
[11] 1.172925e+02 1.933753e+02 2.020940e+02 1.570501e+02 8.990829e
 +01
 ...

 But when I try to put it into a matrix like this, I don't get an
 error, but the matrix appears empty...

   MZ=matrix(mz[[1]],nrow=1)
   MZ
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,
 9][,10]
 [,11][,12][,13][,14]   [,15][,16][,
 17][,18]
 [,19][,20][,21][,22][,23][,24][,
 25][,26]
 ...

 When I did a subset of my data, it was fine.  I did a manual binary
 search and determined the cutoff to be 10 elements.  So if I do
 just 99,999 elements, it looks as I would expect:

   mz - scan(data.column3.txt, list(0), 9)
 Read 9 records
   tst - matrix(mz[[1]],nrow=1)
   tst
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,
 9][,10]
 [1,]00 10.03393 3.65188800 3.067042 1.277249 1.984366
 36.44203
 [,11][,12][,13][,14]   [,15][,16][,
 17][,18]
 [1,] 117.2925 193.3753 202.0940 157.0501 89.9083 26.44127 17.05373
 53.40315
 [,19][,20][,21][,22][,23][,24][,
 25][,26]
 [1,] 65.20086 37.33463 17.71247 27.37268 41.83289 48.46916 58.94969
 76.05099
 ...

 If I do 100,000, I get the same empty appearance.  I've assumed that
 there must be a limitation on the number of elements in a matrix.  Is
 that right?  If so, how do I increase the maximum number of
 elements?  I tried another machine's installation of R and it
 apparently doesn't have a 99,999 element limit.  I've tried using:

 R --max-mem-size=2G
 R --max-vsize=20
 R --max-nsize=20
 R --max-vsize=20 --max-nsize=20 --max-ppsize=20
 R --max-vsize=10M

 I still end up with the empty-looking matrix when I try these.  How
 do I get my installation to work like the installation on another
 computer I tried where I was able to have larger matrices?

 Oh yeah, I also tried this, just to rule out problems with my data:

   tst - matrix(seq(1,158991),nrow=1,ncol=158991)
   tst
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
 [,13] [,14]
  [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,
 25] [,26]
  [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,
 37] [,38]
 ...

 Thanks,
 Rob

 Robert W. Leach
 Scientific Programmer
 Center for Computational Research
 Center of Excellence in Bioinformatics
 University at Buffalo
 http://www.ccr.buffalo.edu/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Newbie: Incorrect number of dimensions

2008-03-01 Thread jim holtman

It would be helpful if you provided commented, minimal,
self-contained, reproducible code.

What does str(all_differ) say?  That will tell you the structure of
the object that you are trying to work with.

On Sat, Mar 1, 2008 at 3:35 AM, Keizer_71 [EMAIL PROTECTED] wrote:

  dim(data.sub)
 [1] 1   140

 #extracting all differentially express genes##
 library(multtest)
 two_side- (1-pt(abs(data.sub),50))*2
 diff- mt.rawp2adjp(two_side)
 all_differ-diff[[1]][37211:1,]
 all_differ

 #list of differentially expressed genes##
  probe.names-
 + all_differ[[2]][all_differ[[1]][,BY]=0.01]

 Error in all_differ[[1]][, BY] : incorrect number of dimensions

 Hi,

 I am pretty new with R. What i am trying to do is to find all differentially
 express genes and list of differentially expressed genes. Am i doing
 something wrong?

 I keep getting incorrect number of dimensions. How do i find out the correct
 dimensions?

 thanks,
 Keizer

 --
 View this message in context: 
 http://www.nabble.com/Newbie%3A-Incorrect-number-of-dimensions-tp15773090p15773090.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fwd: Re: How to create following chart for visualizing multivariate time series

2008-03-01 Thread jim holtman

If you want color, then a slight addition to Henrique's solution will do it:

x - y - seq(-4*pi, 4*pi, len=27)
r - sqrt(outer(x^2, y^2, +))
library(lattice)
colkey - colorRampPalette(c('red','yellow','green'))(32)
levelplot(r, colorkey=list(col=colkey),
   col.regions=(col=colkey))

On Sat, Mar 1, 2008 at 6:08 PM, Henrique Dallazuanna [EMAIL PROTECTED] wrote:
 This works for me:

 x - y - seq(-4*pi, 4*pi, len=27)
 r - sqrt(outer(x^2, y^2, +))
 library(lattice)
 levelplot(r, colorkey=list(col=gray((0:32)/32)),
col.regions=(col=gray((0:32)/32)))

 'r' is a matrix for you?

 On 01/03/2008, David Winsemius [EMAIL PROTECTED] wrote:
  Henrique Dallazuanna [EMAIL PROTECTED] wrote in
  news:[EMAIL PROTECTED]:
 
  library(lattice)
  levelplot(r, colorkey=list(col=gray((0:32)/32)),
   col.regions=(col=gray((0:32)/32)))
 
  When I try that example, I get an error, even after updating lattice.
 
   levelplot(r, colorkey=list(col=gray((0:32)/32)),
  +  col.regions=(col=gray((0:32)/32)))
  Error in UseMethod(levelplot) : no applicable method for levelplot
 
  If I simply change colorkey=FALSE to colorkey=TRUE in the first levelplot
  help page example, I have what looks to me as success.
 
  levelplot(z~x*y, grid, cuts = 50, scales=list(log=e), xlab=,
ylab=, main=Weird Function, sub=with log scales,
colorkey = TRUE,
region = TRUE)
 
  --
  David Winsemius
 
  
   On 29/02/2008, Megh Dal [EMAIL PROTECTED] wrote:
   Hi Jim, i think you could not get my point. I did not want to put
   red-blue color there. I want to put a pallet which will describe
   the values of r. please have a look on following :
   http://bp0.blogger.com/_k3l6qPzizGs/RvDVglPknRI/AKo/itlWOvuuO
   tI/s1600-h/pairwise_kl_window60.png. Please see how a color pallate
   is added on the right side of this plot describing the value of red
   color, value of blue color etc.
  
 Is there any solution?
  
 Regards,
  
  
jim holtman [EMAIL PROTECTED] wrote:
 Try something like this:
  
require(grDevices) # for colours
x - y - seq(-4*pi, 4*pi, len=27)
r - sqrt(outer(x^2, y^2, +))
image(x, y, r, col=gray((0:32)/32))
colors - colorRampPalette(c('red', 'yellow', 'blue')) # create
you color spectrum
image(x,y,r, col=colors(100))
  
  
  
   On Thu, Feb 28, 2008 at 9:28 PM, Megh Dal wrote:
 I used ?image function to do that, like below :

 require(grDevices) # for colours
 x - y - seq(-4*pi, 4*pi, len=27)
 r - sqrt(outer(x^2, y^2, +))
 image(x, y, r, col=gray((0:32)/32))

 However my next problem to add a color pallet for color
 description [as shown in following link]. If anyone here tell
 me how to do that, it will be good for me.

  
Megh Dal wrote: Hi all,
  
   
 Can anyone here please tell me whether is it possible to produce
 a chart displayed in http://www.datawolf.blogspot.com/ in R for
   visualizing multivariate time series? If possible how?


 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?  Tell me what you want to
do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] an efficient pairwise matrix cell's comparison function

2008-03-02 Thread jim holtman

Does this do what you want?

 A - matrix(sample(0:2, 25, TRUE), ncol=5)
 B - matrix(1:25, ncol=5)
 C - ifelse(A == 0, 0, B)
 A
 [,1] [,2] [,3] [,4] [,5]
[1,]11121
[2,]10110
[3,]00102
[4,]01200
[5,]12122
 B
 [,1] [,2] [,3] [,4] [,5]
[1,]16   11   16   21
[2,]27   12   17   22
[3,]38   13   18   23
[4,]49   14   19   24
[5,]5   10   15   20   25
 C
 [,1] [,2] [,3] [,4] [,5]
[1,]16   11   16   21
[2,]20   12   170
[3,]00   130   23
[4,]09   1400
[5,]5   10   15   20   25



On Sun, Mar 2, 2008 at 7:11 AM, Diogo André Alagador
[EMAIL PROTECTED] wrote:
 To all,



 I am undergoing an analysis involving big matrices of about 3x200 which
 I have to handle in a more efficient way. So I would like some advice to
 build such efficient function to deliver the following result:



 -  starting with 2 matrices of the same dimension (eg. A and B)



   0  0  3  5  6  0  0  5

 A=   0  0  6  4  B=   0  4  3  5

 0  0  5  0  1  0  0  9



 -  the function should deliver a C matrix (same dimension too),
 where at each position C(i,j), compares A and B.

  if A(i,j)=0, than C(i,j)=0,

  if A(i,j)!=0, than C(i,j)=B(i,j)



  6  0  0  5

 C= 0  0  3  5

  0  0  0  0



 Although not an expert I could build a function with 2 cycles (reading
 columns and rows) which is not quick. Maybe you can help me in this
 challenge.



 Much thanks in advance,




 Diogo André Alagador
 Biodiversity  Global Change Lab, Museo Nacional de Ciencias Naturales,
 CSIC, Madrid, España
 Forest Research Centre, Instituto Superior de Agronomia, Universidade
 Técnica de Lisboa, Lisboa, Portugal


[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R data Export to Excel

2008-03-02 Thread jim holtman

If you are asking how to convert to multiple columns in Excel, look at
the text to column option in I think the data tab.

On Sun, Mar 2, 2008 at 9:59 PM, Keizer_71 [EMAIL PROTECTED] wrote:

 Here is my R Code

 x-1:2
 y-2:141
 data.matrix-data.matrix(data[,y])#create data.matrix
 variableprobe-apply(data.matrix[x,],1,var)
 variableprobe #output variance across probesets
 hist(variableprobe) #displaying histogram of variableprobe
 write.table(cbind(data[1],
 Variance=apply(data[,y],1,var)),file='c://variance.csv')
 #export as a .csv file.

 Output in Excel
 all in 1 column.

 ProbeID Variance
 1 224588_at 21.5825745738848

 How do i separate them so that i can have three columns

 ProbeID  Variance
 1   224588_at   21.582.

 thanks,
 Kei


 --
 View this message in context: 
 http://www.nabble.com/R-data-Export-to-Excel-tp15796903p15796903.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help for the first poster- a simple question

2008-03-03 Thread jim holtman

FAQ 7.31  (You need to understand what floating point numbers are)

On 3/3/08, Xuejun Qin [EMAIL PROTECTED] wrote:
 Hi, there,
 I cannot  get accurate value  for calculation.
 for example:
 ld-sqrt(1*0.05*0.95*0.05*0.95)
 0.05*0.95-ld=-6.938894e-18
 0.05*0.95-ld==0 is False.

 I met this problem in my program, how can I handle it. Thanks.


 xj.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tapply for Group Specific Means and Proportions

2008-03-03 Thread jim holtman

 1 3 3 3 3 ...
  $ TreeHt   : num  6 6 6 6 8 8 7 7 7 7 ...
   test-sort((tapply(Final$TreeHt, INDEX=interaction(Final$testdate,
 Final$testtime),  FUN=mean, na.rm=TRUE)))
   data.frame(test)
   test
 28Mar96.0752  6.00
 28Mar96.1014  7.00
 28Mar96.0924  7.33
 29Mar96.0835  8.928571
 28Mar96.0954 10.00

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simulation study using R

2008-03-03 Thread jim holtman

What is the format of the data you are storing (single value,
multivalued vector, matrix, dataframe, ...)?  This will help formulate
a solution.  What do you plan to do with the data?  Are you going to
do further analysis, write it to flat files, store it in a data base,
etc.?  How big are the data objects you are manipulating?

On Mon, Mar 3, 2008 at 7:05 PM, Davood Tofighi [EMAIL PROTECTED] wrote:
 Dear All,

 I am running a Monte Carlo simulation study and have some questions on how
 to manage data storage efficiently at the end of each 1000 replication loop.
 I have three conditions coded using the FOR {} loops and a FOR loop that
 generates data for each condition, performs analysis, and computes a
 statistic 1000 times. Therefore, for each condition, I will have 1000
 statistic values. My question is what's the best way to store the 1000
 statistic for each condition. Any suggestion on how to manage such
 simulation studies is greatly appreciated.
 Thanks,

 --
 Davood Tofighi
 Department of Psychology
 Arizona State University

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simulation study using R

2008-03-04 Thread jim holtman

One of the things you might take a look at is the 'filehash' package.
It is an easy way of storing/retrieving R objects.  I have an
application where my objects are matrices of about the same size and I
can quickly store the data and then come back later with a different
script to do further analysis.

On 3/3/08, Davood Tofighi [EMAIL PROTECTED] wrote:
 Thanks for your reply. For each condition, I will have a matrix or data
 frames of 1000 rows and 4 columns. I also have a total of 64 conditions for
 now. So, in total, I will have 64 matrices or data frames of 1000 rows and 4
 columns. The format of data I would like to store would be data frames or
 matrices. I also would like to store the data for later use, e.g., a plot of
 the empirical distribution of the chi^2, or to compute the power of Chi^2
 across 1000 reps for each condition.

 On Mon, Mar 3, 2008 at 7:03 PM, jim holtman [EMAIL PROTECTED] wrote:
  What is the format of the data you are storing (single value,
  multivalued vector, matrix, dataframe, ...)?  This will help formulate
  a solution.  What do you plan to do with the data?  Are you going to
  do further analysis, write it to flat files, store it in a data base,
  etc.?  How big are the data objects you are manipulating?
 
 
 
 
  On Mon, Mar 3, 2008 at 7:05 PM, Davood Tofighi [EMAIL PROTECTED] wrote:
   Dear All,
  
   I am running a Monte Carlo simulation study and have some questions on
 how
   to manage data storage efficiently at the end of each 1000 replication
 loop.
   I have three conditions coded using the FOR {} loops and a FOR loop that
   generates data for each condition, performs analysis, and computes a
   statistic 1000 times. Therefore, for each condition, I will have 1000
   statistic values. My question is what's the best way to store the 1000
   statistic for each condition. Any suggestion on how to manage such
   simulation studies is greatly appreciated.
   Thanks,
  
   --
   Davood Tofighi
   Department of Psychology
   Arizona State University
  
  [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
 
  --
  Jim Holtman
  Cincinnati, OH
  +1 513 646 9390
 
  What is the problem you are trying to solve?
 



 --
 Davood Tofighi
 Department of Psychology
 Arizona State University
 P.O. BOX 871104
 Tempe, AZ 85287-1104
 Tel.:480-727-7884


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory constraints in ubuntu gutsy

2008-03-04 Thread jim holtman

What type of data do you have?  Will it be numeric or factors?  If it
is all numeric, then you will need over 4GB just to hold one copy of
the object (700,000 * 800 * 8).  That is to hold the final object; I
don't know how much additional space is required during the
processing.

What are you going to do with all of it at once?  Can you read it in
in parts and store it in a database and then just retieve the columns
you need for processing?  So your machine is probably not large enough
to hold a single copy and you would have to be using a 64 - bit
version of R.

On 3/4/08, Randy Griffiths [EMAIL PROTECTED] wrote:
 Hello All,

 I have a very large data set (1.1GB) that I am trying to read into R. The
 file is tab delimited and contains headers; there are over 800 columns and
 almost 700,000 rows. I am using the Ubuntu 7.10 Gutsy Gibbon version of R. I
 am using Kernel Linux 2.6.22-14-generic. I have 3.1GB of RAM with the AMD
 Athlon(tm) 64 Processor 3200+. I downloaded R using the instructions from
 cran under Linux-Ubuntu.

 I need to be able to read the whole data set into R, but when I try right
 now, it will only use 4.2GB of the swap space (50% of the 8.5GB currently
 available) and won't go any further. I am new to Linux, but anxious to
 learn. Is there a memory constraint with this build of R? or is this
 something that can be fixed with hardware (like more RAM)? I thought that a
 64bit version of R would be able to handle data of this magnitude. Is there
 a different version of Linux that is better for reading in large data sets
 such as this one?

 I know that databases can be used for large data, but i need run
 discriminant analysis or randomForest on all of the variables.

 Any of your suggestions would be very much appreciated.

 Sincerely,

 Randy Griffiths

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

< 4 5 6 7 8 9 10 11 12 13 >

801 - 900 of 3647 matches

Mail list logo