date:20091101

[R] counting frequencies across two columns

2009-11-01 Thread Jason Priem

I've got a data frame describing comments on an electronic journal, 
wherein each row is a unique comment, like so:


 commentID  author articleID
1 1   smith 2
2 2   jones 3
3 3 andrews 2
4 4   jones 1
5 5 johnson 3
6 6   smith 2

I want know the number of unique authors per article.  I can get a table 
of article frequencies with table(articleID), but I can't figure out how 
to count frequencies in a different column.  I'm sure there's an easy 
way, but I guess I'm too new at this to find it.  Thanks for your help!


Jason Priem
PhD student, School of Information and Library Science, University of 
North Carolina-Chapel Hill


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting frequencies across two columns

2009-11-01 Thread milton ruser

Hi Jason,

As your example is not reproducible, may be something like:

myFreq-data.frame(table(articleID, author))

if you want to know only those articles with 1 author, you can try

subset(myFreq, Freq==1)

or something like.

bests

milton

On Sun, Nov 1, 2009 at 2:20 AM, Jason Priem pr...@email.unc.edu wrote:

 I've got a data frame describing comments on an electronic journal, wherein
 each row is a unique comment, like so:

  commentID  author articleID
 1 1   smith 2
 2 2   jones 3
 3 3 andrews 2
 4 4   jones 1
 5 5 johnson 3
 6 6   smith 2

 I want know the number of unique authors per article.  I can get a table of
 article frequencies with table(articleID), but I can't figure out how to
 count frequencies in a different column.  I'm sure there's an easy way, but
 I guess I'm too new at this to find it.  Thanks for your help!

 Jason Priem
 PhD student, School of Information and Library Science, University of North
 Carolina-Chapel Hill

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting frequencies across two columns

2009-11-01 Thread Patrick Connolly

On Sun, 01-Nov-2009 at 01:20AM -0500, Jason Priem wrote:

 I've got a data frame describing comments on an electronic journal,  
 wherein each row is a unique comment, like so:

  commentID  author articleID
 1 1   smith 2
 2 2   jones 3
 3 3 andrews 2
 4 4   jones 1
 5 5 johnson 3
 6 6   smith 2

Let's call that dataframe x


 I want know the number of unique authors per article.  I can get a table  
 of article frequencies with table(articleID), but I can't figure out how  
 to count frequencies in a different column.  I'm sure there's an easy  
 way, but I guess I'm too new at this to find it.  

I'm not clear what you require, but maybe it's this:

 with(x, table(articleID, author))

articleID andrews johnson jones smith
1   0   0 1 0
2   1   0 0 2
3   0   1 1 0

Is that anything like what you're after?  


-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] wilcox.test construction in r

2009-11-01 Thread jomni


Hi,  I am very confused with constructing the wilcox.test in R.
I have two populations 'original' and 'test'.
I want to know if the 'test' is generally 'lower' than original.
I use alpha of 0.05.

So do I write the function as wilcox.test(original, test, alternative=l)?
or wlcox.test(original, test, alternative = g)?
or wilcox.test(test, original, alternative=g)?
or wilcox.test(test, original, alternative=l)?

How do I interpret the p-value given my criteria?
Do I reject null when p-value less than 0.05? 
or greater than 0.95?

Not a statistics major here so I'm really confused. 
Need some help.
Thanks.
-- 
View this message in context: 
http://old.nabble.com/wilcox.test-construction-in-r-tp26148779p26148779.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] wilcox.test construction in r

2009-11-01 Thread Stefan Grosse

On Sun, 1 Nov 2009 00:47:50 -0700 (PDT) jomni jom...@gmail.com wrote:

J So do I write the function as wilcox.test(original, test,
J alternative=l)? or wlcox.test(original, test, alternative = g)?
J or wilcox.test(test, original, alternative=g)?
J or wilcox.test(test, original, alternative=l)?

J How do I interpret the p-value given my criteria?
J Do I reject null when p-value less than 0.05? 
J or greater than 0.95?

The interpretation of the p depends on how you have tested the
hypothesis.

J Not a statistics major here so I'm really confused. 

You don't need to be that but please read the documentation and try the
given examples in the documentation.

If you would have typed example(wilcox.test) you would have seen for
example:

wlcx.t ## Two-sample test.
wlcx.t ## Hollander  Wolfe (1973), 69f.
wlcx.t ## Permeability constants of the human chorioamnion (a placental
wlcx.t ##  membrane) at term (x) and between 12 to 26 weeks gestational
wlcx.t ##  age (y).  The alternative of interest is greater
permeability 
wlcx.t ##  of the human chorioamnion for the term
pregnancy. 
wlcx.t x - c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91,
1.64, 0.73, 1.46)

wlcx.t y - c(1.15, 0.88, 0.90, 0.74, 1.21)

wlcx.t wilcox.test(x, y, alternative = g)# greater

Wilcoxon rank sum test

data:  x and y 
W = 35, p-value = 0.1272
alternative hypothesis: true location shift is greater than 0 


This I think makes it very easy to interprete. Here it is tested as the
text says whether x is greater than y. So if you want to test the
hypothesis that x is smaller than y so you do
wilcox.test(x,y,alternative=less)
then the lower your p is the higher is the probability that the samples
are different. hence p0.05 would match your confidence level. Now the
surprising news:
wilcox.test(y,x,alternative=greater)
would work as well! 

If you are in doubt create an x and an y where you are sure that x is
smaller than y.

One final remark: if you have ties (several identical values in one
sample) you should use wilcox_test of the coin package.

hth
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] re ct.hclust and horizontal dendrograms

2009-11-01 Thread _nico_


As from subject: is it possible to use the rect.hclust (or something
equivalent) with horizontal dendrograms?

thanks
nico
-- 
View this message in context: 
http://old.nabble.com/rect.hclust-and-horizontal-dendrograms-tp26148886p26148886.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] convert list to Dataframe

2009-11-01 Thread onyourmark


Hi. I have a huge list called twitter:

 dim(twitter)
NULL
 str(twitter)
List of 1
 $ :Classes 'PlainTextDocument', 'TextDocument', 'character'  atomic
[1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For
Governance From Campaigner-in-chief: President obama jumps  campaign 09 
tuesday.. http://bit.ly/2eHMaN;Florida;USA;FL;;;27.6648274;-81.5157535
12210;10:47:37;20;10;2009;David_Stringer;William Hague heading  Washington 
meets  Gen. Jim Jones, Sen. John McCain  others. Will Obama team raise
worries  EU ties?;London, England;United Kingdom;Greater
London;Westminster;;51.5001524;-0.1262362
12355;10:47:53;20;10;2009;Singsabit;RT @Drudge_Report PAPER: Excuses wearing
thin  Obama, media pals... http://tinyurl.com/yfw6cd9;So.
California;USA;CA;;;36.778261;-119.4179324
12407;10:47:59;20;10;2009;obamavideonews;Obama News Obama   Afghanistan
troop decision timing (AFP) : AFP - Pres.. http://bit.ly/3KPUr8 #obama
#video;USA;USA37.09024;-95.712891 ...
  .. ..- attr(*, Author)= chr(0) 
  .. ..- attr(*, DateTimeStamp)= POSIXlt[1:9], format: 2009-10-31
04:46:56
  .. ..- attr(*, Description)= chr(0) 
  .. ..- attr(*, Heading)= chr(0) 
  .. ..- attr(*, ID)= chr 1
  .. ..- attr(*, Language)= chr en
  .. ..- attr(*, LocalMetaData)= list()
  .. ..- attr(*, Origin)= chr(0) 
 - attr(*, CMetaData)=List of 3
  ..$ NodeID  : num 0
  ..$ MetaData:List of 2
  .. ..$ create_date: POSIXlt[1:9], format: 2009-10-31 04:46:56
  .. ..$ creator: Named chr 
  .. .. ..- attr(*, names)= chr LOGNAME
  ..$ Children: NULL
  ..- attr(*, class)= chr MetaDataNode
 - attr(*, DMetaData)='data.frame':   1 obs. of  1 variable:
  ..$ MetaID: num 0
 - attr(*, class)= chr [1:3] VCorpus Corpus list

It contains tweets but in many languages. The columns are separated by
semi-colons. I am using the tm package and it is a corpus.

It looks like this:

547282;06:37:17;21;10;2009;dani_jade18;@Laura_Whyte1   day
:p;Huddersfield/Lincoln;United
Kingdom;Kirklees;Kirklees;;53.6468475;-1.7727296
547283;06:37:17;21;10;2009;fabiomafra;alguém traz mais lenha pro computador
da facool? BOM DIA.;Belo Horizonte - MG -
BR;Brazil;MG;;;-19.8157306;-43.9542226
547284;06:37:17;21;10;2009;romanotr;Вау, Репортеры без границ опубликовали
список стран со свободой слова, из 173 Грузия на 81 месте опережая Украину.
Успехи,успехи...;Portugal Aveiro;Portugal;Aveiro;;;40.6411848;-8.6536169
547285;06:37:18;21;10;2009;Y_T_;Playing: Beth Orton lt\;Someone's
Daughtergt\;;Kanazawa, Japan;Japan;Ishikawa
Prefecture;;;36.5613254;136.6562051
Error: invalid input
'547286;06:37:18;21;10;2009;Atogey;æ”¯æŒä½ ï¼Œå›½å®¶éœ€è¦ä»–ä»¬ï¼Œä½†æ˜¯å›½å®¶çš„æœªæ¥ä¸èƒ½é ä»–ä»¬â€¦RT
@zuola ï¿¿æˆ‘è§‰å¾— @wenyunc

I want to convert it to fields or columns and so I thought I should
convert it to a dataframe. I tried

 twitterDF-as.data.frame(twitter)
Error in sort.list(y) : 
  invalid input
'547286;06:37:18;21;10;2009;Atogey;æ”¯æŒä½ ï¼Œå›½å®¶éœ€è¦ä»–ä»¬ï¼Œä½†æ˜¯å›½å®¶çš„æœªæ¥ä¸èƒ½é ä»–ä»¬â€¦RT
@zuola ï¿¿æˆ‘è§‰å¾— @wenyunchao
ä¸€ç‚¹éƒ½ä¸ä¹è§‚ã€‚çœŸæ£çš„ä¹è§‚åº”è¯¥æ˜¯ï¼šä½ å…³æˆ‘åˆæ€Žä¹ˆæ ·ï¼Œåæ£æ”¿æ²»æ–—äº‰ä¸ä¼šä¸¢æŽ‰æ€§å‘½ï¼Œè€åå‡ºæ¥åŽæ›´æ˜¯ä¸€æ¡å¥½æ±‰ã€‚åŒ—é£Žè¿˜æ˜¯èˆä¸å¾—*éœ¸åœ°ä½ã€è‚‰ã€ä¹¦ã€å¥³äººå’Œç½‘ç»œçš„ï¼Œä¸è¿‡ç‰¢é‡Œä¸ä¼šæä¾›è¿™äº›ã€‚å¦â€¦;å±±è¥¿ï¼Œæµ™æ±Ÿ;China;Zhejiang;;;28.695035;119.751054'
in 'utf8towcs'
 

Can anyone suggest what I can do? 

P.S. Actually, I would love to remove all the non-English tweets but I have
no clue about how to do that.

-- 
View this message in context: 
http://old.nabble.com/convert-list-to-Dataframe-tp26148889p26148889.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to properly shade the background panels of an xyplot?

2009-11-01 Thread Deepayan Sarkar

On Fri, Oct 30, 2009 at 8:33 PM, Ottorino-Luca Pantani
ottorino-luca.pant...@unifi.it wrote:
 Dear R users,
 this is a follow up of this message
 http://tolstoy.newcastle.edu.au/R/e6/help/09/05/13897.html
 I'm reproducing the core of it for convenience.

 //
 / data(Oats, package = MEMSS) /
 / tp1.oats - xyplot(yield ~ nitro | Variety + Block, /
 /                      data = Oats, /
 /                      panel = function(x, y, subscripts, ...) { /
 /                                # How to normalize my heatmap metric with
 /
 /                                # the value of the panel that has maximum
 average ? /
 /                                # metric = eval(mean(y)/
 max(mean-of-each-panel) /
 /                                metric = eval(mean(y)/max(Oats$yield)) /
 /                                panel.fill(col = gray(metric)) /
 /                                panel.lines(x,y) /
 /                          } /
 /                        ) /
 / print(tp1.oats) /
 //

 xyplot(yield ~ nitro | Variety + Block,

       data = Oats,
       max.mean = max(with(Oats, tapply(yield, list(Variety, Block),
 mean))),
       panel = function(x, y, subscripts, max.mean, ...) {
           metric = mean(y)/max.mean
           panel.fill(col = gray(metric))
           panel.lines(x,y)
       })


 or

 xyplot(yield ~ nitro | Variety + Block,

       data = Oats,
       aux.env = new.env(parent = emptyenv()),
       prepanel = function(x, y, aux.env, ...) {
           aux.env$max.mean.y - max(aux.env$max.mean.y, mean(y))
           list()
       },
       panel = function(x, y, subscripts, aux.env, ...) {
           metric = mean(y) / aux.env$max.mean.y
           panel.fill(col = gray(metric))
           panel.lines(x,y)
       })

 -Deepayan

 The result is a trellis object in which the background colour of the panels
 is an outcome of the data contained in the panel itself. After all, this is
 what panel = function (x,y . is meant for, right ?

 But what, if I want to highlight some panels ? Arbitrarily or conditioned by
 another variable.

 Say I want to shade in gray only the upper right panels (Block VI, Victory
 and Marvellous varieties )

See ?which.packet, which will tell you the current levels of the
conditioning variables. So something like

  panel = function(x, y, subscripts, aux.env, ...) {
  wp - which.packet()
  if (levels(Oats$Variety)[wp[1]] %in% c(Victory,
Marvellous) || ...)
  panel.fill(col = gray(metric))
 panel.lines(x,y)
 })

-Deepayan


 Given a data frame like this, with a variable intended to set the colour of
 the panel background
 /
 data(Oats, package = MEMSS)
 /Oats1 - cbind.data.frame(Oats,
                            Highlight =
                            ifelse(Oats$Block == VI 
                                   Oats$Variety %in% c(Victory,
 Marvellous ),
                                   gray,
                                   transparent)
                            )

 which is a possible code ?

 I (more or less) know how to manage the data in the panel,
 but I cannot imagine how to do it with variables external to the panel
 itself.

 I suppose that the panel functions are not useful here.
 I'm wandering through par.settings, themes and panel.fill, but still without
 success.
 Any hint ?

 --
 Ottorino-Luca Pantani, Università di Firenze
 Dip. Scienza del Suolo e Nutrizione della Pianta
 P.zle Cascine 28 50144 Firenze Italia
 Ubuntu 8.04.3 LTS -- GNU Emacs 23.0.60.1 (x86_64-pc-linux-gnu, GTK+ Version
 2.12.9)
 ESS version 5.5 -- R 2.9.2

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting frequencies across two columns

2009-11-01 Thread David Winsemius



On Nov 1, 2009, at 1:59 AM, Patrick Connolly wrote:


On Sun, 01-Nov-2009 at 01:20AM -0500, Jason Priem wrote:


I've got a data frame describing comments on an electronic journal,
wherein each row is a unique comment, like so:

commentID  author articleID
1 1   smith 2
2 2   jones 3
3 3 andrews 2
4 4   jones 1
5 5 johnson 3
6 6   smith 2


Let's call that dataframe x



I want know the number of unique authors per article.  I can get a  
table
of article frequencies with table(articleID), but I can't figure  
out how

to count frequencies in a different column.  I'm sure there's an easy
way, but I guess I'm too new at this to find it.


I'm not clear what you require, but maybe it's this:


with(x, table(articleID, author))


articleID andrews johnson jones smith
   1   0   0 1 0
   2   1   0 0 2
   3   0   1 1 0

Is that anything like what you're after?


You've had two guesses so far and my guess increments the count.

Were you attempting to specify this?

df1 - read.table(textConnection(commentID  author articleID
1 1   smith 2
2 2   jones 3
3 3 andrews 2
4 4   jones 1
5 5 johnson 3
6 6   smith 2), header=T)

 lapply( lapply(tapply(df1$author, df1$articleID, I), unique) ,  
length)

$`1`
[1] 1

$`2`
[1] 2

$`3`
[1] 2

Or delivered in matrix form (and using Connolly's approach as  
intermediate:


 apply( with(df1, table(articleID, author)), 1, function(x) sum(x0) )
1 2 3
1 2 2




--
~ 
.~ 
.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

  ___Patrick Connolly

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert list to Dataframe

2009-11-01 Thread David Winsemius


Three suggestions:

-- drop the idea of using a dataframe. It's only appropriate when the  
data is rectangular.

-- look at strsplit for separating at @ characters.
-- post the output of dput() on your sample, since email is probably  
not capable of rendering this data without creating distortions.


--
David

On Nov 1, 2009, at 7:43 AM, onyourmark wrote:



Hi. I have a huge list called twitter:


dim(twitter)

NULL

str(twitter)


This looks to have been converted into an R object through soe process  
on some unspecified input. You should describe that process, and hte  
only unambiguous method of doing so is by including the code.




List of 1
$ :Classes 'PlainTextDocument', 'TextDocument', 'character'  atomic
[1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed  
Lessons For
Governance From Campaigner-in-chief: President obama jumps  campaign  
09

tuesday.. http://bit.ly/2eHMaN;Florida;USA;FL;;;27.6648274;-81.5157535
12210;10:47:37;20;10;2009;David_Stringer;William Hague heading   
Washington

meets  Gen. Jim Jones, Sen. John McCain  others. Will Obama team raise
worries  EU ties?;London, England;United Kingdom;Greater
London;Westminster;;51.5001524;-0.1262362
12355;10:47:53;20;10;2009;Singsabit;RT @Drudge_Report PAPER: Excuses  
wearing

thin  Obama, media pals... http://tinyurl.com/yfw6cd9;So.
California;USA;CA;;;36.778261;-119.4179324
12407;10:47:59;20;10;2009;obamavideonews;Obama News Obama
Afghanistan

troop decision timing (AFP) : AFP - Pres.. http://bit.ly/3KPUr8 #obama
#video;USA;USA37.09024;-95.712891 ...
 .. ..- attr(*, Author)= chr(0)
 .. ..- attr(*, DateTimeStamp)= POSIXlt[1:9], format: 2009-10-31
04:46:56
 .. ..- attr(*, Description)= chr(0)
 .. ..- attr(*, Heading)= chr(0)
 .. ..- attr(*, ID)= chr 1
 .. ..- attr(*, Language)= chr en
 .. ..- attr(*, LocalMetaData)= list()
 .. ..- attr(*, Origin)= chr(0)
- attr(*, CMetaData)=List of 3
 ..$ NodeID  : num 0
 ..$ MetaData:List of 2
 .. ..$ create_date: POSIXlt[1:9], format: 2009-10-31 04:46:56
 .. ..$ creator: Named chr 
 .. .. ..- attr(*, names)= chr LOGNAME
 ..$ Children: NULL
 ..- attr(*, class)= chr MetaDataNode
- attr(*, DMetaData)='data.frame':   1 obs. of  1 variable:
 ..$ MetaID: num 0
- attr(*, class)= chr [1:3] VCorpus Corpus list

It contains tweets but in many languages. The columns are  
separated by

semi-colons. I am using the tm package and it is a corpus.

It looks like this:


It is difficult to see any connection with what you have above.



547282;06:37:17;21;10;2009;dani_jade18;@Laura_Whyte1   day
:p;Huddersfield/Lincoln;United
Kingdom;Kirklees;Kirklees;;53.6468475;-1.7727296
547283;06:37:17;21;10;2009;fabiomafra;alguém traz mais lenha pro  
computador

da facool? BOM DIA.;Belo Horizonte - MG -
BR;Brazil;MG;;;-19.8157306;-43.9542226
547284;06:37:17;21;10;2009;romanotr;Вау, Репортеры  
без границ опубликовали
список стран со свободой слова, из 173  
Грузия на 81 месте опережая Украину.
Успехи,успехи...;Portugal Aveiro;Portugal;Aveiro;;; 
40.6411848;-8.6536169

547285;06:37:18;21;10;2009;Y_T_;Playing: Beth Orton lt\;Someone's
Daughtergt\;;Kanazawa, Japan;Japan;Ishikawa
Prefecture;;;36.5613254;136.6562051
Error: invalid input
'547286;06:37:18;21;10;2009;Atogey;æ”¯æŒä½  
ï¼Œå›½å®¶éœ€è¦ä»– 
ä»¬ï¼Œä½†æ˜¯å›½å®¶çš„æœªæ 
¥ä¸èƒ½é ä»–ä»¬â€¦RT

@zuola ï¿¿æˆ‘è§‰å¾— @wenyunc

I want to convert it to fields or columns and so I thought I should
convert it to a dataframe. I tried


twitterDF-as.data.frame(twitter)

Error in sort.list(y) :
 invalid input
'547286;06:37:18;21;10;2009;Atogey;æ”¯æŒä½  
ï¼Œå›½å®¶éœ€è¦ä»– 
ä»¬ï¼Œä½†æ˜¯å›½å®¶çš„æœªæ 
¥ä¸èƒ½é ä»–ä»¬â€¦RT

@zuola ï¿¿æˆ‘è§‰å¾— @wenyunchao
ä¸€ç‚¹éƒ½ä¸ä¹è§‚ã€‚çœŸæ  
£çš„ä¹è§‚åº”è¯¥æ˜¯ï¼šä½ å… 
³æˆ‘åˆæ€Žä¹ˆæ ·ï¼Œåæ £æ”¿æ²»æ– 
—äº‰ä¸ä¼šä¸¢æŽ‰æ€§å‘½ï¼Œè€å  
å‡ºæ¥åŽæ›´æ˜¯ä¸€æ¡å¥½æ±‰ã€ 
‚åŒ—é£Žè¿˜æ˜¯èˆä¸å¾—*éœ¸åœ°ä½ã 
€è‚‰ã€ä¹¦ã€å 
¥³äººå’Œç½‘ç»œçš„ï¼Œä¸è¿‡ç‰ 
¢é‡Œä¸ä¼šæä¾›è¿™äº›ã€‚å¦â 
€¦;å±±è¥¿ï¼Œæµ™æ±Ÿ;China;Zhejiang;;; 
28.695035;119.751054'

in 'utf8towcs'




Can anyone suggest what I can do?

P.S. Actually, I would love to remove all the non-English tweets but  
I have

no clue about how to do that.

--


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert list to Dataframe

2009-11-01 Thread onyourmark


Hello. The fields are separated by a ';'. I think that the data is
rectangular in the sense that there are about 15 fields for each row. Some
of the fields are empty. In the dput() display below, it seems that the rows
are delimited by '  ' .
Any idea from this?

Here is the end of the output for dput(twitter)

4927861;05:04:14;28;10;2009;HOYTSTHEATRES;GameStop Brings  15K  Manage
Holiday Rush [Black Friday]
http://bit.ly/2d3OJg;Australia;Australia-25.274398;133.775136;, 
4927863;05:04:14;28;10;2009;padden;Rachel  master chef  cook 
anytime!;Sydney, Australia;Australia;NSW;;;-33.867139;151.207114, 
4927878;05:04:17;28;10;2009;GSpotMagazine;The penalty  success   bored 
attentions  people  formerly snubbed you. -Mary Wilson Little
#quote;UK;United Kingdom55.378051;-3.435973, 
4927885;05:04:20;28;10;2009;super_assassin;@triplejsr flight  conchords,
pleaaase :) thanks rosie
xx;Australia;Australia-25.274398;133.775136, 
4927893;05:04:21;28;10;2009;SLMFE;Gestern:Achso,ja okey,um 5 nach las ich
jemanden komen der dir die Akupunkturnadel(zb 5!im Ohr!)entfernt..Um 10 n.
kommt immer noch keiner..;Germany;Germany51.165691;10.451526, 
4927901;05:04:23;28;10;2009;mikesemple;HHS Secretary pushes health care
reform  rural America: By Christopher Smart The health-care crisis  ..
http://bit.ly/49Iqcu;London;United Kingdom;Greater
London;Westminster;;51.5001524;-0.1262362, 
4927913;05:04:26;28;10;2009;coax_k;Facebook Headquarters  Studio O+A: San
Francisco based interior design firm Studio O+A  designed  ..
http://bit.ly/hdqWp;Sydney;Australia;NSW;;;-33.867139;151.207114;
), Author = character(0), DateTimeStamp = structure(list(sec =
56.404713898, 
min = 46L, hour = 4L, mday = 31L, mon = 9L, year = 109L, 
wday = 6L, yday = 303L, isdst = 0L), .Names = c(sec, min, 
hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, 
POSIXlt), tzone = GMT), Description = character(0), Heading =
character(0), ID = 1, Language = en, LocalMetaData = list(), Origin =
character(0), class = c(PlainTextDocument, 
TextDocument, character))), CMetaData = structure(list(NodeID = 0, 
MetaData = structure(list(create_date = structure(list(sec =
56.4059998989105, 
min = 46L, hour = 4L, mday = 31L, mon = 9L, year = 109L, 
wday = 6L, yday = 303L, isdst = 0L), .Names = c(sec, 
min, hour, mday, mon, year, wday, yday, isdst
), class = c(POSIXt, POSIXlt), tzone = GMT), creator =
structure(, .Names = LOGNAME)), .Names = c(create_date, 
creator)), Children = NULL), .Names = c(NodeID, MetaData, 
Children), class = MetaDataNode), DMetaData = structure(list(
MetaID = 0), .Names = MetaID, row.names = c(NA, -1L), class =
data.frame), class = c(VCorpus, 
Corpus, list))




onyourmark wrote:
 
 Hi. I have a huge list called twitter:
 
 dim(twitter)
 NULL
 str(twitter)
 List of 1
  $ :Classes 'PlainTextDocument', 'TextDocument', 'character'  atomic
 [1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons
 For Governance From Campaigner-in-chief: President obama jumps  campaign
 09  tuesday.. http://bit.ly/2eHMaN;Florida;USA;FL;;;27.6648274;-81.5157535
 12210;10:47:37;20;10;2009;David_Stringer;William Hague heading  Washington 
 meets  Gen. Jim Jones, Sen. John McCain  others. Will Obama team raise
 worries  EU ties?;London, England;United Kingdom;Greater
 London;Westminster;;51.5001524;-0.1262362
 12355;10:47:53;20;10;2009;Singsabit;RT @Drudge_Report PAPER: Excuses
 wearing thin  Obama, media pals... http://tinyurl.com/yfw6cd9;So.
 California;USA;CA;;;36.778261;-119.4179324
 12407;10:47:59;20;10;2009;obamavideonews;Obama News Obama   Afghanistan
 troop decision timing (AFP) : AFP - Pres.. http://bit.ly/3KPUr8 #obama
 #video;USA;USA37.09024;-95.712891 ...
   .. ..- attr(*, Author)= chr(0) 
   .. ..- attr(*, DateTimeStamp)= POSIXlt[1:9], format: 2009-10-31
 04:46:56
   .. ..- attr(*, Description)= chr(0) 
   .. ..- attr(*, Heading)= chr(0) 
   .. ..- attr(*, ID)= chr 1
   .. ..- attr(*, Language)= chr en
   .. ..- attr(*, LocalMetaData)= list()
   .. ..- attr(*, Origin)= chr(0) 
  - attr(*, CMetaData)=List of 3
   ..$ NodeID  : num 0
   ..$ MetaData:List of 2
   .. ..$ create_date: POSIXlt[1:9], format: 2009-10-31 04:46:56
   .. ..$ creator: Named chr 
   .. .. ..- attr(*, names)= chr LOGNAME
   ..$ Children: NULL
   ..- attr(*, class)= chr MetaDataNode
  - attr(*, DMetaData)='data.frame':   1 obs. of  1 variable:
   ..$ MetaID: num 0
  - attr(*, class)= chr [1:3] VCorpus Corpus list
 
 It contains tweets but in many languages. The columns are separated by
 semi-colons. I am using the tm package and it is a corpus.
 
 It looks like this:
 
 547282;06:37:17;21;10;2009;dani_jade18;@Laura_Whyte1   day
 :p;Huddersfield/Lincoln;United
 Kingdom;Kirklees;Kirklees;;53.6468475;-1.7727296
 547283;06:37:17;21;10;2009;fabiomafra;alguém traz mais lenha pro
 computador da facool? BOM DIA.;Belo Horizonte - MG -
 BR;Brazil;MG;;;-19.8157306;-43.9542226

Re: [R] convert list to Dataframe

2009-11-01 Thread Duncan Murdoch


On 01/11/2009 7:43 AM, onyourmark wrote:

Hi. I have a huge list called twitter:


It's a list, but more importantly it's a VCorpus and a Corpus.  You 
should use the functions appropriate to those classes to extract the 
strings making up the data, declare their encoding properly (or convert 
them to your native encoding), then use read.delim() on a textConnection 
to read them in.


Duncan Murdoch




dim(twitter)

NULL

str(twitter)

List of 1
 $ :Classes 'PlainTextDocument', 'TextDocument', 'character'  atomic
[1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For
Governance From Campaigner-in-chief: President obama jumps  campaign 09 
tuesday.. http://bit.ly/2eHMaN;Florida;USA;FL;;;27.6648274;-81.5157535
12210;10:47:37;20;10;2009;David_Stringer;William Hague heading  Washington 
meets  Gen. Jim Jones, Sen. John McCain  others. Will Obama team raise

worries  EU ties?;London, England;United Kingdom;Greater
London;Westminster;;51.5001524;-0.1262362
12355;10:47:53;20;10;2009;Singsabit;RT @Drudge_Report PAPER: Excuses wearing
thin  Obama, media pals... http://tinyurl.com/yfw6cd9;So.
California;USA;CA;;;36.778261;-119.4179324
12407;10:47:59;20;10;2009;obamavideonews;Obama News Obama   Afghanistan
troop decision timing (AFP) : AFP - Pres.. http://bit.ly/3KPUr8 #obama
#video;USA;USA37.09024;-95.712891 ...
  .. ..- attr(*, Author)= chr(0) 
  .. ..- attr(*, DateTimeStamp)= POSIXlt[1:9], format: 2009-10-31

04:46:56
  .. ..- attr(*, Description)= chr(0) 
  .. ..- attr(*, Heading)= chr(0) 
  .. ..- attr(*, ID)= chr 1

  .. ..- attr(*, Language)= chr en
  .. ..- attr(*, LocalMetaData)= list()
  .. ..- attr(*, Origin)= chr(0) 
 - attr(*, CMetaData)=List of 3

  ..$ NodeID  : num 0
  ..$ MetaData:List of 2
  .. ..$ create_date: POSIXlt[1:9], format: 2009-10-31 04:46:56
  .. ..$ creator: Named chr 
  .. .. ..- attr(*, names)= chr LOGNAME
  ..$ Children: NULL
  ..- attr(*, class)= chr MetaDataNode
 - attr(*, DMetaData)='data.frame':   1 obs. of  1 variable:
  ..$ MetaID: num 0
 - attr(*, class)= chr [1:3] VCorpus Corpus list

It contains tweets but in many languages. The columns are separated by
semi-colons. I am using the tm package and it is a corpus.

It looks like this:

547282;06:37:17;21;10;2009;dani_jade18;@Laura_Whyte1   day
:p;Huddersfield/Lincoln;United
Kingdom;Kirklees;Kirklees;;53.6468475;-1.7727296
547283;06:37:17;21;10;2009;fabiomafra;alguém traz mais lenha pro computador
da facool? BOM DIA.;Belo Horizonte - MG -
BR;Brazil;MG;;;-19.8157306;-43.9542226
547284;06:37:17;21;10;2009;romanotr;Вау, Репортеры без границ опубликовали
список стран со свободой слова, из 173 Грузия на 81 месте опережая Украину.
Успехи,успехи...;Portugal Aveiro;Portugal;Aveiro;;;40.6411848;-8.6536169
547285;06:37:18;21;10;2009;Y_T_;Playing: Beth Orton lt\;Someone's
Daughtergt\;;Kanazawa, Japan;Japan;Ishikawa
Prefecture;;;36.5613254;136.6562051
Error: invalid input
'547286;06:37:18;21;10;2009;Atogey;æ”¯æŒä½ 
ï¼Œå›½å®¶éœ€è¦ä»–ä»¬ï¼Œä½†æ˜¯å›½å®¶çš„æœªæ¥ä¸èƒ½é ä»–ä»¬â€¦RT
@zuola ï¿¿æˆ‘è§‰å¾— @wenyunc

I want to convert it to fields or columns and so I thought I should
convert it to a dataframe. I tried


twitterDF-as.data.frame(twitter)
Error in sort.list(y) : 
  invalid input

'547286;06:37:18;21;10;2009;Atogey;æ”¯æŒä½ 
ï¼Œå›½å®¶éœ€è¦ä»–ä»¬ï¼Œä½†æ˜¯å›½å®¶çš„æœªæ¥ä¸èƒ½é ä»–ä»¬â€¦RT
@zuola ï¿¿æˆ‘è§‰å¾— @wenyunchao
ä¸€ç‚¹éƒ½ä¸ä¹è§‚ã€‚çœŸæ£çš„ä¹è§‚åº”è¯¥æ˜¯ï¼šä½ å…³æˆ‘åˆæ€Žä¹ˆæ 
·ï¼Œåæ£æ”¿æ²»æ–—äº‰ä¸ä¼šä¸¢æŽ‰æ€§å‘½ï¼Œè€åå‡ºæ¥åŽæ›´æ˜¯ä¸€æ¡å¥½æ±‰ã€‚åŒ—é£Žè¿˜æ˜¯èˆä¸å¾—*éœ¸åœ°ä½ã€è‚‰ã€ä¹¦ã€å¥³äººå’Œç½‘ç»œçš„ï¼Œä¸è¿‡ç‰¢é‡Œä¸ä¼šæä¾›è¿™äº›ã€‚å¦â€¦;å±±è¥¿ï¼Œæµ™æ±Ÿ;China;Zhejiang;;;28.695035;119.751054'
in 'utf8towcs'

Can anyone suggest what I can do? 


P.S. Actually, I would love to remove all the non-English tweets but I have
no clue about how to do that.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Decomposition method

2009-11-01 Thread FMH

Dear All,

I'm doing an  imputation on the missing data and is looking for a decomposition 
method in R. Could somone advice me the way to do this?

Thank you
Fir




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert list to Dataframe

2009-11-01 Thread David Winsemius



On Nov 1, 2009, at 8:24 AM, onyourmark wrote:



Hello. The fields are separated by a ';'. I think that the data is
rectangular in the sense that there are about 15 fields for each  
row.


There either are 15 fields or there aren't. You can't make a dataframe  
with an approximate number of fields. In the fragment below there  
appear to be 14 fields. Try:


twitfrag -  
strsplit(c(4927861;05:04:14;28;10;2009;HOYTSTHEATRES;GameStop Brings   
15K  Manage
Holiday Rush [Black Friday] http://bit.ly/2d3OJg;Australia;Australia-25.274398;133.775136 
,
4927863;05:04:14;28;10;2009;padden;Rachel  master chef  cook  
anytime!;Sydney, Australia;Australia;NSW;;;-33.867139;151.207114,

4927878;05:04:17;28;10;2009;GSpotMagazine;The penalty  success   bored
attentions  people  formerly snubbed you. -Mary Wilson Little  
#quote;UK;United Kingdom55.378051;-3.435973,
4927885;05:04:20;28;10;2009;super_assassin;@triplejsr flight   
conchords,
pleaaase :) thanks rosie  
xx;Australia;Australia-25.274398;133.775136,
4927893;05:04:21;28;10;2009;SLMFE;Gestern:Achso,ja okey,um 5 nach las  
ich
jemanden komen der dir die Akupunkturnadel(zb 5!im Ohr!)entfernt..Um  
10 n.

kommt immer noch keiner..;Germany;Germany51.165691;10.451526,
4927901;05:04:23;28;10;2009;mikesemple;HHS Secretary pushes health care
reform  rural America: By Christopher Smart The health-care crisis  ..
http://bit.ly/49Iqcu;London;United Kingdom;Greater
London;Westminster;;51.5001524;-0.1262362,
4927913;05:04:26;28;10;2009;coax_k;Facebook Headquarters  Studio O+A:  
San

Francisco based interior design firm Studio O+A  designed  ..
http://bit.ly/hdqWp;Sydney;Australia;NSW;;;-33.867139;151.207114;
), ;)
twitfrag

I think you will see some patterns emerging.


Some
of the fields are empty. In the dput() display below, it seems that  
the rows

are delimited by '  ' .
Any idea from this?


They are strings (in our aRgot, objects of type character.) That is an  
effect of whatever processing you have done with components of the tm  
package, the entirety of which you are failing to share with us.




Here is the end of the output for dput(twitter)


The whole point of using dput is to create a complete representation  
of an object.





4927861;05:04:14;28;10;2009;HOYTSTHEATRES;GameStop Brings  15K   
Manage

Holiday Rush [Black Friday]
http://bit.ly/2d3OJg;Australia;Australia-25.274398;133.775136;,
4927863;05:04:14;28;10;2009;padden;Rachel  master chef  cook
anytime!;Sydney, Australia;Australia;NSW;;;-33.867139;151.207114,
4927878;05:04:17;28;10;2009;GSpotMagazine;The penalty  success
bored

attentions  people  formerly snubbed you. -Mary Wilson Little
#quote;UK;United Kingdom55.378051;-3.435973,
4927885;05:04:20;28;10;2009;super_assassin;@triplejsr flight   
conchords,

pleaaase :) thanks rosie
xx;Australia;Australia-25.274398;133.775136,
4927893;05:04:21;28;10;2009;SLMFE;Gestern:Achso,ja okey,um 5 nach  
las ich
jemanden komen der dir die Akupunkturnadel(zb 5!im Ohr!)entfernt..Um  
10 n.

kommt immer noch keiner..;Germany;Germany51.165691;10.451526,
4927901;05:04:23;28;10;2009;mikesemple;HHS Secretary pushes health  
care

reform  rural America: By Christopher Smart The health-care crisis  ..
http://bit.ly/49Iqcu;London;United Kingdom;Greater
London;Westminster;;51.5001524;-0.1262362,
4927913;05:04:26;28;10;2009;coax_k;Facebook Headquarters  Studio O 
+A: San

Francisco based interior design firm Studio O+A  designed  ..
http://bit.ly/hdqWp;Sydney;Australia;NSW;;;-33.867139;151.207114;
), Author = character(0), DateTimeStamp = structure(list(sec =
56.404713898,
  min = 46L, hour = 4L, mday = 31L, mon = 9L, year = 109L,
  wday = 6L, yday = 303L, isdst = 0L), .Names = c(sec, min,
hour, mday, mon, year, wday, yday, isdst), class =  
c(POSIXt,

POSIXlt), tzone = GMT), Description = character(0), Heading =
character(0), ID = 1, Language = en, LocalMetaData = list(),  
Origin =

character(0), class = c(PlainTextDocument,
TextDocument, character))), CMetaData = structure(list(NodeID = 0,
  MetaData = structure(list(create_date = structure(list(sec =
56.4059998989105,
  min = 46L, hour = 4L, mday = 31L, mon = 9L, year = 109L,
  wday = 6L, yday = 303L, isdst = 0L), .Names = c(sec,
  min, hour, mday, mon, year, wday, yday, isdst
  ), class = c(POSIXt, POSIXlt), tzone = GMT), creator =
structure(, .Names = LOGNAME)), .Names = c(create_date,
  creator)), Children = NULL), .Names = c(NodeID, MetaData,
Children), class = MetaDataNode), DMetaData = structure(list(
  MetaID = 0), .Names = MetaID, row.names = c(NA, -1L), class =
data.frame), class = c(VCorpus,
Corpus, list))




onyourmark wrote:


Hi. I have a huge list called twitter:


dim(twitter)

NULL

str(twitter)

List of 1
$ :Classes 'PlainTextDocument', 'TextDocument', 'character'  atomic
[1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed  
Lessons
For Governance From Campaigner-in-chief: President obama jumps   
campaign

09  tuesday..

Re: [R] convert list to Dataframe

2009-11-01 Thread onyourmark


I did this on the source files which were semi-colon delimted (to delimit the
fields, I am not sure what character denotes the new tweet)

After loading the tm package

 txt - system.file(texts, txt, package = tm)
 (twitter - Corpus(DirSource(txt),
+ readerControl = list(language = lat)))

then

twitter - tm_map(twitter, removeWords, stopwords(english))

That last command took about an hour to complete.



onyourmark wrote:
 
 Hi. I have a huge list called twitter:
 
 dim(twitter)
 NULL
 str(twitter)
 List of 1
  $ :Classes 'PlainTextDocument', 'TextDocument', 'character'  atomic
 [1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons
 For Governance From Campaigner-in-chief: President obama jumps  campaign
 09  tuesday.. http://bit.ly/2eHMaN;Florida;USA;FL;;;27.6648274;-81.5157535
 12210;10:47:37;20;10;2009;David_Stringer;William Hague heading  Washington 
 meets  Gen. Jim Jones, Sen. John McCain  others. Will Obama team raise
 worries  EU ties?;London, England;United Kingdom;Greater
 London;Westminster;;51.5001524;-0.1262362
 12355;10:47:53;20;10;2009;Singsabit;RT @Drudge_Report PAPER: Excuses
 wearing thin  Obama, media pals... http://tinyurl.com/yfw6cd9;So.
 California;USA;CA;;;36.778261;-119.4179324
 12407;10:47:59;20;10;2009;obamavideonews;Obama News Obama   Afghanistan
 troop decision timing (AFP) : AFP - Pres.. http://bit.ly/3KPUr8 #obama
 #video;USA;USA37.09024;-95.712891 ...
   .. ..- attr(*, Author)= chr(0) 
   .. ..- attr(*, DateTimeStamp)= POSIXlt[1:9], format: 2009-10-31
 04:46:56
   .. ..- attr(*, Description)= chr(0) 
   .. ..- attr(*, Heading)= chr(0) 
   .. ..- attr(*, ID)= chr 1
   .. ..- attr(*, Language)= chr en
   .. ..- attr(*, LocalMetaData)= list()
   .. ..- attr(*, Origin)= chr(0) 
  - attr(*, CMetaData)=List of 3
   ..$ NodeID  : num 0
   ..$ MetaData:List of 2
   .. ..$ create_date: POSIXlt[1:9], format: 2009-10-31 04:46:56
   .. ..$ creator: Named chr 
   .. .. ..- attr(*, names)= chr LOGNAME
   ..$ Children: NULL
   ..- attr(*, class)= chr MetaDataNode
  - attr(*, DMetaData)='data.frame':   1 obs. of  1 variable:
   ..$ MetaID: num 0
  - attr(*, class)= chr [1:3] VCorpus Corpus list
 
 It contains tweets but in many languages. The columns are separated by
 semi-colons. I am using the tm package and it is a corpus.
 
 It looks like this:
 
 547282;06:37:17;21;10;2009;dani_jade18;@Laura_Whyte1   day
 :p;Huddersfield/Lincoln;United
 Kingdom;Kirklees;Kirklees;;53.6468475;-1.7727296
 547283;06:37:17;21;10;2009;fabiomafra;alguém traz mais lenha pro
 computador da facool? BOM DIA.;Belo Horizonte - MG -
 BR;Brazil;MG;;;-19.8157306;-43.9542226
 547284;06:37:17;21;10;2009;romanotr;Вау, Репортеры без границ
 опубликовали список стран со свободой слова, из 173 Грузия на 81 месте
 опережая Украину. Успехи,успехи...;Portugal
 Aveiro;Portugal;Aveiro;;;40.6411848;-8.6536169
 547285;06:37:18;21;10;2009;Y_T_;Playing: Beth Orton lt\;Someone's
 Daughtergt\;;Kanazawa, Japan;Japan;Ishikawa
 Prefecture;;;36.5613254;136.6562051
 Error: invalid input
 '547286;06:37:18;21;10;2009;Atogey;æ”¯æŒä½ ï¼Œå›½å®¶éœ€è¦ä»–ä»¬ï¼Œä½†æ˜¯å›½å®¶çš„æœªæ¥ä¸èƒ½é ä»–ä»¬â€¦RT
 @zuola ï¿¿æˆ‘è§‰å¾— @wenyunc
 
 I want to convert it to fields or columns and so I thought I should
 convert it to a dataframe. I tried
 
 twitterDF-as.data.frame(twitter)
 Error in sort.list(y) : 
   invalid input
 '547286;06:37:18;21;10;2009;Atogey;æ”¯æŒä½ ï¼Œå›½å®¶éœ€è¦ä»–ä»¬ï¼Œä½†æ˜¯å›½å®¶çš„æœªæ¥ä¸èƒ½é ä»–ä»¬â€¦RT
 @zuola ï¿¿æˆ‘è§‰å¾— @wenyunchao
 ä¸€ç‚¹éƒ½ä¸ä¹è§‚ã€‚çœŸæ£çš„ä¹è§‚åº”è¯¥æ˜¯ï¼šä½ å…³æˆ‘åˆæ€Žä¹ˆæ ·ï¼Œåæ£æ”¿æ²»æ–—äº‰ä¸ä¼šä¸¢æŽ‰æ€§å‘½ï¼Œè€åå‡ºæ¥åŽæ›´æ˜¯ä¸€æ¡å¥½æ±‰ã€‚åŒ—é£Žè¿˜æ˜¯èˆä¸å¾—*éœ¸åœ°ä½ã€è‚‰ã€ä¹¦ã€å¥³äººå’Œç½‘ç»œçš„ï¼Œä¸è¿‡ç‰¢é‡Œä¸ä¼šæä¾›è¿™äº›ã€‚å¦â€¦;å±±è¥¿ï¼Œæµ™æ±Ÿ;China;Zhejiang;;;28.695035;119.751054'
 in 'utf8towcs'
 
 
 Can anyone suggest what I can do? 
 
 P.S. Actually, I would love to remove all the non-English tweets but I
 have no clue about how to do that.
 
 

-- 
View this message in context: 
http://old.nabble.com/convert-list-to-Dataframe-tp26148889p26148898.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] package lme4

2009-11-01 Thread wenjun zheng

Hi R Users,
 When I use package lme4 for mixed model analysis, I can't distinguish
the significant and insignificant variables from all random independent
variables.
 Here is my data and result:
Data:

 
Rice-data.frame(Yield=c(8,7,4,9,7,6,9,8,8,8,7,5,9,9,5,7,7,8,8,8,4,8,6,4,8,8,9),
 Variety=rep(rep(c(A1,A2,A3),each=3),3),
 Stand=rep(c(B1,B2,B3),9),
 Block=rep(1:3,each=9))
Rice.lmer-lmer(Yield ~ (1|Variety) + (1|Stand) + (1|Block) +
(1|Variety:Stand), data = Rice)

Result:

Linear mixed model fit by REML
Formula: Yield ~ (1 | Variety) + (1 | Stand) + (1 | Block) + (1 |
Variety:Stand)
   Data: Rice
   AIC   BIC logLik deviance REMLdev
 96.25 104.0 -42.1285.33   84.25
Random effects:
 GroupsNameVariance Std.Dev.
 Variety:Stand (Intercept) 1.345679 1.16003
 Block (Intercept) 0.00 0.0
 Stand (Intercept) 0.89 0.94281
 Variety   (Intercept) 0.024691 0.15714
 Residual  0.67 0.81650
Number of obs: 27, groups: Variety:Stand, 9; Block, 3; Stand, 3; Variety, 3

Fixed effects:
Estimate Std. Error t value
(Intercept)   7.1852 0.6919   10.38

Can you give me some advice for recognizing the significant variables among
random effects above without other  calculating.

Any suggestions will be appreciated.
Wenjun

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] SEM with Unbalanced Panel

2009-11-01 Thread Henry Wotton

Dear R Experts,

I have a query concerning SEMs in a multilevel context. I have an unbalanced 
panel where children are nested in families, which in turn are nested in 
districts. The problem is the following: the outcome variable of interest is 
measured at the child level but the (explanatory) latent variable is at the 
family level. 

Most the variables are discrete and I am able to fit the model using the sem() 
and hetcor() packages ignoring the hierarchical structure of the data. But, how 
do I incorporate the fact that the outcome variable is at the child level and 
the latent variable at the family level? 

Any help would be greatly appreciated.

Thanks a lot already

Henry

-- 

http://portal.gmx.net/de/go/dsl02

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting frequencies across two columns

2009-11-01 Thread Jorge Ivan Velez

Hi Jason,

If I understand correctly, you are looking for something along the lines of

with(X, tapply(author, articleID, function(x) length(unique(x
# 1 2 3
# 1 2 2

with X your data frame.

HTH,
Jorge


On Sun, Nov 1, 2009 at 1:20 AM, Jason Priem  wrote:

 I've got a data frame describing comments on an electronic journal, wherein
 each row is a unique comment, like so:

  commentID  author articleID
 1 1   smith 2
 2 2   jones 3
 3 3 andrews 2
 4 4   jones 1
 5 5 johnson 3
 6 6   smith 2

 I want know the number of unique authors per article.  I can get a table of
 article frequencies with table(articleID), but I can't figure out how to
 count frequencies in a different column.  I'm sure there's an easy way, but
 I guess I'm too new at this to find it.  Thanks for your help!

 Jason Priem
 PhD student, School of Information and Library Science, University of North
 Carolina-Chapel Hill

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] wilcox.test construction in r

2009-11-01 Thread Peter Ehlers


Dear Stefan,

See two comments inserted below.

Stefan Grosse wrote:

On Sun, 1 Nov 2009 00:47:50 -0700 (PDT) jomni jom...@gmail.com wrote:

J So do I write the function as wilcox.test(original, test,
J alternative=l)? or wlcox.test(original, test, alternative = g)?
J or wilcox.test(test, original, alternative=g)?
J or wilcox.test(test, original, alternative=l)?

J How do I interpret the p-value given my criteria?
J Do I reject null when p-value less than 0.05? 
J or greater than 0.95?


The interpretation of the p depends on how you have tested the
hypothesis.

J Not a statistics major here so I'm really confused. 


You don't need to be that but please read the documentation and try the
given examples in the documentation.


Comment 1:
  As you point out, one should at least scan the documentation.
  Here's a quote from ?wilcox.test:

   'the one-sided alternative greater is that x is shifted
   to the right of y'

  That's pretty unambiguous.


If you would have typed example(wilcox.test) you would have seen for
example:

wlcx.t ## Two-sample test.
wlcx.t ## Hollander  Wolfe (1973), 69f.
wlcx.t ## Permeability constants of the human chorioamnion (a placental
wlcx.t ##  membrane) at term (x) and between 12 to 26 weeks gestational
wlcx.t ##  age (y).  The alternative of interest is greater
permeability 
wlcx.t ##  of the human chorioamnion for the term
pregnancy. 
wlcx.t x - c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91,

1.64, 0.73, 1.46)

wlcx.t y - c(1.15, 0.88, 0.90, 0.74, 1.21)

wlcx.t wilcox.test(x, y, alternative = g)# greater

Wilcoxon rank sum test

data:  x and y 
W = 35, p-value = 0.1272
alternative hypothesis: true location shift is greater than 0 



This I think makes it very easy to interprete. Here it is tested as the
text says whether x is greater than y. So if you want to test the
hypothesis that x is smaller than y so you do
wilcox.test(x,y,alternative=less)
then the lower your p is the higher is the probability that the samples
are different. hence p0.05 would match your confidence level. Now the


Comment 2:
 I know that you know better, but with p-values it's always
 best to be careful with the language. ... the probability
 that the _samples_ are different makes little sense. The
 samples _are_ different, period (or why do the test?). The
 p-value says something about the distribution from which
 the samples are obtained.

Cheers,
Peter Ehlers



surprising news:
wilcox.test(y,x,alternative=greater)
would work as well! 


If you are in doubt create an x and an y where you are sure that x is
smaller than y.

One final remark: if you have ties (several identical values in one
sample) you should use wilcox_test of the coin package.

hth
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix^(-1/2)

2009-11-01 Thread David Winsemius



On Oct 31, 2009, at 9:33 PM, David Winsemius wrote:



On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote:


Dear R-Help Team,

as a R novice I have a (maybe for you very simple question), how do  
I get

the following solved in R:

Let R be a n x n matrix:

\mid R\mid^{-\frac{1}{2}}

solve(A) gives me the inverse of the matrix R, however not the  
^(-1/2) of

the matrix...


GIYF: (and Bill Venables if friendly, too.)

http://www.lmgtfy.com/?q=powers+of+matrix+r-project


I had assumed that the first hit I got:

https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html

...  would be the first hit anybody got, but that's not necessarily  
true now and especially for the future. And further searching within  
the results produced this more recent Maechler posting:


https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html

For the Mac users, there appears to be no binary, but the source  
compiles without error on a 64-bit version of R 2.10.0:


install.packages(expm,repos=http://R-Forge.R-project.org;,  
type=source)


#The suggested code throws an error, so my very minor revision would be:

 library(expm)
?%^%

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to loop thru a matrix or data frame , and append calculations to a new data frame?

2009-11-01 Thread John Kane

Duh, thought of that after I'd left for dinner :(

--- On Sat, 10/31/09, David Winsemius dwinsem...@comcast.net wrote:

 From: David Winsemius dwinsem...@comcast.net
 Subject: Re: [R] how to loop thru a matrix or data frame , and append 
 calculations to  a new data frame?
 To: John Kane jrkrid...@yahoo.ca
 Cc: r-help@r-project.org, Robert Wilkins robst...@gmail.com
 Received: Saturday, October 31, 2009, 2:00 PM
 On Oct 31, 2009, at 12:33 PM, John
 Kane wrote:

  Do you mean to apply the same calculation to each
 element?

  ? apply

  mydata  - data.frame(aa = 1:5, bb= 11:15)
  mp5 - function(x) x*5
  mp5data -  apply(mydata, 2, mp5)
  mp5data

 It would have been much more compact (and in the spirit of
 functional programming) to just do:

 mp5data - mydata*5
 # binary operations applied to dataframes give sensible
 results when the data types allow.
 mp5data
      aa bb
 [1,]  5 55
 [2,] 10 60
 [3,] 15 65
 [4,] 20 70
 [5,] 25 75

 Many times the loops are implicit in the vectorized design
 of R. And that solution would not result in the structure
 requested, for which some further straightening would be
 needed:

  mp5data - as.vector(as.matrix(mydata)*5)
  mp5data
  [1]  5 10 15 20 25 55 60 65 70 75

 -- David

  This is functionally equivelent to a double if loop.

  mydata  - data.frame(aa = 1:5, bb= 11:15)
  newdata - data.frame(matrix(rep(NA,10), nrow=5))

  for (i in 1:length(mydata[1,])) {
   for (j in 1:5) {
     newdata[j,i] - mydata[j,i]*5
  }
     }

  newdata

  --- On Fri, 10/30/09, Robert Wilkins robst...@gmail.com
 wrote:

  From: Robert Wilkins robst...@gmail.com
  Subject: [R] how to loop thru a matrix or data
 frame , and append calculations to  a new data frame?
  To: r-help@r-project.org
  Received: Friday, October 30, 2009, 11:46 AM
  How do you do a double loop through a
  matrix or data frame , and
  within each iteration , do a calculation and
 append it to a
  new,
  second data frame?
  (So, if your original matrix or data frame is 4 x
 5 , then
  20
  calculations are done, and the new data frame,
 which had 0
  rows to
  start with, now has 20 rows)

  __
  R-help@r-project.org
  mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.

  __
  R-help@r-project.org
 mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

  __
Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your 
favourite sites. Download it now
http://ca.toolbar.yahoo.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix^(-1/2)

2009-11-01 Thread spencerg


 A question, a comment, and an alternative answer to matrix^(-1/2):


QUESTION:


What's the status of the expm package, mentioned in the email you 
cited from Martin Maechler, dated Apr 5 19:52:09 CEST 2008? I tried both 
install.packages('expm') and 
install.packages(expm,repos=http://R-Forge.R-project.org;), and got 
package 'expm' is not available in both cases.



COMMENT:


The solution proposed by Venables rests on Sylvester's matrix theorem, 
which essentially says that if a matrix A is diagonalizable with 
eigenvalue decomposition eigA - eigen(A) and f: D → C with D ⊂ C be a 
function for which f(A) is well defined 
(http://en.wikipedia.org/wiki/Sylvester%27s_matrix_theorem), then f(A) = 
with(eigA, vectors %*% diag(f(values)) %*% solve(vectors)). Maechler and 
others have noted that this can be one of the least accurate and most 
computationally expensive ways to compute f(A).



ALTERNATIVE ANSWER:


For A^(-1/2), if A is symmetric and nonnegative definite, then 
solve(chol(A)) would be a very good way to compute it.



Hope this helps,
Spencer


David Winsemius wrote:


On Oct 31, 2009, at 9:33 PM, David Winsemius wrote:



On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote:


Dear R-Help Team,

as a R novice I have a (maybe for you very simple question), how do 
I get

the following solved in R:

Let R be a n x n matrix:

\mid R\mid^{-\frac{1}{2}}

solve(A) gives me the inverse of the matrix R, however not the 
^(-1/2) of

the matrix...


GIYF: (and Bill Venables if friendly, too.)

http://www.lmgtfy.com/?q=powers+of+matrix+r-project


I had assumed that the first hit I got:

https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html

... would be the first hit anybody got, but that's not necessarily 
true now and especially for the future. And further searching within 
the results produced this more recent Maechler posting:


https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html

For the Mac users, there appears to be no binary, but the source 
compiles without error on a 64-bit version of R 2.10.0:


install.packages(expm,repos=http://R-Forge.R-project.org;, 
type=source)


#The suggested code throws an error, so my very minor revision would be:

library(expm)
?%^%




--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix^(-1/2)

2009-11-01 Thread Charles C. Berry


On Sun, 1 Nov 2009, spencerg wrote:


 A question, a comment, and an alternative answer to matrix^(-1/2):

QUESTION:


What's the status of the expm package, mentioned in the email you cited 
from Martin Maechler, dated Apr 5 19:52:09 CEST 2008? I tried both 
install.packages('expm') and 
install.packages(expm,repos=http://R-Forge.R-project.org;), and got 
package 'expm' is not available in both cases.





Try

http://r-forge.r-project.org/projects/expm/

HTH,

Chuck



COMMENT:


The solution proposed by Venables rests on Sylvester's matrix theorem, which 
essentially says that if a matrix A is diagonalizable with eigenvalue 
decomposition eigA - eigen(A) and f: D → C with D ⊂ C be a function for 
which f(A) is well defined 
(http://en.wikipedia.org/wiki/Sylvester%27s_matrix_theorem), then f(A) = 
with(eigA, vectors %*% diag(f(values)) %*% solve(vectors)). Maechler and 
others have noted that this can be one of the least accurate and most 
computationally expensive ways to compute f(A).



ALTERNATIVE ANSWER:


For A^(-1/2), if A is symmetric and nonnegative definite, then solve(chol(A)) 
would be a very good way to compute it.



Hope this helps,
Spencer


David Winsemius wrote:


 On Oct 31, 2009, at 9:33 PM, David Winsemius wrote:

 
  On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote:
 
   Dear R-Help Team,
  
   as a R novice I have a (maybe for you very simple question), how do I 
   get

   the following solved in R:
  
   Let R be a n x n matrix:
  
   \mid R\mid^{-\frac{1}{2}}
  
   solve(A) gives me the inverse of the matrix R, however not the ^(-1/2) 
   of

   the matrix...
 
  GIYF: (and Bill Venables if friendly, too.)
 
  http://www.lmgtfy.com/?q=powers+of+matrix+r-project


 I had assumed that the first hit I got:

 https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html

 ... would be the first hit anybody got, but that's not necessarily true
 now and especially for the future. And further searching within the
 results produced this more recent Maechler posting:

 https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html

 For the Mac users, there appears to be no binary, but the source compiles
 without error on a 64-bit version of R 2.10.0:

 install.packages(expm,repos=http://R-Forge.R-project.org;,
 type=source)

 #The suggested code throws an error, so my very minor revision would be:

 library(expm)
 ?%^%




--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoiding loop

2009-11-01 Thread Charles C. Berry


On Sat, 31 Oct 2009, parkbomee wrote:



Thank you both.

However, using tapply() instead of a loop does not seem to improve my code much.
I am using this inside of an optimization function,
and it still takes more than it needs...




Well, you haven't given us much to work with.

The optimization choices depend on the particulars of your problem, which 
you've not detailed.


It does not take long to run the tapply() code once, so you need to do it 
many times. Right?


If so, you need to say how the structure varies (rendered as DF in David's 
reply) from iteration to iteration in the optimization.


If it turns out that only 'value' changes and that the number of different 
times is not too large, then precomputing suitable indicator matrics may 
help:


mat1 - model.matrix( ~ 0 + factor(time):as.numeric(choice==1),DF)
mat2 - model.matrix( ~ 0 + factor(time), DF )

Inside the optimization use something like

with(DF,(value%*%mat1)/(value%*%mat2))

If the structure can change or the number of unique times is large, then 
with so simple a calculation you should probably just inline some C code.


http://cran.r-project.org/web/packages/inline/index.html

HTH,

Chuck





CC: bbom...@hotmail.com; r-help@r-project.org
From: dwinsem...@comcast.net
To: d.rizopou...@erasmusmc.nl
Subject: Re: [R] avoiding loop
Date: Sat, 31 Oct 2009 22:26:17 -0400

This is pretty much equivalent:

tapply(DF$value[DF$choice==1], DF$time[DF$choice==1], sum) /
 tapply(DF$value, DF$time, sum)

And both will probably fail if the number of groups with choice==1 is
different than the number overall.

--
David.

On Oct 31, 2009, at 5:14 PM, Dimitris Rizopoulos wrote:


one approach is the following:

# say 'DF' is your data frame, then
with(DF, {
   ind - choice == 1
   n - tapply(value[ind], time[ind], sum)
   d - tapply(value, time, sum)
   n / d
})


I hope it helps.

Best,
Dimitris


parkbomee wrote:

Hi all,
I am trying to figure out a way to improve my code's efficiency by
avoiding the use of loop.
I want to calculate a conditional mean(?) given time.
For example, from the data below, I want to calculate sum((value|
choice==1)/sum(value)) across time.
Is there a way to do it without using a loop?
time  cum_time  choicevalue
1 4 1   3
1 4  0   2
1  4 0   3
1  4 0   3
2 6 1   4
2 6 0   4
2 6 0   2
2 6 0   4
2 6 0   2
2 6 0   2 3 4
1   2 3 4 0   3 3
4 0   5 3 4 0   2
My code looks like
objective[1] = value[1] / sum(value[1:cum_time[1])
for (i in 2:max(time)){
objective[i] = value[cum_time[i-1]+1] /
sum(value[(cum_time[i-1]+1) : cum_time[i])])
}
sum(objective)
Anyone have an idea that I can do this without using a loop??
Thanks.

_
[[elided Hotmail spam]]
[[alternative HTML version deleted]]
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT



_
[[elided Hotmail spam]]
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problems whit seasonal ARIMA

2009-11-01 Thread Laura Saltyte

Hello,

I have daily wind speed data and need to fit seasonal ARIMA model, problem
is that my period is 365. But when I use arima(...) function, with period
365, Im getting error message: Error in makeARIMA(trarma[[1]],
trarma[[2]], Delta, kappa) :   maximum supported lag is 350. Can someone
help me with this problem?
Thank you


Sincerely yours,
Laura Saltyte

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package zelig

2009-11-01 Thread Uwe Ligges


Confirmed with recent Zelig and R-2.10.0.

CCing Kosuke Imai, the Zelig maintainer:

Please can you take a look at this one and additionally fix the warnings 
given on your package's overview page at

http://cran.r-project.org/web/checks/check_results_Zelig.html

Please upload a fixed version with increased version number to CRAN.

Best wishes,
Uwe Ligges



sayan dasgupta wrote:

hello all
I am using the R package Zelig for some tobit  regression with robust
standard errors.
I have got R version 2.9.2 (2009-08-24)
and Zelig Version: 3.4-5
when i do demo(robust)

It ends like this way

data(coalition)


# Fit the model with robust standard error
user.prompt()


Press return to continue:


z.out3 - zelig(Surv(duration, ciep12) ~ polar + numst2 +  crisis,

model = weibull, data = coalition,   cluster = invest, robust = TRUE)
*It is giving the following error

*Error in rowsum.default(resid(fit, dfbeta), cluster) :
  'x' must be numeric

Please help where it is going wrong
Thanks and Regards
Sayan Dasgupta

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix^(-1/2)

2009-11-01 Thread spencerg

Hi, Chuck: 



 Thanks very much, but why do I get package 'expm' is not 
available from 
install.packages(expm,repos=http://R-Forge.R-project.org;)? 



 Best Wishes,
 Spencer Graves


Charles C. Berry wrote:

On Sun, 1 Nov 2009, spencerg wrote:


 A question, a comment, and an alternative answer to matrix^(-1/2):

QUESTION:


What's the status of the expm package, mentioned in the email you 
cited from Martin Maechler, dated Apr 5 19:52:09 CEST 2008? I tried 
both install.packages('expm') and 
install.packages(expm,repos=http://R-Forge.R-project.org;), and 
got package 'expm' is not available in both cases.





Try

http://r-forge.r-project.org/projects/expm/

HTH,

Chuck



COMMENT:


The solution proposed by Venables rests on Sylvester's matrix 
theorem, which essentially says that if a matrix A is diagonalizable 
with eigenvalue decomposition eigA - eigen(A) and f: D → C with D ⊂ 
C be a function for which f(A) is well defined 
(http://en.wikipedia.org/wiki/Sylvester%27s_matrix_theorem), then 
f(A) = with(eigA, vectors %*% diag(f(values)) %*% solve(vectors)). 
Maechler and others have noted that this can be one of the least 
accurate and most computationally expensive ways to compute f(A).



ALTERNATIVE ANSWER:


For A^(-1/2), if A is symmetric and nonnegative definite, then 
solve(chol(A)) would be a very good way to compute it.



Hope this helps,
Spencer


David Winsemius wrote:


 On Oct 31, 2009, at 9:33 PM, David Winsemius wrote:

   On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote:
Dear R-Help Team,
 as a R novice I have a (maybe for you very simple 
question), how do Iget

   the following solved in R:
 Let R be a n x n matrix:
 \mid R\mid^{-\frac{1}{2}}
 solve(A) gives me the inverse of the matrix R, however not 
the ^(-1/2)of

   the matrix...
   GIYF: (and Bill Venables if friendly, too.)
   http://www.lmgtfy.com/?q=powers+of+matrix+r-project

 I had assumed that the first hit I got:

 https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html

 ... would be the first hit anybody got, but that's not necessarily 
true

 now and especially for the future. And further searching within the
 results produced this more recent Maechler posting:

 https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html

 For the Mac users, there appears to be no binary, but the source 
compiles

 without error on a 64-bit version of R 2.10.0:

 install.packages(expm,repos=http://R-Forge.R-project.org;,
 type=source)

 #The suggested code throws an error, so my very minor revision 
would be:


 library(expm)
 ?%^%


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




Charles C. Berry(858) 534-2098
Dept of Family/Preventive 
Medicine

E mailto:cbe...@tajo.ucsd.eduUC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 
92093-0901


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fast optimizer

2009-11-01 Thread Ravi Varadhan


Hi,

Here is one approach to solve your likelihood maximization subject to 
constraints.  I have written a function called `constrOptim.nl' that can solve 
constrained optimization problems.  I will demonstrate this with a simulation.

# 1.  Simulate the data (x,y).

a - 4
b - 1
c - 2.5
p - 0.3
n - 100
z - rbinom(n, size=1, prob=p)
x - ifelse(z, rpois(n, a), rpois(n, a+c))
y - ifelse(z, rpois(n, b+c), rpois(n, b))

# 2.  Define the log-likelihood to be maximized

mloglik - function(par, xx, y, ...){
# Note that I have named `xx' instead of `x' to avoid conflict with another 
function 
p - par[1]
a - par[2]
b - par[3]
cc - par[4]
sum(log(p*dpois(xx,a)*dpois(y,b+cc) + (1-p)*dpois(xx,a+cc)*dpois(y,b)))
}

# 3. Write the constraint function to define inequalities

hin - function(par, ...) {
# All constraints are defined such that h[i]  0 for all i.
h - rep(NA, 7)
h[1] - par[1]
h[2] - 1 - par[1] 
h[3] - par[2]
h[4] - par[3]
h[5] - par[4]
h[6] - par[2] - par[4]
h[7] - par[3] - par[4]
h
}

# Now, we are almost ready to maximize the log-likelihood.  We need a feasible 
starting value.  
# We construct a projection function that can convert any arbitrary real vector 
to a feasible starting vector.

# 5.  Finding a feasible starting vector of parameter values

project - function(par, lower, upper) {
# a function to map a random vector to a feasible vector
slack - 1.e-14
if (par[1] = 0) par[1] - slack
if (par[1] = 1) par[1] - 1 - slack
if (par[2] = 0) par[2] - slack
if (par[3] = 0) par[3] - slack
par[4] - min(par[2], par[3], par[4]) - slack
par
} 

p0c - runif(4, c(0,1,1,1), c(1, 5, 5, 2))  # a random candidate starting value
p0 - project(p0c)  # feasible starting value

# 6.  Optimization

source(constrOptim_nl.R)  # we source the `constrOptim.nl' function
ans - constrOptim.nl(par=p0, fn=mloglik, hin=hin, xx=x, y=y, 
control=list(fnscale=-1)) # Note: we are maximizing
ans


This concludes our brief tutorial.   

Hope this helps,
Ravi.



Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


- Original Message -
From: Ravi Varadhan rvarad...@jhmi.edu
Date: Thursday, October 29, 2009 10:23 pm
Subject: Re: [R] Fast optimizer
To: R_help Help rhelp...@gmail.com
Cc: r-help@r-project.org, r-sig-fina...@stat.math.ethz.ch


 This optimization should not take you 1-2 mins.  My guess is that your 
 coding of the likelihood is inefficient, perhaps containing for loops. 
  Would you mind sharing your code with us?
 
 As far as incorporating inequality constraints, there are at least 4 
 approaches:
 
 1.  You could use `constrOptim' to incorporate inequality constraints. 
  
 
 2.  I have written a function `constrOptim.nl' that is better than 
 `constrOptim', and it can be used here.
 
 3.  The projection method in spg() function in the BB package can be 
 used.
 
 4.  You can create a penalized objective function, where the 
 inequalities c  a and c  b can be penalized.  Then you can use 
 optim's L-BFGS-B or spg() or nlminb().
 
 
 Ravi.
 
 
 Ravi Varadhan, Ph.D.
 Assistant Professor,
 Division of Geriatric Medicine and Gerontology
 School of Medicine
 Johns Hopkins University
 
 Ph. (410) 502-2619
 email: rvarad...@jhmi.edu
 
 
 - Original Message -
 From: R_help Help rhelp...@gmail.com
 Date: Thursday, October 29, 2009 9:21 pm
 Subject: Re: [R] Fast optimizer
 To: Ravi Varadhan rvarad...@jhmi.edu
 Cc: r-help@r-project.org, r-sig-fina...@stat.math.ethz.ch
 
 
  Ok. I have the following likelihood function.
  
  L - p*dpois(x,a)*dpois(y,b+c)+(1-p)*dpois(x,a+c)*dpois(y,b)
  
  where I have 100 points of (x,y) and parameters c(a,b,c,p) to
  estimate. Constraints are:
  
  0  p  1
  a,b,c  0
  c  a
  c  b
  
  I construct a loglikelihood function out of this. First ignoring the
  last two constraints, it takes optim with box constraint about 1-2 min
  to estimate this. I have to estimate the MLE on about 200 rolling
  windows. This will take very long. Is there any faster implementation?
  
  Secondly, I cannot incorporate the last two contraints using optim function.
  
  Thank you,
  
  rc
  
  
  
  On Thu, Oct 29, 2009 at 9:02 PM, Ravi Varadhan rvarad...@jhmi.edu 
 wrote:
  
   You have hardly given us any information for us to be able to help 
 
  you.  Give us more information on your problem, and, if possible, a 
 
  minimal, self-contained example of what you are trying to do.
  
   Ravi.
   
  
   Ravi Varadhan, Ph.D.
   Assistant Professor,
   Division of Geriatric Medicine and Gerontology
   School of Medicine
   Johns Hopkins University
  
   Ph. (410) 502-2619
   email: rvarad...@jhmi.edu
  
  
   - Original Message -
   From: R_help Help rhelp...@gmail.com
   Date:

Re: [R] matrix^(-1/2)

2009-11-01 Thread David Winsemius



On Nov 1, 2009, at 1:46 PM, spencerg wrote:


Hi, Chuck:

Thanks very much, but why do I get package 'expm' is not  
available from install.packages(expm,repos=http://R-Forge.R-project.org 
)?


In my case I think it was it is because there is no 2.10 branch to  
either the:


http://r-forge.r-project.org/bin/macosx/leopard/contrib/... or the
http://r-forge.r-project.org/bin/macosx/universal/contrib/...trees.

I tried a variety of stems for the installer but got these messages:
Warning: unable to access index for repository
http://r-forge.r-project.org/bin/macosx/universal/contrib/latest/bin/macosx/leopard/contrib/2.10
Warning: unable to access index for repository
http://r-forge.r-project.org/bin/macosx/universal/contrib/bin/macosx/leopard/contrib/2.10
Warning: unable to access index for repository
http://r-forge.r-project.org/bin/macosx/leopard/contrib/2.10

So I wonder if the package installers' expectations for the r-forge  
repository are matching up with the tree structures.


I should also note that the matpow or %^% functions in expm would  
not address the OP's question since they require that the exponent be  
positive.


--
David.



Best Wishes,
Spencer Graves


Charles C. Berry wrote:

On Sun, 1 Nov 2009, spencerg wrote:


A question, a comment, and an alternative answer to matrix^(-1/2):

QUESTION:


What's the status of the expm package, mentioned in the email  
you cited from Martin Maechler, dated Apr 5 19:52:09 CEST 2008? I  
tried both install.packages('expm') and  
install.packages(expm,repos=http://R-Forge.R-project.org;), and  
got package 'expm' is not available in both cases.





Try

   http://r-forge.r-project.org/projects/expm/

HTH,

Chuck



COMMENT:


The solution proposed by Venables rests on Sylvester's matrix  
theorem, which essentially says that if a matrix A is  
diagonalizable with eigenvalue decomposition eigA - eigen(A) and  
f: D → C with D ⊂ C be a function for which f(A) is well defined (http://en.wikipedia.org/wiki/Sylvester%27s_matrix_the 
orem), then f(A) = with(eigA, vectors %*% diag(f(values)) %*%  
solve(vectors)). Maechler and others have noted that this can be  
one of the least accurate and most computationally expensive ways  
to compute f(A).



ALTERNATIVE ANSWER:


For A^(-1/2), if A is symmetric and nonnegative definite, then  
solve(chol(A)) would be a very good way to compute it.



Hope this helps,
Spencer


David Winsemius wrote:


On Oct 31, 2009, at 9:33 PM, David Winsemius wrote:

   On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote:
Dear R-Help Team,
 as a R novice I have a (maybe for you very simple  
question), how do Iget

   the following solved in R:
 Let R be a n x n matrix:
 \mid R\mid^{-\frac{1}{2}}
 solve(A) gives me the inverse of the matrix R, however  
not the ^(-1/2)of

   the matrix...
   GIYF: (and Bill Venables if friendly, too.)
   http://www.lmgtfy.com/?q=powers+of+matrix+r-project

I had assumed that the first hit I got:

https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html

... would be the first hit anybody got, but that's not  
necessarily true

now and especially for the future. And further searching within the
results produced this more recent Maechler posting:

https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html

For the Mac users, there appears to be no binary, but the source  
compiles

without error on a 64-bit version of R 2.10.0:

install.packages(expm,repos=http://R-Forge.R-project.org;,
type=source)

#The suggested code throws an error, so my very minor revision  
would be:


library(expm)
?%^%


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




Charles C. Berry(858) 534-2098
   Dept of Family/ 
Preventive Medicine

E mailto:cbe...@tajo.ucsd.eduUC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego  
92093-0901




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems whit seasonal ARIMA

2009-11-01 Thread Rolf Turner



On 2/11/2009, at 2:49 AM, Laura Saltyte wrote:


Hello,

I have daily wind speed data and need


``need'' is probably not the operative word!


to fit seasonal ARIMA model, problem
is that my period is 365. But when I use arima(...) function, with  
period

365, I’m getting error message: “Error in makeARIMA(trarma[[1]],
trarma[[2]], Delta, kappa) :   maximum supported lag is 350”. Can  
someone

help me with this problem?


Well, you could dig into the code and rewrite it to revise the maximum
lag.  But that would be pretty silly.

A seasonal model with a lag of 365 is downright ridiculous.

A seasonal model of lag ``s'' assumes that X_{t-s} has an influence
upon or contains information about (or is non-trivially correlated
with) X_t.  Do you *really* believe that wind speed from a year ago
contains information relevant to current wind speed?

There is almost surely a periodic *trend* with period 365 (well, maybe
period 365.25?) in daily wind speed data, but that's a different issue.
If you have data for sufficiently many years you could try estimating
such a trend simply by averaging over all occurrences of a given day to
estimate the trend value for that day.  You could also try fitting a
sinusoid (probably a fairly high order sinusoid )

What you really should do depends on the (practical) question that
you are trying to answer.

cheers,

Rolf Turner
##
Attention: 
This e-mail message is privileged and confidential. If you are not the 
intended recipient please delete the message and notify the sender. 
Any views or opinions presented are solely those of the author.


This e-mail has been scanned and cleared by MailMarshal 
www.marshalsoftware.com

##

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoiding loop

2009-11-01 Thread jim holtman

What you need to do is to understand how to use Rprof so that you can
determine where the time is being spent.  It probably indicates that
this is not the source of slowness in your optimization function.  How
much time are we talking about?  You may spent more time trying to
optimize the function than just running the current version even if it
is slow (slow is a relative term and does not hold much meaning
without some context round it).

On Sat, Oct 31, 2009 at 11:36 PM, parkbomee bbom...@hotmail.com wrote:

 Thank you both.

 However, using tapply() instead of a loop does not seem to improve my code 
 much.
 I am using this inside of an optimization function,
 and it still takes more than it needs...



 CC: bbom...@hotmail.com; r-help@r-project.org
 From: dwinsem...@comcast.net
 To: d.rizopou...@erasmusmc.nl
 Subject: Re: [R] avoiding loop
 Date: Sat, 31 Oct 2009 22:26:17 -0400

 This is pretty much equivalent:

 tapply(DF$value[DF$choice==1], DF$time[DF$choice==1], sum) /
          tapply(DF$value, DF$time, sum)

 And both will probably fail if the number of groups with choice==1 is
 different than the number overall.

 --
 David.

 On Oct 31, 2009, at 5:14 PM, Dimitris Rizopoulos wrote:

  one approach is the following:
 
  # say 'DF' is your data frame, then
  with(DF, {
     ind - choice == 1
     n - tapply(value[ind], time[ind], sum)
     d - tapply(value, time, sum)
     n / d
  })
 
 
  I hope it helps.
 
  Best,
  Dimitris
 
 
  parkbomee wrote:
  Hi all,
  I am trying to figure out a way to improve my code's efficiency by
  avoiding the use of loop.
  I want to calculate a conditional mean(?) given time.
  For example, from the data below, I want to calculate sum((value|
  choice==1)/sum(value)) across time.
  Is there a way to do it without using a loop?
  time  cum_time  choice    value
  1         4             1           3
  1         4              0           2
  1          4             0           3
  1          4             0           3
  2         6             1           4
  2         6             0           4
  2         6             0           2
  2         6             0           4
  2         6             0           2
  2         6             0           2 3         4
  1           2 3         4             0           3 3
  4             0           5 3         4             0           2
  My code looks like
  objective[1] = value[1] / sum(value[1:cum_time[1])
  for (i in 2:max(time)){
      objective[i] = value[cum_time[i-1]+1] /
  sum(value[(cum_time[i-1]+1) : cum_time[i])])
  }
  sum(objective)
  Anyone have an idea that I can do this without using a loop??
  Thanks.
 
  _
  [[elided Hotmail spam]]
     [[alternative HTML version deleted]]
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  --
  Dimitris Rizopoulos
  Assistant Professor
  Department of Biostatistics
  Erasmus University Medical Center
 
  Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
  Tel: +31/(0)10/7043478
  Fax: +31/(0)10/7043014
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT


 _
 [[elided Hotmail spam]]
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] intigrate function and absolute error

2009-11-01 Thread Laila Alkhalfan

Hi
 Can we get the result of an intigration without the absolute error?
for example
f1-function(x1){(1/gamma(alpha))*x1^(alpha-1)*exp(-x1)*log(x1)}
I1-integrate(f1, 0, (max(cc)-tau1+(theta2/theta1)*tau1)/theta2)
I1
0.08007414 with absolute error  7.2e-05
 I need the answer  0.08007414 withou the other part(with absolute error
7.2e-05)
how can we do that?
thank you and take care
Laila

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] intigrate function and absolute error

2009-11-01 Thread Jorge Ivan Velez

Hi Laila,

Here is a suggestion:

res -  integrate(dnorm, -1.96, 1.96)
res
# 0.9500042 with absolute error  1.0e-11

res[[1]]
# [1] 0.9500042

res$value
# [1] 0.9500042


HTH,
Jorge

On Sun, Nov 1, 2009 at 3:02 PM, Laila Alkhalfan  wrote:

 Hi
  Can we get the result of an intigration without the absolute error?
 for example
 f1-function(x1){(1/gamma(alpha))*x1^(alpha-1)*exp(-x1)*log(x1)}
 I1-integrate(f1, 0, (max(cc)-tau1+(theta2/theta1)*tau1)/theta2)
 I1
 0.08007414 with absolute error  7.2e-05
  I need the answer  0.08007414 withou the other part(with absolute error
 7.2e-05)
 how can we do that?
 thank you and take care
 Laila

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calculate Volume in a PCA

2009-11-01 Thread Katja Seis

Hi,
my data frame consist of 8 Variables and 120 000 observations. With those datas 
I am running a PCA and after I want to calculate the Volume of the PCA-cloud of 
certain subsets of my data. Does anyone have an idea about a function that can 
do this?

Thanks



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix^(-1/2)

2009-11-01 Thread Charles C. Berry


On Sun, 1 Nov 2009, David Winsemius wrote:



On Nov 1, 2009, at 1:46 PM, spencerg wrote:


Hi, Chuck:

Thanks very much, but why do I get package 'expm' is not available
from install.packages(expm,repos=http://R-Forge.R-project.org;)?


In my case I think it was it is because there is no 2.10 branch to either 
the:


http: //r-forge.r-project.org/bin/macosx/leopard/contrib/... or the
http: //r-forge.r-project.org/bin/macosx/universal/contrib/...trees.

I tried a variety of stems for the installer but got these messages:
Warning: unable to access index for repository
http://r-forge.r-project.org/bin/macosx/universal/contrib/latest/bin/macosx/leopard/contrib/2.10
Warning: unable to access index for repository
http://r-forge.r-project.org/bin/macosx/universal/contrib/bin/macosx/leopard/contrib/2.10
Warning: unable to access index for repository
http://r-forge.r-project.org/bin/macosx/leopard/contrib/2.10

So I wonder if the package installers' expectations for the r-forge 
repository are matching up with the tree structures.


Right. FWIW, the source install works OK on my linux box:


sessionInfo()

R version 2.10.0 (2009-10-26)
x86_64-pc-linux-gnu

[output truncated]




I should also note that the matpow or %^% functions in expm would not 
address the OP's question since they require that the exponent be positive.




Roger that.

If solve(chol(A)) isn't good enough a symmetric inverse square root is 
available from expm as 'solve( sqrtm( A ) )'


Chuck



--
David.



Best Wishes,
Spencer Graves


Charles C. Berry wrote:
 On Sun, 1 Nov 2009, spencerg wrote:
 
  A question, a comment, and an alternative answer to matrix^(-1/2):
  
  QUESTION:
  
  
  What's the status of the expm package, mentioned in the email you 
  cited from Martin Maechler, dated Apr 5 19:52:09 CEST 2008? I tried 
  both install.packages('expm') and 
  install.packages(expm,repos=http://R-Forge.R-project.org;), and got 
  package 'expm' is not available in both cases.
  
 
 
 Try
 
http://r-forge.r-project.org/projects/expm/
 
 HTH,
 
 Chuck
 
  
  COMMENT:
  
  
  The solution proposed by Venables rests on Sylvester's matrix theorem, 
  which essentially says that if a matrix A is diagonalizable with 
  eigenvalue decomposition eigA - eigen(A) and f: D → C with D ⊂ C 
  be a function for which f(A) is well defined 
  (http://en.wikipedia.org/wiki/Sylvester%27s_matrix_theorem), then f(A) 
  = with(eigA, vectors %*% diag(f(values)) %*% solve(vectors)). Maechler 
  and others have noted that this can be one of the least accurate and 
  most computationally expensive ways to compute f(A).
  
  
  ALTERNATIVE ANSWER:
  
  
  For A^(-1/2), if A is symmetric and nonnegative definite, then 
  solve(chol(A)) would be a very good way to compute it.
  
  
  Hope this helps,

  Spencer
  
  
  David Winsemius wrote:
   
   On Oct 31, 2009, at 9:33 PM, David Winsemius wrote:
   
   On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote:

Dear R-Help Team,
 as a R novice I have a (maybe for you very simple 
 question), how do Iget

   the following solved in R:
 Let R be a n x n matrix:
 \mid R\mid^{-\frac{1}{2}}
 solve(A) gives me the inverse of the matrix R, however not 
 the ^(-1/2)of

   the matrix...
   GIYF: (and Bill Venables if friendly, too.)
   http://www.lmgtfy.com/?q=powers+of+matrix+r-project
   
   I had assumed that the first hit I got:
   
   https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html
   
   ... would be the first hit anybody got, but that's not necessarily 
   true

   now and especially for the future. And further searching within the
   results produced this more recent Maechler posting:
   
   https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html
   
   For the Mac users, there appears to be no binary, but the source 
   compiles

   without error on a 64-bit version of R 2.10.0:
   
   install.packages(expm,repos=http://R-Forge.R-project.org;,

   type=source)
   
   #The suggested code throws an error, so my very minor revision would 
   be:
   
   library(expm)

   ?%^%
  
  __

  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html

  and provide commented, minimal, self-contained, reproducible code.
  
  
 
 Charles C. Berry(858) 534-2098
Dept of Family/Preventive 
 Medicine

 E mailto:cbe...@tajo.ucsd.eduUC San Diego
 http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 
 92093-0901




David Winsemius, MD
Heritage Laboratories
West Hartford, CT




Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego

Re: [R] underflow of fisher.test result

2009-11-01 Thread Peng Yu

On Tue, Oct 20, 2009 at 8:14 AM, Ted Harding
ted.hard...@manchester.ac.uk wrote:
 On 20-Oct-09 13:34:49, Peng Yu wrote:
 fisher.test() gives a very small p-value, which is underflow on my
 machine. However, the log of it should not be underflow. I'm wondering
 if there is a way get log() of a small p-value. An approximation is
 acceptable in this case. Thank you!

 fisher.test(rbind(c(1,10),c(10,1000)))$p.value
 [1] 0

 I have not attempted an exact approach (though may do so later),
 but the P-value in question is about 1e-15000 so (using log to
 base e)

  log(P) approx = -33000

 In such a case, P=0 is a pretty good approximation!

 Which prompts the question: Why the interest in having the value
 of such a very small number?

Is there any existing method that could return a log() of a very small p-value?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Collapse factor levels

2009-11-01 Thread Kevin E. Thorpe


I'm sure this is simple enough, but an R site search on my subject
terms did suggest a solution.  I have a numeric vector with many
values that I wish to create a factor from having only a few levels.
Here is a toy example.

 x - 1:10
 x - 
factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C))

 x
 [1] A A A B B B C C C C
Levels: A A A B B B C C C C
 summary(x)
A A A B B B C C C C
3 0 0 3 0 0 4 0 0 0

So, there are clearly still 10 underlying levels.  The results I would
like to see from printing the value and summary(x) are:

 x
 [1] A A A B B B C C C C
Levels: A B C
 summary(x)
A B C
3 3 4

Hopefully this makes sense.

Thanks,

Kevin

--
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Dalla Lana School of Public Health
University of Toronto
email: kevin.tho...@utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Collapse factor levels

2009-11-01 Thread Jorge Ivan Velez

Hi Kevin,

Here are two suggestions:

# Combination of levels() and table()
table(levels(x))
# A B C
# 3 3 4

# Or defining a function
mysummary - function(x) table(levels(x)) # you can easily improve it :-)
mysummary(x)
# A B C
# 3 3 4

HTH,
Jorge

On Sun, Nov 1, 2009 at 3:51 PM, Kevin E. Thorpe  wrote:

 I'm sure this is simple enough, but an R site search on my subject
 terms did suggest a solution.  I have a numeric vector with many
 values that I wish to create a factor from having only a few levels.
 Here is a toy example.

  x - 1:10
  x -
 factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C))
  x
  [1] A A A B B B C C C C
 Levels: A A A B B B C C C C
  summary(x)
 A A A B B B C C C C
 3 0 0 3 0 0 4 0 0 0

 So, there are clearly still 10 underlying levels.  The results I would
 like to see from printing the value and summary(x) are:

  x
  [1] A A A B B B C C C C
 Levels: A B C
  summary(x)
 A B C
 3 3 4

 Hopefully this makes sense.

 Thanks,

 Kevin

 --
 Kevin E. Thorpe
 Biostatistician/Trialist, Knowledge Translation Program
 Assistant Professor, Dalla Lana School of Public Health
 University of Toronto
 email: kevin.tho...@utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoiding loop

2009-11-01 Thread parkbomee


Thank you all.

What Chuck has suggested might not be applicable since the number of different 
times is around 40,000.
The object of optimization in my function is the varying value, which is 
basically data * parameter, of which parameter is the object of optimization..
 
And from the r profiling with a subset of data,
I got this report..any idea what Anonymous is?


$by.total
total.time total.pct self.time self.pct
Anonymous   571.56 100.0  0.02  0.0
optim 571.56 100.0  0.00  0.0
fn571.54 100.0  0.98  0.2
eval  423.74  74.1  0.00  0.0
with.default  423.74  74.1  0.00  0.0
with  423.74  74.1  0.00  0.0
tapply414.28  72.5 13.84  2.4
lapply255.48  44.7 76.94 13.5
factor127.68  22.3 11.08  1.9
unlist120.54  21.1 80.46 14.1
FUN94.16  16.5 94.16 16.5
.
.
.
.
.


 Date: Sun, 1 Nov 2009 15:35:41 -0400
 Subject: Re: [R] avoiding loop
 From: jholt...@gmail.com
 To: bbom...@hotmail.com
 CC: dwinsem...@comcast.net; d.rizopou...@erasmusmc.nl; r-help@r-project.org
 
 What you need to do is to understand how to use Rprof so that you can
 determine where the time is being spent.  It probably indicates that
 this is not the source of slowness in your optimization function.  How
 much time are we talking about?  You may spent more time trying to
 optimize the function than just running the current version even if it
 is slow (slow is a relative term and does not hold much meaning
 without some context round it).
 
 On Sat, Oct 31, 2009 at 11:36 PM, parkbomee bbom...@hotmail.com wrote:
 
  Thank you both.
 
  However, using tapply() instead of a loop does not seem to improve my code 
  much.
  I am using this inside of an optimization function,
  and it still takes more than it needs...
 
 
 
  CC: bbom...@hotmail.com; r-help@r-project.org
  From: dwinsem...@comcast.net
  To: d.rizopou...@erasmusmc.nl
  Subject: Re: [R] avoiding loop
  Date: Sat, 31 Oct 2009 22:26:17 -0400
 
  This is pretty much equivalent:
 
  tapply(DF$value[DF$choice==1], DF$time[DF$choice==1], sum) /
   tapply(DF$value, DF$time, sum)
 
  And both will probably fail if the number of groups with choice==1 is
  different than the number overall.
 
  --
  David.
 
  On Oct 31, 2009, at 5:14 PM, Dimitris Rizopoulos wrote:
 
   one approach is the following:
  
   # say 'DF' is your data frame, then
   with(DF, {
  ind - choice == 1
  n - tapply(value[ind], time[ind], sum)
  d - tapply(value, time, sum)
  n / d
   })
  
  
   I hope it helps.
  
   Best,
   Dimitris
  
  
   parkbomee wrote:
   Hi all,
   I am trying to figure out a way to improve my code's efficiency by
   avoiding the use of loop.
   I want to calculate a conditional mean(?) given time.
   For example, from the data below, I want to calculate sum((value|
   choice==1)/sum(value)) across time.
   Is there a way to do it without using a loop?
   time  cum_time  choicevalue
   1 4 1   3
   1 4  0   2
   1  4 0   3
   1  4 0   3
   2 6 1   4
   2 6 0   4
   2 6 0   2
   2 6 0   4
   2 6 0   2
   2 6 0   2 3 4
   1   2 3 4 0   3 3
   4 0   5 3 4 0   2
   My code looks like
   objective[1] = value[1] / sum(value[1:cum_time[1])
   for (i in 2:max(time)){
   objective[i] = value[cum_time[i-1]+1] /
   sum(value[(cum_time[i-1]+1) : cum_time[i])])
   }
   sum(objective)
   Anyone have an idea that I can do this without using a loop??
   Thanks.
  
   _
   [[elided Hotmail spam]]
  [[alternative HTML version deleted]]
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
   --
   Dimitris Rizopoulos
   Assistant Professor
   Department of Biostatistics
   Erasmus University Medical Center
  
   Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
   Tel: +31/(0)10/7043478
   Fax: +31/(0)10/7043014
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide

Re: [R] Collapse factor levels

2009-11-01 Thread David Winsemius



On Nov 1, 2009, at 3:51 PM, Kevin E. Thorpe wrote:


I'm sure this is simple enough, but an R site search on my subject
terms did suggest a solution.  I have a numeric vector with many
values that I wish to create a factor from having only a few levels.
Here is a toy example.

 x - 1:10
 x -  
factor 
(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C))


You have thusly created a pathological situation. In 2.10.0 this is  
what you might see:


  x -  
factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C))

Warning message:
In `levels-`(`*tmp*`, value = c(A, A, A, B, B, B, C,  :
  duplicated levels will not be allowed in factors anymore

What you _should_ have done was:

 x2 - factor(c(A,A,A,B,B,B,C,C,C,C))

The usual approach to getting rid of unused factor levels is just to  
apply the function factor() again without additional arguments.


 x - factor(x)  # the x was from your code
Warning message:
In `levels-`(`*tmp*`, value = c(A, A, A, B, B, B, C,  :
  duplicated levels will not be allowed in factors anymore

# but that will be the last time you will see the warning..

 summary(x)
A B C
3 3 4

--
David.

 x
[1] A A A B B B C C C C
Levels: A A A B B B C C C C
 summary(x)
A A A B B B C C C C
3 0 0 3 0 0 4 0 0 0

So, there are clearly still 10 underlying levels.  The results I would
like to see from printing the value and summary(x) are:

 x
[1] A A A B B B C C C C
Levels: A B C
 summary(x)
A B C
3 3 4

Hopefully this makes sense.

Thanks,

Kevin

--
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Dalla Lana School of Public Health
University of Toronto
email: kevin.tho...@utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Collapse factor levels

2009-11-01 Thread Peter Dalgaard


Kevin E. Thorpe wrote:

I'm sure this is simple enough, but an R site search on my subject
terms did suggest a solution.  I have a numeric vector with many
values that I wish to create a factor from having only a few levels.
Here is a toy example.

  x - 1:10
  x - 
factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C))

  x
 [1] A A A B B B C C C C
Levels: A A A B B B C C C C
  summary(x)
A A A B B B C C C C
3 0 0 3 0 0 4 0 0 0

So, there are clearly still 10 underlying levels.  The results I would
like to see from printing the value and summary(x) are:

  x
 [1] A A A B B B C C C C
Levels: A B C
  summary(x)
A B C
3 3 4

Hopefully this makes sense.

Thanks,

Kevin



It's an anomaly inherited frokm S-PLUS (or so I have been told). 
Actually, with the current R, you should get a warning:


 x - 1:10
 x - 
factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C))

Warning message:
In `levels-`(`*tmp*`, value = c(A, A, A, B, B, B, C,  :
  duplicated levels will not be allowed in factors anymore

This works (as documented on the help page for levels!):

 x - 1:10
 x - factor(x,levels=1:10)
 levels(x) - c(A,A,A,B,B,B,C,C,C,C)
 table(x)
x
A B C
3 3 4


--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Collapse factor levels

2009-11-01 Thread Kevin E. Thorpe


Peter Dalgaard wrote:

Kevin E. Thorpe wrote:

I'm sure this is simple enough, but an R site search on my subject
terms did suggest a solution.  I have a numeric vector with many
values that I wish to create a factor from having only a few levels.
Here is a toy example.

  x - 1:10
  x - 
factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C))

  x
 [1] A A A B B B C C C C
Levels: A A A B B B C C C C
  summary(x)
A A A B B B C C C C
3 0 0 3 0 0 4 0 0 0

So, there are clearly still 10 underlying levels.  The results I would
like to see from printing the value and summary(x) are:

  x
 [1] A A A B B B C C C C
Levels: A B C
  summary(x)
A B C
3 3 4

Hopefully this makes sense.

Thanks,

Kevin



It's an anomaly inherited frokm S-PLUS (or so I have been told). 
Actually, with the current R, you should get a warning:


  x - 1:10
  x - 
factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C))

Warning message:
In `levels-`(`*tmp*`, value = c(A, A, A, B, B, B, C,  :
  duplicated levels will not be allowed in factors anymore

This works (as documented on the help page for levels!):

  x - 1:10
  x - factor(x,levels=1:10)
  levels(x) - c(A,A,A,B,B,B,C,C,C,C)
  table(x)
x
A B C
3 3 4




Thanks.  That's exactly what I need.  I knew it was simple.
I've even used levels() before, but it just didn't occur to
me this time.  I'm clearly not on current R. :-)
When I have some time, I'll upgrade.

Kevin

--
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Dalla Lana School of Public Health
University of Toronto
email: kevin.tho...@utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] intigrate function and absolute error

2009-11-01 Thread Ravi Varadhan

You just need to extract the list component named value:

I1$value

Ravi.


Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


- Original Message -
From: Laila Alkhalfan laila...@gmail.com
Date: Sunday, November 1, 2009 3:03 pm
Subject: [R] intigrate function and absolute error
To: r-help@r-project.org


 Hi
  Can we get the result of an intigration without the absolute error?
 for example
 f1-function(x1){(1/gamma(alpha))*x1^(alpha-1)*exp(-x1)*log(x1)}
 I1-integrate(f1, 0, (max(cc)-tau1+(theta2/theta1)*tau1)/theta2)
 I1
 0.08007414 with absolute error  7.2e-05
  I need the answer  0.08007414 withou the other part(with absolute error
 7.2e-05)
 how can we do that?
 thank you and take care
 Laila
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 
 PLEASE do read the posting guide 
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] polar.plot

2009-11-01 Thread Tony Greig

Jim and John,

Thanks for your replies. I ended up using both suggestions to plot the full
tide series and then overlay the averages for rise and fall in different
colours, which illustrated very well how such parameters can
be misleadings in some cases.

Regards,
Tony



2009/10/31 Jim Lemon j...@bitwrit.com.au

 On 10/31/2009 04:49 AM, Tony Greig wrote:

 Hi,

 Two questions:

 1 - Say I have average speed and directions for tide and I would like to
 plot them on a polar plot, but with different colors so I can indicate the
 two directions. I'm using polar.plot from the plotrix library. How can I
 add
 a second b and dir.b series to a polar.plot?

 library(plotrix)
 a = 3
 dir.a = 85
 b = 4
 dir.b = 250
 polar.plot(a, dir.a, start = 90, clockwise = T, show.grid.labels = T,
 radial.lim=c(0,5), line.col=2, lwd=2)


 2 - Which parameter in polar.plot can I use to set the orientation for the
 grid labels which seem to default at 90 in my example above?



 Hi Tony,
 The first one is easy:

 polar.plot(c(a,b), c(dir.a,dir.b), start = 90,
  clockwise = T, show.grid.labels=FALSE,

  radial.lim=c(0,5), line.col=2, lwd=2)

 I have had one other person as about an add option, and I might include
 that in a future version. The second one is a bit harder. You probably
 noticed that I changed the show.grid.labels argument to FALSE in the above.

 par(xpd=TRUE)
 boxed.labels(rep(0,5),1:5,1:5,border=NA)
 par(xpd=FALSE)

 This will put the labels vertically up from the center. Doing fancier
 things like having the labels at an angle would require calculating the
 positions, which isn't too hard.

 Jim



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix^(-1/2)

2009-11-01 Thread David Winsemius



On Nov 1, 2009, at 3:12 PM, Charles C. Berry wrote:


On Sun, 1 Nov 2009, David Winsemius wrote:



On Nov 1, 2009, at 1:46 PM, spencerg wrote:


Hi, Chuck:

   Thanks very much, but why do I get package 'expm' is not  
available
   from install.packages(expm,repos=http://R-Forge.R- 
project.org)?


In my case I think it was it is because there is no 2.10 branch to  
either the:


snipped details



So I wonder if the package installers' expectations for the r-forge  
repository are matching up with the tree structures.


Right. FWIW, the source install works OK on my linux box:


sessionInfo()

R version 2.10.0 (2009-10-26)
x86_64-pc-linux-gnu


Yes, as I said, I had already succeeded in a source compilation using:
install.packages(expm,repos=http://R-Forge.R-project.org;,  
type=source)


(I am not quite sure why R was able to navigate the tree, but suspect  
my issues may stem from generally using the 64-bit Mac version of R.)






I should also note that the matpow or %^% functions in expm would  
not address the OP's question since they require that the exponent  
be positive.




Roger that.

If solve(chol(A)) isn't good enough a symmetric inverse square root  
is available from expm as 'solve( sqrtm( A ) )'


I (as a noobisher matrix mechanic)  have discovered that the symmetric  
part is essential. Experiments with non-symmetric examples have come  
to a singularly bad end. The OP did offer a symmetric example, so he  
probably was more matrix-savvy than I.  I am not sufficiently  
knowledgeable to design the error checking. I think a useful addition  
to %^% would be properly designed negative fractional matrix powers  
with more informative error messages for the matrix klutzes among us.  
I hope I am correct in thinking that M %^% (-1/integer) has meaning  
for integers other than 2, correct?  And noticing that the current  
version of %^% rounds the exponent, I think that raises the question  
(given that the matrix argument were symmetric), would one round the 1/ 
negative-exponent? And if so, Up or down?


--
David.



Chuck



--
David.



   Best Wishes,
   Spencer Graves
Charles C. Berry wrote:
 On Sun, 1 Nov 2009, spencerg wrote:
   A question, a comment, and an alternative answer to  
matrix^(-1/2):

QUESTION:
  What's the status of the expm package, mentioned in  
the email you   cited from Martin Maechler, dated Apr 5 19:52:09  
CEST 2008? I tried   both install.packages('expm') and
install.packages(expm,repos=http://R-Forge.R-project.org;), and  
got   package 'expm' is not available in both cases.

 Try
 http://r-forge.r-project.org/projects/expm/
  HTH,
  Chuck
 COMMENT:
  The solution proposed by Venables rests on Sylvester's  
matrix theorem,   which essentially says that if a matrix A is  
diagonalizable with   eigenvalue decomposition eigA - eigen(A)  
and f: D → C with D ⊂ C   be a function for which f(A) is  
well defined   (http://en.wikipedia.org/wiki/Sylvester%27s_matrix_theorem 
), then f(A)   = with(eigA, vectors %*% diag(f(values)) %*%  
solve(vectors)). Maechler   and others have noted that this can  
be one of the least accurate and   most computationally  
expensive ways to compute f(A).

  ALTERNATIVE ANSWER:
  For A^(-1/2), if A is symmetric and nonnegative  
definite, then   solve(chol(A)) would be a very good way to  
compute it.

  Hope this helps,
  Spencer
  David Winsemius wrote:
  On Oct 31, 2009, at 9:33 PM, David Winsemius wrote:
  On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote:
Dear R-Help Team,
 as a R novice I have a (maybe for you very simple  
 question), how do Iget

   the following solved in R:
 Let R be a n x n matrix:
 \mid R\mid^{-\frac{1}{2}}
 solve(A) gives me the inverse of the matrix R,  
however not  the ^(-1/2)of

   the matrix...
   GIYF: (and Bill Venables if friendly, too.)
   http://www.lmgtfy.com/?q=powers+of+matrix+r-project
  I had assumed that the first hit I got:
  https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html
  ... would be the first hit anybody got, but that's not  
necessarilytrue
   now and especially for the future. And further searching  
within the

   results produced this more recent Maechler posting:
  https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html
  For the Mac users, there appears to be no binary, but  
the sourcecompiles

   without error on a 64-bit version of R 2.10.0:
  install.packages(expm,repos=http://R-Forge.R-project.org 
,

   type=source)
  #The suggested code throws an error, so my very minor  
revision wouldbe:

  library(expm)
   ?%^%
__
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide   
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,

[R] Time Series methods

2009-11-01 Thread Tim Bean

Hello, I have a quick question about time series methodology. If I want to
display a boxplot of time series data, sorted by period, I can type:

boxplot(data ~ cycle(data));

where data is of class ts

Is there a similar method for calculating, say, the median value of each
time step within the series? (So for a monthly data set, calculate median
for all Januarys, all Februarys, all Marchs, etc.)

Thanks,
Tim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Time Series methods

2009-11-01 Thread Kjetil Halvorsen

introduce a factor variable with the months and then use tapply?

Kjetil

On Sun, Nov 1, 2009 at 9:07 PM, Tim Bean timb...@gmail.com wrote:
 Hello, I have a quick question about time series methodology. If I want to
 display a boxplot of time series data, sorted by period, I can type:

 boxplot(data ~ cycle(data));

 where data is of class ts

 Is there a similar method for calculating, say, the median value of each
 time step within the series? (So for a monthly data set, calculate median
 for all Januarys, all Februarys, all Marchs, etc.)

 Thanks,
 Tim

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] convert list to numeric

2009-11-01 Thread dadrivr


I would like to preface this by saying that I am new to R, so I would ask
that you be patient and thorough, so that I'm not completely clueless.  I am
trying to convert a list to numeric so that I can perform computations on it 
(specifically mean-center the variable), but I am running into problems.  I
have imported the data set into task (data frame).  The data frame is made
of factors with variable names in the first row.  I am running a loop to set
a variable equal to a column in the data frame.  Here is an example of my
problem:

for (i in 1:dim(task)[2]){
predictor.loop - c(task[i])
predictor.loop.mc - predictor.loop - mean(predictor.loop, na.rm=T)
}

I get the following error: 
Error in predictor.loop - mean(predictor.loop, na.rm = T) : 
  non-numeric argument to binary operator
In addition: Warning message:
In mean.default(predictor.loop, na.rm = T) :
  argument is not numeric or logical: returning NA

The column is entirely made up of numerical data, except for the header,
which is a string.  My problem is that I receive an error because the
predictor.loop variable is not numerical, so I need to find a way to convert
it.  I tried using:
predictor.loop - c(as.numeric(task[i]))
But I get the following error: Error: (list) object cannot be coerced to
type 'double'

If I call the variable, I can assign it to a numerical list (e.g., predictor
loop - task$variablename), but since I am assigning the variable in a loop,
I have to find another way as the variable name would have to change in each
loop iteration.  Any help would be greatly appreciated.  Thanks!
-- 
View this message in context: 
http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Document related data sets.

2009-11-01 Thread Rolf Turner



The manual on Writing R Extensions says that one should ``only  
document a

***single*** data object per Rd file''.  My recollection is that in the
past this commandment (which I occasionally found to be inconvenient)
was enforced.

I recently saw a post to R-help which appeared to indicate that one
***can*** document multiple (presumably related) data sets in a single
Rd file.  I experimented and found that the construction

\usage{
  melvin
  clyde
  irving
  fred
}

which I recollect as *not* working, now appears to work.  This in
effect permits one to document (conveniently) melvin, clyde, irving,
and fred in one data file.  Along with, of course, having

\alias{melvin}
\alias{clyde}
\alias{irving}
\alias{fred}

under the \name{ } field at the top of the file.)

(a) Has the situation indeed changed, or is my ageing
memory playing tricks on me again?

(b) If indeed the situation has changed (or perhaps even
if it hasn't) shouldn't that line ``only document a ***single***
data object per Rd file'' be deleted from ``Writing R Extensions''?

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculate Volume in a PCA

2009-11-01 Thread Ben Bolker




Katja Seis wrote:
 
 Hi,
 my data frame consist of 8 Variables and 120 000 observations. With those
 datas I am running a PCA and after I want to calculate the Volume of the
 PCA-cloud of certain subsets of my data. Does anyone have an idea about a
 function that can do this?
 
 

Not really quite enough information here , although some googling suggests
that
you are looking for a technique used in MRI studies?
http://www.ncbi.nlm.nih.gov/pubmed?term=%22pca%20volume%22itool=QuerySuggestion

  as a wild guess, I'm going to suggest that the answer **might** be that
you should
take your PCA standard deviations and use the formula for the volume of a
(hyper)-ellipsoid
[http://en.wikipedia.org/wiki/Ellipsoid#Volume].  I don't know the general
formula
(pi*s_1*s_2 for an ellipse, 4/3*pi*s_1*s_2*s_3 for an ellipsoid ...) but in
any case the
answer is in general proportional to the product of all of the standard
deviations ...
for a princomp() fit, prod(x$sdev).  In general you can do this without
calculating
PCA explicitly -- if your data matrix is d, then
prod(sqrt(diag(eigen(vcov(d))$values)))
should do it.

  Of course, I haven't tested or proofread these suggestions carefully, use
at your
own risk ...

  
-- 
View this message in context: 
http://old.nabble.com/Calculate-Volume-in-a-PCA-tp26154363p26155426.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert list to numeric

2009-11-01 Thread Benilton Carvalho


it appears that what you really want is to use:

task[[i]]

instead of task[i]

b

On Nov 1, 2009, at 11:04 PM, dadrivr wrote:



I would like to preface this by saying that I am new to R, so I  
would ask
that you be patient and thorough, so that I'm not completely  
clueless.  I am
trying to convert a list to numeric so that I can perform  
computations on it
(specifically mean-center the variable), but I am running into  
problems.  I
have imported the data set into task (data frame).  The data frame  
is made
of factors with variable names in the first row.  I am running a  
loop to set
a variable equal to a column in the data frame.  Here is an example  
of my

problem:

for (i in 1:dim(task)[2]){
predictor.loop - c(task[i])
predictor.loop.mc - predictor.loop - mean(predictor.loop, na.rm=T)
}

I get the following error:
Error in predictor.loop - mean(predictor.loop, na.rm = T) :
 non-numeric argument to binary operator
In addition: Warning message:
In mean.default(predictor.loop, na.rm = T) :
 argument is not numeric or logical: returning NA

The column is entirely made up of numerical data, except for the  
header,

which is a string.  My problem is that I receive an error because the
predictor.loop variable is not numerical, so I need to find a way to  
convert

it.  I tried using:
predictor.loop - c(as.numeric(task[i]))
But I get the following error: Error: (list) object cannot be  
coerced to

type 'double'

If I call the variable, I can assign it to a numerical list (e.g.,  
predictor
loop - task$variablename), but since I am assigning the variable in  
a loop,
I have to find another way as the variable name would have to change  
in each

loop iteration.  Any help would be greatly appreciated.  Thanks!
--
View this message in context: 
http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert list to numeric

2009-11-01 Thread jim holtman

Can you at least provide an 'str' of the 'task' object (not sure if it
is a dataframe; you said a 'list') so that we know what it looks like.
 It would also be helpful if you would provide a subset of the data
that we could try out a script on it.  The best way would be to post
the first 10 or so rows (and limit columns it more that 10 or so).  A
convenient way is to :

dput(task[1:10,])

and then cut/paste the result into your email.

On Sun, Nov 1, 2009 at 8:04 PM, dadrivr dadr...@gmail.com wrote:

 I would like to preface this by saying that I am new to R, so I would ask
 that you be patient and thorough, so that I'm not completely clueless.  I am
 trying to convert a list to numeric so that I can perform computations on it
 (specifically mean-center the variable), but I am running into problems.  I
 have imported the data set into task (data frame).  The data frame is made
 of factors with variable names in the first row.  I am running a loop to set
 a variable equal to a column in the data frame.  Here is an example of my
 problem:

 for (i in 1:dim(task)[2]){
 predictor.loop - c(task[i])
 predictor.loop.mc - predictor.loop - mean(predictor.loop, na.rm=T)
 }

 I get the following error:
 Error in predictor.loop - mean(predictor.loop, na.rm = T) :
  non-numeric argument to binary operator
 In addition: Warning message:
 In mean.default(predictor.loop, na.rm = T) :
  argument is not numeric or logical: returning NA

 The column is entirely made up of numerical data, except for the header,
 which is a string.  My problem is that I receive an error because the
 predictor.loop variable is not numerical, so I need to find a way to convert
 it.  I tried using:
 predictor.loop - c(as.numeric(task[i]))
 But I get the following error: Error: (list) object cannot be coerced to
 type 'double'

 If I call the variable, I can assign it to a numerical list (e.g., predictor
 loop - task$variablename), but since I am assigning the variable in a loop,
 I have to find another way as the variable name would have to change in each
 loop iteration.  Any help would be greatly appreciated.  Thanks!
 --
 View this message in context: 
 http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] save an object by dynamicly created name

2009-11-01 Thread jeffc


Hi,

I would like to save a few dynamically created objects to disk. The
following is the basic flow of the code segment

for(i = 1:10) {
   m = i:5
   save(m, file = ...) ## ???
}
To distinguish different objects to be saved, I would like to save m as m1,
m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ...

I tried a couple of methods on translating between object names and strings
(below) but couldn't get it to work. 
https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html
http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html

Any suggestions would be appreciated.

thanks

Hao

-- 
View this message in context: 
http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26155437.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Bagging with SVM

2009-11-01 Thread 柯洁


Dear sir,
If I want to use bagging with SVM, which package should I choose?Thanks!
Best wishes,Jie   
_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] save an object by dynamicly created name

2009-11-01 Thread Henrik Bengtsson

path - data;
dir.create(path);

for (i in 1:10) {
  m - i:5;
  filename - sprintf(m%02d.Rbin, i);
  pathname - file.path(path, filename);
  save(m, file=pathname);
}

/H

On Sun, Nov 1, 2009 at 6:53 PM, jeffc h...@andrew.cmu.edu wrote:

 Hi,

 I would like to save a few dynamically created objects to disk. The
 following is the basic flow of the code segment

 for(i = 1:10) {
   m = i:5
   save(m, file = ...) ## ???
 }
 To distinguish different objects to be saved, I would like to save m as m1,
 m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ...

 I tried a couple of methods on translating between object names and strings
 (below) but couldn't get it to work.
 https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html
 http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html

 Any suggestions would be appreciated.

 thanks

 Hao

 --
 View this message in context: 
 http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26155437.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bagging with SVM

2009-11-01 Thread Max Kuhn

There is no package that does this, but you can do it yourself pretty
easily (although I think that it is a waste of time to bag this
particular model). You can use the function bagEarth in the caret
package as a prototype.

Max

On Sun, Nov 1, 2009 at 8:32 PM, 柯洁 kejiefina...@hotmail.com wrote:

 Dear sir,
 If I want to use bagging with SVM, which package should I choose?Thanks!
 Best wishes,Jie
 _


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] save an object by dynamicly created name

2009-11-01 Thread David Winsemius



On Nov 1, 2009, at 10:16 PM, Henrik Bengtsson wrote:


path - data;
dir.create(path);

for (i in 1:10) {
 m - i:5;
 filename - sprintf(m%02d.Rbin, i);
 pathname - file.path(path, filename);
 save(m, file=pathname);
}



That would result in each of the ten files containing an object with  
the same  name == m. (Also on my system R data files have type  
Rdta.) So I thought what was requested might have been a slight mod:


path - ~/;
dir.create(path);

for (i in 1:10) {
 assign( paste(m, i, sep=),  i:5)
 filename - sprintf(m%02d.Rdta, i)
 pathname - file.path(path, filename)
 obj =get(paste(m, i, sep=))
 save(obj, file=pathname)
}

--
David.


/H

On Sun, Nov 1, 2009 at 6:53 PM, jeffc h...@andrew.cmu.edu wrote:


Hi,

I would like to save a few dynamically created objects to disk. The
following is the basic flow of the code segment

for(i = 1:10) {
  m = i:5
  save(m, file = ...) ## ???
}
To distinguish different objects to be saved, I would like to save  
m as m1,

m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ...

I tried a couple of methods on translating between object names and  
strings

(below) but couldn't get it to work.
https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html
http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html

Any suggestions would be appreciated.

thanks

Hao

--
View this message in context: 
http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26155437.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to execute a funcition which name is stored in a string?

2009-11-01 Thread Ning Ma

Hi, everybody

Is there any way to execute a function, which name is stored in a string.
such as:
a - ls()
foo(a) ## same as ls() itself.

Or, to execute a R command, which is stored in a string
such as:
a - m1 - matrix(1:9,3,3)
foo(a) ## same as the assignment itself

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using exists with coef from an arima fit

2009-11-01 Thread Erin Hodgess

Dear R People:

I have the output from an arima model fit in an object xxx.

I want to verify that the ma1 coefficient is there, so I did the following:
 xxx$coef
   ar1ar2ma1  intercept
 1.3841297 -0.4985667 -0.996 -0.1091657
 str(xxx$coef)
 Named num [1:4] 1.384 -0.499 -1 -0.109
 - attr(*, names)= chr [1:4] ar1 ar2 ma1 intercept
 exists('xxx$coef[ma1]')
[1] FALSE


Other than using xxx$coef[3], is there another way to check this, please?

Thanks in advance,
Sincerely,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] save an object by dynamicly created name

2009-11-01 Thread Henrik Bengtsson

On Sun, Nov 1, 2009 at 7:48 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Nov 1, 2009, at 10:16 PM, Henrik Bengtsson wrote:

 path - data;
 dir.create(path);

 for (i in 1:10) {
  m - i:5;
  filename - sprintf(m%02d.Rbin, i);
  pathname - file.path(path, filename);
  save(m, file=pathname);
 }


 That would result in each of the ten files containing an object with the
 same  name == m. (Also on my system R data files have type Rdta.) So I
 thought what was requested might have been a slight mod:

 path - ~/;
 dir.create(path);

 for (i in 1:10) {
  assign( paste(m, i, sep=),  i:5)
  filename - sprintf(m%02d.Rdta, i)
  pathname - file.path(path, filename)
  obj =get(paste(m, i, sep=))
  save(obj, file=pathname)
 }

Then a more convenient solution is to use saveObject() and
loadObject() of R.utils.  saveObject() does not save the name of the
object save.  If you want to save multiple objects, the wrap them up
in a list.  loadObject() does not assign variable, but instead return
them. Example:

library(R.utils);
x - list(a=1,b=LETTERS,c=Sys.time());
saveObject(x, file=foo.Rbin);
y - loadObject(foo.Rbin);
stopifnot(identical(x,y));

So, for the original example, I'd recommend:

library(R.utils);
path - data;
mkdirs(path);

for (i in 1:10) {
  m - i:5;
  filename - sprintf(m%02d.Rbin, i);
  saveObject(m, file=filename, path=path);
}

and loading the objects back as:

for (i in 1:10) {
  filename - sprintf(m%02d.Rbin, i);
  m - loadObject(filename, path=path);
  print(m);
}

/Henrik


 --
 David.

 /H

 On Sun, Nov 1, 2009 at 6:53 PM, jeffc h...@andrew.cmu.edu wrote:

 Hi,

 I would like to save a few dynamically created objects to disk. The
 following is the basic flow of the code segment

 for(i = 1:10) {
  m = i:5
  save(m, file = ...) ## ???
 }
 To distinguish different objects to be saved, I would like to save m as
 m1,
 m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ...

 I tried a couple of methods on translating between object names and
 strings
 (below) but couldn't get it to work.
 https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html
 http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html

 Any suggestions would be appreciated.

 thanks

 Hao

 --
 View this message in context:
 http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26155437.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using exists with coef from an arima fit: please ignore

2009-11-01 Thread Erin Hodgess

Got it!

There is an xxx$model$theta that does the trick.

Thanks,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to execute a funcition which name is stored in a string?

2009-11-01 Thread David Winsemius



On Nov 1, 2009, at 11:07 PM, Ning Ma wrote:


Hi, everybody

Is there any way to execute a function, which name is stored in a  
string.

such as:
a - ls()
foo(a) ## same as ls() itself.


Need to leave the () off.

 fstr - sum
 eval(parse(text=fstr))(1:5)
[1] 15



Or, to execute a R command, which is stored in a string
such as:
a - m1 - matrix(1:9,3,3)
foo(a) ## same as the assignment itself


 eval(parse(text=m1 - matrix(1:9,3,3)))
 m1
 [,1] [,2] [,3]
[1,]147
[2,]258
[3,]369

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] unusual result with any

2009-11-01 Thread Erin Hodgess

Dear R People:

I am working with a numeric vector and trying to use the any
function.  However, I'm getting some odd results, as shown below:

 xy
[1] 0.7305081 2.4224211
 str(xy)
 num [1:2] 0.73 2.42
 any(xy)  1
[1] FALSE
Warning message:
In any(xy) : coercing argument of type 'double' to logical


What am I doing wrong please?

Thanks in advance,
Sincerely,
Erin



-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] unusual result with any :please ignore

2009-11-01 Thread Erin Hodgess

Dear R People:

Sorry to be so pesky tonight.  I figured it out.



-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] unusual result with any

2009-11-01 Thread Johannes Hüsing


Erin Hodgess schrieb:
  

xy


[1] 0.7305081 2.4224211
  

str(xy)


 num [1:2] 0.73 2.42
  

any(xy)  1


[1] FALSE
Warning message:
In any(xy) : coercing argument of type 'double' to logical
  


What am I doing wrong please?

  
xy  1 should return TRUE FALSE, and you want to apply any() to that. 
Thus: any(xy  1)

any(xy) returns TRUE, as the nonzero numbers are coerced to TRUE
When TRUE is compared with 1, it is coerced to a number (no warning is 
issued here), namely 1.

1  1 returns FALSE.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] save an object by dynamicly created name

2009-11-01 Thread David Winsemius



On Nov 1, 2009, at 11:28 PM, Henrik Bengtsson wrote:

On Sun, Nov 1, 2009 at 7:48 PM, David Winsemius dwinsem...@comcast.net 
 wrote:


On Nov 1, 2009, at 10:16 PM, Henrik Bengtsson wrote:


path - data;
dir.create(path);

for (i in 1:10) {
 m - i:5;
 filename - sprintf(m%02d.Rbin, i);
 pathname - file.path(path, filename);
 save(m, file=pathname);
}



That would result in each of the ten files containing an object  
with the
same  name == m. (Also on my system R data files have type Rdta.)  
So I

thought what was requested might have been a slight mod:

path - ~/;
dir.create(path);

for (i in 1:10) {
 assign( paste(m, i, sep=),  i:5)
 filename - sprintf(m%02d.Rdta, i)
 pathname - file.path(path, filename)
 obj =get(paste(m, i, sep=))
 save(obj, file=pathname)
}


Then a more convenient solution is to use saveObject() and
loadObject() of R.utils.  saveObject() does not save the name of the
object save.


The OP asked for this outcome :

 I would like to save m as m1, m2, m3 ...,
to file /home/data/m1, /home/data/m2, home/data/m3, ...



 If you want to save multiple objects, the wrap them up
in a list.


I agree that a list would makes sense if it were to be stored in one  
file , although it was not what requested. But wouldn't that require  
assign()-ing a name before list()-wrapping?


I suppose we ought to mention that the use of assign to create a  
variable is a FAQ ... 7.21? Yep, I have now referred to it a  
sufficient number of times to refer to it by number.


http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-variable_003f

--
David


 loadObject() does not assign variable, but instead return
them. Example:

library(R.utils);
x - list(a=1,b=LETTERS,c=Sys.time());
saveObject(x, file=foo.Rbin);
y - loadObject(foo.Rbin);
stopifnot(identical(x,y));




So, for the original example, I'd recommend:

library(R.utils);
path - data;
mkdirs(path);

for (i in 1:10) {
 m - i:5;
 filename - sprintf(m%02d.Rbin, i);
 saveObject(m, file=filename, path=path);
}

and loading the objects back as:

for (i in 1:10) {
 filename - sprintf(m%02d.Rbin, i);
 m - loadObject(filename, path=path);
 print(m);
}
/Henrik



--
David.


/H

On Sun, Nov 1, 2009 at 6:53 PM, jeffc h...@andrew.cmu.edu wrote:


Hi,

I would like to save a few dynamically created objects to disk. The
following is the basic flow of the code segment

for(i = 1:10) {
 m = i:5
 save(m, file = ...) ## ???
}
To distinguish different objects to be saved, I would like to  
save m as

m1,
m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ...

I tried a couple of methods on translating between object names and
strings
(below) but couldn't get it to work.
https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html
http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html

Any suggestions would be appreciated.

thanks

Hao

--
View this message in context:
http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26155437.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] save an object by dynamicly created name

2009-11-01 Thread Henrik Bengtsson

On Sun, Nov 1, 2009 at 9:18 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Nov 1, 2009, at 11:28 PM, Henrik Bengtsson wrote:

 On Sun, Nov 1, 2009 at 7:48 PM, David Winsemius dwinsem...@comcast.net
 wrote:

 On Nov 1, 2009, at 10:16 PM, Henrik Bengtsson wrote:

 path - data;
 dir.create(path);

 for (i in 1:10) {
  m - i:5;
  filename - sprintf(m%02d.Rbin, i);
  pathname - file.path(path, filename);
  save(m, file=pathname);
 }


 That would result in each of the ten files containing an object with the
 same  name == m. (Also on my system R data files have type Rdta.) So I
 thought what was requested might have been a slight mod:

 path - ~/;
 dir.create(path);

 for (i in 1:10) {
  assign( paste(m, i, sep=),  i:5)
  filename - sprintf(m%02d.Rdta, i)
  pathname - file.path(path, filename)
  obj =get(paste(m, i, sep=))
  save(obj, file=pathname)
 }

 Then a more convenient solution is to use saveObject() and
 loadObject() of R.utils.  saveObject() does not save the name of the
 object save.

 The OP asked for this outcome :

  I would like to save m as m1, m2, m3 ...,
 to file /home/data/m1, /home/data/m2, home/data/m3, ...


  If you want to save multiple objects, the wrap them up
 in a list.

 I agree that a list would makes sense if it were to be stored in one file ,
 although it was not what requested.

That comment was not for the OP, but for saveObject()/loadObject() in general.

 But wouldn't that require assign()-ing a name before list()-wrapping?

Nope, the whole point of using saveObject()/loadObject() is to save
the objects/values without their names that you happens to choose in
the current session, and to avoid overwriting existing ones in your
next session. My example could also have been:

library(R.utils);
saveObject(list(a=1,b=LETTERS,c=Sys.time()), file=foo.Rbin);
y - loadObject(foo.Rbin);
z - loadObject(foo.Rbin);
stopifnot(identical(y,z));

If you really want to attach the elements of the saved list, do:

attachLocally(loadObject(foo.Rbin));
 str(a)
 num 1
 str(b)
 chr [1:26] A B C D E F G H I J ...
 str(c)
 POSIXct[1:1], format: 2009-11-01 21:30:41


 I suppose we ought to mention that the use of assign to create a variable is
 a FAQ ... 7.21? Yep, I have now referred to it a sufficient number of times
 to refer to it by number.

 http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-variable_003f

My personal take on assign() and get() is that if you find yourself
using them (at this level), there is a good chance there exists a
better solution that you should use instead.

My $.02

/H


 --
 David

  loadObject() does not assign variable, but instead return
 them. Example:

 library(R.utils);
 x - list(a=1,b=LETTERS,c=Sys.time());
 saveObject(x, file=foo.Rbin);
 y - loadObject(foo.Rbin);
 stopifnot(identical(x,y));


 So, for the original example, I'd recommend:

 library(R.utils);
 path - data;
 mkdirs(path);

 for (i in 1:10) {
  m - i:5;
  filename - sprintf(m%02d.Rbin, i);
  saveObject(m, file=filename, path=path);
 }

 and loading the objects back as:

 for (i in 1:10) {
  filename - sprintf(m%02d.Rbin, i);
  m - loadObject(filename, path=path);
  print(m);
 }
 /Henrik


 --
 David.

 /H

 On Sun, Nov 1, 2009 at 6:53 PM, jeffc h...@andrew.cmu.edu wrote:

 Hi,

 I would like to save a few dynamically created objects to disk. The
 following is the basic flow of the code segment

 for(i = 1:10) {
  m = i:5
  save(m, file = ...) ## ???
 }
 To distinguish different objects to be saved, I would like to save m as
 m1,
 m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ...

 I tried a couple of methods on translating between object names and
 strings
 (below) but couldn't get it to work.
 https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html
 http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html

 Any suggestions would be appreciated.

 thanks

 Hao

 --
 View this message in context:

 http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26155437.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 David Winsemius, MD

[R] Using processed objects as arguments of a function

2009-11-01 Thread Steven Kang

Dear R users,

I wish to utilise processed and saved objects as arguments of a function.

Specifically, I have created objects using *assign*  *paste* functions
with an incremental index i, the names of the objects are:

fund1, fund2, fund3,., fund80,. (where the numerical value
increments according to the index i  class of these objects are dataframes)

I wish to collapse these objects row wisely using *rbind* function.

paste(fund, 1:i, sep = ) results in list of objects as characters...  
get(paste(fund, 1:i, sep = )) outputs fund1...

Are there any methods to use these objects as an argument of rbind to
collapse the dataframes?

Your expertise in resolving this issue would be highly appreciated.





Steven

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoiding loop

2009-11-01 Thread Martin Morgan

parkbomee bbom...@hotmail.com writes:

 Thank you all.

 What Chuck has suggested might not be applicable since the number of
 different times is around 40,000.

 The object of optimization in my function is the varying value,
 which is basically data * parameter, of which parameter is the
 object of optimization..
  
 And from the r profiling with a subset of data,
 I got this report..any idea what Anonymous is?


 $by.total
 total.time total.pct self.time self.pct
 Anonymous   571.56 100.0  0.02  0.0
 optim 571.56 100.0  0.00  0.0
 fn571.54 100.0  0.98  0.2

You're giving us 'by.total', so these are saying that all the time was
spent in these functions or the functions they called. Probably all
are in 'optim' and its arguments; since little self.time is spent
here, there isn't much to work with

 eval  423.74  74.1  0.00  0.0
 with.default  423.74  74.1  0.00  0.0
 with  423.74  74.1  0.00  0.0

These are probably in the internals of optim, where the function
you're trying to optimize is being set up for evaluation. Again
there's little self.time, and all these say is that a big piece of the
time is being spent in code called by this code.

 tapply414.28  72.5 13.84  2.4
 lapply255.48  44.7 76.94 13.5
 factor127.68  22.3 11.08  1.9
 unlist120.54  21.1 80.46 14.1
 FUN94.16  16.5 94.16 16.5

these look like they are tapply-related calls (looking at the code for
tapply, it calls lapply, factor, and unlist, and FUN is the function
argument to tapply), perhaps from the function you're optimizing (did
you implement this as suggested below?  it would really help to have a
possibly simplified version of the code you're calling).

There is material to work with here, as apparently a fairly large
amount of self.time is being spent in each of these functions. So
here's a sample data set

  n - 10
  set.seed(123)
  df - data.frame(time=sort(as.integer(ceiling(runif(n)*n/5))),
   value=ceiling(runif(n)*5))

It would have been helpful for you to provide reproducible code like
that above, so that the characteristics of your data were easily
reproducible. Let's time tapply

 replicate(5, {
+ system.time(x0 - tapply0(df$value, df$time, sum), gcFirst=TRUE)[[1]]
+ })
[1] 0.316 0.316 0.308 0.320 0.304

tapply is quite general, but in your case I think you'd be happy with

  tapply1 - function(X, INDEX, FUN)
  unlist(lapply(split(X, INDEX), FUN), use.names=FALSE)

 replicate(5, {
+ system.time(x1 - tapply1(df$value, df$time, sum), gcFirst=TRUE)[[1]]
+ })
[1] 0.156 0.148 0.152 0.144 0.152

so about twice the speed (timing depends quite a bit on what 'time' is,
integer or numeric or character or factor). The vector values of the
two calculations are identical, though tapply presents the data as an
array with column names

 identical(as.vector(x0), x1)
[1] TRUE

tapply allows FUN to be anything, but if the interest is in the sum of
each time interval, and the time intervals can be assumed to be sorted
(sorting is not expensive, so could be done on the fly), then

  tapply2 - function(X, INDEX)
  {
  csum - cumsum(c(0, X))
  idx - diff(INDEX) != 0
  csum[c(FALSE, idx, TRUE)] - csum[c(TRUE, idx, FALSE)]
  }

calculates the cumulative sum and the points in INDEX where the time
intervals change. It then takes the difference over the appropriate
interval.

 replicate(5, {
+ system.time(x2 - tapply2(df$value, df$time), gcFirst=TRUE)[[1]]
+ })
[1] 0.024 0.024 0.024 0.024 0.024
 identical(as.vector(x0), x2)
[1] TRUE

This approach could be subject to rounding error (if csum gets very
large and the intervals remain small). To calculate values where
choice == 1 I think you'd want to

  tapply2(df$value * (df$choice==1), df$time)

rather than sub-setting, so that the result of tapply2 is always a
vector of the same length even when some time intervals never have
choice==1.

Because tapply in these examples seems so fast compared to your
calculation, I wonder whether optim is evaluating your function many
times, and that reformulating the optimization might lead to a very
substantial speed-up?

Martin

 .
 .
 .
 .
 .


 Date: Sun, 1 Nov 2009 15:35:41 -0400
 Subject: Re: [R] avoiding loop
 From: jholt...@gmail.com
 To: bbom...@hotmail.com
 CC: dwinsem...@comcast.net; d.rizopou...@erasmusmc.nl; r-help@r-project.org
 
 What you need to do is to understand how to use Rprof so that you can
 determine where the time is being spent.  It probably indicates that
 this is not the source of slowness in your optimization function.  How
 much time are we talking about?  You may spent more time trying to
 optimize the function than just running

Re: [R] How to execute a funcition which name is stored in a string?

2009-11-01 Thread Charlie Sharpsteen

On Sun, Nov 1, 2009 at 8:07 PM, Ning Ma pnin...@gmail.com wrote:

 Hi, everybody

 Is there any way to execute a function, which name is stored in a string.
 such as:
 a - ls()
 foo(a) ## same as ls() itself.

One way to accomplish this by using get() to search for a function
that matches your string. You can assign the return value of get() to
a variable which may be used like the original function object:
a - 'ls'
foo - get( a, mode = 'function' )
foo()
[1] a          foo



-Charlie

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RMySQL and Stored procedures

2009-11-01 Thread Orvalho Augusto

Can someone provide me a good way to circumvent the lack of calling
Stored Procedures from RMySQL?

I can not rewrite those stored procedures on R because there a lot
more folks here that only understands SQL.

The stored procedure returns a resultset.

Thanks
Caveman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to execute a funcition which name is stored in a string?

2009-11-01 Thread Henrik Bengtsson

?do.call

/H

On Sun, Nov 1, 2009 at 10:58 PM, Charlie Sharpsteen
ch...@sharpsteen.net wrote:
 On Sun, Nov 1, 2009 at 8:07 PM, Ning Ma pnin...@gmail.com wrote:

 Hi, everybody

 Is there any way to execute a function, which name is stored in a string.
 such as:
 a - ls()
 foo(a) ## same as ls() itself.

 One way to accomplish this by using get() to search for a function
 that matches your string. You can assign the return value of get() to
 a variable which may be used like the original function object:
 a - 'ls'
 foo - get( a, mode = 'function' )
 foo()
 [1] a          foo



 -Charlie

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

72 matches

Mail list logo