date:20120805

Re: [R] regexpr with accents

2012-08-05 Thread Rui Barradas


Hello,

Works with me:

d1 <- data.frame(V1 = 1:3,
V2 = c("some text = 9", "some tèxt = 9", "some other text = 9"))

regexpr("some text = 9", d1$V2)
[1]  1 -1 -1
attr(,"match.length")
[1] 13 -1 -1
regexpr("some tèxt = 9", d1$V2)
[1] -1  1 -1
attr(,"match.length")
[1] -1 13 -1
d1$V1[regexpr("some text = 9",d1$V2) > 0] <- 9
d1$V1[regexpr("some tèxt = 9",d1$V2) > 0] <- 9
d1
  V1  V2
1  9   some text = 9
2  9   some tèxt = 9
3  3 some other text = 9

What do you mean by "it did not work"? What was the contents of 'd1'?

sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Portuguese_Portugal.1252 LC_CTYPE=Portuguese_Portugal.1252
[3] LC_MONETARY=Portuguese_Portugal.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Portugal.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base

loaded via a namespace (and not attached):
[1] fortunes_1.5-0

Hope this helps,

Rui Barradas

Em 06-08-2012 06:55, Luca Meyer escreveu:

Hello,

I have build a syntax to find out if a given substring is included in a larger 
string that works like this:

d1$V1[regexpr("some text = 9",d1$V2)>0] <- 9

and this works all right till "some text" contains standard ASCII set. However, 
it does not work when accents are included as the following:

d1$V1[regexpr("some tèxt = 9",d1$V2)>0] <- 9

I have tried to substitute "è" with several wildcards but it did not work, can 
anyone suggest how to have the syntax parse the string ignoring the accent?

Thank you in advance,

Luca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Head or Tails game

2012-08-05 Thread Nordlund, Dan (DSHS/RDA)

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of darnold
> Sent: Friday, August 03, 2012 9:18 PM
> To: r-help@r-project.org
> Subject: Re: [R] Head or Tails game
> 
> Wow! Some great responses!
> 
> I am getting some great responses. I've only read David, Michael, and
> Dennis
> thus far, leading me to develop this result before reading further.
> 
> lead <- function(x) {
>   n <- length(x)
>   count <- 0
>   if (x[1] >= 0) count <- count + 1
>   for (i in 2:n) {
> if (x[i] > 0 || (x[i] == 0 && x[i-1] >= 0 )) {
>   count  <- count + 1
> }
>   }
>   count
> }
> 
> games <- replicate(1,sample(c(-1,1),40,replace=TRUE))
> 
> games_sum <- apply(games,2,sum)
> plot(table(games_sum))
> 
> games_lead <- apply(games,2,cumsum)
> games_lead <- apply(games_lead,2,lead)
> plot(table(games_lead))
> 
> Now I am going to read Arun, William, and Jeff's responses and see what
> other ideas are being proposed.
> 
> Thanks everyone.
> 
> D.
> 


Here is another solution that doesn't need to define an additional function 
with an explicit loop. It seems to be considerably faster than the approach 
presented above.

system.time({
  set.seed(123)
  games <- matrix(sample(c(-1, 1), 40*1, TRUE), ncol = 1)
  games_sum <- apply(games,2,cumsum)
  games_lead <- colSums((games_sum > 0) | (games_sum==0 & games==-1))
})
   user  system elapsed 
   0.080.000.08 

plot(table(games_sum[40,]))
plot(table(games_lead))


Compare this with your solution

system.time({
set.seed(123)
games <- replicate(1,sample(c(-1,1),40,replace=TRUE))

games_sum <- apply(games,2,sum)

games_lead <- apply(games,2,cumsum)
games_lead <- apply(games_lead,2,lead)
})
   user  system elapsed 
   0.950.020.98 


Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] regexpr with accents

2012-08-05 Thread Luca Meyer

Hello,

I have build a syntax to find out if a given substring is included in a larger 
string that works like this:

d1$V1[regexpr("some text = 9",d1$V2)>0] <- 9

and this works all right till "some text" contains standard ASCII set. However, 
it does not work when accents are included as the following:

d1$V1[regexpr("some tèxt = 9",d1$V2)>0] <- 9

I have tried to substitute "è" with several wildcards but it did not work, can 
anyone suggest how to have the syntax parse the string ignoring the accent?

Thank you in advance,

Luca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory limit for Windows 64bit build of R

2012-08-05 Thread Jie

Hi,

Before someone gives professional advice, you may do an experiment:
Set the windows virtual memeory to be as large as ~128GB, (make sure the
hard drive has enough space, restart might be required);
increase the memroy limit in R;
load a big dataset (or iteratively assign it to an object, and do some
calculation.Definitely will be very slow)
I am not sure. Just try to help.

Best wishes,
Jie

On Sun, Aug 5, 2012 at 6:52 PM,  wrote:

> Dear all
>
> I have a Windows Server 2008 R2 Enterprise machine, with 64bit R installed
> running on 2 x Quad-core Intel Xeon 5500 processor with 24GB DDR3 1066 Mhz
> RAM.  I am seeking to analyse very large data sets (perhaps as much as
> 10GB), without the addtional coding overhead of a package such as
> bigmemory().
>
> My question is this - if we were to increase the RAM on the machine to
> (say) 128GB, would this become a possibility?  I have read the
> documentation on memory limits and it seems so, but would like some
> additional confirmation before investing in any extra RAM.
>
> Kind regards
>
> Alan
>
> Alan Simpson
> Technical Lead, Retail Model Development
> Retail Models Project
> National Australia Bank
>
> Level 15, 500 Bourke St, Melbourne VIC
> Tel: +61 (0) 3 8697 7135  |  Mob: +61 (0) 412 975 955
> Email: alan.x.simp...@nab.com.au
>
>
> The information contained in this email and its attachments may be
> confidential.
> If you have received this email in error, please notify the sender by
> return email,
> delete this email and destroy any copy.
>
> Any advice contained in this email has been prepared without taking into
> account your objectives, financial situation or needs. Before acting on any
> advice in this email, National Australia Bank Limited ABN 12 004 044 937
> AFSL and Australian Credit Licence 230686 (NAB) recommends that
> you consider whether it is appropriate for your circumstances.
> If this email contains reference to any financial products, NAB recommends
> you consider the Product Disclosure Statement (PDS) or other disclosure
> document available from NAB, before making any decisions regarding any
> products.
>
> If this email contains any promotional content that you do not wish to
> receive,
> please reply to the original sender and write "Don't email promotional
> material" in the subject.
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Memory limit for Windows 64bit build of R

2012-08-05 Thread Alan . X . Simpson

Dear all

I have a Windows Server 2008 R2 Enterprise machine, with 64bit R installed 
running on 2 x Quad-core Intel Xeon 5500 processor with 24GB DDR3 1066 Mhz 
RAM.  I am seeking to analyse very large data sets (perhaps as much as 
10GB), without the addtional coding overhead of a package such as 
bigmemory(). 

My question is this - if we were to increase the RAM on the machine to 
(say) 128GB, would this become a possibility?  I have read the 
documentation on memory limits and it seems so, but would like some 
additional confirmation before investing in any extra RAM.

Kind regards

Alan

Alan Simpson
Technical Lead, Retail Model Development
Retail Models Project
National Australia Bank

Level 15, 500 Bourke St, Melbourne VIC 
Tel: +61 (0) 3 8697 7135  |  Mob: +61 (0) 412 975 955
Email: alan.x.simp...@nab.com.au


The information contained in this email and its attachments may be confidential.
If you have received this email in error, please notify the sender by return 
email,
delete this email and destroy any copy.

Any advice contained in this email has been prepared without taking into 
account your objectives, financial situation or needs. Before acting on any 
advice in this email, National Australia Bank Limited ABN 12 004 044 937 AFSL 
and Australian Credit Licence 230686 (NAB) recommends that 
you consider whether it is appropriate for your circumstances. 
If this email contains reference to any financial products, NAB recommends 
you consider the Product Disclosure Statement (PDS) or other disclosure 
document available from NAB, before making any decisions regarding any 
products.

If this email contains any promotional content that you do not wish to receive, 
please reply to the original sender and write "Don't email promotional 
material" in the subject.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] deleting columns from a dataframe where NA is more than 15 percent of the column length

2012-08-05 Thread arun

HI,

Try this:
dat1<-data.frame(x=c(NA,NA,rnorm(6,15),NA),y=c(NA,rnorm(8,15)),z=c(rnorm(7,15),NA,NA))
dat1[which(colMeans(is.na(dat1))<=.15)]    
 y
1   NA
2 13.53085
3 12.89453
4 15.02625
5 14.00387
6 15.34618
7 15.69293
8 15.62377
9 14.76479

#You can also use apply, sapply etc.
dat2<-data.frame(x=c(NA,NA,rnorm(6,15),NA),y=c(NA,rnorm(8,15)),z=c(rnorm(7,15),NA,NA),u=c(rnorm(9,15)))
dat2[apply(dat2,2,function(x) mean(is.na(x))<=.15)]  

#dat2[sapply(dat2,function(x) mean(is.na(x))<=.15)]
#dat2[which(colMeans(is.na(dat2))<=.15)] 

   y    u
1   NA 14.56278
2 16.49940 16.25761
3 14.11368 14.08768
4 14.95139 14.01923
5 14.99517 15.91936
6 14.46359 14.07573
7 15.09702 13.94888
8 15.99967 14.97171
9 15.51924 15.59981

A.K.





- Original Message -
From: Faz Jones 
To: r-help@r-project.org
Cc: 
Sent: Sunday, August 5, 2012 9:04 PM
Subject: [R] deleting columns from a dataframe where NA is more than 15 percent 
of the column length

I have a dataframe of 10 different columns (length of each column is
the same). I want to eliminate any column that has 'NA' greater than
15% of the column length. Do i first need to make a function for
calculating the percentage of NA for each column and then make another
dataframe where i apply the function? Whats the best way to do this.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package to remove collinear variables

2012-08-05 Thread Roberto Moscetti

Hi,
thank you for your help. I know, I need to learn enough statistics to
understand how to process my data. The reason because of I write on this
forum is to ask to people a way to learn.
I am a postharvest researcher and statistic is not my main field, so I
try to do my best.

Do you know a book (or literature) than can help me?

Thank you very much for your time and suggestions.

Best regards,
Roberto

Il 05/08/2012 12:55, Jeff Newmiller ha scritto:

There is no "magic bullet" (package) for your problem. You must either learn
enough statistics to understand how to analyze your data, or consult with someone who
does.

FWIW collinearity is not in general amenable to automatic removal. However, you
can identify which inputs are collinear with each other, and omit the redundant
ones next iteration of your analysis, using (for example) the approach
suggested by Uwe. Deciding WHICH of the redundant inputs is most appropriate
to keep is the part computers are not so good at... that is where you must be
smarter or more creative than the computer.

Also, it would help you get responses if you included the context (earlier
discussion) in your replies.. most people do not use Nabble here. Reading and
following the requests in the footer of every message will also help.
---
Jeff NewmillerThe . . Go Live...
DCN:Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/BatteriesO.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---
Sent from my phone. Please excuse my brevity.

Roberto wrote:

I do not know, because I tried to use rfe function (Backwards Feature
Selection, Caret Package) to select wavelengths useful for a prediction
model. Otherwise, rfe function give me back a lot of warning messages
about
collinearity between variables.

So, I do not know if your script can be useful.
I tried to use VIF-Regression to select variables, but rfe function
advise
me with the same warning messages again.

What do you think about that?

Thank you very much for your help.

Best,
Roberto

--
View this message in context:
http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] find date between two other dates

2012-08-05 Thread arun



Hi,

Your function is.between() can be also used.

is.between<-function(x,a,b){
 x=b
 }
ddate <-  c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999 
06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999 07:57:18")
ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT")
ddate1<-data.frame(date=ddate)
date2<-c("01/12/1998 00:00:00", "31/12/1998 23:59:59", "01/01/1999 00:00:00", 
"31/01/1999 23:59:59", "01/02/1999 00:00:00", "28/02/1999 23:59:59",
"01/03/1999 00:00:00", "31/03/1999 23:59:59")
date3<-as.POSIXct(strptime(date2, "%d/%m/%Y %H:%M:%S"), "GMT")

ddate1[is.between(ddate1$date,date3[2],date3[1]),"Season"]<-1
 ddate1[is.between(ddate1$date,date3[4],date3[3]),"Season"]<-2
 ddate1[is.between(ddate1$date,date3[6],date3[5]),"Season"]<-3
 ddate1[is.between(ddate1$date,date3[8],date3[7]),"Season"]<-4
 ddate1
 date Season
1 1998-12-29 20:00:33  1
2 1999-01-02 05:20:44  2
3 1999-01-02 06:18:36  2
4 1999-02-02 07:06:59  3
5 1999-03-02 07:10:56  4
6 1999-03-02 07:57:18  4

A.K.


- Original Message -
From: penguins 
To: r-help@r-project.org
Cc: 
Sent: Sunday, August 5, 2012 4:30 PM
Subject: [R] find date between two other dates

Hi,

I am trying to assign "Season" values to dates depending on when they occur.

For example, the following dates would be assigned the following "Season"
numbers based on the "season" intervals detailed below in the code:

ddate                               Season
29/12/1998 20:00:33       1
02/01/1999 05:20:44       2
02/01/1999 06:18:36       2
02/02/1999 07:06:59       3
02/03/1999 07:10:56       4
02/03/1999 07:57:18       4

My approach so far doesnt work because of the time stamps and is probably
very long winded. However, to prevent errors I would prefer to keep the date
formats as dd/mm/ as oppose to a numeric format. Any help on the
following code would be gratefully recieved:

ddate <-  c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999
06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999
07:57:18")
ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT")

is.between<-function(x, a, b) {
       (x > a) & (b > x)
}

ddate$s1 <- is.between(ddate, 01/12/1998 00:00:00, 31/12/1998 23:59:59)
ddate$s2 <- is.between(ddate, 01/01/1999 00:00:00, 31/01/1999 23:59:59)
ddate$s3 <- is.between(ddate, 01/02/1999 00:00:00, 28/02/1999 23:59:59)
ddate$s4 <- is.between(ddate, 01/03/1999 00:00:00, 31/03/1999 23:59:59)

Many thanks



--
View this message in context: 
http://r.789695.n4.nabble.com/find-date-between-two-other-dates-tp4639231.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] more efficient way to parallel

2012-08-05 Thread Jie

Dear All,

Suppose I have a program as below: Outside is a loop for simulation (with
random generated data), inside there are several sapply()'s (10~100) over
the data and something else, but these sapply's have to be sequential. And
each sapply do not involve very intensive calculation (a few seconds only).
So the outside loop takes minutes to finish one iteration.
I guess the better way is not to parallel sapply but the outer loop.
But I have no idea how to modify it. I have a simple code here. Only two
sapply's involved for simplicity. The logical in the sapply is not
 important.
Thank you for your attention and suggestion.

library(parallel)
library(MASS)
result.seq=c()
Maxi <- 100
for (i in 1:Maxi)
{
## initialization, not of interest
Sigmahalf <- matrix(sample(1:1,size = 1,replace =T ),  100)
Sigma <- t(Sigmahalf)%*%Sigmahalf
x <- mvrnorm(n=1000, rep(0, 10), Sigma)
xlist <- list()
for (j in 1:1000)
{
xlist[[j]] <- list(X = matrix( x [j, ],5))
}
## end of initialization

dd1 <- sapply(xlist,function(s) {min(abs((eigen(s$X))$values))})
 ##
sumdd1=sum(dd1)
for (j in 1:1000)
{
xlist[[j]]$dd1 <- dd1[j]/sumdd1
}
  ## Assume dd2 and dd1 can not be combined in one sapply()
dd2 <- sapply(xlist, function(s){min(abs((eigen(s$X))$values))+s$dd1})
result.seq[i] <- sum(dd1*dd2)

}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Case study on R code speedup

2012-08-05 Thread John C Nash

Recently I looked into some ways to speed up a calculation in R (the Rayleigh 
Quotient is
the example). I wanted to look at the byte-code compiler too. As a way of 
making notes I
embedded my attempts in a knitR (.Rnw) file. The resulting pdf is linked from 
the Rwiki at
http://rwiki.sciviews.org/doku.php?id=tips:rqcasestudy.  R users may find the 
examples
helpful.

John Nash

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] deleting columns from a dataframe where NA is more than 15 percent of the column length

2012-08-05 Thread Jorge I Velez

Hi Faz,

Here is one way of doing it where "x" is your data frame:

x[, colMeans(is.na(x)) <= .15]

HTH,
Jorge.-


On Sun, Aug 5, 2012 at 9:04 PM, Faz Jones <> wrote:

> I have a dataframe of 10 different columns (length of each column is
> the same). I want to eliminate any column that has 'NA' greater than
> 15% of the column length. Do i first need to make a function for
> calculating the percentage of NA for each column and then make another
> dataframe where i apply the function? Whats the best way to do this.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] deleting columns from a dataframe where NA is more than 15 percent of the column length

2012-08-05 Thread Faz Jones

I have a dataframe of 10 different columns (length of each column is
the same). I want to eliminate any column that has 'NA' greater than
15% of the column length. Do i first need to make a function for
calculating the percentage of NA for each column and then make another
dataframe where i apply the function? Whats the best way to do this.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R: Help xts object Subset Date by Day of the Week

2012-08-05 Thread Douglas Karabasz

I have a xts object made of daily closing prices I have acquired using
quantmod.

 

Here is my code:

library(xts)

library(quantmod)

library(lubridate)

 

# Gets SPY data

getSymbols("SPY")

# Subset Prices to just closing price

SP500 <- Cl(SPY)

# Show day of the week for each date using 2-6 for monday-friday

SP500wd <- wday(SP500)

# Add Price and days of week together

SP500wd <- cbind(SP500, SP500wd)

# subset Monday into one xts object

SPmon <- subset(SP500wd, SP500wd$..2=="2")

 

 

I then used the package lubridate to show the days of the week.   Due to the
requirement of an xts objects to be numeric you will see each day is
represented as a number so that Monday is =2, Tuesday=3, Wednesday=4,
Thursday=5, Friday=6, Saturday=7.   Since this is a financial index you will
only see the numbers 2-6 or Monday-Friday.

I want to subset the data by using the day column.  I would like some help
to figure out the best way to accomplish a few objectives.  

1.   Subset the data so that I only show Monday in sequence.  However, I
do want to make sure that it shows the date, price and the ..2 colum(which
is the day of week) after Sub setting the data  (I have it done but not sure
if it is the best way)

2.   Rearrange the object (hopefully without destroying the xts object)
so that my data lines up like a weekly calendar.   So it would look like the
follow.  

   


Long Date Monday

Monday Price

Monday Day Index

Long Date Tuesday 

Tuesday Price

Tuesday Day Index

Long Date Wednesday

Wednesday Price

Wednesday Index

Long Date Thursday

Thursday Price

Thursday Index

Friday

Friday Price

Friday Index


1/5/2009

92.85

2

1/6/2009

93.47

3

1/7/2009

90.67

4

1/8/2009

84.4

5

1/9/2009

89.09

6


1/12/2009

86.95

2

1/13/2009

87.11

3

1/14/2009

84.37

4

1/15/2009

91.04

5

1/16/2009

85.06

6


MLK Mondy

MLK Monday

MLK Monday

1/20/2009

80.57

3

1/21/2009

84.05

4

1/22/2009

82.75

5

1/23/2009

83.11

6


1/26/2009

83.68

2

1/27/2009

84.53

3

1/28/2009

87.39

4

1/29/2009

84.55

5

1/30/2009

82.83

6


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Thank you,

Douglas


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] find date between two other dates

2012-08-05 Thread arun

HI,

Try this:
ddate <-  c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999 
06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999 07:57:18")
ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT") 
ddate1<-data.frame(date=ddate)
date2<-c("01/12/1998 00:00:00", "31/12/1998 23:59:59", "01/01/1999 00:00:00", 
"31/01/1999 23:59:59", "01/02/1999 00:00:00", "28/02/1999 23:59:59",
"01/03/1999 00:00:00", "31/03/1999 23:59:59")
date3<-as.POSIXct(strptime(date2, "%d/%m/%Y %H:%M:%S"), "GMT")
ddate1[ddate1$date<=date3[2]& ddate1$date>=date3[1],"Season"]<-1
ddate1[ddate1$date=date3[3],"Season"]<-2
ddate1[ddate1$date=date3[5],"Season"]<-3
ddate1[ddate1$date=date3[7],"Season"]<-4

 ddate1
 date Season
1 1998-12-29 20:00:33  1
2 1999-01-02 05:20:44  2
3 1999-01-02 06:18:36  2
4 1999-02-02 07:06:59  3
5 1999-03-02 07:10:56  4
6 1999-03-02 07:57:18  4


A.K.

- Original Message -
From: penguins 
To: r-help@r-project.org
Cc: 
Sent: Sunday, August 5, 2012 4:30 PM
Subject: [R] find date between two other dates

Hi,

I am trying to assign "Season" values to dates depending on when they occur.

For example, the following dates would be assigned the following "Season"
numbers based on the "season" intervals detailed below in the code:

ddate                               Season
29/12/1998 20:00:33       1
02/01/1999 05:20:44       2
02/01/1999 06:18:36       2
02/02/1999 07:06:59       3
02/03/1999 07:10:56       4
02/03/1999 07:57:18       4

My approach so far doesnt work because of the time stamps and is probably
very long winded. However, to prevent errors I would prefer to keep the date
formats as dd/mm/ as oppose to a numeric format. Any help on the
following code would be gratefully recieved:

ddate <-  c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999
06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999
07:57:18")
ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT")

is.between<-function(x, a, b) {
       (x > a) & (b > x)
}

ddate$s1 <- is.between(ddate, 01/12/1998 00:00:00, 31/12/1998 23:59:59)
ddate$s2 <- is.between(ddate, 01/01/1999 00:00:00, 31/01/1999 23:59:59)
ddate$s3 <- is.between(ddate, 01/02/1999 00:00:00, 28/02/1999 23:59:59)
ddate$s4 <- is.between(ddate, 01/03/1999 00:00:00, 31/03/1999 23:59:59)

Many thanks



--
View this message in context: 
http://r.789695.n4.nabble.com/find-date-between-two-other-dates-tp4639231.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] find date between two other dates

2012-08-05 Thread Rui Barradas


Hello,

You can use a function that returns the number you want, not a logical 
value.
But first, it's a bad idea to have a data.frame and a vector with the 
same name, so, in what follows, I've altered the df name.


ddate <-  c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999
06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999
07:57:18")
ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT")
ddat <- data.frame(ddate=ddate)   # Here, different name.

season.month <- function(x){
x <- as.integer(format(x, format="%m"))
ifelse(x == 12L, 1L, x + 1L)
}

season.month(ddate)
ddat$season <- season.month(ddate)
str(ddat)
'data.frame':   6 obs. of  2 variables:
 $ ddate : POSIXct, format: "1998-12-29 20:00:33" "1999-01-02 05:20:44" ...
 $ season: int  1 2 2 3 4 4

ddat
ddate season
1 1998-12-29 20:00:33  1
2 1999-01-02 05:20:44  2
3 1999-01-02 06:18:36  2
4 1999-02-02 07:06:59  3
5 1999-03-02 07:10:56  4
6 1999-03-02 07:57:18  4

Hope this helps,

Rui Barradas

Em 05-08-2012 21:30, penguins escreveu:

Hi,

I am trying to assign "Season" values to dates depending on when they occur.

For example, the following dates would be assigned the following "Season"
numbers based on the "season" intervals detailed below in the code:

ddate   Season
29/12/1998 20:00:33   1
02/01/1999 05:20:44   2
02/01/1999 06:18:36   2
02/02/1999 07:06:59   3
02/03/1999 07:10:56   4
02/03/1999 07:57:18   4

My approach so far doesnt work because of the time stamps and is probably
very long winded. However, to prevent errors I would prefer to keep the date
formats as dd/mm/ as oppose to a numeric format. Any help on the
following code would be gratefully recieved:
  
ddate <-  c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999

06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999
07:57:18")
ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT")

is.between<-function(x, a, b) {
(x > a) & (b > x)
}

ddate$s1 <- is.between(ddate, 01/12/1998 00:00:00, 31/12/1998 23:59:59)
ddate$s2 <- is.between(ddate, 01/01/1999 00:00:00, 31/01/1999 23:59:59)
ddate$s3 <- is.between(ddate, 01/02/1999 00:00:00, 28/02/1999 23:59:59)
ddate$s4 <- is.between(ddate, 01/03/1999 00:00:00, 31/03/1999 23:59:59)

Many thanks



--
View this message in context: 
http://r.789695.n4.nabble.com/find-date-between-two-other-dates-tp4639231.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with segmented function

2012-08-05 Thread stella

Hi,

I appreciate your help with the segmented function. I am relatively new to
R. I followed the introduction of the 'segmented'-package by Vito Muggeo,
but still it does not work.
Here are the lines I wrote:

data_test<-data.frame(x=c(1:10),y=c(1,1,1,1,1,2,3,4,5,6))
lr_test<-lm(y~x,data_test)
seg_test<-segmented(lr_test,seg.Z~x,psi=1)


/error in segmented.lm(lr_test, seg.Z ~ x, psi = 1) : 
 A wrong number of terms in `seg.Z' or `psi'/

Thank you very much,
Stella



--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-segmented-function-tp4639227.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] find date between two other dates

2012-08-05 Thread penguins

Hi,

I am trying to assign "Season" values to dates depending on when they occur.

For example, the following dates would be assigned the following "Season"
numbers based on the "season" intervals detailed below in the code:

ddate   Season
29/12/1998 20:00:33   1
02/01/1999 05:20:44   2
02/01/1999 06:18:36   2
02/02/1999 07:06:59   3
02/03/1999 07:10:56   4
02/03/1999 07:57:18   4

My approach so far doesnt work because of the time stamps and is probably
very long winded. However, to prevent errors I would prefer to keep the date
formats as dd/mm/ as oppose to a numeric format. Any help on the
following code would be gratefully recieved:
 
ddate <-  c("29/12/1998 20:00:33", "02/01/1999 05:20:44", "02/01/1999
06:18:36", "02/02/1999 07:06:59", "02/03/1999 07:10:56", "02/03/1999
07:57:18")
ddate <- as.POSIXct(strptime(ddate, "%d/%m/%Y %H:%M:%S"), "GMT")

is.between<-function(x, a, b) {
   (x > a) & (b > x)
}

ddate$s1 <- is.between(ddate, 01/12/1998 00:00:00, 31/12/1998 23:59:59)
ddate$s2 <- is.between(ddate, 01/01/1999 00:00:00, 31/01/1999 23:59:59)
ddate$s3 <- is.between(ddate, 01/02/1999 00:00:00, 28/02/1999 23:59:59)
ddate$s4 <- is.between(ddate, 01/03/1999 00:00:00, 31/03/1999 23:59:59)

Many thanks



--
View this message in context: 
http://r.789695.n4.nabble.com/find-date-between-two-other-dates-tp4639231.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package to remove collinear variables

2012-08-05 Thread Jeff Newmiller

There is no "magic bullet" (package) for your problem. You must either learn 
enough statistics to understand how to analyze your data, or consult with 
someone who does.

FWIW collinearity is not in general amenable to automatic removal. However, you 
can identify which inputs are collinear with each other, and omit the redundant 
ones next iteration of your analysis, using (for example) the approach 
suggested by Uwe.  Deciding WHICH of the redundant inputs is most appropriate 
to keep is the part computers are not so good at... that is where you must be 
smarter or more creative than the computer.

Also, it would help you get responses if you included the context (earlier 
discussion) in your replies.. most people do not use Nabble here. Reading and 
following the requests in the footer of every message will also help.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Roberto  wrote:

>I do not know, because I tried to use rfe function (Backwards Feature
>Selection, Caret Package) to select wavelengths useful for a prediction
>model. Otherwise, rfe function give me back a lot of warning messages
>about
>collinearity between variables.
>
>So, I do not know if your script can be useful.
>I tried to use VIF-Regression to select variables, but rfe function
>advise
>me with the same warning messages again.
>
>What do you think about that?
>
>Thank you very much for your help.
>
>Best,
>Roberto
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find out what "native.enc" corresponds to

2012-08-05 Thread Milan Bouchet-Valat

Le dimanche 05 août 2012 à 10:04 +0100, Prof Brian Ripley a écrit :
> On 05/08/2012 09:54, Milan Bouchet-Valat wrote:
> > Hi!
> >
> > I'm using R2HTML in my RcmdrPlugin.temis package to output localized
> > strings to a HTML file. Thus, I insert a simple header at the top of the
> > file to specify what encoding is used; if I don't do that, Web browsers
> > assume it is latin1, which is not always true.
> >
> > My problem is, I could not find a way to detect what encoding is used by
> > R2HTML in the most general case. R2HTML simply calls cat() with the file
> > name, which means the text connection is opened using file(encoding =
> > getOption("encoding")). This is fine, except that when
> > getOption("encoding")) is set to "native.enc", I'm not able to find out
> > the real encoding that was used for output.
> >
> > Of course, ideally I would tell R2HTML to output everything as UTF-8,
> > and I would add this information to the header. But AFAICT this is not
> > possible in the current state of this package. So I would be very
> > grateful if somebody could provide me with a solution to resolve
> > "native.enc" to the encoding name.
> 
> ?options points you to ?connections, which does explain this.  See 
> Sys.getlocale("LC_CTYPE") to see
> 
> 'the internal encoding of the current locale'
> 
> (or at least, what the OS claims it to be: e.g. some lie about 'C' locales).
Thanks for the pointers, but the issue is/was that LC_CTYPE does not
provide a valid encoding name. But your reply prompted me to read ?iconv
again, and I discovered the existence of localeToCharset(), which seems
to provide me with the encoding name I'm looking for.

> As for a name, iconv() knows this as "" (and some OSes do make it rather 
> hard to find a name if it is not part of the locale name).
I'm afraid I don't understand what you mean. Do you suggest I encode
data to/from the current encoding?


Regards

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Accessing more than two coefficients in a plot

2012-08-05 Thread FJ M


I needed to create my own forecast from the square root, linear and quadratic 
coefficients and then the abline() plot worked fine. 
 
# Forecast l using non-linear regression coeffs - unweighted
lm2.bforecast<- numeric(n)
for (i in 1:n)
{
lm2.bforecast[i] <- 
lm2.b$coeff["(Intercept)"]+lm2.b$coeff["VV1_2"]*VV1_2[i]+lm2.b$coeff["VV1_22"]*VV1_22[i]+lm2.b$coeff["VV1_212"]*VV1_212[i]
}
lm2.bforecastline<-lm(lm2.bforecast ~ VV1_2, method = "qr", model = TRUE, x = 
FALSE, y = FALSE, qr = TRUE) # unweighted, non-linear regression forecast

plot(VV1_2, Lambda1_2,  ylim=yrange, tck=1, main="Verizon V(1) Parameters (V, 
V^2 & V^0.5) Unweighted", xlab="VV1_2", ylab="Lambda1_2 & 
Beta1_2",pch=19,col="red") 
{points(VV1_2, lm2.lforecast, pch=19, col="brown")
abline(lm2.lforecastline, col="brown", lty="longdash", lwd=2)
 ...
 

> Date: Sun, 25 Mar 2012 15:36:20 -0700
> Subject: Re: [R] Accessing more than two coefficients in a plot
> From: gunter.ber...@gene.com
> To: chicagobrownb...@hotmail.com
> CC: r-help@r-project.org
> 
> Well, as a line in the plane is determined by 2 coefficients only, I'd
> guess that trying to find an R function that plots a line defined by 4
> coefficients has about the same chance of success as finding a unicorn
> with 3 horns.
> 
> You do understand that your linear model defines a hyperplane in your
> three covariates, do you not? Or do I misunderstand what you have
> requested?
> 
> Cheers,
> Bert
> 
> On Sun, Mar 25, 2012 at 2:32 PM, FJ M  wrote:
> >
> > I've successfully plotted (in the plot and abline code below) a simple 
> > regression of Lambda1_2 on VV1_2. I then successfully regressed Lambda1_2 
> > on VV1_2, VV1_22 and VV1_212 producing lm2.l. When I go to plot lm2.l using 
> > abline I get the warning:
> >
> > "1: In abline(lm2.l, col = "brown", lty = "dotted", lwd = 2) : only using 
> > the first two of 4 regression coefficients"
> >
> > Is there another function like abline that will produce a line using the 
> > constant and three coefficients from the lm2.l regression?
> >
> >
> > lm.l <- lm(Lambda1_2 ~ VV1_2, method = "qr", model = TRUE, x = FALSE, y = 
> > FALSE, qr = TRUE) # unweighted regression
> >
> > lm2.l <- lm(Lambda1_2 ~ VV1_2 + VV1_22 + VV1_212, method = "qr", model = 
> > TRUE, x = FALSE, y = FALSE, qr = TRUE) # unweighted regression
> >
> > plot(VV1_2, Lambda1_2, ylim=yrange, tck=1, main="V(1) Parameters (V, V^2 & 
> > V^0.5)", xlab="VV1_2", ylab="Lambda & Beta1_2",pch=19,col="red")
> > {abline(lm2.l, col="brown", lty="dotted", lwd=2)
> > abline(wlm2.l, col="gold",lty="longdash", lwd=2)
> > points(VV1_2, Beta1_2, pch=19, col="blue")
> > abline(lm2.b, col="black",lty="dotted", lwd=2)
> > abline(wlm2.b, col="blue", lty="longdash", lwd=2)
> > legend("topright", inset=.05, title="Parameters",
> > labels, lwd=2, lty=c(1, 1, 1, 1, 2), col=colors)
> > }
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> 
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package to remove collinear variables

2012-08-05 Thread Roberto

I do not know, because I tried to use rfe function (Backwards Feature
Selection, Caret Package) to select wavelengths useful for a prediction
model. Otherwise, rfe function give me back a lot of warning messages about
collinearity between variables.

So, I do not know if your script can be useful.
I tried to use VIF-Regression to select variables, but rfe function advise
me with the same warning messages again.

What do you think about that?

Thank you very much for your help.

Best,
Roberto



--
View this message in context: 
http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extracting desired numbers from complicated lines of web pages

2012-08-05 Thread jim holtman

try this:  left as an exercise to the reader if these have to be
grouped by 'userid' which might be the case and therefore you might
want to check for non-existent values.  Also on the last line you did
not say it there are only those three values, or could there be more.


input <- readLines(textConnection('
+ [1] "\t\t\t108
Friends"
+
+  [2] "\t\t\t151
Reviews"
+
+  [3] "\t\t\t\t5 Review Updates"
+
+  [4] "\t\t\t\t1
First"
+
+  [5] "\t\t\t\t2 Fans"
+
+  [6] "\t\t\t\t54 Local
Photos"
+
+  [7] http://s3-media2.ak.yelpcdn.com/assets/0/www/img/cf265851428e/ico/reviewVotes.gif";
alt=""> Review votes: 2022 Useful, 1591 Funny, and 1756 Cool
+
+ [[alternative HTML version deleted]]'))
>
> # extract the data by brute force and then break apart into a dataframe
> count <- lapply(input, function(.line){
+ if (grepl('[0-9]+ Friends', .line))
+ return(sub(".*>([0-9]+) (Friends).*", "\\1:\\2", .line))
+ if (grepl("[0-9]+ Reviews", .line))
+ return(sub(".*>([0-9]+) (Reviews).*", "\\1:\\2", .line))
+ if (grepl("[0-9]+ Review Update", .line))
+ return(sub(".*>([0-9]+) (Review Update).*", "\\1:\\2", .line))
+ if (grepl("[0-9]+ First", .line))
+ return(sub(".*>([0-9]+) (First).*", "\\1:\\2", .line))
+ if (grepl("[0-9]+ Fans", .line))
+ return(sub(".*>([0-9]+) (Fans).*", "\\1:\\2", .line))
+ if (grepl("[0-9]+ Local Photos", .line))
+ return(sub(".*>([0-9]+) (Local Photos).*", "\\1:\\2", .line))
+ if (grepl("[0-9]+ Useful", .line))
+ return(c(  # vector with multiple values
+ sub(".* ([0-9]+) (Useful).*", "\\1:\\2", .line)
+   , sub(".* ([0-9]+) (Funny).*", "\\1:\\2", .line)
+   , sub(".* ([0-9]+) (Cool).*", "\\1:\\2", .line)
+   ))
+ return(NULL)
+ })
>
> # create dataframe
> df <- data.frame(do.call(rbind, strsplit(unlist(count), ":")))
> names(df) <- c("Value", "Variable")
> df
  Value  Variable
1   108   Friends
2   151   Reviews
3 5 Review Update
4 1 First
5 2  Fans
654  Local Photos
7  2022Useful
8  1591 Funny
9  1756  Cool
>
>
>
>

On Sun, Aug 5, 2012 at 11:16 AM, Shelby McIntyre  wrote:
> I need to extract the indicted (bold & underlined) numbers from lines coming 
> off web pages.
>
> Of course I don't know ahead of time the location or length of the number.  
> What I do know
> is the tag "Friends", and "Reviews", etc. In fact, it would be good to end up 
> with
>
> Value   Variable
> 108   Friends
> 151   Reviews
> 5   Review Updates
>   NA  First <-- assuming here that "First" did not show 
> up on an line
> etc.
>
> Of particular trouble is line [7] which requires extracting 3 numbers 2022 
> (Useful), 1591 (Funny) and 1756 (Cool).
> == Extraction problem lines ===
>
> [1] "\t\t\t href=\"/user_details_friends?userid=--T8djg0nrb_yMMMA3Y0jQ\">108 
> Friends"
>
>  [2] "\t\t\t href=\"/user_details_reviews_self?userid=--T8djg0nrb_yMMMA3Y0jQ\">151 
> Reviews"
>
>  [3] "\t\t\t\t5 Review Updates"
>
>  [4] "\t\t\t\t href=\"/user_details_reviews_self?review_filter=first&userid=--T8djg0nrb_yMMMA3Y0jQ\">1
>  First"
>
>  [5] "\t\t\t\t2 Fans"
>
>  [6] "\t\t\t\t href=\"/user_local_photos?userid=--T8djg0nrb_yMMMA3Y0jQ\">54 Local 
> Photos"
>
>  [7]  src="http://s3-media2.ak.yelpcdn.com/assets/0/www/img/cf265851428e/ico/reviewVotes.gif";
>  alt=""> Review votes: 2022 Useful, 1591 Funny, and 1756 Cool
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Parallel runs of an external executable with snow in local

2012-08-05 Thread Uwe Ligges




On 03.08.2012 19:21, Xavier Portell/UPC wrote:

Hi everyone,

I'm aiming to run an external executable (say filetorun.EXE) in parallel. The external executable collect needed data from a 
file, say "input.txt" and, in turn,generates several output files, say "output.txt". I need to generate 
"input.txt", run the executable and keep "input.txt" and "output.txt". I'm using Windows 7, R 
version 2.15.1 (2012-06-22) on RStudio and platform: i386.pc.mingw32/i386 (32-bit).

My first attempt was a R code which, by using
   System("filetorun.EXE", intern = F, ignore.stdout = F,
   ignore.stderr = F, wait = T, input = NULL,
   show.output.on.console = T, minimized = F, invisible = T))
, ran the executable and kept required files to a conveniently named folder. 
After that I changed my previous R script so I could use the function 
lapply().This script apparently worked fine.

Finally, I tried to parallelize the problem by using snow and parLapply(). The 
resulting script looks like this:

## Not run
#
library(snow)cl <- makeCluster(3, type = "SOCK")
clusterExport(cl,list('param.esp','copy.files','for12.template','program.executor'))
parLapply(cl,a.list,a.function))stopCluster(cl)
#
##End not run

Although it runs, the parallelized version is messing up the input parameters to pass to the 
executable (see table below, where parameters P1 and P2 are considered. ".s" comes from 
the serial code and ".p" from the parallelized one):
   s r P1.s P2.s P1.p P2.p
1 1 1  1.0 3.00  2.0 3.00
2 2 1  1.5 3.00  2.0 3.75
3 3 1  2.0 3.00  2.0 3.00
4 4 1  1.0 3.75  1.5 3.00
5 5 1  1.5 3.75  1.5 3.00
6 6 1  2.0 3.75  2.0 3.75

My first thought to avoid the described behaviour was creating a temporary file, say "tmp.id" with id being an identification run number, 
and copying "filetorun.EXE" and "Input.txt" to "tmp.id". However, while doing so, I realised that although running the 
correct "filetorun.EXE" copy (i.e., the one in "tmp.id") R looks for "input.txt" in the work directory.



Not sure about the real setup, but you can actually specify the path, 
not only filenames.


Uwe Ligges





I've been looking thoroughly for a solution but I got nothing.

Thanks for any help in advance,


Xavier Portell Canal

PhD candidate
Department of Agri-food engineering,
Universitat Politècnica de Catalunya

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PROGRAMM MATRIX

2012-08-05 Thread arun

HI,

I think it will be better if you posted this on R-help.

A.K.  



- Original Message -
From: "hafida...@hotmail.fr" 
To: smartpink...@yahoo.com
Cc: 
Sent: Sunday, August 5, 2012 9:01 AM
Subject: PROGRAMM MATRIX

Hi 
can you please help me to programme this formula: 

g[ll']=i[ll']-sum from j=1 to k  c[lj]c[l'j]A[j]^-1 WHERE 
i[ll']= 1/n  sum from i=1 to n   z[il]z[il']

n,k,m  are given.  j=1...k,    l,l'=1...m,  

it s complicate for me ; hope you can help me 
thank you a lot


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Extracting desired numbers from complicated lines of web pages

2012-08-05 Thread Shelby McIntyre

I need to extract the indicted (bold & underlined) numbers from lines coming 
off web pages.

Of course I don't know ahead of time the location or length of the number.  
What I do know
is the tag "Friends", and "Reviews", etc. In fact, it would be good to end up 
with

Value   Variable
108   Friends
151   Reviews
5   Review Updates
  NA  First <-- assuming here that "First" did not show up 
on an line
etc.

Of particular trouble is line [7] which requires extracting 3 numbers 2022 
(Useful), 1591 (Funny) and 1756 (Cool).
== Extraction problem lines ===

[1] "\t\t\t108 
Friends"   

 [2] "\t\t\t151 
Reviews"  

 [3] "\t\t\t\t5 Review Updates"


 [4] "\t\t\t\t1
 First"

 [5] "\t\t\t\t2 Fans"  


 [6] "\t\t\t\t54 Local 
Photos" 

 [7] http://s3-media2.ak.yelpcdn.com/assets/0/www/img/cf265851428e/ico/reviewVotes.gif";
 alt=""> Review votes: 2022 Useful, 1591 Funny, and 1756 Cool

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] trouble with looping for effect of sampling interval increase

2012-08-05 Thread Naidraug

I've looked everywhere and tinkered for three days now, so I figure asking
might be good. 
So here's a general rundown of what I am trying to get my code to do I am
giving you the whole rundown because I need a solution that retain certain
ways of doing things because they give me the information i need. 
I want to examine the effect of increasing my sampling interval on my data.
Example: what if instead of sampling every hour I sampled every two, oh
yeah, how about every three?.. etc ad nausea.  How I want to do this is to
take the data I have now, add an index  to it, that contains counters. Those
counters will look something like 1,2,1,2,.. for the first one,
1,2,3,1,2,3.. for the next one. I have a lot of them, like say a thousand...
Then for each column in the index my loops should start in the first column,
run only the ones, store that, then run the twos, and store that in the same
column of output in a different row. Then move to the next column run the
ones, store in the next column of output, run the twos, store in the next
row of that column, run the threes, etc on out until there is no more. I
want to use this index for a number of reasons. The first is that after this
I will be going back through and using a different method for sub-sampling
but keeping all else the same. So all I have to do there is change the way I
generate the index. The second is that it allows me to run  many subsamples
and see their range.  So the code I have made, generates my index, and does
the heavy lifting all correctly, as well as my averages, and quartiles, but
a look at the head () of my key output (IntervalBetas)  shows that something
has gone a miss. You have to look close to catch it.  The values generated
for each row of output are identical, this should not be the case, as row
one of the first output column should be generated from all values indexed
by a one in the first column, whereas in column two there are different
values indexed by the number one. I've checked about everything I can think
of, done print() on my loop sequence things (those little i and j) and
wiggled about everything. I am flummoxed. I think the bit that is messing up
is in here :
#Here is the loop for betas from sampling interval increase
 c <- WHOLESIZE[2]-1
 for (i in 1:c)
 {
 x <- length(unique(index[,i]))

 for (j in 1:x) 
 {

 data <- WHOLE [WHOLE[,x]==j,1]

But also here is the whole code in case I am wrong that that is the problem
area: 

#loop for making index


 #clean dataset of empty cells
 dataset <- na.omit (datasetORIGINAL)
 #how messed up was the data?
 holeyDATA <- datasetORIGINAL - dataset

 D <- dim(dataset)

#what is the smallest sample? 
tinysample <- 100 




#how long is the dataset?
 datalength <- length (dataset)


 #MD <- how many divisions
 
MD <- datalength/tinysample

 #clear things up for the index loop
 WHOLE <- NULL
index <- NULL
 #do the index loop

 for (a in 1:MD)
 {
 index <- cbind (index, rep (1:a, length = D[1]))
 }
index <- subset(index, select = -c(1) )

 #merge dataset and index loop
 WHOLE <- cbind (dataset, index)

 WHOLESIZE <- dim (WHOLE)

#Housekeeping before loops
IntervalBetas <- NULL


IntervalBetas <- c(NA,NA)
IntervalBetas <- as.data.frame (IntervalBetas)
IntervalLowerQ <- NULL
IntervalUpperQ <- NULL
IntervalMean <- NULL
IntervalMedian <- NULL

#Here is the loop for betas from sampling interval increase
 c <- WHOLESIZE[2]-1
 for (i in 1:c)
 {
 x <- length(unique(index[,i]))

 for (j in 1:x) 
 {

 data <- WHOLE [WHOLE[,x]==j,1]




 #get power spectral density

 PSDPLOT <- spectrum (data, detrend = TRUE, plot = FALSE)
 frequency <- PSDPLOT$freq
 PSD <- PSDPLOT$spec
 #log transform the power spectral density 
 Logfrequency <- log(frequency)
 LogPSD<- log(PSD)
 #fit my line to the data 
 Line <- lm (LogPSD ~ Logfrequency)
 #store the slope of the line
 Betas <- rbind (Betas, -coef(Line)[2])

#Get values on the curve shape
BSkew <- skew (Betas)
BMean <- mean (Betas)
BMedian <- median (Betas)
Q <- quantile (Betas) 


#store curve shape values
IntervalLowerQ <- rbind (IntervalLowerQ , Q[2]) 
IntervalUpperQ <- rbind (IntervalUpperQ , Q[4]) 
IntervalSkew <- rbind (IntervalSkew , BSkew) 
IntervalMean <- rbind (IntervalMean , BMean)
IntervalMedian <- rbind (IntervalMedian , BMedian)

#Store the Betas
#This is a pain


BetaSave <- Betas 
no.r <- nrow(IntervalBetas)
l.v <- length(BetaSave)
difer <- no.r - l.v
difers <- abs(difer)
if (no.r < l.v){ 
IntervalBetas <- rbind(IntervalBetas,rep(NA,difers))
}
else {
(BetaSave <- rbind(BetaSave,rep(NA,difers)))
}

IntervalBetas <- cbind (IntervalBetas, BetaSave)


 }
 
 }

#That ends the loop within a loop for how sampling interval
#changes beta
head (IntervalBetas)





--
View this message in context: 
http://r.789695.n4.nabble.com/trouble-with-looping-for-effect-of-sampling-interval-increase-tp4639213.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://st

Re: [R] Coloring Counties in a State Map

2012-08-05 Thread arcata

Hi Ray, that was really helpful, thank you!!!



--
View this message in context: 
http://r.789695.n4.nabble.com/Coloring-Counties-in-a-State-Map-tp4638218p4639210.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2 boxplot help

2012-08-05 Thread John Kane

Duh, I'm more dyslexic than usual obviously.  

John Kane
Kingston ON Canada


> -Original Message-
> From: ruipbarra...@sapo.pt
> Sent: Sun, 05 Aug 2012 17:07:38 +0100
> To: jrkrid...@inbox.com
> Subject: Re: [R] ggplot2 boxplot help
> 
> Hello,
> 
> Wasn't it supposed to be a boxplot?
> Anyway, the main problem seems to be a df format conversion prior to
> plotting.
> 
> dat2 <- data.frame(sample=rep(NA, 2*nrow(dat)))
> dat2$sample <- with(dat1, c(as.character(sample_1),
> as.character(sample_2)))
> dat2$value <- with(dat1, c(value_1, value_2))
> dat2
> 
> qplot(sample, value, data=dat2, geom="boxplot")
> 
> Hope this helps,
> 
> Rui Barradas
> 
> Em 05-08-2012 16:10, John Kane escreveu:
>> Please use dput() to supply sample data.
>> 
>> I think this does something like what you want.
>> ===###
>> ibrary(ggplot2)
>> library(reshape2)
>> 
>>   dat1<-read.table(text="
>> sample_1 sample_2 value_1  value_2
>> N  C   1.9268400 36.77590
>> N  C   0.1817890  5.58835
>> N  C0.2309000 7.54035
>> N C  0.0294559 1.50886
>> N  C 0.4678610 14.75560
>>   N C 10.7258000 92.13150",
>> sep="",header=TRUE)
>> 
>> 
>> bb  <-  melt(dat1)
>> 
>> p  <-  ggplot(bb  , aes(variable, value, fill =as.factor(value )  )) +
>>geom_bar(stat= "identity", position = "dodge")  +
>> scale_fill_discrete(name = "Fancy Title") +
>>scale_x_discrete(breaks=c("value_1", "value_2"),
>> labels=c("Sample 1", "Sample 2"))
>> p
>> ####
>> 
>> John Kane
>> Kingston ON Canada
>> 
>> 
>>> -Original Message-
>>> From: alexpadron1...@gmail.com
>>> Sent: Sat, 4 Aug 2012 14:21:49 -0700 (PDT)
>>> To: r-help@r-project.org
>>> Subject: [R] ggplot2 boxplot help
>>> 
>>> Hello,
>>> 
>>> I have a data set that looks like this:
>>> 
>>> name  G-ID test_id g-id g
>>> 1 00077464 C_068131 C_068131 OC_068131-
>>> 2 00051728 C_044461 C_044461 OC_044461-
>>> 3 00058738 C_050343 C_050343 OC_050343-
>>> 4 00059239 C_050649 C_050649 OC_050649-
>>> 5 1761 C_000909 C_000909 OC_000909-
>>> 6 5119 C_002752 C_002752 OC_002752-
>>>   locssample_1 sample_2 value_1
>>> value_2
>>> 1 37316550-37317847   N  C   1.9268400
>>> 36.77590
>>> 2 27058468-27060176   N  C   0.1817890
>>> 5.58835
>>> 3 4761739-4763268N  C0.2309000
>>> 7.54035
>>> 4  14565311-14567393   N C  0.0294559
>>> 1.50886
>>> 5  38670994-38675694   N  C 0.4678610
>>> 14.75560
>>> 6   48362804-48380794   N C 10.7258000
>>> 92.13150
>>> 
>>> 
>>> 
>>> In this dataset, sample_1 corresponds to value_1 and sample_2
>>> corresponds
>>> to
>>> value_2. How can I graph this in ggplot2's boxplot function? I am not
>>> quite
>>> sure how to tell R that sample_1 and sample_2 columns correspond to
>>> value_1
>>> and value_2 using ggplot2.
>>> 
>>> Can anyone shed some light on this?
>>> 
>>> Thanks.
>>> 
>>> 
>>> 
>>> --
>>> View this message in context:
>>> http://r.789695.n4.nabble.com/ggplot2-boxplot-help-tp4639187.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>> 
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> FREE ONLINE PHOTOSHARING - Share your photos online with your friends
>> and family!
>> Visit http://www.inbox.com/photosharing to find out more!
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>


FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your 
desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2 boxplot help

2012-08-05 Thread Rui Barradas


Hello,

Wasn't it supposed to be a boxplot?
Anyway, the main problem seems to be a df format conversion prior to 
plotting.


dat2 <- data.frame(sample=rep(NA, 2*nrow(dat)))
dat2$sample <- with(dat1, c(as.character(sample_1), as.character(sample_2)))
dat2$value <- with(dat1, c(value_1, value_2))
dat2

qplot(sample, value, data=dat2, geom="boxplot")

Hope this helps,

Rui Barradas

Em 05-08-2012 16:10, John Kane escreveu:

Please use dput() to supply sample data.

I think this does something like what you want.
===###
ibrary(ggplot2)
library(reshape2)

  dat1<-read.table(text="
sample_1 sample_2 value_1  value_2
N  C   1.9268400 36.77590
N  C   0.1817890  5.58835
N  C0.2309000 7.54035
N C  0.0294559 1.50886
N  C 0.4678610 14.75560
  N C 10.7258000 92.13150",
sep="",header=TRUE)


bb  <-  melt(dat1)

p  <-  ggplot(bb  , aes(variable, value, fill =as.factor(value )  )) +
   geom_bar(stat= "identity", position = "dodge")  +
scale_fill_discrete(name = "Fancy Title") +
   scale_x_discrete(breaks=c("value_1", "value_2"), labels=c("Sample 
1", "Sample 2"))
p
####

John Kane
Kingston ON Canada



-Original Message-
From: alexpadron1...@gmail.com
Sent: Sat, 4 Aug 2012 14:21:49 -0700 (PDT)
To: r-help@r-project.org
Subject: [R] ggplot2 boxplot help

Hello,

I have a data set that looks like this:

name  G-ID test_id g-id g
1 00077464 C_068131 C_068131 OC_068131-
2 00051728 C_044461 C_044461 OC_044461-
3 00058738 C_050343 C_050343 OC_050343-
4 00059239 C_050649 C_050649 OC_050649-
5 1761 C_000909 C_000909 OC_000909-
6 5119 C_002752 C_002752 OC_002752-
  locssample_1 sample_2 value_1
value_2
1 37316550-37317847   N  C   1.9268400
36.77590
2 27058468-27060176   N  C   0.1817890
5.58835
3 4761739-4763268N  C0.2309000
7.54035
4  14565311-14567393   N C  0.0294559
1.50886
5  38670994-38675694   N  C 0.4678610
14.75560
6   48362804-48380794   N C 10.7258000
92.13150



In this dataset, sample_1 corresponds to value_1 and sample_2 corresponds
to
value_2. How can I graph this in ggplot2's boxplot function? I am not
quite
sure how to tell R that sample_1 and sample_2 columns correspond to
value_1
and value_2 using ggplot2.

Can anyone shed some light on this?

Thanks.



--
View this message in context:
http://r.789695.n4.nabble.com/ggplot2-boxplot-help-tp4639187.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ptproc package

2012-08-05 Thread Uwe Ligges




On 05.08.2012 03:11, amirzadeh wrote:

Dear all
I came across  ptproc package on following website:
http://www.biostat.jhsph.edu/~rpeng/software/index.html

  Actually I downloaded it on the contributors website and tried to install
it manual but R wont unzip it. It is not available on CRAN project.
I use R 2.15.1 and windows vista on my computer. Any help would be
appreciated.


You will have to install it from sources. See the manual "R Installation 
and Administration" on how to do that on a Windows machine and which 
tools may be required.


In this case, even reading install.packages is sufficient, since you can 
try:


install.packages("ptproc", 
repos="http://www.biostat.jhsph.edu/~rpeng/software";, type="source")


should do the trick already.

Best,
Uwe Ligges





Thanks.
Amir Zadeh.



--
View this message in context: 
http://r.789695.n4.nabble.com/ptproc-package-tp4639196.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package to remove collinear variables

2012-08-05 Thread Uwe Ligges




On 05.08.2012 05:27, Roberto wrote:

Hi,
I need to remove collinear variables to my Near-Infrared table of spectra.

What package can I use?

Something simple, because I am a novice about statistic.



Remove those where

isTRUE(all.equal(cor(x, y), 1))

is TRUE?

Uwe Ligges




Thank you.

Best regards,
Roberto



--
View this message in context: 
http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2 boxplot help

2012-08-05 Thread John Kane

Please use dput() to supply sample data.  

I think this does something like what you want.
===###
ibrary(ggplot2)
library(reshape2)

 dat1<-read.table(text="
sample_1 sample_2 value_1  value_2
N  C   1.9268400 36.77590
N  C   0.1817890  5.58835
N  C0.2309000 7.54035
N C  0.0294559 1.50886
N  C 0.4678610 14.75560
 N C 10.7258000 92.13150",
   sep="",header=TRUE)


bb  <-  melt(dat1)

p  <-  ggplot(bb  , aes(variable, value, fill =as.factor(value )  )) + 
  geom_bar(stat= "identity", position = "dodge")  +
   scale_fill_discrete(name = "Fancy Title") +
  scale_x_discrete(breaks=c("value_1", "value_2"), 
labels=c("Sample 1", "Sample 2"))
p
####

John Kane
Kingston ON Canada


> -Original Message-
> From: alexpadron1...@gmail.com
> Sent: Sat, 4 Aug 2012 14:21:49 -0700 (PDT)
> To: r-help@r-project.org
> Subject: [R] ggplot2 boxplot help
> 
> Hello,
> 
> I have a data set that looks like this:
> 
>name  G-ID test_id g-id g
> 1 00077464 C_068131 C_068131 OC_068131-
> 2 00051728 C_044461 C_044461 OC_044461-
> 3 00058738 C_050343 C_050343 OC_050343-
> 4 00059239 C_050649 C_050649 OC_050649-
> 5 1761 C_000909 C_000909 OC_000909-
> 6 5119 C_002752 C_002752 OC_002752-
>  locssample_1 sample_2 value_1
> value_2
> 1 37316550-37317847   N  C   1.9268400
> 36.77590
> 2 27058468-27060176   N  C   0.1817890
> 5.58835
> 3 4761739-4763268N  C0.2309000
> 7.54035
> 4  14565311-14567393   N C  0.0294559
> 1.50886
> 5  38670994-38675694   N  C 0.4678610
> 14.75560
> 6   48362804-48380794   N C 10.7258000
> 92.13150
> 
> 
> 
> In this dataset, sample_1 corresponds to value_1 and sample_2 corresponds
> to
> value_2. How can I graph this in ggplot2's boxplot function? I am not
> quite
> sure how to tell R that sample_1 and sample_2 columns correspond to
> value_1
> and value_2 using ggplot2.
> 
> Can anyone shed some light on this?
> 
> Thanks.
> 
> 
> 
> --
> View this message in context:
> http://r.789695.n4.nabble.com/ggplot2-boxplot-help-tp4639187.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to put barchart and line chart in the same plot in ggplot2

2012-08-05 Thread John Kane

As far as I understand ggplot2, you cannot do it.  ggplot2 is pretty much 
designed to NOT allow two different sets of data  with different y axes in the 
same plot.

Doing this is generally considered very bad practice.  I'd suggest looking into 
perhaps using a 2X1 or X2  grid and plotting the two sets perhaps beside or 
above/below 

Have a look at 
http://stackoverflow.com/questions/9490482/combined-plot-of-ggplot2-not-in-a-single-plot-using-par-or-layout-functio
#  see 
http://stackoverflow.com/questions/8615530/place-title-of-multiplot-panel-with-ggplot2
  for an example
Kingston ON Canada


> -Original Message-
> From: xin...@stat.psu.edu
> Sent: Sat, 4 Aug 2012 17:17:58 -0700 (PDT)
> To: r-help@r-project.org
> Subject: [R] how to put barchart and line chart in the same plot in
> ggplot2
> 
> dear userR:
> I am trying to plot two dependent variables in the same plot in ggplot2.
> because these two variables have very different magnitude, I have to use
> a
> second Y axis. I hope one variable to be line and the other to be
> barchart.
> The x axis is continuous. Yet since I have to make barchart, I guess I
> have
> to treat it as discrete or categorical.
> I have been google searching for the whole afternoon but do not have any
> clue.
> Can anyone give me a direction (not have to be a complete answer...)?
> 
> many thanks
> 
> 
> 
> --
> View this message in context:
> http://r.789695.n4.nabble.com/how-to-put-barchart-and-line-chart-in-the-same-plot-in-ggplot2-tp4639194.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your 
desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help to programm

2012-08-05 Thread Rui Barradas


Hello,

Sorry, but I don't understand your formula. Maybe it's better if you

1. use * for multiply.
2. break the expression into smaller components. For instance,

(Is it an add and a multiply or two multiplies?)

EXP <-  exp{ (B0 B1 row matrix) (z[l] column matrix) }

Then use EXP.

3. instead of 'row matrix' write 'row_matrix', the same for column matrix.
4. Your final sum is the sum of what ???


Em 05-08-2012 13:31, hafida...@hotmail.fr escreveu:

Hi
can you please help me to programme this formula:
a[j]= E[j]-sum from l=i  to i-1 (exp{(B0 B1row matrix) (z[l]column matrix) } 
x[l])  /  sum from l=i to n

i=1...n  j=1...k l=1...m   ; n,m,k are given.

it s complicate for me ; hope you can help me
thank you a lot


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find out what "native.enc" corresponds to

2012-08-05 Thread Prof Brian Ripley


On 05/08/2012 09:54, Milan Bouchet-Valat wrote:

Hi!

I'm using R2HTML in my RcmdrPlugin.temis package to output localized
strings to a HTML file. Thus, I insert a simple header at the top of the
file to specify what encoding is used; if I don't do that, Web browsers
assume it is latin1, which is not always true.

My problem is, I could not find a way to detect what encoding is used by
R2HTML in the most general case. R2HTML simply calls cat() with the file
name, which means the text connection is opened using file(encoding =
getOption("encoding")). This is fine, except that when
getOption("encoding")) is set to "native.enc", I'm not able to find out
the real encoding that was used for output.

Of course, ideally I would tell R2HTML to output everything as UTF-8,
and I would add this information to the header. But AFAICT this is not
possible in the current state of this package. So I would be very
grateful if somebody could provide me with a solution to resolve
"native.enc" to the encoding name.


?options points you to ?connections, which does explain this.  See 
Sys.getlocale("LC_CTYPE") to see


'the internal encoding of the current locale'

(or at least, what the OS claims it to be: e.g. some lie about 'C' locales).

As for a name, iconv() knows this as "" (and some OSes do make it rather 
hard to find a name if it is not part of the locale name).


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Find out what "native.enc" corresponds to

2012-08-05 Thread Milan Bouchet-Valat

Hi!

I'm using R2HTML in my RcmdrPlugin.temis package to output localized
strings to a HTML file. Thus, I insert a simple header at the top of the
file to specify what encoding is used; if I don't do that, Web browsers
assume it is latin1, which is not always true.

My problem is, I could not find a way to detect what encoding is used by
R2HTML in the most general case. R2HTML simply calls cat() with the file
name, which means the text connection is opened using file(encoding =
getOption("encoding")). This is fine, except that when
getOption("encoding")) is set to "native.enc", I'm not able to find out
the real encoding that was used for output.

Of course, ideally I would tell R2HTML to output everything as UTF-8,
and I would add this information to the header. But AFAICT this is not
possible in the current state of this package. So I would be very
grateful if somebody could provide me with a solution to resolve
"native.enc" to the encoding name.

Thanks for your help

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting unknown error trying to plot spatial data

2012-08-05 Thread Jim Lemon


On 08/05/2012 05:09 AM, mjkatsaros wrote:

Hi there! I'm following an awesome guide to working with spatial data
(http://www.frankdavenport.com/blog/2012/6/19/notes-from-a-recent-spatial-r-class-i-gave.html)
and am running into an error that I can't figure out how to fix.

Disclaimer: I am very much an R n00b

Here is the r script I am running:
https://dl.dropbox.com/u/28231177/This%20Should%20Work.R

data: https://dl.dropbox.com/u/28231177/my_data.csv

shapefile: https://dl.dropbox.com/u/28231177/sfzipcodes.zip

I am getting two errors:


pds<- fortify(sf_map)

*Using OBJECTID to define regions.*

pds$OBJECTID<- as.integer(pds$OBJECTID)

*Error in `$<-.data.frame`(`*tmp*`, "OBJECTID", value = integer(0)) :
   replacement has 0 rows, data has 16249*



## Make the map

p1<- ggplot(my_data, aes(map_id = zip))
p1<- p1 + geom_map(aes(fill=vol, map_id = zip), map = pds)
p1<- p1 + expand_limits(x = pds$lon, y = pds$lat) + coord_equal()
p1 + xlab("Basic Map with Default Elements")

*Error in unit(x, default.units) : 'x' and 'units' must have length>  0*

Anybody have any idea what is happening here or how to resolve this?


Hi mjkatsaros,
The data file doesn't have a column labelled "OBJECTID". I would try 
renaming the "zip" column to "OBJECTID".


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to put barchart and line chart in the same plot in ggplot2

2012-08-05 Thread Jim Lemon


On 08/05/2012 10:17 AM, xin wei wrote:

dear userR:
I am trying to plot two dependent variables in the same plot in ggplot2.
because these two variables have very different magnitude, I have to use a
second Y axis. I hope one variable to be line and the other to be barchart.
The x axis is continuous. Yet since I have to make barchart, I guess I have
to treat it as discrete or categorical.
I have been google searching for the whole afternoon but do not have any
clue.
Can anyone give me a direction (not have to be a complete answer...)?


Hi xin wei,
If you're desperate, have a look at twoord.plot in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

38 matches

Mail list logo