[R] data load from excel files

2019-11-12 Thread ani jaya
Dear R-Help,

I have 30 of year-based excel files and each file contain month sheets. I
have some problem here. My data is daily rainfall but there is extra 1 day
(first date of next month) for several sheets. My main goal is to get the
minimum value for every month.

First, how to extract those data to list of data frame based on year and
delete every overlapping date?
Second, how to sort it based on date with ascending order (old to new)?
Third, how to get the maximum together with the date?

I did this one,

...
file.list <- list.files(pattern='*.xlsx')
file.list<-mixedsort(file.list)

#
https://stackoverflow.com/questions/12945687/read-all-worksheets-in-an-excel-workbook-into-an-r-list-with-data-frames

read_excel_allsheets <- function(filename, tibble = FALSE) {
  sheets <- readxl::excel_sheets(filename)
  x <- lapply(sheets, function(X) read.xlsx(filename, sheet=X, rows=9:40,
cols=1:2))
  if(!tibble) x <- lapply(x, as.data.frame)
  names(x) <- sheets
  x
}

pon<-lapply(file.list, function(i) read_excel_allsheets(i))
pon1<-do.call("rbind",pon)
names(pon1) <- paste0("M.", 1:360)
pon1 <-lapply(pon1,function(x){x$RR[x$RR==] <- NA; x})
pon1 <-lapply(pon1,function(x){x$RR[x$RR==""] <- NA; x})
maxi<-lapply(pon1, function(x) max(x$RR,na.rm=T))
maxi<-data.frame(Reduce(rbind, maxi))
names(maxi)<-"maxi"


but the list start from January for every year, and move to February and so
on. And there is no date in "maxi". Here some sample what I get from my
simple code.

> pon1[256:258]$M.256
  Tanggal   RR
1  01-09-2001  5.2
2  02-09-2001  0.3
3  03-09-2001 29.0
4  04-09-2001  0.7
5  05-09-2001  9.6
6  06-09-2001  0.7
7  07-09-2001   NA
8  08-09-2001 13.2
9  09-09-2001   NA
10 10-09-2001   NA
11 11-09-2001  0.0
12 12-09-2001 66.0
13 13-09-2001  0.0
14 14-09-2001 57.6
15 15-09-2001 18.0
16 16-09-2001 29.2
17 17-09-2001 52.2
18 18-09-2001  7.0
19 19-09-2001   NA
20 20-09-2001 74.5
21 21-09-2001 20.3
22 22-09-2001 49.6
23 23-09-2001  0.0
24 24-09-2001  1.3
25 25-09-2001  0.0
26 26-09-2001  1.0
27 27-09-2001  0.1
28 28-09-2001  1.9
29 29-09-2001  9.5
30 30-09-2001  3.3
31 01-10-2001  0.0

$M.257
  Tanggal   RR
1  01-09-2002  0.0
2  02-09-2002  0.0
3  03-09-2002  0.0
4  04-09-2002 12.8
5  05-09-2002  1.0
6  06-09-2002  0.0
7  07-09-2002   NA
8  08-09-2002 22.2
9  09-09-2002   NA
10 10-09-2002   NA
11 11-09-2002  0.0
12 12-09-2002  0.0
13 13-09-2002  0.0
14 14-09-2002   NA
15 15-09-2002  0.0
16 16-09-2002  0.0
17 17-09-2002  0.0
18 18-09-2002 13.3
19 19-09-2002  0.0
20 20-09-2002  0.0
21 21-09-2002  0.0
22 22-09-2002  0.0
23 23-09-2002  0.0
24 24-09-2002  0.0
25 25-09-2002  0.0
26 26-09-2002  0.5
27 27-09-2002  2.1
28 28-09-2002   NA
29 29-09-2002 18.5
30 30-09-2002  0.0
31 01-10-2002   NA

$M.258
  Tanggal   RR
1  01-09-2003  0.0
2  02-09-2003  0.0
3  03-09-2003  0.0
4  04-09-2003  4.0
5  05-09-2003  0.3
6  06-09-2003  0.0
7  07-09-2003   NA
8  08-09-2003  0.0
9  09-09-2003  0.0
10 10-09-2003  0.0
11 11-09-2003   NA
12 12-09-2003  1.0
13 13-09-2003  0.0
14 14-09-2003 60.0
15 15-09-2003  4.5
16 16-09-2003  0.1
17 17-09-2003  2.1
18 18-09-2003   NA
19 19-09-2003  0.0
20 20-09-2003   NA
21 21-09-2003   NA
22 22-09-2003 31.5
23 23-09-2003 42.0
24 24-09-2003 43.3
25 25-09-2003  2.8
26 26-09-2003 21.4
27 27-09-2003  0.8
28 28-09-2003 42.3
29 29-09-2003  5.3
30 30-09-2003 17.3
31 01-10-2003  0.0


Any lead or help is very appreciate.

Best,

Ani

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] QQ plot

2019-11-12 Thread Bert Gunter
Typo: "... from 5.5 million..."

Bert


On Tue, Nov 12, 2019 at 3:11 PM Bert Gunter  wrote:

> IMO, this thread has now gone totally off the rails and totally off topic
> -- it is clearly *not* about R programming and totally about statistics.
>
> I believe Ana Marija would do better to get local statistical help or post
> on a statistics or genomics list (stats.stackexchange.com is one such)
> where she can engage in a fuller discussion of what an *appropriate* qqplot
> would tell her. Of course selecting the lowest 3700 p-values from 55.5
> million and plotting them against 3700 expected uniform quantiles will not
> give a line with 0 intercept and slope 1. The scale correction is easy to
> make, but it is not multiplying by 1000!
>
> Bert
>
>
> On Tue, Nov 12, 2019 at 2:11 PM Ana Marija 
> wrote:
>
>> why I selected only those with P<0.003 to put on QQ plot is because
>> the original data set contains 5556249 points and when I extract only
>> P<0.001 I am getting 3713 points. Is there is a way to plot the whole
>> data set, or choose only the representative points?
>>
>> On Tue, Nov 12, 2019 at 3:42 PM Ana Marija 
>> wrote:
>> >
>> > the smallest p value in my dataset goes to 9.89e-08. How do I make
>> > that known on the new QQ plot with multiplied with 1000 values
>> >
>> > On Tue, Nov 12, 2019 at 3:37 PM Ana Marija 
>> wrote:
>> > >
>> > > Just do I need to change the axis when I multiply with 1000 and what
>> > > should I put on my axis?
>> > >
>> > > On Tue, Nov 12, 2019 at 3:07 PM Ana Marija <
>> sokovic.anamar...@gmail.com> wrote:
>> > > >
>> > > > Hi Duncan,
>> > > >
>> > > > yes I choose for QQ plot only P<1e-3 and multiplying everything with
>> > > > 1000 works great!
>> > > > This should not in my understanding influence the interpretation of
>> > > > the plot, it is only changing the scale of axis.
>> > > >
>> > > > Thank you so much,
>> > > > Ana
>> > > >
>> > > > On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch <
>> murdoch.dun...@gmail.com> wrote:
>> > > > >
>> > > > > On 12/11/2019 2:56 p.m., Jim Lemon wrote:
>> > > > > > I thought about this and did a little study of GWAS and the use
>> of
>> > > > > > p-values to assess significant associations. As Ana's plot
>> begins at
>> > > > > > values of about 0.001, this seems to imply that almost
>> everything in
>> > > > > > the genome is associated to some degree. One expects that most
>> SNPs
>> > > > > > will not be associated with a particular condition (p~1), so
>> perhaps
>> > > > > > something is going wrong in the calculations that produce the
>> > > > > > p-values.
>> > > > >
>> > > > > I may be misunderstanding your last sentence, but if there is no
>> > > > > association, the p-value would usually have a uniform
>> distribution from
>> > > > > 0 to 1, it wouldn't be near 1.
>> > > > >
>> > > > > I'd guess we're not seeing the p values from every test, only
>> those that
>> > > > > are less than 0.001.  If that's true, and there are no effects,
>> it makes
>> > > > > sense to multiply all of them by 1000 to get U(0,1) values.  On
>> the
>> > > > > plot, that would correspond to subtracting 3 from -log10(p), or
>> adding 3
>> > > > > to the reference line, as Ana requested.
>> > > > >
>> > > > > Or just multiply them by 1000 and pass them to qq():
>> > > > >
>> > > > >  qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values")
>> > > > >
>> > > > > As far as I can see, there's no way to tell qqman::qq to move the
>> > > > > reference line.
>> > > > >
>> > > > > Duncan Murdoch
>> > > > >
>> > > > > >
>> > > > > > Jim
>> > > > > >
>> > > > > > On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
>> > > > > >  wrote:
>> > > > > >>
>> > > > > >> I agree with Abby. That would defeat the purpose of a QQ plot.
>> > > > > >>
>> > > > > >> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle <
>> spurdl...@gmail.com> wrote:
>> > > > > >>
>> > > > > >>> Hi
>> > > > > >>>
>> > > > > >>> I'm not familiar with the qqman package, or GWAS studies.
>> > > > > >>> However, my guess would be that you're *not* supposed to
>> change the
>> > > > > >>> position of the line.
>> > > > > >>>
>> > > > > >>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija <
>> sokovic.anamar...@gmail.com>
>> > > > > >>> wrote:
>> > > > > 
>> > > > >  Hi,
>> > > > > 
>> > > > >  I was using this library, qqman
>> > > > > 
>> https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
>> > > > > 
>> > > > >  to create QQ plot, attached. How would I change this default
>> abline to
>> > > > >  start from the beginning of my QQ line?
>> > > > > 
>> > > > >  This is my code:
>> > > > >  qq(dd$P, main = "Q-Q plot of GWAS p-values")
>> > > > > 
>> > > > >  Thanks
>> > > > >  Ana
>> > > > >  __
>> > > > >  R-help@r-project.org mailing list -- To UNSUBSCRIBE and
>> more, see
>> > > > >  https://stat.ethz.ch/mailman/listinfo/r-help
>> > > > >  PLEASE do read the postin

Re: [R] QQ plot

2019-11-12 Thread Bert Gunter
IMO, this thread has now gone totally off the rails and totally off topic
-- it is clearly *not* about R programming and totally about statistics.

I believe Ana Marija would do better to get local statistical help or post
on a statistics or genomics list (stats.stackexchange.com is one such)
where she can engage in a fuller discussion of what an *appropriate* qqplot
would tell her. Of course selecting the lowest 3700 p-values from 55.5
million and plotting them against 3700 expected uniform quantiles will not
give a line with 0 intercept and slope 1. The scale correction is easy to
make, but it is not multiplying by 1000!

Bert


On Tue, Nov 12, 2019 at 2:11 PM Ana Marija 
wrote:

> why I selected only those with P<0.003 to put on QQ plot is because
> the original data set contains 5556249 points and when I extract only
> P<0.001 I am getting 3713 points. Is there is a way to plot the whole
> data set, or choose only the representative points?
>
> On Tue, Nov 12, 2019 at 3:42 PM Ana Marija 
> wrote:
> >
> > the smallest p value in my dataset goes to 9.89e-08. How do I make
> > that known on the new QQ plot with multiplied with 1000 values
> >
> > On Tue, Nov 12, 2019 at 3:37 PM Ana Marija 
> wrote:
> > >
> > > Just do I need to change the axis when I multiply with 1000 and what
> > > should I put on my axis?
> > >
> > > On Tue, Nov 12, 2019 at 3:07 PM Ana Marija <
> sokovic.anamar...@gmail.com> wrote:
> > > >
> > > > Hi Duncan,
> > > >
> > > > yes I choose for QQ plot only P<1e-3 and multiplying everything with
> > > > 1000 works great!
> > > > This should not in my understanding influence the interpretation of
> > > > the plot, it is only changing the scale of axis.
> > > >
> > > > Thank you so much,
> > > > Ana
> > > >
> > > > On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch <
> murdoch.dun...@gmail.com> wrote:
> > > > >
> > > > > On 12/11/2019 2:56 p.m., Jim Lemon wrote:
> > > > > > I thought about this and did a little study of GWAS and the use
> of
> > > > > > p-values to assess significant associations. As Ana's plot
> begins at
> > > > > > values of about 0.001, this seems to imply that almost
> everything in
> > > > > > the genome is associated to some degree. One expects that most
> SNPs
> > > > > > will not be associated with a particular condition (p~1), so
> perhaps
> > > > > > something is going wrong in the calculations that produce the
> > > > > > p-values.
> > > > >
> > > > > I may be misunderstanding your last sentence, but if there is no
> > > > > association, the p-value would usually have a uniform distribution
> from
> > > > > 0 to 1, it wouldn't be near 1.
> > > > >
> > > > > I'd guess we're not seeing the p values from every test, only
> those that
> > > > > are less than 0.001.  If that's true, and there are no effects, it
> makes
> > > > > sense to multiply all of them by 1000 to get U(0,1) values.  On the
> > > > > plot, that would correspond to subtracting 3 from -log10(p), or
> adding 3
> > > > > to the reference line, as Ana requested.
> > > > >
> > > > > Or just multiply them by 1000 and pass them to qq():
> > > > >
> > > > >  qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values")
> > > > >
> > > > > As far as I can see, there's no way to tell qqman::qq to move the
> > > > > reference line.
> > > > >
> > > > > Duncan Murdoch
> > > > >
> > > > > >
> > > > > > Jim
> > > > > >
> > > > > > On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
> > > > > >  wrote:
> > > > > >>
> > > > > >> I agree with Abby. That would defeat the purpose of a QQ plot.
> > > > > >>
> > > > > >> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle 
> wrote:
> > > > > >>
> > > > > >>> Hi
> > > > > >>>
> > > > > >>> I'm not familiar with the qqman package, or GWAS studies.
> > > > > >>> However, my guess would be that you're *not* supposed to
> change the
> > > > > >>> position of the line.
> > > > > >>>
> > > > > >>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija <
> sokovic.anamar...@gmail.com>
> > > > > >>> wrote:
> > > > > 
> > > > >  Hi,
> > > > > 
> > > > >  I was using this library, qqman
> > > > > 
> https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
> > > > > 
> > > > >  to create QQ plot, attached. How would I change this default
> abline to
> > > > >  start from the beginning of my QQ line?
> > > > > 
> > > > >  This is my code:
> > > > >  qq(dd$P, main = "Q-Q plot of GWAS p-values")
> > > > > 
> > > > >  Thanks
> > > > >  Ana
> > > > >  __
> > > > >  R-help@r-project.org mailing list -- To UNSUBSCRIBE and
> more, see
> > > > >  https://stat.ethz.ch/mailman/listinfo/r-help
> > > > >  PLEASE do read the posting guide
> > > > > >>> http://www.R-project.org/posting-guide.html
> > > > >  and provide commented, minimal, self-contained, reproducible
> code.
> > > > > >>>
> > > > > >>> __
> > > > > >>> R-help@r-project.org

Re: [R] QQ plot

2019-11-12 Thread Ana Marija
why I selected only those with P<0.003 to put on QQ plot is because
the original data set contains 5556249 points and when I extract only
P<0.001 I am getting 3713 points. Is there is a way to plot the whole
data set, or choose only the representative points?

On Tue, Nov 12, 2019 at 3:42 PM Ana Marija  wrote:
>
> the smallest p value in my dataset goes to 9.89e-08. How do I make
> that known on the new QQ plot with multiplied with 1000 values
>
> On Tue, Nov 12, 2019 at 3:37 PM Ana Marija  
> wrote:
> >
> > Just do I need to change the axis when I multiply with 1000 and what
> > should I put on my axis?
> >
> > On Tue, Nov 12, 2019 at 3:07 PM Ana Marija  
> > wrote:
> > >
> > > Hi Duncan,
> > >
> > > yes I choose for QQ plot only P<1e-3 and multiplying everything with
> > > 1000 works great!
> > > This should not in my understanding influence the interpretation of
> > > the plot, it is only changing the scale of axis.
> > >
> > > Thank you so much,
> > > Ana
> > >
> > > On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch  
> > > wrote:
> > > >
> > > > On 12/11/2019 2:56 p.m., Jim Lemon wrote:
> > > > > I thought about this and did a little study of GWAS and the use of
> > > > > p-values to assess significant associations. As Ana's plot begins at
> > > > > values of about 0.001, this seems to imply that almost everything in
> > > > > the genome is associated to some degree. One expects that most SNPs
> > > > > will not be associated with a particular condition (p~1), so perhaps
> > > > > something is going wrong in the calculations that produce the
> > > > > p-values.
> > > >
> > > > I may be misunderstanding your last sentence, but if there is no
> > > > association, the p-value would usually have a uniform distribution from
> > > > 0 to 1, it wouldn't be near 1.
> > > >
> > > > I'd guess we're not seeing the p values from every test, only those that
> > > > are less than 0.001.  If that's true, and there are no effects, it makes
> > > > sense to multiply all of them by 1000 to get U(0,1) values.  On the
> > > > plot, that would correspond to subtracting 3 from -log10(p), or adding 3
> > > > to the reference line, as Ana requested.
> > > >
> > > > Or just multiply them by 1000 and pass them to qq():
> > > >
> > > >  qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values")
> > > >
> > > > As far as I can see, there's no way to tell qqman::qq to move the
> > > > reference line.
> > > >
> > > > Duncan Murdoch
> > > >
> > > > >
> > > > > Jim
> > > > >
> > > > > On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
> > > > >  wrote:
> > > > >>
> > > > >> I agree with Abby. That would defeat the purpose of a QQ plot.
> > > > >>
> > > > >> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle  
> > > > >> wrote:
> > > > >>
> > > > >>> Hi
> > > > >>>
> > > > >>> I'm not familiar with the qqman package, or GWAS studies.
> > > > >>> However, my guess would be that you're *not* supposed to change the
> > > > >>> position of the line.
> > > > >>>
> > > > >>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija 
> > > > >>> 
> > > > >>> wrote:
> > > > 
> > > >  Hi,
> > > > 
> > > >  I was using this library, qqman
> > > >  https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
> > > > 
> > > >  to create QQ plot, attached. How would I change this default 
> > > >  abline to
> > > >  start from the beginning of my QQ line?
> > > > 
> > > >  This is my code:
> > > >  qq(dd$P, main = "Q-Q plot of GWAS p-values")
> > > > 
> > > >  Thanks
> > > >  Ana
> > > >  __
> > > >  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > >  https://stat.ethz.ch/mailman/listinfo/r-help
> > > >  PLEASE do read the posting guide
> > > > >>> http://www.R-project.org/posting-guide.html
> > > >  and provide commented, minimal, self-contained, reproducible code.
> > > > >>>
> > > > >>> __
> > > > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > > > >>> PLEASE do read the posting guide
> > > > >>> http://www.R-project.org/posting-guide.html
> > > > >>> and provide commented, minimal, self-contained, reproducible code.
> > > > >>>
> > > > >>
> > > > >>  [[alternative HTML version deleted]]
> > > > >>
> > > > >> __
> > > > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > > > >> PLEASE do read the posting guide 
> > > > >> http://www.R-project.org/posting-guide.html
> > > > >> and provide commented, minimal, self-contained, reproducible code.
> > > > >
> > > > > __
> > > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE d

Re: [R] QQ plot

2019-11-12 Thread Ana Marija
the smallest p value in my dataset goes to 9.89e-08. How do I make
that known on the new QQ plot with multiplied with 1000 values

On Tue, Nov 12, 2019 at 3:37 PM Ana Marija  wrote:
>
> Just do I need to change the axis when I multiply with 1000 and what
> should I put on my axis?
>
> On Tue, Nov 12, 2019 at 3:07 PM Ana Marija  
> wrote:
> >
> > Hi Duncan,
> >
> > yes I choose for QQ plot only P<1e-3 and multiplying everything with
> > 1000 works great!
> > This should not in my understanding influence the interpretation of
> > the plot, it is only changing the scale of axis.
> >
> > Thank you so much,
> > Ana
> >
> > On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch  
> > wrote:
> > >
> > > On 12/11/2019 2:56 p.m., Jim Lemon wrote:
> > > > I thought about this and did a little study of GWAS and the use of
> > > > p-values to assess significant associations. As Ana's plot begins at
> > > > values of about 0.001, this seems to imply that almost everything in
> > > > the genome is associated to some degree. One expects that most SNPs
> > > > will not be associated with a particular condition (p~1), so perhaps
> > > > something is going wrong in the calculations that produce the
> > > > p-values.
> > >
> > > I may be misunderstanding your last sentence, but if there is no
> > > association, the p-value would usually have a uniform distribution from
> > > 0 to 1, it wouldn't be near 1.
> > >
> > > I'd guess we're not seeing the p values from every test, only those that
> > > are less than 0.001.  If that's true, and there are no effects, it makes
> > > sense to multiply all of them by 1000 to get U(0,1) values.  On the
> > > plot, that would correspond to subtracting 3 from -log10(p), or adding 3
> > > to the reference line, as Ana requested.
> > >
> > > Or just multiply them by 1000 and pass them to qq():
> > >
> > >  qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values")
> > >
> > > As far as I can see, there's no way to tell qqman::qq to move the
> > > reference line.
> > >
> > > Duncan Murdoch
> > >
> > > >
> > > > Jim
> > > >
> > > > On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
> > > >  wrote:
> > > >>
> > > >> I agree with Abby. That would defeat the purpose of a QQ plot.
> > > >>
> > > >> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle  wrote:
> > > >>
> > > >>> Hi
> > > >>>
> > > >>> I'm not familiar with the qqman package, or GWAS studies.
> > > >>> However, my guess would be that you're *not* supposed to change the
> > > >>> position of the line.
> > > >>>
> > > >>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija 
> > > >>> 
> > > >>> wrote:
> > > 
> > >  Hi,
> > > 
> > >  I was using this library, qqman
> > >  https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
> > > 
> > >  to create QQ plot, attached. How would I change this default abline 
> > >  to
> > >  start from the beginning of my QQ line?
> > > 
> > >  This is my code:
> > >  qq(dd$P, main = "Q-Q plot of GWAS p-values")
> > > 
> > >  Thanks
> > >  Ana
> > >  __
> > >  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >  https://stat.ethz.ch/mailman/listinfo/r-help
> > >  PLEASE do read the posting guide
> > > >>> http://www.R-project.org/posting-guide.html
> > >  and provide commented, minimal, self-contained, reproducible code.
> > > >>>
> > > >>> __
> > > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > > >>> PLEASE do read the posting guide
> > > >>> http://www.R-project.org/posting-guide.html
> > > >>> and provide commented, minimal, self-contained, reproducible code.
> > > >>>
> > > >>
> > > >>  [[alternative HTML version deleted]]
> > > >>
> > > >> __
> > > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > > >> PLEASE do read the posting guide 
> > > >> http://www.R-project.org/posting-guide.html
> > > >> and provide commented, minimal, self-contained, reproducible code.
> > > >
> > > > __
> > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide 
> > > > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > > >
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.

__
R-

Re: [R] QQ plot

2019-11-12 Thread Ana Marija
Just do I need to change the axis when I multiply with 1000 and what
should I put on my axis?

On Tue, Nov 12, 2019 at 3:07 PM Ana Marija  wrote:
>
> Hi Duncan,
>
> yes I choose for QQ plot only P<1e-3 and multiplying everything with
> 1000 works great!
> This should not in my understanding influence the interpretation of
> the plot, it is only changing the scale of axis.
>
> Thank you so much,
> Ana
>
> On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch  
> wrote:
> >
> > On 12/11/2019 2:56 p.m., Jim Lemon wrote:
> > > I thought about this and did a little study of GWAS and the use of
> > > p-values to assess significant associations. As Ana's plot begins at
> > > values of about 0.001, this seems to imply that almost everything in
> > > the genome is associated to some degree. One expects that most SNPs
> > > will not be associated with a particular condition (p~1), so perhaps
> > > something is going wrong in the calculations that produce the
> > > p-values.
> >
> > I may be misunderstanding your last sentence, but if there is no
> > association, the p-value would usually have a uniform distribution from
> > 0 to 1, it wouldn't be near 1.
> >
> > I'd guess we're not seeing the p values from every test, only those that
> > are less than 0.001.  If that's true, and there are no effects, it makes
> > sense to multiply all of them by 1000 to get U(0,1) values.  On the
> > plot, that would correspond to subtracting 3 from -log10(p), or adding 3
> > to the reference line, as Ana requested.
> >
> > Or just multiply them by 1000 and pass them to qq():
> >
> >  qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values")
> >
> > As far as I can see, there's no way to tell qqman::qq to move the
> > reference line.
> >
> > Duncan Murdoch
> >
> > >
> > > Jim
> > >
> > > On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
> > >  wrote:
> > >>
> > >> I agree with Abby. That would defeat the purpose of a QQ plot.
> > >>
> > >> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle  wrote:
> > >>
> > >>> Hi
> > >>>
> > >>> I'm not familiar with the qqman package, or GWAS studies.
> > >>> However, my guess would be that you're *not* supposed to change the
> > >>> position of the line.
> > >>>
> > >>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija 
> > >>> 
> > >>> wrote:
> > 
> >  Hi,
> > 
> >  I was using this library, qqman
> >  https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
> > 
> >  to create QQ plot, attached. How would I change this default abline to
> >  start from the beginning of my QQ line?
> > 
> >  This is my code:
> >  qq(dd$P, main = "Q-Q plot of GWAS p-values")
> > 
> >  Thanks
> >  Ana
> >  __
> >  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >  https://stat.ethz.ch/mailman/listinfo/r-help
> >  PLEASE do read the posting guide
> > >>> http://www.R-project.org/posting-guide.html
> >  and provide commented, minimal, self-contained, reproducible code.
> > >>>
> > >>> __
> > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > >>> http://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>>
> > >>
> > >>  [[alternative HTML version deleted]]
> > >>
> > >> __
> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide 
> > >> http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] QQ plot

2019-11-12 Thread Ana Marija
Hi Duncan,

yes I choose for QQ plot only P<1e-3 and multiplying everything with
1000 works great!
This should not in my understanding influence the interpretation of
the plot, it is only changing the scale of axis.

Thank you so much,
Ana

On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch  wrote:
>
> On 12/11/2019 2:56 p.m., Jim Lemon wrote:
> > I thought about this and did a little study of GWAS and the use of
> > p-values to assess significant associations. As Ana's plot begins at
> > values of about 0.001, this seems to imply that almost everything in
> > the genome is associated to some degree. One expects that most SNPs
> > will not be associated with a particular condition (p~1), so perhaps
> > something is going wrong in the calculations that produce the
> > p-values.
>
> I may be misunderstanding your last sentence, but if there is no
> association, the p-value would usually have a uniform distribution from
> 0 to 1, it wouldn't be near 1.
>
> I'd guess we're not seeing the p values from every test, only those that
> are less than 0.001.  If that's true, and there are no effects, it makes
> sense to multiply all of them by 1000 to get U(0,1) values.  On the
> plot, that would correspond to subtracting 3 from -log10(p), or adding 3
> to the reference line, as Ana requested.
>
> Or just multiply them by 1000 and pass them to qq():
>
>  qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values")
>
> As far as I can see, there's no way to tell qqman::qq to move the
> reference line.
>
> Duncan Murdoch
>
> >
> > Jim
> >
> > On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
> >  wrote:
> >>
> >> I agree with Abby. That would defeat the purpose of a QQ plot.
> >>
> >> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle  wrote:
> >>
> >>> Hi
> >>>
> >>> I'm not familiar with the qqman package, or GWAS studies.
> >>> However, my guess would be that you're *not* supposed to change the
> >>> position of the line.
> >>>
> >>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija 
> >>> wrote:
> 
>  Hi,
> 
>  I was using this library, qqman
>  https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
> 
>  to create QQ plot, attached. How would I change this default abline to
>  start from the beginning of my QQ line?
> 
>  This is my code:
>  qq(dd$P, main = "Q-Q plot of GWAS p-values")
> 
>  Thanks
>  Ana
>  __
>  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>  https://stat.ethz.ch/mailman/listinfo/r-help
>  PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
>  and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] QQ plot

2019-11-12 Thread Duncan Murdoch

On 12/11/2019 2:56 p.m., Jim Lemon wrote:

I thought about this and did a little study of GWAS and the use of
p-values to assess significant associations. As Ana's plot begins at
values of about 0.001, this seems to imply that almost everything in
the genome is associated to some degree. One expects that most SNPs
will not be associated with a particular condition (p~1), so perhaps
something is going wrong in the calculations that produce the
p-values.


I may be misunderstanding your last sentence, but if there is no 
association, the p-value would usually have a uniform distribution from 
0 to 1, it wouldn't be near 1.


I'd guess we're not seeing the p values from every test, only those that 
are less than 0.001.  If that's true, and there are no effects, it makes 
sense to multiply all of them by 1000 to get U(0,1) values.  On the 
plot, that would correspond to subtracting 3 from -log10(p), or adding 3 
to the reference line, as Ana requested.


Or just multiply them by 1000 and pass them to qq():

qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values")

As far as I can see, there's no way to tell qqman::qq to move the 
reference line.


Duncan Murdoch



Jim

On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
 wrote:


I agree with Abby. That would defeat the purpose of a QQ plot.

On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle  wrote:


Hi

I'm not familiar with the qqman package, or GWAS studies.
However, my guess would be that you're *not* supposed to change the
position of the line.

On Tue, Nov 12, 2019 at 11:48 AM Ana Marija 
wrote:


Hi,

I was using this library, qqman
https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html

to create QQ plot, attached. How would I change this default abline to
start from the beginning of my QQ line?

This is my code:
qq(dd$P, main = "Q-Q plot of GWAS p-values")

Thanks
Ana
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] QQ plot

2019-11-12 Thread Jim Lemon
That refers to "normally" distributed data (see Greg Snow's comment
below the one you cite). P-values are not necessarily normally
distributed as you can see, and they must have a non-zero mean.

Jim

On Wed, Nov 13, 2019 at 7:07 AM Ana Marija  wrote:
>
> Hi,
>
> what I know so far that this kind of QQ plot is an indication that
> data has non zero mean:
> https://stats.stackexchange.com/questions/280634/how-to-interpret-qq-plot-not-on-the-line
>
> but is that an indication that something is wrong with the analysis?
>
> Thanks
> Ana
>
> On Tue, Nov 12, 2019 at 2:00 PM Jim Lemon  wrote:
> >
> > I thought about this and did a little study of GWAS and the use of
> > p-values to assess significant associations. As Ana's plot begins at
> > values of about 0.001, this seems to imply that almost everything in
> > the genome is associated to some degree. One expects that most SNPs
> > will not be associated with a particular condition (p~1), so perhaps
> > something is going wrong in the calculations that produce the
> > p-values.
> >
> > Jim
> >
> > On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
> >  wrote:
> > >
> > > I agree with Abby. That would defeat the purpose of a QQ plot.
> > >
> > > On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle  wrote:
> > >
> > > > Hi
> > > >
> > > > I'm not familiar with the qqman package, or GWAS studies.
> > > > However, my guess would be that you're *not* supposed to change the
> > > > position of the line.
> > > >
> > > > On Tue, Nov 12, 2019 at 11:48 AM Ana Marija 
> > > > 
> > > > wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I was using this library, qqman
> > > > > https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
> > > > >
> > > > > to create QQ plot, attached. How would I change this default abline to
> > > > > start from the beginning of my QQ line?
> > > > >
> > > > > This is my code:
> > > > > qq(dd$P, main = "Q-Q plot of GWAS p-values")
> > > > >
> > > > > Thanks
> > > > > Ana
> > > > > __
> > > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide
> > > > http://www.R-project.org/posting-guide.html
> > > > > and provide commented, minimal, self-contained, reproducible code.
> > > >
> > > > __
> > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> > > > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > > >
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] QQ plot

2019-11-12 Thread Bert Gunter
As this is O/T I'll keep it offlist.

Inline:


On Tue, Nov 12, 2019 at 12:00 PM Jim Lemon  wrote:

> I thought about this and did a little study of GWAS and the use of
> p-values to assess significant associations. As Ana's plot begins at
> values of about 0.001, this seems to imply that almost everything in
> the genome is associated to some degree. One expects that most SNPs
> will not be associated with a particular condition (p~1), so perhaps
> something is going wrong in the calculations that produce the
> p-values.
>

Exactly! Or possibly with the data handling pipeline prior to getting into
R.

-- Bert



>
> Jim
>
> On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
>  wrote:
> >
> > I agree with Abby. That would defeat the purpose of a QQ plot.
> >
> > On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle  wrote:
> >
> > > Hi
> > >
> > > I'm not familiar with the qqman package, or GWAS studies.
> > > However, my guess would be that you're *not* supposed to change the
> > > position of the line.
> > >
> > > On Tue, Nov 12, 2019 at 11:48 AM Ana Marija <
> sokovic.anamar...@gmail.com>
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > I was using this library, qqman
> > > > https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
> > > >
> > > > to create QQ plot, attached. How would I change this default abline
> to
> > > > start from the beginning of my QQ line?
> > > >
> > > > This is my code:
> > > > qq(dd$P, main = "Q-Q plot of GWAS p-values")
> > > >
> > > > Thanks
> > > > Ana
> > > > __
> > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] QQ plot

2019-11-12 Thread Ana Marija
Hi,

what I know so far that this kind of QQ plot is an indication that
data has non zero mean:
https://stats.stackexchange.com/questions/280634/how-to-interpret-qq-plot-not-on-the-line

but is that an indication that something is wrong with the analysis?

Thanks
Ana

On Tue, Nov 12, 2019 at 2:00 PM Jim Lemon  wrote:
>
> I thought about this and did a little study of GWAS and the use of
> p-values to assess significant associations. As Ana's plot begins at
> values of about 0.001, this seems to imply that almost everything in
> the genome is associated to some degree. One expects that most SNPs
> will not be associated with a particular condition (p~1), so perhaps
> something is going wrong in the calculations that produce the
> p-values.
>
> Jim
>
> On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
>  wrote:
> >
> > I agree with Abby. That would defeat the purpose of a QQ plot.
> >
> > On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle  wrote:
> >
> > > Hi
> > >
> > > I'm not familiar with the qqman package, or GWAS studies.
> > > However, my guess would be that you're *not* supposed to change the
> > > position of the line.
> > >
> > > On Tue, Nov 12, 2019 at 11:48 AM Ana Marija 
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > I was using this library, qqman
> > > > https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
> > > >
> > > > to create QQ plot, attached. How would I change this default abline to
> > > > start from the beginning of my QQ line?
> > > >
> > > > This is my code:
> > > > qq(dd$P, main = "Q-Q plot of GWAS p-values")
> > > >
> > > > Thanks
> > > > Ana
> > > > __
> > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] QQ plot

2019-11-12 Thread Miloš Žarković
Just a small comment. In GWAS studies p values are considerate to bi
significant whwn p < 10-6 or smaller
regards,

Miloš

On Tue, 12 Nov 2019 at 21:00, Jim Lemon  wrote:

> I thought about this and did a little study of GWAS and the use of
> p-values to assess significant associations. As Ana's plot begins at
> values of about 0.001, this seems to imply that almost everything in
> the genome is associated to some degree. One expects that most SNPs
> will not be associated with a particular condition (p~1), so perhaps
> something is going wrong in the calculations that produce the
> p-values.
>
> Jim
>
> On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
>  wrote:
> >
> > I agree with Abby. That would defeat the purpose of a QQ plot.
> >
> > On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle  wrote:
> >
> > > Hi
> > >
> > > I'm not familiar with the qqman package, or GWAS studies.
> > > However, my guess would be that you're *not* supposed to change the
> > > position of the line.
> > >
> > > On Tue, Nov 12, 2019 at 11:48 AM Ana Marija <
> sokovic.anamar...@gmail.com>
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > I was using this library, qqman
> > > > https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
> > > >
> > > > to create QQ plot, attached. How would I change this default abline
> to
> > > > start from the beginning of my QQ line?
> > > >
> > > > This is my code:
> > > > qq(dd$P, main = "Q-Q plot of GWAS p-values")
> > > >
> > > > Thanks
> > > > Ana
> > > > __
> > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] QQ plot

2019-11-12 Thread Ana Marija
details about my data if it is helpful:

> median(dd$P,na.rm = FALSE)
[1] 0.000444
> mean(dd$P,na.rm = FALSE)
[1] 0.000461
> min(dd$P,na.rm = FALSE)
[1] 9.89e-08
> max(dd$P,na.rm = FALSE)
[1] 0.001

On Tue, Nov 12, 2019 at 2:07 PM Ana Marija  wrote:
>
> Hi,
>
> what I know so far that this kind of QQ plot is an indication that
> data has non zero mean:
> https://stats.stackexchange.com/questions/280634/how-to-interpret-qq-plot-not-on-the-line
>
> but is that an indication that something is wrong with the analysis?
>
> Thanks
> Ana
>
> On Tue, Nov 12, 2019 at 2:00 PM Jim Lemon  wrote:
> >
> > I thought about this and did a little study of GWAS and the use of
> > p-values to assess significant associations. As Ana's plot begins at
> > values of about 0.001, this seems to imply that almost everything in
> > the genome is associated to some degree. One expects that most SNPs
> > will not be associated with a particular condition (p~1), so perhaps
> > something is going wrong in the calculations that produce the
> > p-values.
> >
> > Jim
> >
> > On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
> >  wrote:
> > >
> > > I agree with Abby. That would defeat the purpose of a QQ plot.
> > >
> > > On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle  wrote:
> > >
> > > > Hi
> > > >
> > > > I'm not familiar with the qqman package, or GWAS studies.
> > > > However, my guess would be that you're *not* supposed to change the
> > > > position of the line.
> > > >
> > > > On Tue, Nov 12, 2019 at 11:48 AM Ana Marija 
> > > > 
> > > > wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I was using this library, qqman
> > > > > https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
> > > > >
> > > > > to create QQ plot, attached. How would I change this default abline to
> > > > > start from the beginning of my QQ line?
> > > > >
> > > > > This is my code:
> > > > > qq(dd$P, main = "Q-Q plot of GWAS p-values")
> > > > >
> > > > > Thanks
> > > > > Ana
> > > > > __
> > > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide
> > > > http://www.R-project.org/posting-guide.html
> > > > > and provide commented, minimal, self-contained, reproducible code.
> > > >
> > > > __
> > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> > > > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > > >
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using xpath with xml2

2019-11-12 Thread Ben Tupper
Forehead smack!  Of course!

Thank you, Bill!

> On Nov 12, 2019, at 2:50 PM, William Dunlap  wrote:
> 
> > xml_ns(daymet)
> d1<-> http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0 
> 
> xlink <-> http://www.w3.org/1999/xlink 
> > daymet %>% xml2::xml_find_all(xpath = "d1:dataset")
> {xml_nodeset (1)}
> [1] http://tibco.com/>
> 
> On Tue, Nov 12, 2019 at 11:35 AM Ben Tupper  > wrote:
> Hi,
> 
> I have mined XML extensively with R before now, but my xpath chops seem to be 
> regressing recently. I know that I can roll up my sleeves and search through 
> the child nodes of the root, but I can't noodle out why using the xpath 
> description returns an empty nodeset.
> 
> Any suggestions and nudges most welcome.
> 
> ### START
> 
> library(xml2)
> library(httr)
> library(magrittr)
> 
> daymet_uri <- 
> "https://thredds.daac.ornl.gov/thredds/catalog/ornldaac/1328/catalog.xml 
> "
> 
> # run the following to show the node in a browser
> # httr::BROWSE(daymet_uri)
> 
> daymet <- httr::GET(daymet_uri) %>%
>   httr::content(type = "text/xml", encoding = "UTF-8")
> 
> # list the children "service" and "dataset"
> daymet %>% xml2::xml_children()
> #{xml_nodeset (2)}
> #[1] \n   name="odap" #serviceTyp ...
> #[2]  #
> # according to this tutorial we should find 'dataset'
> # https://www.w3schools.com/xml/xpath_syntax.asp 
> 
> daymet %>% xml2::xml_find_all(xpath = "//dataset")
> # {xml_nodeset (0)}
> 
> # I have also tried every other xpath combination I think of e.g.
> #   ".//dataset", "./dataset", "/dataset" and "dataset"
> # They each yield an empty nodeset
> 
> ### END
> 
> > sessionInfo()
> 
> R version 3.5.1 (2018-07-02)
> Platform: x86_64-redhat-linux-gnu (64-bit)
> Running under: CentOS Linux 7 (Core)
> 
> Matrix products: default
> BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
> 
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C  
>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8   
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C 
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C   
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods  
> [7] base 
> 
> other attached packages:
> [1] magrittr_1.5 httr_1.4.1   xml2_1.2.2  
> 
> loaded via a namespace (and not attached):
> [1] compiler_3.5.1 R6_2.4.0   tools_3.5.1curl_4.2  
> [5] yaml_2.2.0 Rcpp_1.0.3
> 
> 
> Thanks,
> Ben
> 
> Ben Tupper
> Bigelow Laboratory for Ocean Sciences
> 60 Bigelow Drive, P.O. Box 380
> East Boothbay, Maine 04544
> http://www.bigelow.org 
> 
> Ecological Forecasting: https://eco.bigelow.org/ 
> 
> __
> R-help@r-project.org  mailing list -- To 
> UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help 
> 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> 
> and provide commented, minimal, self-contained, reproducible code.

Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

Ecological Forecasting: https://eco.bigelow.org/






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] QQ plot

2019-11-12 Thread Jim Lemon
I thought about this and did a little study of GWAS and the use of
p-values to assess significant associations. As Ana's plot begins at
values of about 0.001, this seems to imply that almost everything in
the genome is associated to some degree. One expects that most SNPs
will not be associated with a particular condition (p~1), so perhaps
something is going wrong in the calculations that produce the
p-values.

Jim

On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
 wrote:
>
> I agree with Abby. That would defeat the purpose of a QQ plot.
>
> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle  wrote:
>
> > Hi
> >
> > I'm not familiar with the qqman package, or GWAS studies.
> > However, my guess would be that you're *not* supposed to change the
> > position of the line.
> >
> > On Tue, Nov 12, 2019 at 11:48 AM Ana Marija 
> > wrote:
> > >
> > > Hi,
> > >
> > > I was using this library, qqman
> > > https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
> > >
> > > to create QQ plot, attached. How would I change this default abline to
> > > start from the beginning of my QQ line?
> > >
> > > This is my code:
> > > qq(dd$P, main = "Q-Q plot of GWAS p-values")
> > >
> > > Thanks
> > > Ana
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using xpath with xml2

2019-11-12 Thread William Dunlap via R-help
> xml_ns(daymet)
d1<-> http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0
xlink <-> http://www.w3.org/1999/xlink
> daymet %>% xml2::xml_find_all(xpath = "d1:dataset")
{xml_nodeset (1)}
[1]  https://thredds.daac.ornl.gov/thredds/catalog/ornldaac/1328/catalog.xml";
>
> # run the following to show the node in a browser
> # httr::BROWSE(daymet_uri)
>
> daymet <- httr::GET(daymet_uri) %>%
>   httr::content(type = "text/xml", encoding = "UTF-8")
>
> # list the children "service" and "dataset"
> daymet %>% xml2::xml_children()
> #{xml_nodeset (2)}
> #[1] \n   name="odap" #serviceTyp ...
> #[2]  #
> # according to this tutorial we should find 'dataset'
> # https://www.w3schools.com/xml/xpath_syntax.asp
> daymet %>% xml2::xml_find_all(xpath = "//dataset")
> # {xml_nodeset (0)}
>
> # I have also tried every other xpath combination I think of e.g.
> #   ".//dataset", "./dataset", "/dataset" and "dataset"
> # They each yield an empty nodeset
>
> ### END
>
> > sessionInfo()
>
> R version 3.5.1 (2018-07-02)
> Platform: x86_64-redhat-linux-gnu (64-bit)
> Running under: CentOS Linux 7 (Core)
>
> Matrix products: default
> BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods
> [7] base
>
> other attached packages:
> [1] magrittr_1.5 httr_1.4.1   xml2_1.2.2
>
> loaded via a namespace (and not attached):
> [1] compiler_3.5.1 R6_2.4.0   tools_3.5.1curl_4.2
> [5] yaml_2.2.0 Rcpp_1.0.3
>
>
> Thanks,
> Ben
>
> Ben Tupper
> Bigelow Laboratory for Ocean Sciences
> 60 Bigelow Drive, P.O. Box 380
> East Boothbay, Maine 04544
> http://www.bigelow.org
>
> Ecological Forecasting: https://eco.bigelow.org/
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using xpath with xml2

2019-11-12 Thread Ben Tupper
Hi,

I have mined XML extensively with R before now, but my xpath chops seem to be 
regressing recently. I know that I can roll up my sleeves and search through 
the child nodes of the root, but I can't noodle out why using the xpath 
description returns an empty nodeset.

Any suggestions and nudges most welcome.

### START

library(xml2)
library(httr)
library(magrittr)

daymet_uri <- 
"https://thredds.daac.ornl.gov/thredds/catalog/ornldaac/1328/catalog.xml";

# run the following to show the node in a browser
# httr::BROWSE(daymet_uri)

daymet <- httr::GET(daymet_uri) %>%
  httr::content(type = "text/xml", encoding = "UTF-8")

# list the children "service" and "dataset"
daymet %>% xml2::xml_children()
#{xml_nodeset (2)}
#[1] \n  https://www.w3schools.com/xml/xpath_syntax.asp
daymet %>% xml2::xml_find_all(xpath = "//dataset")
# {xml_nodeset (0)}

# I have also tried every other xpath combination I think of e.g.
#   ".//dataset", "./dataset", "/dataset" and "dataset"
# They each yield an empty nodeset

### END

> sessionInfo()

R version 3.5.1 (2018-07-02)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods  
[7] base 

other attached packages:
[1] magrittr_1.5 httr_1.4.1   xml2_1.2.2  

loaded via a namespace (and not attached):
[1] compiler_3.5.1 R6_2.4.0   tools_3.5.1curl_4.2  
[5] yaml_2.2.0 Rcpp_1.0.3


Thanks,
Ben

Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

Ecological Forecasting: https://eco.bigelow.org/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] QQ plot

2019-11-12 Thread Patrick (Malone Quantitative)
I agree with Abby. That would defeat the purpose of a QQ plot.

On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle  wrote:

> Hi
>
> I'm not familiar with the qqman package, or GWAS studies.
> However, my guess would be that you're *not* supposed to change the
> position of the line.
>
> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija 
> wrote:
> >
> > Hi,
> >
> > I was using this library, qqman
> > https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
> >
> > to create QQ plot, attached. How would I change this default abline to
> > start from the beginning of my QQ line?
> >
> > This is my code:
> > qq(dd$P, main = "Q-Q plot of GWAS p-values")
> >
> > Thanks
> > Ana
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.