Re: [R] quantile from quantile table calculation without original data

2021-03-12 Thread Abby Spurdle
Hi Petr,

In principle, I like David's approach the best.
However, I note that there's a bug in the squared step.
Furthemore, the variance of the sample quantiles should increase as
they move away from the modal region.

I've built on David's approach, but changed it to a two stage
optimization algorithm.
The parameter estimates from the first stage are used to compute density values.
Then the second stage is weighted, using the scaled density values.

I tried to create an iteratively reweighted algorithm.
However, it didn't converge.
(But that doesn't necessarily mean it can't be done).

The following code returns the value: 1.648416e-05

qfit.lnorm <- function (p, q, lower.tail=TRUE, ...,
par0 = c (-0.5, 0.5) )
{   n <- length (p)
qsample <- q

objf <- function (par)
{   qmodel <- qlnorm (p, par [1], par [2], lower.tail)
sum ( (qmodel - qsample)^2) / n
}
objf.w <- function (wpar, w)
{   qmodel <- qlnorm (p, wpar [1], wpar [2], lower.tail)
sum (w * (qmodel - qsample)^2)
}

wpar0 <- optim (par0, objf)$par
w <- dlnorm (p, wpar0 [1], wpar0 [2], lower.tail)
optim (wpar0, objf.w,, w=w)
}

par <- qfit.lnorm (temp$percent, temp$size, FALSE)$par
plnorm (0.1, par [1], par [2])


On Tue, Mar 9, 2021 at 2:52 AM PIKAL Petr  wrote:
>
> Hallo David, Abby and Bert
>
> Thank you for your solutions. In the meantime I found package 
> rriskDistributions, which was able to calculate values for lognormal 
> distribution from quantiles.
>
> Abby
> > 1-psolution
> [1] 9.980823e-06
>
> David
> > plnorm(0.1, -.7020649, .4678656)
> [1] 0.0003120744
>
> rriskDistributions
> > plnorm(0.1, -.6937355, .3881209)
> [1] 1.697379e-05
>
> Bert suggested to ask for original data before quantile calculation what is 
> probably the best but also the most problematic solution. Actually, maybe 
> original data are unavailable as it is the result from particle size 
> measurement, where the software always twist the original data and spits only 
> descriptive results.
>
> All your results are quite consistent with the available values as they are 
> close to 1, so for me, each approach works.
>
> Thank you again.
>
> Best regards.
> Petr
>
> > -Original Message-
> > From: David Winsemius 
> > Sent: Sunday, March 7, 2021 1:33 AM
> > To: Abby Spurdle ; PIKAL Petr
> > 
> > Cc: r-help@r-project.org
> > Subject: Re: [R] quantile from quantile table calculation without original 
> > data
> >
> >
> > On 3/6/21 1:02 AM, Abby Spurdle wrote:
> > > I came up with a solution.
> > > But not necessarily the best solution.
> > >
> > > I used a spline to approximate the quantile function.
> > > Then use that to generate a large sample.
> > > (I don't see any need for the sample to be random, as such).
> > > Then compute the sample mean and sd, on a log scale.
> > > Finally, plug everything into the plnorm function:
> > >
> > > p <- seq (0.01, 0.99,, 1e6)
> > > Fht <- splinefun (temp$percent, temp$size) x <- log (Fht (p) )
> > > psolution <- plnorm (0.1, mean (x), sd (x), FALSE) psolution
> > >
> > > The value of the solution is very close to one.
> > > Which is not a surprise.
> > >
> > > Here's a plot of everything:
> > >
> > > u <- seq (0.01, 1.65,, 200)
> > > v <- plnorm (u, mean (x), sd (x), FALSE) plot (u, v, type="l", ylim =
> > > c (0, 1) ) points (temp$size, temp$percent, pch=16) points (0.1,
> > > psolution, pch=16, col="blue")
> >
> > Here's another approach, which uses minimization of the squared error to
> > get the parameters for a lognormal distribution.
> >
> > temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069, 0.3781,
> > 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 
> > 0.95,
> > 0.99)), .Names = c("size", "percent"
> > ), row.names = c(NA, -9L), class = "data.frame")
> >
> > obj <- function(x) {sum( qlnorm(1-temp$percent, x[[1]], x[[2]])-temp$size
> > )^2}
> >
> > # Note the inversion of the poorly named and flipped "percent" column,
> >
> > optim( list(a=-0.65, b=0.42), obj)
> >
> > #
> >
> > $par
> >   a  b
> > -0.7020649  0.4678656
> >
> > $value
> > [1] 3.110316e-12
> >
> > $counts
> > function gradient
> >51   NA
> >
> > $convergence
> > [1] 0
> >
> > $message
> > NULL
> >

Re: [R] quantile from quantile table calculation without original data

2021-03-08 Thread PIKAL Petr
Hallo David, Abby and Bert

Thank you for your solutions. In the meantime I found package 
rriskDistributions, which was able to calculate values for lognormal 
distribution from quantiles.

Abby
> 1-psolution
[1] 9.980823e-06

David
> plnorm(0.1, -.7020649, .4678656)
[1] 0.0003120744

rriskDistributions
> plnorm(0.1, -.6937355, .3881209)
[1] 1.697379e-05

Bert suggested to ask for original data before quantile calculation what is 
probably the best but also the most problematic solution. Actually, maybe 
original data are unavailable as it is the result from particle size 
measurement, where the software always twist the original data and spits only 
descriptive results.

All your results are quite consistent with the available values as they are 
close to 1, so for me, each approach works.

Thank you again.

Best regards.
Petr

> -Original Message-
> From: David Winsemius 
> Sent: Sunday, March 7, 2021 1:33 AM
> To: Abby Spurdle ; PIKAL Petr
> 
> Cc: r-help@r-project.org
> Subject: Re: [R] quantile from quantile table calculation without original 
> data
> 
> 
> On 3/6/21 1:02 AM, Abby Spurdle wrote:
> > I came up with a solution.
> > But not necessarily the best solution.
> >
> > I used a spline to approximate the quantile function.
> > Then use that to generate a large sample.
> > (I don't see any need for the sample to be random, as such).
> > Then compute the sample mean and sd, on a log scale.
> > Finally, plug everything into the plnorm function:
> >
> > p <- seq (0.01, 0.99,, 1e6)
> > Fht <- splinefun (temp$percent, temp$size) x <- log (Fht (p) )
> > psolution <- plnorm (0.1, mean (x), sd (x), FALSE) psolution
> >
> > The value of the solution is very close to one.
> > Which is not a surprise.
> >
> > Here's a plot of everything:
> >
> > u <- seq (0.01, 1.65,, 200)
> > v <- plnorm (u, mean (x), sd (x), FALSE) plot (u, v, type="l", ylim =
> > c (0, 1) ) points (temp$size, temp$percent, pch=16) points (0.1,
> > psolution, pch=16, col="blue")
> 
> Here's another approach, which uses minimization of the squared error to
> get the parameters for a lognormal distribution.
> 
> temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069, 0.3781,
> 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 
> 0.95,
> 0.99)), .Names = c("size", "percent"
> ), row.names = c(NA, -9L), class = "data.frame")
> 
> obj <- function(x) {sum( qlnorm(1-temp$percent, x[[1]], x[[2]])-temp$size
> )^2}
> 
> # Note the inversion of the poorly named and flipped "percent" column,
> 
> optim( list(a=-0.65, b=0.42), obj)
> 
> #
> 
> $par
>   a  b
> -0.7020649  0.4678656
> 
> $value
> [1] 3.110316e-12
> 
> $counts
> function gradient
>51   NA
> 
> $convergence
> [1] 0
> 
> $message
> NULL
> 
> 
> I'm not sure how principled this might be. There's no consideration in this
> approach for expected sampling error at the right tail where the magnitudes
> of the observed values will create much larger contributions to the sum of
> squares.
> 
> --
> 
> David.
> 
> >
> >
> > On Sat, Mar 6, 2021 at 8:09 PM Abby Spurdle 
> wrote:
> >> I'm sorry.
> >> I misread your example, this morning.
> >> (I didn't read the code after the line that calls plot).
> >>
> >> After looking at this problem again, interpolation doesn't apply, and
> >> extrapolation would be a last resort.
> >> If you can assume your data comes from a particular type of
> >> distribution, such as a lognormal distribution, then a better
> >> approach would be to find the most likely parameters.
> >>
> >> i.e.
> >> This falls within the broader scope of maximum likelihood.
> >> (Except that you're dealing with a table of quantile-probability
> >> pairs, rather than raw observational data).
> >>
> >> I suspect that there's a relatively easy way of finding the parameters.
> >>
> >> I'll think about it...
> >> But someone else may come back with an answer first...
> >>
> >>
> >> On Sat, Mar 6, 2021 at 8:17 AM Abby Spurdle 
> wrote:
> >>> I note three problems with your data:
> >>> (1) The name "percent" is misleading, perhaps you want "probability"?
> >>> (2) There are straight (or near-straight) regions, each of which, is
> >>> equally (or near-equally) spaced, which is not what I would expect
> >>> in p

Re: [R] quantile from quantile table calculation without original data

2021-03-08 Thread Jeff Newmiller
I am aware of that... I have my own functions for this purpose that use 
splinefun. But if you are trying to also do other aspects of probability 
distribution calculations, it looked like using fBasics would be easier than 
re-inventing the wheel. I could be wrong, though, since I haven't used fBasics 
myself.

On March 8, 2021 12:41:40 AM PST, Martin Maechler  
wrote:
>> Jeff Newmiller 
>> on Fri, 05 Mar 2021 10:09:41 -0800 writes:
>
>> Your example could probably be resolved with approx. If
>> you want a more robust solution, it looks like the fBasics
>> package can do spline interpolation. 
>
>base R's  spline package does spline interpolation !!
>
>Martin

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quantile from quantile table calculation without original data

2021-03-08 Thread Martin Maechler
> Jeff Newmiller 
> on Fri, 05 Mar 2021 10:09:41 -0800 writes:

> Your example could probably be resolved with approx. If
> you want a more robust solution, it looks like the fBasics
> package can do spline interpolation. 

base R's  spline package does spline interpolation !!

Martin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quantile from quantile table calculation without original data

2021-03-06 Thread David Winsemius



On 3/6/21 1:02 AM, Abby Spurdle wrote:

I came up with a solution.
But not necessarily the best solution.

I used a spline to approximate the quantile function.
Then use that to generate a large sample.
(I don't see any need for the sample to be random, as such).
Then compute the sample mean and sd, on a log scale.
Finally, plug everything into the plnorm function:

p <- seq (0.01, 0.99,, 1e6)
Fht <- splinefun (temp$percent, temp$size)
x <- log (Fht (p) )
psolution <- plnorm (0.1, mean (x), sd (x), FALSE)
psolution

The value of the solution is very close to one.
Which is not a surprise.

Here's a plot of everything:

u <- seq (0.01, 1.65,, 200)
v <- plnorm (u, mean (x), sd (x), FALSE)
plot (u, v, type="l", ylim = c (0, 1) )
points (temp$size, temp$percent, pch=16)
points (0.1, psolution, pch=16, col="blue")


Here's another approach, which uses minimization of the squared error to 
get the parameters for a lognormal distribution.


temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069,
0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1,
0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent"
), row.names = c(NA, -9L), class = "data.frame")

obj <- function(x) {sum( qlnorm(1-temp$percent, x[[1]], 
x[[2]])-temp$size )^2}


# Note the inversion of the poorly named and flipped "percent" column,

optim( list(a=-0.65, b=0.42), obj)

#

$par
 a  b
-0.7020649  0.4678656

$value
[1] 3.110316e-12

$counts
function gradient
  51   NA

$convergence
[1] 0

$message
NULL


I'm not sure how principled this might be. There's no consideration in 
this approach for expected sampling error at the right tail where the 
magnitudes of the observed values will create much larger contributions 
to the sum of squares.


--

David.




On Sat, Mar 6, 2021 at 8:09 PM Abby Spurdle  wrote:

I'm sorry.
I misread your example, this morning.
(I didn't read the code after the line that calls plot).

After looking at this problem again, interpolation doesn't apply, and
extrapolation would be a last resort.
If you can assume your data comes from a particular type of
distribution, such as a lognormal distribution, then a better approach
would be to find the most likely parameters.

i.e.
This falls within the broader scope of maximum likelihood.
(Except that you're dealing with a table of quantile-probability
pairs, rather than raw observational data).

I suspect that there's a relatively easy way of finding the parameters.

I'll think about it...
But someone else may come back with an answer first...


On Sat, Mar 6, 2021 at 8:17 AM Abby Spurdle  wrote:

I note three problems with your data:
(1) The name "percent" is misleading, perhaps you want "probability"?
(2) There are straight (or near-straight) regions, each of which, is
equally (or near-equally) spaced, which is not what I would expect in
problems involving "quantiles".
(3) Your plot (approximating the distribution function) is
back-the-front (as per what is customary).


On Fri, Mar 5, 2021 at 10:14 PM PIKAL Petr  wrote:

Dear all

I have table of quantiles, probably from lognormal distribution

  dput(temp)
temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069,
0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1,
0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent"
), row.names = c(NA, -9L), class = "data.frame")

and I need to calculate quantile for size 0.1

plot(temp$size, temp$percent, pch=19, xlim=c(0,2))
ss <- approxfun(temp$size, temp$percent)
points((0:100)/50, ss((0:100)/50))
abline(v=.1)

If I had original data it would be quite easy with ecdf/quantile function but 
without it I am lost what function I could use for such task.

Please, give me some hint where to look.


Best regards

Petr
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner's personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohlá±ení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/


 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

Re: [R] quantile from quantile table calculation without original data

2021-03-06 Thread Abby Spurdle
I came up with a solution.
But not necessarily the best solution.

I used a spline to approximate the quantile function.
Then use that to generate a large sample.
(I don't see any need for the sample to be random, as such).
Then compute the sample mean and sd, on a log scale.
Finally, plug everything into the plnorm function:

p <- seq (0.01, 0.99,, 1e6)
Fht <- splinefun (temp$percent, temp$size)
x <- log (Fht (p) )
psolution <- plnorm (0.1, mean (x), sd (x), FALSE)
psolution

The value of the solution is very close to one.
Which is not a surprise.

Here's a plot of everything:

u <- seq (0.01, 1.65,, 200)
v <- plnorm (u, mean (x), sd (x), FALSE)
plot (u, v, type="l", ylim = c (0, 1) )
points (temp$size, temp$percent, pch=16)
points (0.1, psolution, pch=16, col="blue")


On Sat, Mar 6, 2021 at 8:09 PM Abby Spurdle  wrote:
>
> I'm sorry.
> I misread your example, this morning.
> (I didn't read the code after the line that calls plot).
>
> After looking at this problem again, interpolation doesn't apply, and
> extrapolation would be a last resort.
> If you can assume your data comes from a particular type of
> distribution, such as a lognormal distribution, then a better approach
> would be to find the most likely parameters.
>
> i.e.
> This falls within the broader scope of maximum likelihood.
> (Except that you're dealing with a table of quantile-probability
> pairs, rather than raw observational data).
>
> I suspect that there's a relatively easy way of finding the parameters.
>
> I'll think about it...
> But someone else may come back with an answer first...
>
>
> On Sat, Mar 6, 2021 at 8:17 AM Abby Spurdle  wrote:
> >
> > I note three problems with your data:
> > (1) The name "percent" is misleading, perhaps you want "probability"?
> > (2) There are straight (or near-straight) regions, each of which, is
> > equally (or near-equally) spaced, which is not what I would expect in
> > problems involving "quantiles".
> > (3) Your plot (approximating the distribution function) is
> > back-the-front (as per what is customary).
> >
> >
> > On Fri, Mar 5, 2021 at 10:14 PM PIKAL Petr  wrote:
> > >
> > > Dear all
> > >
> > > I have table of quantiles, probably from lognormal distribution
> > >
> > >  dput(temp)
> > > temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069,
> > > 0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1,
> > > 0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent"
> > > ), row.names = c(NA, -9L), class = "data.frame")
> > >
> > > and I need to calculate quantile for size 0.1
> > >
> > > plot(temp$size, temp$percent, pch=19, xlim=c(0,2))
> > > ss <- approxfun(temp$size, temp$percent)
> > > points((0:100)/50, ss((0:100)/50))
> > > abline(v=.1)
> > >
> > > If I had original data it would be quite easy with ecdf/quantile function 
> > > but without it I am lost what function I could use for such task.
> > >
> > > Please, give me some hint where to look.
> > >
> > >
> > > Best regards
> > >
> > > Petr
> > > Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
> > > partnerů PRECHEZA a.s. jsou zveřejněny na: 
> > > https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information 
> > > about processing and protection of business partner's personal data are 
> > > available on website: 
> > > https://www.precheza.cz/en/personal-data-protection-principles/
> > > Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou 
> > > důvěrné a podléhají tomuto právně závaznému prohlá±ení o vyloučení 
> > > odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any 
> > > documents attached to it may be confidential and are subject to the 
> > > legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
> > >
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quantile from quantile table calculation without original data

2021-03-05 Thread Abby Spurdle
I'm sorry.
I misread your example, this morning.
(I didn't read the code after the line that calls plot).

After looking at this problem again, interpolation doesn't apply, and
extrapolation would be a last resort.
If you can assume your data comes from a particular type of
distribution, such as a lognormal distribution, then a better approach
would be to find the most likely parameters.

i.e.
This falls within the broader scope of maximum likelihood.
(Except that you're dealing with a table of quantile-probability
pairs, rather than raw observational data).

I suspect that there's a relatively easy way of finding the parameters.

I'll think about it...
But someone else may come back with an answer first...


On Sat, Mar 6, 2021 at 8:17 AM Abby Spurdle  wrote:
>
> I note three problems with your data:
> (1) The name "percent" is misleading, perhaps you want "probability"?
> (2) There are straight (or near-straight) regions, each of which, is
> equally (or near-equally) spaced, which is not what I would expect in
> problems involving "quantiles".
> (3) Your plot (approximating the distribution function) is
> back-the-front (as per what is customary).
>
>
> On Fri, Mar 5, 2021 at 10:14 PM PIKAL Petr  wrote:
> >
> > Dear all
> >
> > I have table of quantiles, probably from lognormal distribution
> >
> >  dput(temp)
> > temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069,
> > 0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1,
> > 0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent"
> > ), row.names = c(NA, -9L), class = "data.frame")
> >
> > and I need to calculate quantile for size 0.1
> >
> > plot(temp$size, temp$percent, pch=19, xlim=c(0,2))
> > ss <- approxfun(temp$size, temp$percent)
> > points((0:100)/50, ss((0:100)/50))
> > abline(v=.1)
> >
> > If I had original data it would be quite easy with ecdf/quantile function 
> > but without it I am lost what function I could use for such task.
> >
> > Please, give me some hint where to look.
> >
> >
> > Best regards
> >
> > Petr
> > Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
> > partnerů PRECHEZA a.s. jsou zveřejněny na: 
> > https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
> > processing and protection of business partner's personal data are available 
> > on website: https://www.precheza.cz/en/personal-data-protection-principles/
> > Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné 
> > a podléhají tomuto právně závaznému prohlá±ení o vyloučení odpovědnosti: 
> > https://www.precheza.cz/01-dovetek/ | This email and any documents attached 
> > to it may be confidential and are subject to the legally binding 
> > disclaimer: https://www.precheza.cz/en/01-disclaimer/
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quantile from quantile table calculation without original data

2021-03-05 Thread Abby Spurdle
I note three problems with your data:
(1) The name "percent" is misleading, perhaps you want "probability"?
(2) There are straight (or near-straight) regions, each of which, is
equally (or near-equally) spaced, which is not what I would expect in
problems involving "quantiles".
(3) Your plot (approximating the distribution function) is
back-the-front (as per what is customary).


On Fri, Mar 5, 2021 at 10:14 PM PIKAL Petr  wrote:
>
> Dear all
>
> I have table of quantiles, probably from lognormal distribution
>
>  dput(temp)
> temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069,
> 0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1,
> 0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent"
> ), row.names = c(NA, -9L), class = "data.frame")
>
> and I need to calculate quantile for size 0.1
>
> plot(temp$size, temp$percent, pch=19, xlim=c(0,2))
> ss <- approxfun(temp$size, temp$percent)
> points((0:100)/50, ss((0:100)/50))
> abline(v=.1)
>
> If I had original data it would be quite easy with ecdf/quantile function but 
> without it I am lost what function I could use for such task.
>
> Please, give me some hint where to look.
>
>
> Best regards
>
> Petr
> Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
> partnerů PRECHEZA a.s. jsou zveřejněny na: 
> https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
> processing and protection of business partner's personal data are available 
> on website: https://www.precheza.cz/en/personal-data-protection-principles/
> Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
> podléhají tomuto právně závaznému prohlá±ení o vyloučení odpovědnosti: 
> https://www.precheza.cz/01-dovetek/ | This email and any documents attached 
> to it may be confidential and are subject to the legally binding disclaimer: 
> https://www.precheza.cz/en/01-disclaimer/
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quantile from quantile table calculation without original data

2021-03-05 Thread David Winsemius



On 3/5/21 1:14 AM, PIKAL Petr wrote:

Dear all

I have table of quantiles, probably from lognormal distribution

  dput(temp)
temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069,
0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1,
0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent"
), row.names = c(NA, -9L), class = "data.frame")

and I need to calculate quantile for size 0.1

plot(temp$size, temp$percent, pch=19, xlim=c(0,2))
ss <- approxfun(temp$size, temp$percent)
points((0:100)/50, ss((0:100)/50))
abline(v=.1)

If I had original data it would be quite easy with ecdf/quantile function but 
without it I am lost what function I could use for such task.


The quantiles are in reverse order so tryoing to match the data to 
quantiles from candidate parameters requires subtracting them from unity:


> temp$size
[1] 1.6000 0.9466 0.8062 0.6477 0.5069 0.3781 0.3047 0.2681 0.1907
> qlnorm(1-temp$percent, -.5)
[1] 6.21116124 3.14198142 2.18485959 1.19063854 0.60653066 0.30897659 
0.16837670 0.11708517 0.05922877

> qlnorm(1-temp$percent, -.9)
[1] 4.16346589 2.10613313 1.46455518 0.79810888 0.40656966 0.20711321 
0.11286628 0.07848454 0.03970223

> qlnorm(1-temp$percent, -2)
[1] 1.38589740 0.70107082 0.48750807 0.26566737 0.13533528 0.06894200 
0.03756992 0.02612523 0.01321572

> qlnorm(1-temp$percent, -1.6)
[1] 2.06751597 1.04587476 0.72727658 0.39632914 0.20189652 0.10284937 
0.05604773 0.03897427 0.01971554

> qlnorm(1-temp$percent, -1.6, .5)
[1] 0.64608380 0.45951983 0.38319004 0.28287360 0.20189652 0.14410042 
0.10637595 0.08870608 0.06309120

> qlnorm(1-temp$percent, -1, .5)
[1] 1.1772414 0.8372997 0.6982178 0.5154293 0.3678794 0.2625681 
0.1938296 0.1616330 0.1149597

> qlnorm(1-temp$percent, -1, .4)
[1] 0.9328967 0.7103066 0.6142340 0.4818106 0.3678794 0.2808889 
0.2203318 0.1905308 0.1450700

> qlnorm(1-temp$percent, -0.5, .4)
[1] 1.5380866 1.1710976 1.0127006 0.7943715 0.6065307 0.4631076 
0.3632657 0.3141322 0.2391799

> qlnorm(1-temp$percent, -0.55, .4)
[1] 1.4630732 1.1139825 0.9633106 0.7556295 0.5769498 0.4405216 
0.3455491 0.2988118 0.2275150

> qlnorm(1-temp$percent, -0.55, .35)
[1] 1.3024170 1.0260318 0.9035201 0.7305712 0.5769498 0.4556313 
0.3684158 0.3244257 0.2555795

> qlnorm(1-temp$percent, -0.55, .45)
[1] 1.6435467 1.2094723 1.0270578 0.7815473 0.5769498 0.4259129 
0.3241016 0.2752201 0.2025322

> qlnorm(1-temp$percent, -0.53, .45)
[1] 1.6767486 1.2339052 1.0478057 0.7973356 0.5886050 0.4345169 
0.3306489 0.2807799 0.2066236

> qlnorm(1-temp$percent, -0.57, .45)
[1] 1.6110023 1.1855231 1.0067207 0.7660716 0.5655254 0.4174793 
0.3176840 0.2697704 0.1985218


Seems like it might be an acceptable fit. modulo the underlying data 
gathering situation which really should be considered.


You can fiddle with that result. My statistical hat (not of PhD level 
certification) says that the middle quantiles in this sequence probably 
have the lowest sampling error for a lognormal, but I'm rather unsure 
about that. A counter-argument might be that since there is a hard lower 
bound of 0 for the 0-th quantile that you should be more worried about 
matching the 0.1907 value to the 0.01 order statistic, since 99% of the 
data is know to be above it. Seems like efforts at matching the 0.50 
quantile to 0.5069 for the logmean parameter and matching the 0.01 
quantile 0.1907 for estimation of  the variance estimate might be 
preferred to worrying too much about the 1.6 value which would be in the 
right tail (and far away from your region of extrapolation.)



Further trial and error:


> qlnorm(1-temp$percent, -0.58, .47)
[1] 1.6709353 1.2129813 1.0225804 0.7687497 0.5598984 0.4077870 
0.3065638 0.2584427 0.1876112

> qlnorm(1-temp$percent, -0.65, .47)
[1] 1.5579697 1.1309763 0.9534476 0.7167775 0.5220458 0.3802181 
0.2858382 0.2409704 0.1749275

> qlnorm(1-temp$percent, -0.65, .5)
[1] 1.6705851 1.1881849 0.9908182 0.7314290 0.5220458 0.3726018 
0.2750573 0.2293682 0.1631355

> qlnorm(1-temp$percent, -0.65, .4)
[1] 1.3238434 1.0079731 0.8716395 0.6837218 0.5220458 0.3986004 
0.3126657 0.2703761 0.2058641

> qlnorm(1-temp$percent, -0.68, .4)
[1] 1.2847179 0.9781830 0.8458786 0.6635148 0.5066170 0.3868200 
0.3034251 0.2623852 0.1997799

> qlnorm(1-temp$percent, -0.65, .39)
[1] 1.2934016 0.9915290 0.8605402 0.6791257 0.5220458 0.4012980 
0.3166985 0.2748601 0.2107093

>
> qlnorm(1-temp$percent, -0.65, .42)
[1] 1.3868932 1.0416839 0.8942693 0.6930076 0.5220458 0.3932595 
0.3047536 0.2616262 0.1965053

> qlnorm(1-temp$percent, -0.68, .42)
[1] 1.3459043 1.0108975 0.8678396 0.6725261 0.5066170 0.3816369 
0.2957468 0.2538940 0.1906976



(I did make an effort at searching for quantile matching as a method for 
distribution fitting, but came up empty.)



--

David.



Please, give me some hint where to look.


Best regards

Petr

>[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and 

Re: [R] quantile from quantile table calculation without original data

2021-03-05 Thread Jeff Newmiller
Your example could probably be resolved with approx. If you want a more robust 
solution, it looks like the fBasics package can do spline interpolation. You 
may want to spline on the log  of your size variable and use exp on the output 
if you want to avoid negative results.

On March 5, 2021 1:14:22 AM PST, PIKAL Petr  wrote:
>Dear all
>
>I have table of quantiles, probably from lognormal distribution
>
> dput(temp)
>temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069,
>0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1,
>0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent"
>), row.names = c(NA, -9L), class = "data.frame")
>
>and I need to calculate quantile for size 0.1
>
>plot(temp$size, temp$percent, pch=19, xlim=c(0,2))
>ss <- approxfun(temp$size, temp$percent)
>points((0:100)/50, ss((0:100)/50))
>abline(v=.1)
>
>If I had original data it would be quite easy with ecdf/quantile
>function but without it I am lost what function I could use for such
>task.
>
>Please, give me some hint where to look.
>
>
>Best regards
>
>Petr
>Osobn� �daje: Informace o zpracov�n� a ochran� osobn�ch �daj�
>obchodn�ch partner� PRECHEZA a.s. jsou zve�ejn�ny na:
>https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information
>about processing and protection of business partner's personal data are
>available on website:
>https://www.precheza.cz/en/personal-data-protection-principles/
>D�v�rnost: Tento e-mail a jak�koliv k n�mu p�ipojen� dokumenty jsou
>d�v�rn� a podl�haj� tomuto pr�vn� z�vazn�mu prohl�en� o vylou�en�
>odpov�dnosti: https://www.precheza.cz/01-dovetek/ | This email and any
>documents attached to it may be confidential and are subject to the
>legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
>
>
>   [[alternative HTML version deleted]]

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] quantile from quantile table calculation without original data

2021-03-05 Thread PIKAL Petr
Dear all

I have table of quantiles, probably from lognormal distribution

 dput(temp)
temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069,
0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1,
0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent"
), row.names = c(NA, -9L), class = "data.frame")

and I need to calculate quantile for size 0.1

plot(temp$size, temp$percent, pch=19, xlim=c(0,2))
ss <- approxfun(temp$size, temp$percent)
points((0:100)/50, ss((0:100)/50))
abline(v=.1)

If I had original data it would be quite easy with ecdf/quantile function but 
without it I am lost what function I could use for such task.

Please, give me some hint where to look.


Best regards

Petr
Osobn� �daje: Informace o zpracov�n� a ochran� osobn�ch �daj� obchodn�ch 
partner� PRECHEZA a.s. jsou zve�ejn�ny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner's personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
D�v�rnost: Tento e-mail a jak�koliv k n�mu p�ipojen� dokumenty jsou d�v�rn� a 
podl�haj� tomuto pr�vn� z�vazn�mu prohl�en� o vylou�en� odpov�dnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.