Re: [R] Quadratic function with interaction terms for the PLS fitting model?

2017-07-13 Thread Bert Gunter
David et.al.:

It's a problem with poly (or rather with how it is being misused)

> mx <- as.matrix(gasoline[1:50,"NIR"])
> str(mx)
 AsIs [1:50, 1:401] -0.0502 -0.0442 -0.0469 -0.0467 -0.0509 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:50] "1" "2" "3" "4" ...
  ..$ : chr [1:401] "900 nm" "902 nm" "904 nm" "906 nm" ...

> poly(mx[,1:5],2)  ## only 5 columns
Error in poly(dots[[i]], degree, raw = raw, simple = raw) :
  'degree' must be less than number of unique points
> out <- poly(mx[,1:5], degree =2)
> dim(out)
[1] 50 20
## So this is same issue as before. But:

> out <- poly(mx[,1:30],degree = 2)

## 30 columns means 30*30 =900 2nd degree terms, but there are at most 50
## orthogonal vectors for the 50 -d space; ergo, poly() chokes rather
gracelessly, which is what you saw, with the following output:

rsession(2093,0x7fffe0c113c0) malloc: ***
mach_vm_map(size=823564528381952) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
rsession(2093,0x7fffe0c113c0) malloc: ***
mach_vm_map(size=823564528381952) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Error: cannot allocate vector of size 767004.2 Gb

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jul 13, 2017 at 4:36 PM, David Winsemius  wrote:
>
>> On Jul 13, 2017, at 10:43 AM, Bert Gunter  wrote:
>>
>> poly(NIR, degree = 2) will work if NIR is a matrix, not a data.frame.
>> The degree argument apparently  *must* be explicitly named if NIR is
>> not a numeric vector. AFAICS, this is unclear or unstated in ?poly.
>
> I still get the same error with:
>
> library(pld)
> data(gasoline)
> gasTrain <- gasoline[1:50,]
> gas1 <- plsr(octane ~ poly(as.matrix(NIR), 2), ncomp = 10, data = gasTrain, 
> validation = "LOO")
>
>
> Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
>   invalid 'times' value
>
>> gas1 <- plsr(octane ~ poly(as.matrix(gasTrain$NIR), degree=2), ncomp = 10, 
>> data = gasTrain, validation = "CV")
> Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
>   invalid 'times' value
>
>> str(as.matrix(gasTrain$NIR))
>  AsIs [1:50, 1:401] -0.0502 -0.0442 -0.0469 -0.0467 -0.0509 ...
>  - attr(*, "dimnames")=List of 2
>   ..$ : chr [1:50] "1" "2" "3" "4" ...
>   ..$ : chr [1:401] "900 nm" "902 nm" "904 nm" "906 nm" ...
>
> So tried to strip the RHS down to a "simple" matrix
>
>> gas1 <- plsr(octane ~ poly(matrix(gasTrain$NIR, nrow=nrow(gasTrain$NIR) ), 
>> degree=2), ncomp = 10, data = gasTrain, validation = "CV")
> Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
>   invalid 'times' value
>
> I guess it reflects my lack of understanding of poly (which parallels my lack 
> of understanding of PLS.)
> --
> David.
>>
>>
>> -- Bert
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Thu, Jul 13, 2017 at 10:15 AM, David Winsemius
>>  wrote:
>>>
 On Jul 12, 2017, at 6:58 PM, Ng, Kelvin Sai-cheong  
 wrote:

 Dear all,

 I am using the pls package of R to perform partial least square on a set of
 multivariate data.  Instead of fitting a linear model, I want to fit my
 data with a quadratic function with interaction terms.  But I am not sure
 how.  I will use an example to illustrate my problem:

 Following the example in the PLS manual:
 ## Read data
 data(gasoline)
 gasTrain <- gasoline[1:50,]
 ## Perform PLS
 gas1 <- plsr(octane ~ NIR, ncomp = 10, data = gasTrain, validation = "LOO")

 where octane ~ NIR is the model that this example is fitting with.

 NIR is a collective of variables, i.e. NIR spectra consists of 401 diffuse
 reflectance measurements from 900 to 1700 nm.

 Instead of fitting with predict.octane[i] = a[0] * NIR[0,i] + a[1] *
 NIR[1,i] + ...
 I want to fit the data with:
 predict.octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... +
 b[0]*NIR[0,i]*NIR[0,i] + b[1] * NIR[0,i]*NIR[1,i] + ...

 i.e. quadratic with interaction terms.

 But I don't know how to formulate this.
>>>
>>> I did not see any terms in the model that I would have called interaction 
>>> terms. I'm seeing a desire for a polynomial function in NIR. For that 
>>> purpose, one might see if you get satisfactory results with:
>>>
>>> gas1 <- plsr(octane ~NIR + I(NIR^2), ncomp = 10, data = gasTrain, 
>>> validation = "LOO")
>>> gas1
>>>
>>> I first tried using poly(NIR, 2) on the RHS and it threw an error, which 
>>> raises concerns in my mind that this may not be a proper model. 

Re: [R] Quadratic function with interaction terms for the PLS fitting model?

2017-07-13 Thread David Winsemius

> On Jul 13, 2017, at 10:43 AM, Bert Gunter  wrote:
> 
> poly(NIR, degree = 2) will work if NIR is a matrix, not a data.frame.
> The degree argument apparently  *must* be explicitly named if NIR is
> not a numeric vector. AFAICS, this is unclear or unstated in ?poly.

I still get the same error with:

library(pld)
data(gasoline)
gasTrain <- gasoline[1:50,]
gas1 <- plsr(octane ~ poly(as.matrix(NIR), 2), ncomp = 10, data = gasTrain, 
validation = "LOO")


Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : 
  invalid 'times' value

> gas1 <- plsr(octane ~ poly(as.matrix(gasTrain$NIR), degree=2), ncomp = 10, 
> data = gasTrain, validation = "CV")
Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : 
  invalid 'times' value

> str(as.matrix(gasTrain$NIR))
 AsIs [1:50, 1:401] -0.0502 -0.0442 -0.0469 -0.0467 -0.0509 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:50] "1" "2" "3" "4" ...
  ..$ : chr [1:401] "900 nm" "902 nm" "904 nm" "906 nm" ...

So tried to strip the RHS down to a "simple" matrix

> gas1 <- plsr(octane ~ poly(matrix(gasTrain$NIR, nrow=nrow(gasTrain$NIR) ), 
> degree=2), ncomp = 10, data = gasTrain, validation = "CV")
Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : 
  invalid 'times' value

I guess it reflects my lack of understanding of poly (which parallels my lack 
of understanding of PLS.)
-- 
David.
> 
> 
> -- Bert
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> 
> On Thu, Jul 13, 2017 at 10:15 AM, David Winsemius
>  wrote:
>> 
>>> On Jul 12, 2017, at 6:58 PM, Ng, Kelvin Sai-cheong  
>>> wrote:
>>> 
>>> Dear all,
>>> 
>>> I am using the pls package of R to perform partial least square on a set of
>>> multivariate data.  Instead of fitting a linear model, I want to fit my
>>> data with a quadratic function with interaction terms.  But I am not sure
>>> how.  I will use an example to illustrate my problem:
>>> 
>>> Following the example in the PLS manual:
>>> ## Read data
>>> data(gasoline)
>>> gasTrain <- gasoline[1:50,]
>>> ## Perform PLS
>>> gas1 <- plsr(octane ~ NIR, ncomp = 10, data = gasTrain, validation = "LOO")
>>> 
>>> where octane ~ NIR is the model that this example is fitting with.
>>> 
>>> NIR is a collective of variables, i.e. NIR spectra consists of 401 diffuse
>>> reflectance measurements from 900 to 1700 nm.
>>> 
>>> Instead of fitting with predict.octane[i] = a[0] * NIR[0,i] + a[1] *
>>> NIR[1,i] + ...
>>> I want to fit the data with:
>>> predict.octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... +
>>> b[0]*NIR[0,i]*NIR[0,i] + b[1] * NIR[0,i]*NIR[1,i] + ...
>>> 
>>> i.e. quadratic with interaction terms.
>>> 
>>> But I don't know how to formulate this.
>> 
>> I did not see any terms in the model that I would have called interaction 
>> terms. I'm seeing a desire for a polynomial function in NIR. For that 
>> purpose, one might see if you get satisfactory results with:
>> 
>> gas1 <- plsr(octane ~NIR + I(NIR^2), ncomp = 10, data = gasTrain, validation 
>> = "LOO")
>> gas1
>> 
>> I first tried using poly(NIR, 2) on the RHS and it threw an error, which 
>> raises concerns in my mind that this may not be a proper model. I have no 
>> experience with the use of plsr or its underlying theory, so the fact that 
>> this is not throwing an error is no guarantee of validity. Using this 
>> construction in ordinary least squares regression has dangers with 
>> inferential statistics because of the correlation of the linear and squared 
>> terms as well as likely violation of homoscedasticity.
>> 
>> --
>> David.
>> 
>> 
>>> 
>>> May I have some help please?
>>> 
>>> Thanks,
>>> 
>>> Kelvin
>>> 
>>>  [[alternative HTML version deleted]]
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> David Winsemius
>> Alameda, CA, USA
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quadratic function with interaction terms for the PLS fitting model?

2017-07-13 Thread Bert Gunter
> It would seem reasonable that the help for poly() could make it explicitly 
> clear that if 'x' is not a vector, but is a matrix, that 'degree' must be 
> explicitly named.
>
> Regards,
>
> Marc
>

Exactly.  As written, there is no reason to believe that the stated
exception to the ... argument matching rule does not also apply to x a
matrix.

-- Bert

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quadratic function with interaction terms for the PLS fitting model?

2017-07-13 Thread Marc Schwartz
Hi Bert,

Ok, to your initial point, the key nuance is that if 'x' is a vector, you can 
leave the 'degree' argument unnamed, however, if 'x' is a matrix, you cannot. 
That aspect of the behavior does not seem to change if poly() is called stand 
alone or, as suggested in ?poly, within a formula to be parsed.

Working on tracing through the code using debug(), the error is triggered with 
'mx', when the following code is called within poly(), where 'x' within the 
function call is 'mx'. Note that my 'mx' was generated using new calls to 
rnorm():

if (is.matrix(x)) {
m <- unclass(as.data.frame(cbind(x, ...)))
return(do.call(polym, c(m, degree = degree, raw = raw, list(coefs = 
coefs
}

'm' ends up being:

Browse[2]> m
$x1
 [1] 0.11551124 0.36245863 0.44844573 0.89193967 0.91431981 0.16244275
 [7] 0.28070518 0.34013156 0.26561721 0.52915461 0.88164507 0.42485427
[13] 0.48844831 0.60092526 0.01493797 0.41814162 0.31549893 0.19483697
[19] 0.16003496 0.52635862

$x2
 [1] 0.89119433 0.02665353 0.03954367 0.37604374 0.05604632 0.86123698
 [7] 0.11106261 0.15707524 0.32433273 0.62476982 0.70646979 0.78843108
[13] 0.63674970 0.17091172 0.65220425 0.64087676 0.56903083 0.21398002
[19] 0.02820857 0.47113431

$V3
 [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

attr(,"row.names")
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20


On the third 'loop' over the list elements in 'm' via do.call(), m$V3 is passed 
to polym() as its 'x' argument:

Browse[3]> x
 [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
Browse[3]> degree
[1] 2
Browse[3]> length(unique(x))
[1] 1


The following check is triggered and, of course, fails with the error message:

if (degree >= length(unique(x))) stop("'degree' must be less than number of 
unique points")

Thus, in effect, the following is being called:

> polym(rep(2, 20), degree = 2)
Error in poly(dots[[1L]], degree, raw = raw, simple = raw && nd > 1) : 
  'degree' must be less than number of unique points


It would seem reasonable that the help for poly() could make it explicitly 
clear that if 'x' is not a vector, but is a matrix, that 'degree' must be 
explicitly named.

Regards,

Marc


> On Jul 13, 2017, at 1:34 PM, Bert Gunter  wrote:
> 
> Marc:
> 
> 1. I am aware of the need to explicitly name arguments after ... --
> see the R Language definition where this can be inferred from the
> argument matching rules.
> 
> 2. I am aware of the stated exception for poly(). However:
> 
>> x1 <- runif(20)
>> x2 <- runif(20)
>> mx <- cbind(x1,x2)
>> poly(mx,2)
> Error in poly(dots[[i]], degree, raw = raw, simple = raw) :
>  'degree' must be less than number of unique points
> 
>> poly(mx, degree = 2)
> 1.0   2.0 0.1  1.1 0.2
> [1,] -0.2984843  0.0402593349 -0.07095761  0.021179734 -0.22909595
> [2,]  0.2512177  0.2172530896  0.29620999  0.074413206  0.14508422
> [3,]  0.2775652  0.3085750335 -0.13955410 -0.038735366 -0.13729529
> [4,] -0.4090782  0.4032189266 -0.14737858  0.060289370 -0.12358925
> [5,] -0.1631886 -0.2221937915 -0.26690975  0.043556631  0.16814432
> [6,]  0.1770952  0.0009863446  0.25380650  0.044947925  0.02737265
> [7,] -0.2108146 -0.1525957018  0.34023304 -0.071726094  0.28787441
> [8,]  0.2693983  0.2794576400  0.04697126  0.012653979 -0.26792015
> [9,]  0.2014353  0.0653896008 -0.37013148 -0.074557536  0.54445808
> [10,] -0.1002967 -0.2761638672 -0.29389518  0.029476714  0.25539539
> [11,]  0.1132090 -0.1372916959  0.21619808  0.024475573 -0.06074932
> [12,] -0.1116108 -0.2696398425 -0.14592886  0.016287234 -0.12617869
> [13,]  0.1792535  0.0064357827 -0.04948750 -0.008870809 -0.24736773
> [14,] -0.1167216 -0.2662346206 -0.20209364  0.023588696 -0.00923419
> [15,] -0.4258838  0.4700591049  0.08836730 -0.037634205 -0.24586894
> [16,]  0.1047271 -0.1523001267 -0.21491954 -0.022507896  0.02225837
> [17,] -0.1985753 -0.1728455549  0.32036901 -0.063617358  0.22084868
> [18,]  0.1844006  0.0196368680  0.32321195  0.059600465  0.23017961
> [19,]  0.1009775 -0.1586846110 -0.08282554 -0.008363512 -0.21685556
> [20,]  0.1753745 -0.0033219134  0.09871464  0.017312033 -0.23746062
> attr(,"degree")
> [1] 1 2 1 2 2
> attr(,"coefs")
> attr(,"coefs")[[1]]
> attr(,"coefs")[[1]]$alpha
> [1] 0.5477073 0.4154115
> 
> attr(,"coefs")[[1]]$norm2
> [1]  1. 20.  1.55009761  0.08065872
> 
> Cheers,
> Bert
> 
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> 
> On Thu, Jul 13, 2017 at 11:17 AM, Marc Schwartz  wrote:
>> Bert,
>> 
>> The 'degree' argument follows the "..." argument in the function declaration:
>> 
>>  poly(x, ..., degree = 1, coefs = NULL, raw = FALSE, simple = FALSE)
>> 
>> Generally, any arguments after the "..." must be explicitly named, but as 
>> per the Details section of ?poly:
>> 
>> "Although formally 

Re: [R] Quadratic function with interaction terms for the PLS fitting model?

2017-07-13 Thread Bert Gunter
Marc:

1. I am aware of the need to explicitly name arguments after ... --
see the R Language definition where this can be inferred from the
argument matching rules.

2. I am aware of the stated exception for poly(). However:

> x1 <- runif(20)
> x2 <- runif(20)
> mx <- cbind(x1,x2)
> poly(mx,2)
Error in poly(dots[[i]], degree, raw = raw, simple = raw) :
  'degree' must be less than number of unique points

> poly(mx, degree = 2)
 1.0   2.0 0.1  1.1 0.2
 [1,] -0.2984843  0.0402593349 -0.07095761  0.021179734 -0.22909595
 [2,]  0.2512177  0.2172530896  0.29620999  0.074413206  0.14508422
 [3,]  0.2775652  0.3085750335 -0.13955410 -0.038735366 -0.13729529
 [4,] -0.4090782  0.4032189266 -0.14737858  0.060289370 -0.12358925
 [5,] -0.1631886 -0.2221937915 -0.26690975  0.043556631  0.16814432
 [6,]  0.1770952  0.0009863446  0.25380650  0.044947925  0.02737265
 [7,] -0.2108146 -0.1525957018  0.34023304 -0.071726094  0.28787441
 [8,]  0.2693983  0.2794576400  0.04697126  0.012653979 -0.26792015
 [9,]  0.2014353  0.0653896008 -0.37013148 -0.074557536  0.54445808
[10,] -0.1002967 -0.2761638672 -0.29389518  0.029476714  0.25539539
[11,]  0.1132090 -0.1372916959  0.21619808  0.024475573 -0.06074932
[12,] -0.1116108 -0.2696398425 -0.14592886  0.016287234 -0.12617869
[13,]  0.1792535  0.0064357827 -0.04948750 -0.008870809 -0.24736773
[14,] -0.1167216 -0.2662346206 -0.20209364  0.023588696 -0.00923419
[15,] -0.4258838  0.4700591049  0.08836730 -0.037634205 -0.24586894
[16,]  0.1047271 -0.1523001267 -0.21491954 -0.022507896  0.02225837
[17,] -0.1985753 -0.1728455549  0.32036901 -0.063617358  0.22084868
[18,]  0.1844006  0.0196368680  0.32321195  0.059600465  0.23017961
[19,]  0.1009775 -0.1586846110 -0.08282554 -0.008363512 -0.21685556
[20,]  0.1753745 -0.0033219134  0.09871464  0.017312033 -0.23746062
attr(,"degree")
[1] 1 2 1 2 2
attr(,"coefs")
attr(,"coefs")[[1]]
attr(,"coefs")[[1]]$alpha
[1] 0.5477073 0.4154115

attr(,"coefs")[[1]]$norm2
[1]  1. 20.  1.55009761  0.08065872

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jul 13, 2017 at 11:17 AM, Marc Schwartz  wrote:
> Bert,
>
> The 'degree' argument follows the "..." argument in the function declaration:
>
>   poly(x, ..., degree = 1, coefs = NULL, raw = FALSE, simple = FALSE)
>
> Generally, any arguments after the "..." must be explicitly named, but as per 
> the Details section of ?poly:
>
> "Although formally degree should be named (as it follows ...), an unnamed 
> second argument of length 1 will be interpreted as the degree, such that 
> poly(x, 3) can be used in formulas."
>
> The issue of having to explicitly name arguments that follow the three dots 
> has come up over the years, but I cannot recall where that is documented in 
> the manuals.
>
> Regards,
>
> Marc
>
>
>
>> On Jul 13, 2017, at 12:43 PM, Bert Gunter  wrote:
>>
>> poly(NIR, degree = 2) will work if NIR is a matrix, not a data.frame.
>> The degree argument apparently  *must* be explicitly named if NIR is
>> not a numeric vector. AFAICS, this is unclear or unstated in ?poly.
>>
>>
>> -- Bert
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Thu, Jul 13, 2017 at 10:15 AM, David Winsemius
>>  wrote:
>>>
 On Jul 12, 2017, at 6:58 PM, Ng, Kelvin Sai-cheong  
 wrote:

 Dear all,

 I am using the pls package of R to perform partial least square on a set of
 multivariate data.  Instead of fitting a linear model, I want to fit my
 data with a quadratic function with interaction terms.  But I am not sure
 how.  I will use an example to illustrate my problem:

 Following the example in the PLS manual:
 ## Read data
 data(gasoline)
 gasTrain <- gasoline[1:50,]
 ## Perform PLS
 gas1 <- plsr(octane ~ NIR, ncomp = 10, data = gasTrain, validation = "LOO")

 where octane ~ NIR is the model that this example is fitting with.

 NIR is a collective of variables, i.e. NIR spectra consists of 401 diffuse
 reflectance measurements from 900 to 1700 nm.

 Instead of fitting with predict.octane[i] = a[0] * NIR[0,i] + a[1] *
 NIR[1,i] + ...
 I want to fit the data with:
 predict.octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... +
 b[0]*NIR[0,i]*NIR[0,i] + b[1] * NIR[0,i]*NIR[1,i] + ...

 i.e. quadratic with interaction terms.

 But I don't know how to formulate this.
>>>
>>> I did not see any terms in the model that I would have called interaction 
>>> terms. I'm seeing a desire for a polynomial function in NIR. For that 
>>> purpose, 

Re: [R] Quadratic function with interaction terms for the PLS fitting model?

2017-07-13 Thread Marc Schwartz
Bert,

The 'degree' argument follows the "..." argument in the function declaration:

  poly(x, ..., degree = 1, coefs = NULL, raw = FALSE, simple = FALSE)

Generally, any arguments after the "..." must be explicitly named, but as per 
the Details section of ?poly:

"Although formally degree should be named (as it follows ...), an unnamed 
second argument of length 1 will be interpreted as the degree, such that 
poly(x, 3) can be used in formulas."

The issue of having to explicitly name arguments that follow the three dots has 
come up over the years, but I cannot recall where that is documented in the 
manuals.

Regards,

Marc



> On Jul 13, 2017, at 12:43 PM, Bert Gunter  wrote:
> 
> poly(NIR, degree = 2) will work if NIR is a matrix, not a data.frame.
> The degree argument apparently  *must* be explicitly named if NIR is
> not a numeric vector. AFAICS, this is unclear or unstated in ?poly.
> 
> 
> -- Bert
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> 
> On Thu, Jul 13, 2017 at 10:15 AM, David Winsemius
>  wrote:
>> 
>>> On Jul 12, 2017, at 6:58 PM, Ng, Kelvin Sai-cheong  
>>> wrote:
>>> 
>>> Dear all,
>>> 
>>> I am using the pls package of R to perform partial least square on a set of
>>> multivariate data.  Instead of fitting a linear model, I want to fit my
>>> data with a quadratic function with interaction terms.  But I am not sure
>>> how.  I will use an example to illustrate my problem:
>>> 
>>> Following the example in the PLS manual:
>>> ## Read data
>>> data(gasoline)
>>> gasTrain <- gasoline[1:50,]
>>> ## Perform PLS
>>> gas1 <- plsr(octane ~ NIR, ncomp = 10, data = gasTrain, validation = "LOO")
>>> 
>>> where octane ~ NIR is the model that this example is fitting with.
>>> 
>>> NIR is a collective of variables, i.e. NIR spectra consists of 401 diffuse
>>> reflectance measurements from 900 to 1700 nm.
>>> 
>>> Instead of fitting with predict.octane[i] = a[0] * NIR[0,i] + a[1] *
>>> NIR[1,i] + ...
>>> I want to fit the data with:
>>> predict.octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... +
>>> b[0]*NIR[0,i]*NIR[0,i] + b[1] * NIR[0,i]*NIR[1,i] + ...
>>> 
>>> i.e. quadratic with interaction terms.
>>> 
>>> But I don't know how to formulate this.
>> 
>> I did not see any terms in the model that I would have called interaction 
>> terms. I'm seeing a desire for a polynomial function in NIR. For that 
>> purpose, one might see if you get satisfactory results with:
>> 
>> gas1 <- plsr(octane ~NIR + I(NIR^2), ncomp = 10, data = gasTrain, validation 
>> = "LOO")
>> gas1
>> 
>> I first tried using poly(NIR, 2) on the RHS and it threw an error, which 
>> raises concerns in my mind that this may not be a proper model. I have no 
>> experience with the use of plsr or its underlying theory, so the fact that 
>> this is not throwing an error is no guarantee of validity. Using this 
>> construction in ordinary least squares regression has dangers with 
>> inferential statistics because of the correlation of the linear and squared 
>> terms as well as likely violation of homoscedasticity.
>> 
>> --
>> David.
>> 
>> 
>>> 
>>> May I have some help please?
>>> 
>>> Thanks,
>>> 
>>> Kelvin
>>> 
>>>  [[alternative HTML version deleted]]
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> David Winsemius
>> Alameda, CA, USA
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quadratic function with interaction terms for the PLS fitting model?

2017-07-13 Thread Bert Gunter
poly(NIR, degree = 2) will work if NIR is a matrix, not a data.frame.
The degree argument apparently  *must* be explicitly named if NIR is
not a numeric vector. AFAICS, this is unclear or unstated in ?poly.


-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jul 13, 2017 at 10:15 AM, David Winsemius
 wrote:
>
>> On Jul 12, 2017, at 6:58 PM, Ng, Kelvin Sai-cheong  
>> wrote:
>>
>> Dear all,
>>
>> I am using the pls package of R to perform partial least square on a set of
>> multivariate data.  Instead of fitting a linear model, I want to fit my
>> data with a quadratic function with interaction terms.  But I am not sure
>> how.  I will use an example to illustrate my problem:
>>
>> Following the example in the PLS manual:
>> ## Read data
>> data(gasoline)
>> gasTrain <- gasoline[1:50,]
>> ## Perform PLS
>> gas1 <- plsr(octane ~ NIR, ncomp = 10, data = gasTrain, validation = "LOO")
>>
>> where octane ~ NIR is the model that this example is fitting with.
>>
>> NIR is a collective of variables, i.e. NIR spectra consists of 401 diffuse
>> reflectance measurements from 900 to 1700 nm.
>>
>> Instead of fitting with predict.octane[i] = a[0] * NIR[0,i] + a[1] *
>> NIR[1,i] + ...
>> I want to fit the data with:
>> predict.octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... +
>> b[0]*NIR[0,i]*NIR[0,i] + b[1] * NIR[0,i]*NIR[1,i] + ...
>>
>> i.e. quadratic with interaction terms.
>>
>> But I don't know how to formulate this.
>
> I did not see any terms in the model that I would have called interaction 
> terms. I'm seeing a desire for a polynomial function in NIR. For that 
> purpose, one might see if you get satisfactory results with:
>
> gas1 <- plsr(octane ~NIR + I(NIR^2), ncomp = 10, data = gasTrain, validation 
> = "LOO")
> gas1
>
> I first tried using poly(NIR, 2) on the RHS and it threw an error, which 
> raises concerns in my mind that this may not be a proper model. I have no 
> experience with the use of plsr or its underlying theory, so the fact that 
> this is not throwing an error is no guarantee of validity. Using this 
> construction in ordinary least squares regression has dangers with 
> inferential statistics because of the correlation of the linear and squared 
> terms as well as likely violation of homoscedasticity.
>
> --
> David.
>
>
>>
>> May I have some help please?
>>
>> Thanks,
>>
>> Kelvin
>>
>>   [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quadratic function with interaction terms for the PLS fitting model?

2017-07-13 Thread David Winsemius

> On Jul 12, 2017, at 6:58 PM, Ng, Kelvin Sai-cheong  
> wrote:
> 
> Dear all,
> 
> I am using the pls package of R to perform partial least square on a set of
> multivariate data.  Instead of fitting a linear model, I want to fit my
> data with a quadratic function with interaction terms.  But I am not sure
> how.  I will use an example to illustrate my problem:
> 
> Following the example in the PLS manual:
> ## Read data
> data(gasoline)
> gasTrain <- gasoline[1:50,]
> ## Perform PLS
> gas1 <- plsr(octane ~ NIR, ncomp = 10, data = gasTrain, validation = "LOO")
> 
> where octane ~ NIR is the model that this example is fitting with.
> 
> NIR is a collective of variables, i.e. NIR spectra consists of 401 diffuse
> reflectance measurements from 900 to 1700 nm.
> 
> Instead of fitting with predict.octane[i] = a[0] * NIR[0,i] + a[1] *
> NIR[1,i] + ...
> I want to fit the data with:
> predict.octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... +
> b[0]*NIR[0,i]*NIR[0,i] + b[1] * NIR[0,i]*NIR[1,i] + ...
> 
> i.e. quadratic with interaction terms.
> 
> But I don't know how to formulate this.

I did not see any terms in the model that I would have called interaction 
terms. I'm seeing a desire for a polynomial function in NIR. For that purpose, 
one might see if you get satisfactory results with:

gas1 <- plsr(octane ~NIR + I(NIR^2), ncomp = 10, data = gasTrain, validation = 
"LOO")
gas1

I first tried using poly(NIR, 2) on the RHS and it threw an error, which raises 
concerns in my mind that this may not be a proper model. I have no experience 
with the use of plsr or its underlying theory, so the fact that this is not 
throwing an error is no guarantee of validity. Using this construction in 
ordinary least squares regression has dangers with inferential statistics 
because of the correlation of the linear and squared terms as well as likely 
violation of homoscedasticity.

-- 
David.


> 
> May I have some help please?
> 
> Thanks,
> 
> Kelvin
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Quadratic function with interaction terms for the PLS fitting model?

2017-07-12 Thread Ng, Kelvin Sai-cheong
Dear all,

I am using the pls package of R to perform partial least square on a set of
multivariate data.  Instead of fitting a linear model, I want to fit my
data with a quadratic function with interaction terms.  But I am not sure
how.  I will use an example to illustrate my problem:

Following the example in the PLS manual:
## Read data
 data(gasoline)
gasTrain <- gasoline[1:50,]
## Perform PLS
gas1 <- plsr(octane ~ NIR, ncomp = 10, data = gasTrain, validation = "LOO")

where octane ~ NIR is the model that this example is fitting with.

NIR is a collective of variables, i.e. NIR spectra consists of 401 diffuse
reflectance measurements from 900 to 1700 nm.

Instead of fitting with predict.octane[i] = a[0] * NIR[0,i] + a[1] *
NIR[1,i] + ...
I want to fit the data with:
predict.octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... +
b[0]*NIR[0,i]*NIR[0,i] + b[1] * NIR[0,i]*NIR[1,i] + ...

i.e. quadratic with interaction terms.

But I don't know how to formulate this.

May I have some help please?

Thanks,

Kelvin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.