Yeah, or it might be easier to do it separately, like a function seriesData = createSeries(data, rank=2) which returns a DataFrame that contains all of those series terms. Then seriesData would simply be used as the data argument in glm().
Bradley On Sunday, August 31, 2014 3:05:12 PM UTC-5, John Myles White wrote: > > I see. This is a pretty radical change to how GLM’s would be specified. I > think the only realistic way you could make any progress on such a radical > proposal is to undertake this change as a project on your own and then give > people a demo of a system you’ve built that’s noticeably better than what > they’re used to having in R. > > — John > > On Aug 31, 2014, at 1:02 PM, Bradley Setzler <bradley...@gmail.com > <javascript:>> wrote: > > Sorry, I meant for those to be in the ... term. > > Let me write them explicitly for the case of 3 independent variables, X1 > X2 X3, seriesRank=2 would be, > > (intercept) > X1.^2 > X2.^2 > X3.^2 > X1.*X2 > X1.*X3 > X2.*X3 > X1.*X2.*X3 > > Bradley > > On Sunday, August 31, 2014 2:55:22 PM UTC-5, John Myles White wrote: >> >> Bradley, you’re forgetting about interactions terms. >> >> — John >> >> On Aug 31, 2014, at 12:53 PM, Bradley Setzler <bradley...@gmail.com> >> wrote: >> >> No problem. >> >> Honestly, I'm not sure formula is a useful way to think about regression, >> the formula is uniquely determined from: >> (depVar, indepVars, data, family, link) >> >> so that the + symbols are redundant given family and link, >> glm(Y ~ X1 + X2 + X3 + X4 + X5 +...., family, link) >> >> and it would be nice to have an explicit intercept argument like, >> glm(Y,X,data,family,link,intercept=true) >> >> Adding to the wish list, I would like to see something like a series >> option for non-parametric regression, >> glm(Y,X,data,family,link,seriesRank=2) >> where seriesRank=2 means all of the terms X1.^2, X1.*X2, X1.*X3,...,X5.^2 >> are included as regressors. >> >> Bradley >> >> >> >> >> On Sunday, August 31, 2014 2:32:30 PM UTC-5, John Myles White wrote: >>> >>> Merged. Thanks, Bradley. >>> >>> — John >>> >>> On Aug 31, 2014, at 12:29 PM, Bradley Setzler <bradley...@gmail.com> >>> wrote: >>> >>> Thank you for suggesting this, John. >>> >>> https://github.com/JuliaStats/GLM.jl/pull/90 >>> >>> Bradley >>> >>> >>> On Sunday, August 31, 2014 1:33:04 PM UTC-5, John Myles White wrote: >>>> >>>> Bradley, it’s especially easy to edit documentation because you can >>>> make a Pull Request right from the website. >>>> >>>> — John >>>> >>>> On Aug 31, 2014, at 11:30 AM, Bradley Setzler <bradley...@gmail.com> >>>> wrote: >>>> >>>> Thank you Adam, this works. >>>> >>>> Let me suggest that this information be included in the GLM >>>> documentation: >>>> >>>> To fit a GLM model, use the function, >>>> glm(formula, data, family, link), >>>> where, >>>> - formula uses column symbols from the DataFrame data, e.g., if >>>> names(data)=[:Y,:X], then a valid formula is Y~X; >>>> - data is a DataFrame which may contain NA values, the rows with NA >>>> values will be ignored (apparently); >>>> - family may be chosen from Binomial(), Gamma(), Normal(), or >>>> Poisson(), and the parentheses are required; and, >>>> - link may be chosen from the list in the GLM documentation, such as >>>> LogitLink(), and again the parentheses are required. For some families, a >>>> default link is available so the link argument may be left blank. >>>> >>>> Bradley >>>> >>>> >>>> On Sunday, August 31, 2014 12:56:19 PM UTC-5, Adam Kapor wrote: >>>>> >>>>> This works for me: >>>>> >>>>> ``` >>>>> >>>>> *julia> * >>>>> *fit(GeneralizedLinearModel,Y~X,data,Binomial(),ProbitLink())* >>>>> >>>>> *DataFrameRegressionModel{GeneralizedLinearModel,Float64}:* >>>>> >>>>> *Coefficients:* >>>>> >>>>> * Estimate Std.Error z value Pr(>|z|)* >>>>> >>>>> *(Intercept) 0.430727 1.98019 0.217518 0.8278* >>>>> >>>>> *X 2.37745e-17 0.91665 2.59362e-17 1.0000* >>>>> >>>>> *julia> **fit(GeneralizedLinearModel,Y~X,data,Binomial(),LogitLink())* >>>>> >>>>> *DataFrameRegressionModel{GeneralizedLinearModel,Float64}:* >>>>> >>>>> *Coefficients:* >>>>> >>>>> * Estimate Std.Error z value Pr(>|z|)* >>>>> >>>>> *(Intercept) 0.693147 3.24037 0.21391 0.8306* >>>>> >>>>> *X -7.44332e-17 1.5 -4.96221e-17 1.0000* >>>>> >>>>> *```* >>>>> >>>>> On Sunday, August 31, 2014 1:27:15 PM UTC-4, Bradley Setzler wrote: >>>>>> >>>>>> Has anyone successfully performed probit or logit regression in >>>>>> Julia? The GLM documentation <https://github.com/JuliaStats/GLM.jl> >>>>>> does not provide a generalizable example of how to use glm(). It gives a >>>>>> Poisson example without any suggestion of how to switch from Poisson to >>>>>> some other type. >>>>>> >>>>>> *Using the Poisson example from GLM documentation works:* >>>>>> >>>>>> julia> X = [1;2;3.] >>>>>> julia> Y = [1;0;1.] >>>>>> julia> data = DataFrame(X=X,Y=Y) >>>>>> julia> fit(GeneralizedLinearModel, Y ~ X,data, Poisson()) >>>>>> DataFrameRegressionModel{GeneralizedLinearModel,Float64}: >>>>>> Coefficients: >>>>>> Estimate Std.Error z value Pr(>|z|) >>>>>> (Intercept) -0.405465 1.87034 -0.216787 0.8284 >>>>>> X -3.91448e-17 0.8658 -4.52123e-17 1.0000 >>>>>> >>>>>> *But does not generalize:* >>>>>> >>>>>> julia> fit(GeneralizedLinearModel, Y ~ X ,data, Logit()) >>>>>> ERROR: Logit not defined >>>>>> >>>>>> julia> fit(GeneralizedLinearModel, Y ~ X, data, link=:ProbitLink) >>>>>> ERROR: `fit` has no method matching >>>>>> fit(::Type{GeneralizedLinearModel}, ::Array{Float64,2}, >>>>>> ::Array{Float64,1}) >>>>>> >>>>>> julia> fit(GeneralizedLinearModel, Y ~ X, data, >>>>>> family="binomial",link="probit") >>>>>> ERROR: `fit` has no method matching >>>>>> fit(::Type{GeneralizedLinearModel}, ::Array{Float64,2}, >>>>>> ::Array{Float64,1}) >>>>>> >>>>>> ....and a dozen other similar attempts fail. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Bradley >>>>>> >>>>>> >>>> >>> >> >