Yeah, or it might be easier to do it separately, like a function 
seriesData = createSeries(data, rank=2)
which returns a DataFrame that contains all of those series terms. Then 
seriesData would simply be used as the data argument in glm().

Bradley

On Sunday, August 31, 2014 3:05:12 PM UTC-5, John Myles White wrote:
>
> I see. This is a pretty radical change to how GLM’s would be specified. I 
> think the only realistic way you could make any progress on such a radical 
> proposal is to undertake this change as a project on your own and then give 
> people a demo of a system you’ve built that’s noticeably better than what 
> they’re used to having in R.
>
>  — John
>
> On Aug 31, 2014, at 1:02 PM, Bradley Setzler <bradley...@gmail.com 
> <javascript:>> wrote:
>
> Sorry, I meant for those to be in the ... term.
>
> Let me write them explicitly for the case of 3 independent variables, X1 
> X2 X3, seriesRank=2 would be,
>
> (intercept)
> X1.^2
> X2.^2
> X3.^2
> X1.*X2
> X1.*X3
> X2.*X3
> X1.*X2.*X3
>
> Bradley
>
> On Sunday, August 31, 2014 2:55:22 PM UTC-5, John Myles White wrote:
>>
>> Bradley, you’re forgetting about interactions terms.
>>
>>  — John
>>
>> On Aug 31, 2014, at 12:53 PM, Bradley Setzler <bradley...@gmail.com> 
>> wrote:
>>
>> No problem.
>>
>> Honestly, I'm not sure formula is a useful way to think about regression, 
>> the formula is uniquely determined from:
>> (depVar, indepVars, data, family, link)
>>
>> so that the + symbols are redundant given family and link,
>> glm(Y ~ X1 + X2 + X3 + X4 + X5 +...., family, link)
>>
>> and it would be nice to have an explicit intercept argument like,
>> glm(Y,X,data,family,link,intercept=true)
>>
>> Adding to the wish list, I would like to see something like a series 
>> option for non-parametric regression,
>> glm(Y,X,data,family,link,seriesRank=2)
>> where seriesRank=2 means all of the terms X1.^2, X1.*X2, X1.*X3,...,X5.^2 
>> are included as regressors.
>>
>> Bradley
>>
>>
>>
>>
>> On Sunday, August 31, 2014 2:32:30 PM UTC-5, John Myles White wrote:
>>>
>>> Merged. Thanks, Bradley.
>>>
>>>  — John
>>>
>>> On Aug 31, 2014, at 12:29 PM, Bradley Setzler <bradley...@gmail.com> 
>>> wrote:
>>>
>>> Thank you for suggesting this, John.
>>>
>>> https://github.com/JuliaStats/GLM.jl/pull/90
>>>
>>> Bradley
>>>
>>>
>>> On Sunday, August 31, 2014 1:33:04 PM UTC-5, John Myles White wrote:
>>>>
>>>> Bradley, it’s especially easy to edit documentation because you can 
>>>> make a Pull Request right from the website.
>>>>
>>>>  — John
>>>>
>>>> On Aug 31, 2014, at 11:30 AM, Bradley Setzler <bradley...@gmail.com> 
>>>> wrote:
>>>>
>>>> Thank you Adam, this works.
>>>>
>>>> Let me suggest that this information be included in the GLM 
>>>> documentation:
>>>>
>>>> To fit a GLM model, use the function,
>>>> glm(formula, data, family, link), 
>>>> where,
>>>> - formula uses column symbols from the DataFrame data, e.g., if 
>>>> names(data)=[:Y,:X], then a valid formula is Y~X;
>>>> - data is a DataFrame which may contain NA values, the rows with NA 
>>>> values will be ignored (apparently);
>>>> - family may be chosen from Binomial(), Gamma(), Normal(), or 
>>>> Poisson(), and the parentheses are required; and,
>>>> - link may be chosen from the list in the GLM documentation, such as 
>>>> LogitLink(), and again the parentheses are required. For some families, a 
>>>> default link is available so the link argument may be left blank.
>>>>
>>>> Bradley
>>>>
>>>>
>>>> On Sunday, August 31, 2014 12:56:19 PM UTC-5, Adam Kapor wrote:
>>>>>
>>>>> This works for me:
>>>>>
>>>>> ```
>>>>>
>>>>> *julia> *
>>>>> *fit(GeneralizedLinearModel,Y~X,data,Binomial(),ProbitLink())*
>>>>>
>>>>> *DataFrameRegressionModel{GeneralizedLinearModel,Float64}:*
>>>>>
>>>>> *Coefficients:*
>>>>>
>>>>> *                Estimate Std.Error     z value Pr(>|z|)*
>>>>>
>>>>> *(Intercept)     0.430727   1.98019    0.217518   0.8278*
>>>>>
>>>>> *X            2.37745e-17   0.91665 2.59362e-17   1.0000*
>>>>>
>>>>> *julia> **fit(GeneralizedLinearModel,Y~X,data,Binomial(),LogitLink())*
>>>>>
>>>>> *DataFrameRegressionModel{GeneralizedLinearModel,Float64}:*
>>>>>
>>>>> *Coefficients:*
>>>>>
>>>>> *                 Estimate Std.Error      z value Pr(>|z|)*
>>>>>
>>>>> *(Intercept)      0.693147   3.24037      0.21391   0.8306*
>>>>>
>>>>> *X            -7.44332e-17       1.5 -4.96221e-17   1.0000*
>>>>>
>>>>> *```*
>>>>>
>>>>> On Sunday, August 31, 2014 1:27:15 PM UTC-4, Bradley Setzler wrote:
>>>>>>
>>>>>> Has anyone successfully performed probit or logit regression in 
>>>>>> Julia? The GLM documentation <https://github.com/JuliaStats/GLM.jl> 
>>>>>> does not provide a generalizable example of how to use glm(). It gives a 
>>>>>> Poisson example without any suggestion of how to switch from Poisson to 
>>>>>> some other type.
>>>>>>
>>>>>> *Using the Poisson example from GLM documentation works:*
>>>>>>
>>>>>> julia> X = [1;2;3.]
>>>>>> julia> Y = [1;0;1.]
>>>>>> julia> data = DataFrame(X=X,Y=Y)
>>>>>> julia> fit(GeneralizedLinearModel, Y ~ X,data, Poisson())
>>>>>> DataFrameRegressionModel{GeneralizedLinearModel,Float64}: 
>>>>>> Coefficients: 
>>>>>> Estimate Std.Error z value Pr(>|z|) 
>>>>>> (Intercept) -0.405465 1.87034 -0.216787 0.8284 
>>>>>> X -3.91448e-17 0.8658 -4.52123e-17 1.0000 
>>>>>>
>>>>>> *But does not generalize:*
>>>>>>
>>>>>> julia> fit(GeneralizedLinearModel, Y ~ X ,data, Logit()) 
>>>>>> ERROR: Logit not defined
>>>>>>
>>>>>> julia> fit(GeneralizedLinearModel, Y ~ X, data, link=:ProbitLink) 
>>>>>> ERROR: `fit` has no method matching 
>>>>>> fit(::Type{GeneralizedLinearModel}, ::Array{Float64,2}, 
>>>>>> ::Array{Float64,1})
>>>>>>
>>>>>> julia> fit(GeneralizedLinearModel, Y ~ X, data, 
>>>>>> family="binomial",link="probit") 
>>>>>> ERROR: `fit` has no method matching 
>>>>>> fit(::Type{GeneralizedLinearModel}, ::Array{Float64,2}, 
>>>>>> ::Array{Float64,1})
>>>>>>
>>>>>> ....and a dozen other similar attempts fail. 
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Bradley
>>>>>>
>>>>>>
>>>>
>>>
>>
>

Reply via email to