>
>  if you implement some algorithm one should use the notation from the
> referenced paper


This can be easier to implement (essentially just copy from the paper) but
will make for a mess and a maintenance nightmare.  I don't want to have to
read a paper just to understand what someone's code is doing.  Not to
mention there are lots of "unique findings" and algorithms in papers that
have actually already been found/implemented, but with different
terminology in a different field.  My research has taken me down lots of
rabbit holes, and I'm always amazed at how very different
fields/applications all have the same underlying math.  We should do
everything we can to unify the algorithms in the most Julian way.  It's not
always easy, but it should at least be the goal.

This is most important with terminology and using greek letters.  I don't
want one algorithm to represent a learning rate with eta, and another to
use alpha.  It may match the paper, but it makes for mass confusion when
you're not using the paper as a reference.  (the obvious solution is to
never use greek letters, of course)

On Wed, Nov 11, 2015 at 10:34 AM, Christof Stocker <
stocker.chris...@gmail.com> wrote:

> I agree. I personally think the ML efforts should follow the StatsBase and
> Optim conventions where it makes sense.
>
> The notational differences are inconvenient, but they are manageable. I
> think readability should be the goal there. For example if you implement
> some algorithm one should use the notation from the referenced paper. A
> package tailored towards use in a statistical context such as GLMs should
> probably follow the convention used in statistics (e.g. beta for the
> coefficients). A package for SVMs should follow the conventions for SVMs
> (e.g. w for the coefficients) and so forth. It's nice to streamline things,
> but lets not get carried away with this kind of micromanagement
>
>
> On 2015-11-11 16:01, Tom Breloff wrote:
>
> One of the tricky things to figure out is how to separate statistics from
> machine learning, as they overlap heavily (completely?) but with different
> terminology and goals.  I think it's really important that JuliaStats and
> JuliaML/JuliaLearn play nicely together, and this probably means that any
> ML interface uses StatsBase verbs whenever possible.  There has been a
> little tension (from my perspective) and a slight turf war when it comes to
> statistical processes and terminology... is it possible to avoid?
>
> On Wed, Nov 11, 2015 at 9:49 AM, Stefan Karpinski <ste...@karpinski.org>
> wrote:
>
>> This is definitely already in progress, but we've a ways to go before
>> it's as easy as scikit-learn. I suspect that having an organization will be
>> more effective at coordinating the various efforts than people might expect.
>>
>> On Wed, Nov 11, 2015 at 9:46 AM, Tom Breloff < <t...@breloff.com>
>> t...@breloff.com> wrote:
>>
>>> Randy, see LearnBase.jl, MachineLearning.jl, Learn.jl (just a readme for
>>> now), Orchestra.jl, and many others.  Many people have the same goal, and
>>> wrapping TensorFlow isn't going to change the need for a high level
>>> interface.  I do agree that a good high level interface is higher on the
>>> priority list, though.
>>>
>>> On Wed, Nov 11, 2015 at 9:29 AM, Randy Zwitch <
>>> randy.zwi...@fuqua.duke.edu> wrote:
>>>
>>>> Sure. I'm not against anyone doing anything, just that it seems like
>>>> Julia suffers from an "expert/edge case" problem right now. For me, it'd be
>>>> awesome if there was just a scikit-learn (Python) or caret (R) type
>>>> mega-interface that ties together the packages that are already coded
>>>> together. From my cursory reading, it seems like TensorFlow is more like a
>>>> low-level toolkit for expressing/solving equations, where I see Julia
>>>> lacking an easy method to evaluate 3-5 different algorithms on the same
>>>> dataset quickly.
>>>>
>>>> A tweet I just saw sums it up pretty succinctly: "TensorFlow already
>>>> has more stars than scikit-learn, and probably more stars than people
>>>> actually doing deep learning"
>>>>
>>>>
>>>>
>>>> On Tuesday, November 10, 2015 at 11:28:32 PM UTC-5, Alireza Nejati
>>>> wrote:
>>>>>
>>>>> Randy: To answer your question, I'd reckon that the two major gaps in
>>>>> julia that TensorFlow could fill are:
>>>>>
>>>>> 1. Lack of automatic differentiation on arbitrary graph structures.
>>>>> 2. Lack of ability to map computations across cpus and clusters.
>>>>>
>>>>> Funny enough, I was thinking about (1) for the past few weeks and I
>>>>> think I have an idea about how to accomplish it using existing JuliaDiff
>>>>> libraries. About (2), I have no idea, and that's probably going to be the
>>>>> most important aspect of TensorFlow moving forward (and also probably the
>>>>> hardest to implement). So for the time being, I think it's definitely
>>>>> worthwhile to just have an interface to TensorFlow. There are a few ways
>>>>> this could be done. Some ways that I can think of:
>>>>>
>>>>> 1. Just tell people to use PyCall directly. Not an elegant solution.
>>>>> 2. A more julia-integrated interface *a la* SymPy.
>>>>> 3. Using TensorFlow as the 'backend' of a novel julia-based machine
>>>>> learning library. In this scenario, everything would be in julia, and
>>>>> TensorFlow would only be used to map computations to hardware.
>>>>>
>>>>> I think 3 is the most attractive option, but also probably the hardest
>>>>> to do.
>>>>>
>>>>
>>>
>>
>
>

Reply via email to