Re: [mlpack] Inquiry into SGD

孟憲妤 Sun, 12 Mar 2017 21:34:47 -0700

Hello,

Thanks for your reply.
In page 4 of the paper : Even though the processors
have no knowledge as to whether any of the other processors have modified
x, we define x j to be
the state of the decision variable x after j updates have been performed 1
. Since two processors can
write to x at the same time, we need to be a bit careful with this
definition, but we simply break ties
at random.


I don't fully understand what does it mean by "break ties at random". Does
it indicate that we should store all x_k(j) for evaluating G_j(x_j) ? How
do we know that the state we read is the newest state(no other process is
writing) ?
In https://github.com/mlpack/mlpack/wiki/SummerOfCodeIdeas#para
llel-stochastic-optimization-methods mentioned that we should consider how
to partition data in memory for good cache locality. Does it suggest that
we should consider the size of L1 L2 cache and the block size and design
the size of array not to be prone to cache miss?

Thanks for your patience.

Meng


2017-03-08 15:35 GMT+08:00 Stephen Tu <tu.steph...@gmail.com>:

> Hi,
>
> The practical way to run parallel SGD is to use a constant stepsize per
> epoch, and then set the step size \eta <- \eta * \rho for some \rho \in
> (0,1) (typical value is like 0.9). I would not recommend changing the step
> size at every time. See the discussion in Section 5 (Robust 1/k rates).
>
> I don't understand your second question since there shouldn't be locks?
>
> On Tue, Mar 7, 2017 at 5:57 AM, 孟憲妤 <spur...@gmail.com> wrote:
>
>> Hello,
>>
>> I'm HsienYu Meng from Taiwan. I'm currently in my first year of graduate
>> studies at Tsinghua Univ.  As you mentioned, the SGD project should
>> implement HOGWILD! algorithm and consider to set stepsize at each
>> iteration. However, the paper indicates that they assume the stepsize to be
>> a constant value so that the formula 7 make sense. I wonder if we change
>> the stepsize at each time, will the properties of HOGWILD! still holds?
>> Another question is, if the dataset is large and we partition it in memory,
>> which kind of losk-release should we apply? Sequential consistency, Release
>> consistency or lazy release consistency?
>>
>> Appreciate for your patience .Thanks!
>>
>> --
>> *孟憲妤*
>>
>> *MengHsienyu**Department of Computer Science ,Tsinghua university *
>> *Beijing ,China*
>>
>> *100084*
>>
>> *Mobile:(+86)18201162149 <+86%20182%200116%202149>*
>> *        (+886)0952424693 <+886%20952%20424%20693>*
>>
>>
>> _______________________________________________
>> mlpack mailing list
>> mlpack@lists.mlpack.org
>> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
>>
>
>


-- 
*孟憲妤*

*MengXianYu**Department of Electronic Engineering ,Tsinghua university *
*Beijing ,China*

*100084*

*Mobile:(+86)18201162149*
*        (+886)0952424693*

_______________________________________________
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Re: [mlpack] Inquiry into SGD

Reply via email to