ps, it doesn't make sense to have weight and gradient sparse unless
with strong L1 penalty.

Sincerely,

DB Tsai
-------------------------------------------------------
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai


On Wed, Apr 23, 2014 at 10:17 PM, DB Tsai <dbt...@dbtsai.com> wrote:
> In mllib, the weight, and gradient are dense. Only feature is sparse.
>
> Sincerely,
>
> DB Tsai
> -------------------------------------------------------
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Wed, Apr 23, 2014 at 10:16 PM, David Hall <d...@cs.berkeley.edu> wrote:
>> Was the weight vector sparse? The gradients? Or just the feature vectors?
>>
>>
>> On Wed, Apr 23, 2014 at 10:08 PM, DB Tsai <dbt...@dbtsai.com> wrote:
>>>
>>> The figure showing the Log-Likelihood vs Time can be found here.
>>>
>>>
>>> https://github.com/dbtsai/spark-lbfgs-benchmark/raw/fd703303fb1c16ef5714901739154728550becf4/result/a9a11M.pdf
>>>
>>> Let me know if you can not open it.
>>>
>>> Sincerely,
>>>
>>> DB Tsai
>>> -------------------------------------------------------
>>> My Blog: https://www.dbtsai.com
>>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>>
>>>
>>> On Wed, Apr 23, 2014 at 9:34 PM, Shivaram Venkataraman <
>>> shiva...@eecs.berkeley.edu> wrote:
>>>
>>> > I don't think the attachment came through in the list. Could you upload
>>> > the results somewhere and link to them ?
>>> >
>>> >
>>> > On Wed, Apr 23, 2014 at 9:32 PM, DB Tsai <dbt...@dbtsai.com> wrote:
>>> >
>>> >> 123 features per rows, and in average, 89% are zeros.
>>> >> On Apr 23, 2014 9:31 PM, "Evan Sparks" <evan.spa...@gmail.com> wrote:
>>> >>
>>> >> > What is the number of non zeroes per row (and number of features) in
>>> >> > the
>>> >> > sparse case? We've hit some issues with breeze sparse support in the
>>> >> past
>>> >> > but for sufficiently sparse data it's still pretty good.
>>> >> >
>>> >> > > On Apr 23, 2014, at 9:21 PM, DB Tsai <dbt...@stanford.edu> wrote:
>>> >> > >
>>> >> > > Hi all,
>>> >> > >
>>> >> > > I'm benchmarking Logistic Regression in MLlib using the newly added
>>> >> > optimizer LBFGS and GD. I'm using the same dataset and the same
>>> >> methodology
>>> >> > in this paper, http://www.csie.ntu.edu.tw/~cjlin/papers/l1.pdf
>>> >> > >
>>> >> > > I want to know how Spark scale while adding workers, and how
>>> >> optimizers
>>> >> > and input format (sparse or dense) impact performance.
>>> >> > >
>>> >> > > The benchmark code can be found here,
>>> >> > https://github.com/dbtsai/spark-lbfgs-benchmark
>>> >> > >
>>> >> > > The first dataset I benchmarked is a9a which only has 2.2MB. I
>>> >> > duplicated the dataset, and made it 762MB to have 11M rows. This
>>> >> > dataset
>>> >> > has 123 features and 11% of the data are non-zero elements.
>>> >> > >
>>> >> > > In this benchmark, all the dataset is cached in memory.
>>> >> > >
>>> >> > > As we expect, LBFGS converges faster than GD, and at some point, no
>>> >> > matter how we push GD, it will converge slower and slower.
>>> >> > >
>>> >> > > However, it's surprising that sparse format runs slower than dense
>>> >> > format. I did see that sparse format takes significantly smaller
>>> >> > amount
>>> >> of
>>> >> > memory in caching RDD, but sparse is 40% slower than dense. I think
>>> >> sparse
>>> >> > should be fast since when we compute x wT, since x is sparse, we can
>>> >> > do
>>> >> it
>>> >> > faster. I wonder if there is anything I'm doing wrong.
>>> >> > >
>>> >> > > The attachment is the benchmark result.
>>> >> > >
>>> >> > > Thanks.
>>> >> > >
>>> >> > > Sincerely,
>>> >> > >
>>> >> > > DB Tsai
>>> >> > > -------------------------------------------------------
>>> >> > > My Blog: https://www.dbtsai.com
>>> >> > > LinkedIn: https://www.linkedin.com/in/dbtsai
>>> >> >
>>> >>
>>> >
>>> >
>>
>>

Reply via email to