On Wed, Nov 27, 2013 at 10:17 AM, Dmitriy Lyubimov <[email protected]>wrote:

>
>
>
> On Wed, Nov 27, 2013 at 9:09 AM, Oleksandr Olgashko <
> [email protected]> wrote:
>
>> Could you please formalize reqs for ICA? I mean, what actually should be
>> done.
>> Parallelization strategy is a bit general concept.
>>
>
> No, it is not really. Not general enough so that you couldn't do it on
> your own.
>
> You can think of it as a fairly free-style TDD for how to do  things on MR
> or Pregel so the majority of reviewers here could understand.
>

I guess i need to be a bit more specific: Hadoop MR or Spark/Bagel apis .
we don't really pull in any other frameworks at the moment.


> Not ideal example but hope it helps --look at the attachment for
> https://issues.apache.org/jira/browse/MAHOUT-1365
>
> -d
>
>
>>
>> 2013/11/26 Dmitriy Lyubimov <[email protected]>
>>
>> > On Tue, Nov 26, 2013 at 1:11 PM, Олександр Ольгашко <
>> > [email protected]> wrote:
>> >
>> > > I may need unknown period of time to get familiar with Mahout project
>> > > structure.
>> > > I'd like to make some research about ICA's parallelization strategy,
>> it
>> > is
>> > > quite interesting.
>> > > Not sure, if i can help somehow with MAHOUT-1346, never worked with
>> such
>> > > things before.
>> > >
>> > > Should i use mail lists or smth else for arising questions and other
>> > > communication?
>> > >
>> > yes. there's probably no better place as far as Mahout is concerned.
>> >
>> > >
>> > >
>> > > 2013/11/26 Dmitriy Lyubimov <[email protected]>
>> > >
>> > > > Dimension reduction is addressed with PCA which is an option of SSVD
>> > > > method.
>> > > > However, if you can research/offer parallelization strategy for ICA,
>> > i'd
>> > > be
>> > > > all ears.
>> > > >
>> > > > there's also ongoing push to create a DSL environment for mahout
>> > > > distributed matrices to Spark which i personally think is one of the
>> > most
>> > > > promising recent developments. It is also an treasure chest (or a
>> can
>> > of
>> > > > worms depending on how you view it) for new people to chime in. DSL
>> > > > environment issue is MAHOUT-1346, with everything else pretty much
>> > > > dependent on it
>> > > >
>> > > > -d
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Tue, Nov 26, 2013 at 9:19 AM, Олександр Ольгашко <
>> > > > [email protected]> wrote:
>> > > >
>> > > > > Hello,
>> > > > >
>> > > > > I am a student, interested in data analysis, also i have chosen
>> this
>> > > > theme
>> > > > > for my diploma work. As mentioned here
>> > > > > https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms,
>> there
>> > > are
>> > > > > some open algorithms, for example, in Dimension reduction section.
>> > > > >
>> > > > > So, how can i start develop them? I have some theoretical
>> background,
>> > > > but i
>> > > > > think, there may be some unknown problems. Mb somebody is working
>> on
>> > > > these
>> > > > > algorithms. Can you give some tips for start?
>> > > > >
>> > > > > I searched in JIRA for Independent Component Analysis, found
>> nothing.
>> > > > >
>> > > > > Thanks in advance.
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Reply via email to