On Wed, Nov 27, 2013 at 9:09 AM, Oleksandr Olgashko < [email protected]> wrote:
> Could you please formalize reqs for ICA? I mean, what actually should be > done. > Parallelization strategy is a bit general concept. > No, it is not really. Not general enough so that you couldn't do it on your own. You can think of it as a fairly free-style TDD for how to do things on MR or Pregel so the majority of reviewers here could understand. Not ideal example but hope it helps --look at the attachment for https://issues.apache.org/jira/browse/MAHOUT-1365 -d > > 2013/11/26 Dmitriy Lyubimov <[email protected]> > > > On Tue, Nov 26, 2013 at 1:11 PM, Олександр Ольгашко < > > [email protected]> wrote: > > > > > I may need unknown period of time to get familiar with Mahout project > > > structure. > > > I'd like to make some research about ICA's parallelization strategy, it > > is > > > quite interesting. > > > Not sure, if i can help somehow with MAHOUT-1346, never worked with > such > > > things before. > > > > > > Should i use mail lists or smth else for arising questions and other > > > communication? > > > > > yes. there's probably no better place as far as Mahout is concerned. > > > > > > > > > > > 2013/11/26 Dmitriy Lyubimov <[email protected]> > > > > > > > Dimension reduction is addressed with PCA which is an option of SSVD > > > > method. > > > > However, if you can research/offer parallelization strategy for ICA, > > i'd > > > be > > > > all ears. > > > > > > > > there's also ongoing push to create a DSL environment for mahout > > > > distributed matrices to Spark which i personally think is one of the > > most > > > > promising recent developments. It is also an treasure chest (or a can > > of > > > > worms depending on how you view it) for new people to chime in. DSL > > > > environment issue is MAHOUT-1346, with everything else pretty much > > > > dependent on it > > > > > > > > -d > > > > > > > > > > > > > > > > > > > > On Tue, Nov 26, 2013 at 9:19 AM, Олександр Ольгашко < > > > > [email protected]> wrote: > > > > > > > > > Hello, > > > > > > > > > > I am a student, interested in data analysis, also i have chosen > this > > > > theme > > > > > for my diploma work. As mentioned here > > > > > https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms, > there > > > are > > > > > some open algorithms, for example, in Dimension reduction section. > > > > > > > > > > So, how can i start develop them? I have some theoretical > background, > > > > but i > > > > > think, there may be some unknown problems. Mb somebody is working > on > > > > these > > > > > algorithms. Can you give some tips for start? > > > > > > > > > > I searched in JIRA for Independent Component Analysis, found > nothing. > > > > > > > > > > Thanks in advance. > > > > > > > > > > > > > > >
