On Wed, Apr 6, 2011 at 12:38 PM, Ted Dunning <[email protected]> wrote:

>
>
>
>> 3. Is there a place where I can find guidelines for code formatting
>> specific
>> to Mahout? Things like indentation, class names, comments etc.
>>
>
> Lucene standard which is Sun standard with 2 space indentation.
>
> This page might help especially near the bottom:
> https://cwiki.apache.org/MAHOUT/how-to-contribute.html
>
> Note to others, this wiki page has lots more information than the
> how-to-contribute page that is linked from our main site.
>
>
Thanks Ted, the Eclipse code style sheet proved very useful. Also, you
feedback regarding the order of programming the driver, mapper, reducer,
combiner gave me new insight. My goal now is to create an entire mock chain
of driver-mapper-combiner-reducer for the new HMM functionality and write
simple unit tests for them before the project formally begins. This should
give me a good skeleton to work against during the summer.

Here are my next line of questions:

1. Automatic and incremental build: Being new to maven, I'm a little
confused. While updating the code, the Maven console in Eclipse reports
either the auto build or the incremental build:

4/10/11 12:11:02 PM EDT: Maven Builder: AUTO_BUILD
4/10/11 12:11:02 PM EDT: Maven Builder: AUTO_BUILD
4/10/11 12:12:26 PM EDT: Maven Builder: AUTO_BUILD
4/10/11 12:13:02 PM EDT: Maven Builder: INCREMENTAL_BUILD

These messages are triggered every time I make changes to the code. I looked
into the pom.xml and it lists the 1.6 as the version of the Sun's javac to
be used for compilation. As far as I know, the javac is not an incremental
builder, however the Eclipse's compiler is. How is this possible? Also, what
is the difference between AUTO_BUILD and INCREMENTAL_BUILD?

2. Package location for Map Reduce HMM training. I noticed that the Map
Reduce implementations of the different classifiers are located under
different MapReduce packages (o.a.m.classifier.bayes.mapreduce.bayes,
o.a.m.classifier.bayes.mapreduce.cbayes), whereas the Map Reduce classes of
k-means clustering are lumped under the o.a.m.clustering and no separate Map
Reduce package is introduced. What is the convention here? Keeping in mind
extensibility in future and overall architecture of Mahout's code, where
should I place the new HMM Baum Welch Map Reduce code:
o.a.m.classifier.sequencelearning.hmm or
o.a.m.classifier.sequencelearning.hmm.mapreduce or
o.a.m.classifier.sequencelearning.hmm.mapreduce.baumwelch or somewhere
else...


Dhruv

Reply via email to