Hi everyone,

Jorn and I have had a little discussion about a topic I brought up with him
that I'd like to get everyone's thoughts on. I'm including our conversation
below, but the gist of it is this:

 - I've been switching to development in Scala. At this point, I personally
see little point in coding in Java given that Scala is available (and very
very nice) and it plays very well with existing Java -- I'm very happy with
this for several projects I'm working on, including
TextGrounder<http://code.google.com/p/textgrounder/>and
Junto <http://code.google.com/p/junto/>. So, I'd like to see Scala making
its way into OpenNLP.
 - We need to reorganize the maxent code into the new package opennlp.ml
 - I'd like to create the new package, retaining the Java code as is, make a
first release, and then allow Scala code to mix in with the Java from that
point on
 - A number of issues come up with this, including using another build tool
like SBT instead of Maven and ensuring we are Apache compliant and so on.

So, this is really just a feeler to see what you all think and see if you
have any enthusiasm, reservations or suggestions. Thanks!

Jason


Forwarded conversation
Subject: opennlp.ml + Scala?
------------------------

From: *Jason Baldridge* <[email protected]>
Date: Mon, Mar 21, 2011 at 1:28 PM
To: Jörn Kottmann <[email protected]>


Hi Jorn,

I've changed over to doing nearly all my coding in Scala, generally
transitioning Java codebases to Scala by writing everything new in Scala and
using the existing Java classes as they are. I would like to do this as part
of the new opennlp.ml, as I'm not inclined to write any new Java code unless
absolutely necessary, and I would very much like to create that new and
improved package. What do you think of this?

Jason

-- 
Jason Baldridge
Assistant Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com

----------
From: *Jörn Kottmann* <[email protected]>
Date: Mon, Mar 21, 2011 at 2:24 PM
To: Jason Baldridge <[email protected]>


 Hmm, yeah, if we would rewrite it I think it is something we could
consider, but in our case we just need
to do some reshaping of the existing code and a little refactoring here and
there. That is one reason
I believe we should be conservative and not use it in this case.

Other issues I see is that it will be a message to the mahout people that we
do not want to collaborate,
which in fact I believe is something we should do to get map reduce training
support one day.
The people in the team might not be familiar with scala, which could further
limit the man power
which is available for the re-factoring. Just my 2 cents.

I believe we should also do the maxent refactoring slowly and first do
everything inside the current
structures, and then when everythign is in place do the last changes which
break backward compatibilty.

Anyway we should start a discussion about the future of OpenNLP, which
features do we want
to implement for the next few versions? Which new components would be nice
to have?
I believe there are quit some people who are willing to pick up tasks but
are simply not
aware about the possibility.

Jörn

----------
From: *Jason Baldridge* <[email protected]>
Date: Mon, Mar 21, 2011 at 3:29 PM
To: Jörn Kottmann <[email protected]>





Hmm... what if we did the first refactoring into opennlp.ml with pure Java
but the new package structure, then make a first release and then start
bringing in Scala?


Good points. However, I'm finding that Scala plays *very* nicely with Java
(including allowing Java to use Scala classes), so that could be mostly
transparent to users of the package, maintaining the API pretty much as it
is. So, I *think* we could continue to play nicely with Mahout folks.

Also, after coding for a while in Scala, I can't help but feel that Java the
language is dead, while the JVM lives gloriously on. :) I think there is a
lot of momentum to Scala in general, and my feeling is that it is very
friendly for Java programmers. (Though I had experience in functional
programming before, so a lot of concepts came easily to me that could be
more unusual for others.)


What do you mean by "current structures"? Do you mean to keep the classes as
they are now, but just switch the package organization first?


Yes, perhaps we should do that once the release is all done? (Thanks for all
your hard work on that, btw!)

Also, perhaps we should bring up the Scala question on the mailing list? I
wanted to ask you first to see if you had strong objections first, but since
you don't it might be good to sound out the community.

Jason


----------
From: *Jörn Kottmann* <[email protected]>
Date: Mon, Mar 21, 2011 at 3:38 PM
To: Jason Baldridge <[email protected]>


I actually think just doing it for maxent/ml doesn't really makes sense, if
we want to switch the programming
language its for entire code base. Then we speak about the migration of like
400 classes from java
to scala, does that really makes sense? Just doing a little scala doesn't
sounds reasonable for me.

Sure move it to the mailing list.

Jörn

----------
From: *Jason Baldridge* <[email protected]>
Date: Mon, Mar 21, 2011 at 5:44 PM
To: Jörn Kottmann <[email protected]>


But, the great thing about Scala is that you can mix Scala and Java and not
have to do one or the other -- so I don't think we'd need to do a full
migration.  Anyway, I'll bring it up on the list!

----------
From: *Jörn Kottmann* <[email protected]>
Date: Mon, Mar 21, 2011 at 5:54 PM
To: Jason Baldridge <[email protected]>


Yeah, but then still most of the code will remain to be pure java mixed with
a little scala, but you have
to deal with the extra complexity for having a little scala, e.g. more
complex build tooling, you need
extra IDE support, more complicated compatibility issues, etc.

Jörn

----------
From: *Jason Baldridge* <[email protected]>
Date: Mon, Mar 21, 2011 at 7:39 PM
To: Jörn Kottmann <[email protected]>


The build is *really* easy with SBT (which can incorporate maven and ivy
dependency declarations). The idea would be to transition to Scala so that
it would eventually be mostly scala, if not all scala. A standard jar is
still distributed.

----------
From: *Jörn Kottmann* <[email protected]>
Date: Tue, Mar 22, 2011 at 4:33 AM
To: Jason Baldridge <[email protected]>


We are using maven right now, and it does a lot of more than just putting
together a jar file
e.g.:
- Making a release, with code signing, tagging in our SCM, producing rat
reports, etc.
- Deploying artifacts to the Apache repository
- Building our documentation
- Testing
- Optionally it can run code quality tools like find bugs or a test coverage
tools

Jörn

----------
From: *Jason Baldridge* <[email protected]>
Date: Tue, Mar 22, 2011 at 9:11 AM
To: Jörn Kottmann <[email protected]>





These might need some looking into, but are probably doable.


These are builtin targets for SBT.

-j

----------
From: *Jörn Kottmann* <[email protected]>
Date: Tue, Mar 22, 2011 at 9:20 AM
To: Jason Baldridge <[email protected]>


 Our entire build system was just rewritten to meet Apache rules and
standards, if we
do that again now it will set the project back for like a month or so.

Jörn

----------
From: *Jason Baldridge* <[email protected]>
Date: Tue, Mar 22, 2011 at 9:33 AM
To: Jörn Kottmann <[email protected]>


Fair enough. I will still bring it up as it now actually pains me to code in
Java. ;)

Oh, here is how to deploy artifacts:

http://henkelmann.eu/2010/11/14/sbt_hudson_with_test_integration

I think the others would be straightforward. Possibly one of the bigger
sticking points would be IDE integration -- I use Emacs and it all works
very well for me, but I don't know how it is for Eclipse and NetBeans folks.

----------
From: *Jörn Kottmann* <[email protected]>
Date: Tue, Mar 22, 2011 at 9:40 AM
To: Jason Baldridge <[email protected]>


I didn't say its not possible to rewrite our build with SBT, but I strongly
believe that is an effort which
will take quite some time e.g. a month just to get a build which is as good
as our maven build we just
finished.
All the people have to install the scala plugins into their IDEs to get
proper support, which is
of course also possible.

Yeah bring it up on the mailing list.

Jörn

----------
From: *Jason Baldridge* <[email protected]>
Date: Tue, Mar 22, 2011 at 9:46 AM
To: Jörn Kottmann <[email protected]>


Sounds good. And I find that it is often straightforward to take Maven
specifications and either use them directly from SBT or translate them into
the SBT definitions.  Perhaps we could start this with opennlp.ml and then
see how it goes before doing it in the main OpenNLP code.



-- 
Jason Baldridge
Assistant Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com

Reply via email to