Thanks everyone for your thoughts. I think the first step is to refactor the
package sticking with Java and then we'll see about moving to a Scala/Java
mix after that (but only for the opennlp machine learning package, currently
opennlp-maxent).

I was actually sort of appalled looking through the code yesterday and
seeing so many global variables used all over the place, making it hard to
know exactly what every method had access to. I think this was sort of an
artifact of how I used Trove functions a loooong time ago to enable quick
iteration over the data structures (which required some objects to be
global). That is obviously gone now, but the global variables didn't go
away... hope I'll find time to improve things over the next 5-6 months.

Jason

On Mon, Apr 11, 2011 at 7:27 AM, Tommaso Teofili
<[email protected]>wrote:

> Hi Jason,
> I personally have some Scala experience while working with Clerezza [1]
> which uses both Java and Scala but what I think is that, while Scala is
> perfectly ok with existing Java standards and allowing functional/dynamic
> programming, it raises the barrier for new users/devs a little bit.
> So I am not so sure that a Scala implementation should totally replace an
> existing one, maybe a graceful introduction would be more welcome.
> My 2 cents,
> Tommaso
>
>
> [1] : http://incubator.apache.org/clerezza
>
> 2011/4/10 Jason Baldridge <[email protected]>
>
>> It's been a while since I posted these request for input... Does anyone
>> have
>> any thoughts on it? Is anyone else interested in Scala being part of
>> OpenNLP?
>>
>> Jason
>>
>> On Tue, Mar 22, 2011 at 10:16 AM, Jason Baldridge
>> <[email protected]>wrote:
>>
>> > Hi everyone,
>> >
>> > Jorn and I have had a little discussion about a topic I brought up with
>> him
>> > that I'd like to get everyone's thoughts on. I'm including our
>> conversation
>> > below, but the gist of it is this:
>> >
>> >  - I've been switching to development in Scala. At this point, I
>> personally
>> > see little point in coding in Java given that Scala is available (and
>> very
>> > very nice) and it plays very well with existing Java -- I'm very happy
>> with
>> > this for several projects I'm working on, including TextGrounder<
>> http://code.google.com/p/textgrounder/>and
>> > Junto <http://code.google.com/p/junto/>. So, I'd like to see Scala
>> making
>>
>> > its way into OpenNLP.
>> >  - We need to reorganize the maxent code into the new package
>> opennlp.ml
>> >  - I'd like to create the new package, retaining the Java code as is,
>> make
>> > a first release, and then allow Scala code to mix in with the Java from
>> that
>> > point on
>> >  - A number of issues come up with this, including using another build
>> tool
>> > like SBT instead of Maven and ensuring we are Apache compliant and so
>> on.
>> >
>> > So, this is really just a feeler to see what you all think and see if
>> you
>> > have any enthusiasm, reservations or suggestions. Thanks!
>> >
>> > Jason
>> >
>> >
>> > Forwarded conversation
>> > Subject: opennlp.ml + Scala?
>> > ------------------------
>> >
>> > From: *Jason Baldridge* <[email protected]>
>> > Date: Mon, Mar 21, 2011 at 1:28 PM
>> > To: Jörn Kottmann <[email protected]>
>> >
>> >
>> > Hi Jorn,
>> >
>> > I've changed over to doing nearly all my coding in Scala, generally
>> > transitioning Java codebases to Scala by writing everything new in Scala
>> and
>> > using the existing Java classes as they are. I would like to do this as
>> part
>> > of the new opennlp.ml, as I'm not inclined to write any new Java code
>> > unless absolutely necessary, and I would very much like to create that
>> new
>> > and improved package. What do you think of this?
>> >
>> > Jason
>> >
>> > --
>> > Jason Baldridge
>> > Assistant Professor, Department of Linguistics
>> > The University of Texas at Austin
>> > http://www.jasonbaldridge.com
>> >
>> > ----------
>> > From: *Jörn Kottmann* <[email protected]>
>> > Date: Mon, Mar 21, 2011 at 2:24 PM
>> > To: Jason Baldridge <[email protected]>
>> >
>> >
>> >  Hmm, yeah, if we would rewrite it I think it is something we could
>> > consider, but in our case we just need
>> > to do some reshaping of the existing code and a little refactoring here
>> and
>> > there. That is one reason
>> > I believe we should be conservative and not use it in this case.
>> >
>> > Other issues I see is that it will be a message to the mahout people
>> that
>> > we do not want to collaborate,
>> > which in fact I believe is something we should do to get map reduce
>> > training support one day.
>> > The people in the team might not be familiar with scala, which could
>> > further limit the man power
>> > which is available for the re-factoring. Just my 2 cents.
>> >
>> > I believe we should also do the maxent refactoring slowly and first do
>> > everything inside the current
>> > structures, and then when everythign is in place do the last changes
>> which
>> > break backward compatibilty.
>> >
>> > Anyway we should start a discussion about the future of OpenNLP, which
>> > features do we want
>> > to implement for the next few versions? Which new components would be
>> nice
>> > to have?
>> > I believe there are quit some people who are willing to pick up tasks
>> but
>> > are simply not
>> > aware about the possibility.
>> >
>> > Jörn
>> >
>> > ----------
>> > From: *Jason Baldridge* <[email protected]>
>> > Date: Mon, Mar 21, 2011 at 3:29 PM
>> > To: Jörn Kottmann <[email protected]>
>> >
>> >
>> >
>> >
>> >
>> > Hmm... what if we did the first refactoring into opennlp.ml with pure
>> Java
>> > but the new package structure, then make a first release and then start
>> > bringing in Scala?
>> >
>> >
>> > Good points. However, I'm finding that Scala plays *very* nicely with
>> Java
>> > (including allowing Java to use Scala classes), so that could be mostly
>> > transparent to users of the package, maintaining the API pretty much as
>> it
>> > is. So, I *think* we could continue to play nicely with Mahout folks.
>> >
>> > Also, after coding for a while in Scala, I can't help but feel that Java
>> > the language is dead, while the JVM lives gloriously on. :) I think
>> there is
>> > a lot of momentum to Scala in general, and my feeling is that it is very
>> > friendly for Java programmers. (Though I had experience in functional
>> > programming before, so a lot of concepts came easily to me that could be
>> > more unusual for others.)
>> >
>> >
>> > What do you mean by "current structures"? Do you mean to keep the
>> classes
>> > as they are now, but just switch the package organization first?
>> >
>> >
>> > Yes, perhaps we should do that once the release is all done? (Thanks for
>> > all your hard work on that, btw!)
>> >
>> > Also, perhaps we should bring up the Scala question on the mailing list?
>> I
>> > wanted to ask you first to see if you had strong objections first, but
>> since
>> > you don't it might be good to sound out the community.
>> >
>> > Jason
>> >
>> >
>> > ----------
>> > From: *Jörn Kottmann* <[email protected]>
>> > Date: Mon, Mar 21, 2011 at 3:38 PM
>> > To: Jason Baldridge <[email protected]>
>> >
>> >
>> > I actually think just doing it for maxent/ml doesn't really makes sense,
>> if
>> > we want to switch the programming
>> > language its for entire code base. Then we speak about the migration of
>> > like 400 classes from java
>> > to scala, does that really makes sense? Just doing a little scala
>> doesn't
>> > sounds reasonable for me.
>> >
>> > Sure move it to the mailing list.
>> >
>> > Jörn
>> >
>> > ----------
>> > From: *Jason Baldridge* <[email protected]>
>> > Date: Mon, Mar 21, 2011 at 5:44 PM
>> > To: Jörn Kottmann <[email protected]>
>> >
>> >
>> > But, the great thing about Scala is that you can mix Scala and Java and
>> not
>> > have to do one or the other -- so I don't think we'd need to do a full
>> > migration.  Anyway, I'll bring it up on the list!
>> >
>> > ----------
>> > From: *Jörn Kottmann* <[email protected]>
>> > Date: Mon, Mar 21, 2011 at 5:54 PM
>> > To: Jason Baldridge <[email protected]>
>> >
>> >
>> > Yeah, but then still most of the code will remain to be pure java mixed
>> > with a little scala, but you have
>> > to deal with the extra complexity for having a little scala, e.g. more
>> > complex build tooling, you need
>> > extra IDE support, more complicated compatibility issues, etc.
>> >
>> > Jörn
>> >
>> > ----------
>> > From: *Jason Baldridge* <[email protected]>
>> > Date: Mon, Mar 21, 2011 at 7:39 PM
>> > To: Jörn Kottmann <[email protected]>
>> >
>> >
>> > The build is *really* easy with SBT (which can incorporate maven and ivy
>> > dependency declarations). The idea would be to transition to Scala so
>> that
>> > it would eventually be mostly scala, if not all scala. A standard jar is
>> > still distributed.
>> >
>> > ----------
>> > From: *Jörn Kottmann* <[email protected]>
>> > Date: Tue, Mar 22, 2011 at 4:33 AM
>> > To: Jason Baldridge <[email protected]>
>> >
>> >
>> > We are using maven right now, and it does a lot of more than just
>> putting
>> > together a jar file
>> > e.g.:
>> > - Making a release, with code signing, tagging in our SCM, producing rat
>> > reports, etc.
>> > - Deploying artifacts to the Apache repository
>> > - Building our documentation
>> > - Testing
>> > - Optionally it can run code quality tools like find bugs or a test
>> > coverage tools
>> >
>> > Jörn
>> >
>> > ----------
>> > From: *Jason Baldridge* <[email protected]>
>> > Date: Tue, Mar 22, 2011 at 9:11 AM
>> > To: Jörn Kottmann <[email protected]>
>> >
>> >
>> >
>> >
>> >
>> > These might need some looking into, but are probably doable.
>> >
>> >
>> > These are builtin targets for SBT.
>> >
>> > -j
>> >
>> > ----------
>> > From: *Jörn Kottmann* <[email protected]>
>> > Date: Tue, Mar 22, 2011 at 9:20 AM
>> > To: Jason Baldridge <[email protected]>
>> >
>> >
>> >  Our entire build system was just rewritten to meet Apache rules and
>> > standards, if we
>> > do that again now it will set the project back for like a month or so.
>> >
>> > Jörn
>> >
>> > ----------
>> > From: *Jason Baldridge* <[email protected]>
>> > Date: Tue, Mar 22, 2011 at 9:33 AM
>> > To: Jörn Kottmann <[email protected]>
>> >
>> >
>> > Fair enough. I will still bring it up as it now actually pains me to
>> code
>> > in Java. ;)
>> >
>> > Oh, here is how to deploy artifacts:
>> >
>> > http://henkelmann.eu/2010/11/14/sbt_hudson_with_test_integration
>> >
>> > I think the others would be straightforward. Possibly one of the bigger
>> > sticking points would be IDE integration -- I use Emacs and it all works
>> > very well for me, but I don't know how it is for Eclipse and NetBeans
>> folks.
>> >
>> > ----------
>> > From: *Jörn Kottmann* <[email protected]>
>> > Date: Tue, Mar 22, 2011 at 9:40 AM
>> > To: Jason Baldridge <[email protected]>
>> >
>> >
>> > I didn't say its not possible to rewrite our build with SBT, but I
>> strongly
>> > believe that is an effort which
>> > will take quite some time e.g. a month just to get a build which is as
>> good
>> > as our maven build we just
>> > finished.
>> > All the people have to install the scala plugins into their IDEs to get
>> > proper support, which is
>> > of course also possible.
>> >
>> > Yeah bring it up on the mailing list.
>> >
>> > Jörn
>> >
>> > ----------
>> > From: *Jason Baldridge* <[email protected]>
>> > Date: Tue, Mar 22, 2011 at 9:46 AM
>> > To: Jörn Kottmann <[email protected]>
>> >
>> >
>> > Sounds good. And I find that it is often straightforward to take Maven
>> > specifications and either use them directly from SBT or translate them
>> into
>> > the SBT definitions.  Perhaps we could start this with opennlp.ml and
>> then
>> > see how it goes before doing it in the main OpenNLP code.
>> >
>> >
>> >
>> > --
>> > Jason Baldridge
>> > Assistant Professor, Department of Linguistics
>> > The University of Texas at Austin
>> > http://www.jasonbaldridge.com
>> >
>>
>>
>>
>> --
>> Jason Baldridge
>> Assistant Professor, Department of Linguistics
>> The University of Texas at Austin
>> http://www.jasonbaldridge.com
>>
>
>


-- 
Jason Baldridge
Assistant Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com

Reply via email to