It's been a while since I posted these request for input... Does anyone have any thoughts on it? Is anyone else interested in Scala being part of OpenNLP?
Jason On Tue, Mar 22, 2011 at 10:16 AM, Jason Baldridge <[email protected]>wrote: > Hi everyone, > > Jorn and I have had a little discussion about a topic I brought up with him > that I'd like to get everyone's thoughts on. I'm including our conversation > below, but the gist of it is this: > > - I've been switching to development in Scala. At this point, I personally > see little point in coding in Java given that Scala is available (and very > very nice) and it plays very well with existing Java -- I'm very happy with > this for several projects I'm working on, including > TextGrounder<http://code.google.com/p/textgrounder/>and > Junto <http://code.google.com/p/junto/>. So, I'd like to see Scala making > its way into OpenNLP. > - We need to reorganize the maxent code into the new package opennlp.ml > - I'd like to create the new package, retaining the Java code as is, make > a first release, and then allow Scala code to mix in with the Java from that > point on > - A number of issues come up with this, including using another build tool > like SBT instead of Maven and ensuring we are Apache compliant and so on. > > So, this is really just a feeler to see what you all think and see if you > have any enthusiasm, reservations or suggestions. Thanks! > > Jason > > > Forwarded conversation > Subject: opennlp.ml + Scala? > ------------------------ > > From: *Jason Baldridge* <[email protected]> > Date: Mon, Mar 21, 2011 at 1:28 PM > To: Jörn Kottmann <[email protected]> > > > Hi Jorn, > > I've changed over to doing nearly all my coding in Scala, generally > transitioning Java codebases to Scala by writing everything new in Scala and > using the existing Java classes as they are. I would like to do this as part > of the new opennlp.ml, as I'm not inclined to write any new Java code > unless absolutely necessary, and I would very much like to create that new > and improved package. What do you think of this? > > Jason > > -- > Jason Baldridge > Assistant Professor, Department of Linguistics > The University of Texas at Austin > http://www.jasonbaldridge.com > > ---------- > From: *Jörn Kottmann* <[email protected]> > Date: Mon, Mar 21, 2011 at 2:24 PM > To: Jason Baldridge <[email protected]> > > > Hmm, yeah, if we would rewrite it I think it is something we could > consider, but in our case we just need > to do some reshaping of the existing code and a little refactoring here and > there. That is one reason > I believe we should be conservative and not use it in this case. > > Other issues I see is that it will be a message to the mahout people that > we do not want to collaborate, > which in fact I believe is something we should do to get map reduce > training support one day. > The people in the team might not be familiar with scala, which could > further limit the man power > which is available for the re-factoring. Just my 2 cents. > > I believe we should also do the maxent refactoring slowly and first do > everything inside the current > structures, and then when everythign is in place do the last changes which > break backward compatibilty. > > Anyway we should start a discussion about the future of OpenNLP, which > features do we want > to implement for the next few versions? Which new components would be nice > to have? > I believe there are quit some people who are willing to pick up tasks but > are simply not > aware about the possibility. > > Jörn > > ---------- > From: *Jason Baldridge* <[email protected]> > Date: Mon, Mar 21, 2011 at 3:29 PM > To: Jörn Kottmann <[email protected]> > > > > > > Hmm... what if we did the first refactoring into opennlp.ml with pure Java > but the new package structure, then make a first release and then start > bringing in Scala? > > > Good points. However, I'm finding that Scala plays *very* nicely with Java > (including allowing Java to use Scala classes), so that could be mostly > transparent to users of the package, maintaining the API pretty much as it > is. So, I *think* we could continue to play nicely with Mahout folks. > > Also, after coding for a while in Scala, I can't help but feel that Java > the language is dead, while the JVM lives gloriously on. :) I think there is > a lot of momentum to Scala in general, and my feeling is that it is very > friendly for Java programmers. (Though I had experience in functional > programming before, so a lot of concepts came easily to me that could be > more unusual for others.) > > > What do you mean by "current structures"? Do you mean to keep the classes > as they are now, but just switch the package organization first? > > > Yes, perhaps we should do that once the release is all done? (Thanks for > all your hard work on that, btw!) > > Also, perhaps we should bring up the Scala question on the mailing list? I > wanted to ask you first to see if you had strong objections first, but since > you don't it might be good to sound out the community. > > Jason > > > ---------- > From: *Jörn Kottmann* <[email protected]> > Date: Mon, Mar 21, 2011 at 3:38 PM > To: Jason Baldridge <[email protected]> > > > I actually think just doing it for maxent/ml doesn't really makes sense, if > we want to switch the programming > language its for entire code base. Then we speak about the migration of > like 400 classes from java > to scala, does that really makes sense? Just doing a little scala doesn't > sounds reasonable for me. > > Sure move it to the mailing list. > > Jörn > > ---------- > From: *Jason Baldridge* <[email protected]> > Date: Mon, Mar 21, 2011 at 5:44 PM > To: Jörn Kottmann <[email protected]> > > > But, the great thing about Scala is that you can mix Scala and Java and not > have to do one or the other -- so I don't think we'd need to do a full > migration. Anyway, I'll bring it up on the list! > > ---------- > From: *Jörn Kottmann* <[email protected]> > Date: Mon, Mar 21, 2011 at 5:54 PM > To: Jason Baldridge <[email protected]> > > > Yeah, but then still most of the code will remain to be pure java mixed > with a little scala, but you have > to deal with the extra complexity for having a little scala, e.g. more > complex build tooling, you need > extra IDE support, more complicated compatibility issues, etc. > > Jörn > > ---------- > From: *Jason Baldridge* <[email protected]> > Date: Mon, Mar 21, 2011 at 7:39 PM > To: Jörn Kottmann <[email protected]> > > > The build is *really* easy with SBT (which can incorporate maven and ivy > dependency declarations). The idea would be to transition to Scala so that > it would eventually be mostly scala, if not all scala. A standard jar is > still distributed. > > ---------- > From: *Jörn Kottmann* <[email protected]> > Date: Tue, Mar 22, 2011 at 4:33 AM > To: Jason Baldridge <[email protected]> > > > We are using maven right now, and it does a lot of more than just putting > together a jar file > e.g.: > - Making a release, with code signing, tagging in our SCM, producing rat > reports, etc. > - Deploying artifacts to the Apache repository > - Building our documentation > - Testing > - Optionally it can run code quality tools like find bugs or a test > coverage tools > > Jörn > > ---------- > From: *Jason Baldridge* <[email protected]> > Date: Tue, Mar 22, 2011 at 9:11 AM > To: Jörn Kottmann <[email protected]> > > > > > > These might need some looking into, but are probably doable. > > > These are builtin targets for SBT. > > -j > > ---------- > From: *Jörn Kottmann* <[email protected]> > Date: Tue, Mar 22, 2011 at 9:20 AM > To: Jason Baldridge <[email protected]> > > > Our entire build system was just rewritten to meet Apache rules and > standards, if we > do that again now it will set the project back for like a month or so. > > Jörn > > ---------- > From: *Jason Baldridge* <[email protected]> > Date: Tue, Mar 22, 2011 at 9:33 AM > To: Jörn Kottmann <[email protected]> > > > Fair enough. I will still bring it up as it now actually pains me to code > in Java. ;) > > Oh, here is how to deploy artifacts: > > http://henkelmann.eu/2010/11/14/sbt_hudson_with_test_integration > > I think the others would be straightforward. Possibly one of the bigger > sticking points would be IDE integration -- I use Emacs and it all works > very well for me, but I don't know how it is for Eclipse and NetBeans folks. > > ---------- > From: *Jörn Kottmann* <[email protected]> > Date: Tue, Mar 22, 2011 at 9:40 AM > To: Jason Baldridge <[email protected]> > > > I didn't say its not possible to rewrite our build with SBT, but I strongly > believe that is an effort which > will take quite some time e.g. a month just to get a build which is as good > as our maven build we just > finished. > All the people have to install the scala plugins into their IDEs to get > proper support, which is > of course also possible. > > Yeah bring it up on the mailing list. > > Jörn > > ---------- > From: *Jason Baldridge* <[email protected]> > Date: Tue, Mar 22, 2011 at 9:46 AM > To: Jörn Kottmann <[email protected]> > > > Sounds good. And I find that it is often straightforward to take Maven > specifications and either use them directly from SBT or translate them into > the SBT definitions. Perhaps we could start this with opennlp.ml and then > see how it goes before doing it in the main OpenNLP code. > > > > -- > Jason Baldridge > Assistant Professor, Department of Linguistics > The University of Texas at Austin > http://www.jasonbaldridge.com > -- Jason Baldridge Assistant Professor, Department of Linguistics The University of Texas at Austin http://www.jasonbaldridge.com
