2012/1/4 Grant Ingersoll <[email protected]> > The big downside I see to Scala here is one of how many current committers > know it and how many potential contributors know it. If you are the only > committer who knows it, that leaves you to do all the bug fixes, etc. until > you can attract others. >
I am not a committer but in case you need it I think I may help with Scala as I have some experience with it in Clerezza and with OpenNLP (I sent some patches). > > A bit off topic, but it seems to me that the ML stuff could be abstracted > a bit such that different implementations are pluggable. I think that would be nice too. > This way, you could go for Scala if you want, but others could plug in > there own classifiers, etc. Is that part of this plan? > > On Jan 3, 2012, at 12:00 AM, Jason Baldridge wrote: > > > That is an interesting post that spurred a lot of discussion months back. > > David Pollack has a good follow up to that article that goes into some > more > > detail about that post: > > > > http://goodstuff.im/scala-use-is-less-good-than-java-use-for-at-l > > > > The focus is really on the culture of programmers and different types of > > programmers and which ones Java, PHP, or Scala might be best suited for. > He > > ends it with this comment: > > > > Oh... and all you wicked smart people who are pushing the boundaries (or > > think you will) with data size, event frequency and real-time stuff, > you'll > > find Scala to be a dream come true and there will be nothing like it that > > you've ever used (okay, except maybe Haskell). So, come, build your cool > > thing on Scala and succeed. > > > > That's exactly where opennlp.ml should be. And, it is perfectly > possible to > > have such a library be written in Scala but provide a Java API to it, > like > > Akka <http://akka.io/> does. (Incidently, Akka is a good reason to use > > Scala, though one can use it with Java too, just more painful that way.) > In > > fact, opennlp.ml would have to be written that way since the first > "user" > > of it would be the OpenNLP toolkit. > > > > Also, there is a good discussion involving David Pollack and Dick Wall > here: > > > > http://www.infoq.com/articles/barriers-to-scala-adoption > > > > Regarding the title of the original blog post ("Yes, Virginia, Scala is > > hard"), Dick Wall notes that: > > > > Yes, but I would choose to expand the title to "Software Development is > > hard" or perhaps "Vigorous software development is hard". When you set > out > > to complete a project or write a system, you have a problem to solve. > > Chances are that if it is something good and new, it's going to be pretty > > hard. The complexity of the delivered item will be dictated to some > degree > > by the problem to be solved, and that complexity bar will be about the > same > > height no matter how you tackle it. > > > > Choosing a language with more power is the first way you can get a boost > on > > reaching that bar. Choice of libraries is the next, and the remainder you > > fill in yourself. In Java, the power is (by modern standards) fairly low, > > leaving a larger gap to reach the bar. Most people fill in with > libraries, > > e.g. JPA, Wicket, Spring or perhaps full blown Java EE. These bring their > > own significant complexity to the project (not to mention their own > > learning curve). Then the work begins on the final part, the custom work > > necessary to reach the bar. > > > > If you are writing something like a web application, the chances are that > > the libraries available (of which there are many in Java) will get you > > almost all the way there, albeit with a significant investment in > learning > > the libraries the first time you do it. If the task is something a little > > less commonplace, perhaps a scientific or mathematical project, or just > > some totally new idea or approach, you have even more to do. At this > point > > you want the most power, flexibility and expressiveness you can get, and > > that comes back to the language you choose. > > > > I have found the value that coding in Scala brings has far outweighed the > > effort to learn it and its complexities (which I greatly enjoyed learning > > about, and continue to enjoy learning about). I now pretty much have to > > stop myself from gagging when forced to write code in Java. > > David notes in that: > > > > If you're doing some form of event processing (trading floor, sports > > betting, near-real-time data analysis, social networking), Scala is a > huge > > win over Java. If you've got complex, distributed systems, Scala and > > immutability is a huge win. In these scenarios, the costs of using Scala > > (learning curve, poor tooling, etc.) are small in comparison to the > > benefits of Scala (immutability, composability, good event processing, > > excellent libraries/frameworks that provide a starting point for these > > kinds of systems.) > > > > Again, the nature of many machine learning algorithms makes this a good > > fit. Add to that the existence of systems like Spark and relatively new > > front-ends for Hadoop such as Scrunch and Scoobi, which makes developing > > MapReduce algorithms w/ Scala much nicer and far far preferable to the > pain > > of coding them up in Java. > > > > Note that these discussions are from many months ago. The Scala ecosystem > > has continued to evolve, including continual improvements to IDE support > > for Scala development with Eclipse (and probably for Intellij as well). > > > > I would also note that Java lacks a truly wonderful feature of languages > > like Scala, Python, Clojure and others: a REPL that allows you to try out > > code snippets interactively. This is a great way of testing example code > > before actually putting it into your system, knowing that it will work. > > It's also a great teaching tool for people new to the language. > > > > FWIW, here's Aliaksandr's example in Scala, which can be tried out in the > > Scala REPL. > > > > (List("I", "l", "v", " ", "P") zip List(" ", "o", "e", "F", "!")) map { > > case(x,y) => x+y } mkString > > > > Once you get functional programming, it is truly painful to do without > it! > > > > Jason > > > > On Mon, Jan 2, 2012 at 2:51 PM, Aliaksandr Autayeu > > <[email protected]>wrote: > > > >> An interesting post about Scala: > >> http://goodstuff.im/yes-virginia-scala-is-hard > >> > >> Jason, by the way, from what I saw following your link, I might love > Scala > >> as much as... StringJoin @@ (MapThread[#1 <> #2 &, {{"I", "l", "v", " > >> ", "P"}, {" ", "o", "e", "F", "!"}}]) (that's in Mathematica language) > but > >> I would consider the article above, keeping in mind the audience of the > >> project. > >> > >> Aliaksandr > >> > >> > >> On Thu, Dec 29, 2011 at 6:46 PM, Aliaksandr Autayeu < > >> [email protected]> wrote: > >> > >>> IMO, Java advantage is that of being a standard de-facto, with mature > >>> tools, infrastructure and other things. I'm sure, there are many cool > toys > >>> out there (git was a cool toy once, and SVN is still a standard > de-facto in > >>> many places), but ease of use of OpenNLP (standard Java without > excessive > >>> dependencies + maven) is an enormous advantage of OpenNLP in most of > my use > >>> cases. > >>> > >>> Aliaksandr > >>> > >>> P.S. The link is a nice collection of material! > >>> > >>> On Thu, Dec 29, 2011 at 6:27 PM, Jason Baldridge < > >>> [email protected]> wrote: > >>> > >>>> I'd really like to use Scala in the opennlp.ml rewrite, for reasons > I've > >>>> already stated on the list. My thinking on this is to do the first > >>>> reorganization for opennlp.ml in pure Java, make a release, and then > >>>> starting mixing in Scala. I've been happily mixing Scala and Java on a > >>>> number of projects without much fuss. However, I do so in the context > of > >>>> using SBT (simple build tool), rather than maven (SBT can read Maven > >>>> declarations, FWIW). It is quite straightforward to use, and I'm now > >>>> using > >>>> Eclipse with the Scala IDE for Eclipse to build Java/Scala projects - > so > >>>> it > >>>> should be straightforward for others to get up and running with it. > >>>> > >>>> I'd be interested in hearing whether anyone has any particular > concerns > >>>> or > >>>> objections about this plan. Also interested in hearing whether anyone > is > >>>> particularly keen on the use of Scala. > >>>> > >>>> BTW, if you haven't seen much of Scala before, I have some very gentle > >>>> introductions (aimed at first time programmers) for getting started > with > >>>> it > >>>> on my blog. You can find links to the posts, plus to lots of other > >>>> resources here: > >>>> > >>>> http://icl-f11.utcompling.com/links > >>>> > >>>> -- > >>>> Jason Baldridge > >>>> Associate Professor, Department of Linguistics > >>>> The University of Texas at Austin > >>>> http://www.jasonbaldridge.com > >>>> http://twitter.com/jasonbaldridge > >>>> > >>> > >>> > >> > > > > > > -- > > Jason Baldridge > > Associate Professor, Department of Linguistics > > The University of Texas at Austin > > http://www.jasonbaldridge.com > > http://twitter.com/jasonbaldridge > > -------------------------------------------- > Grant Ingersoll > http://www.lucidimagination.com > > > >
