Chris, This is really nice work.
On Wed, Apr 22, 2020 at 1:46 AM Christofer Dutz <christofer.d...@c-ware.de> wrote: > Hi Andrew, > > thanks for your kind words ... they are sort of the fuel that makes me run > ;-) > > So some general observations and suggestions: > - You seem to use test-jars quite a bit: These are generally considered an > antipattern as you possibly import problems from another module and you > will have no way of detecting them. If you need shared test-code it's > better practice to create a dedicated test-utils module and include that > wherever it's needed. > - Don't use variables for project dependencies: It makes things slightly > more difficult to read the release plugin takes care of updating version > for you and some third party plugins might have issues with it. > - I usually provide versions for all project dependencies and have all > other dependencies managed in a dependencyManagement section of the root > module this avoids problems with version conflicts when constructing > something using multiple parts of your projects (Especially your lib > directory thing) > - Accessing resources outside of the current modules scope is generally > considered an antipattern ... regarding your lib thing, I would suggest an > assembly that builds a directory (but I do understand that this version > perhaps speeds up the development workflow ... we could move the clean > plugin configuration and the antrun plugin config into a profile dedicated > for development) > - I usually order the plugin configurations (as much as possible) the way > they are usually executed in the build ... so: clean, process resources, > compile, test, package, ... this makes it easier to understand the build in > general. > > Today I'll go through the poms again managing all versions and cleaning up > the order of things. Then if all still works I would bump the dependencies > versions up as much as possible. > > Will report back as soon as I'm though or I've got something to report ... > then I'll also go into details with your feedback (I haven't ignored it ;-) > ) > > Chris > > > > Am 22.04.20, 06:08 schrieb "Andrew Palumbo" <ap....@outlook.com>: > > Fixing previous message.. > > > Quote from Chris Dutz: > > > Hi folks, > > so I was now able to build (including all tests) with Java 8 and > 9 ... currently trying 10 ... > > Are there any objection that some maven dependencies get updated > to more recent versions? I mean ... the hbase-client you're using is more > than 5 years old ... > > My answer: > > I personally have no problem with the updating of any dependencies, > they may break some things and caue more work, but that is the kind of > thing that we've been trying to get done in this build work, get > everything up to speed. > > Id say take Andrew, Trevor and Pat's word over mine though i am a bit > less active presently. > > Thanks. > > Andy > > ________________________________ > From: Andrew Palumbo <ap....@outlook.com> > Sent: Tuesday, April 21, 2020 10:17 PM > To: dev@mahout.apache.org <dev@mahout.apache.org> > Subject: Re: Hi ... need some help? > > Hi folks, > > so I was now able to build (including all tests) with Java 8 and 9 > ... currently trying 10 ... > > Are there any objection that some maven dependencies get updated > to more recent versions? I mean ... the hbase-client you're using is more > than 5 years old ... > Not by me, I believe that is being used by the MR module, which is > Deprecated. > > I personally have no problem with the updating of any dependencies, > they may break some things and caue more work, but that is the kind of > thing that we've been trying to get done in this build work, get > everything up to speed. > > Id say take Andrew, Trevor and Pat's word over mine though i am a bit > less active presently. > > Thanks. > > Andy > ________________________________ > From: Andrew Palumbo <ap....@outlook.com> > Sent: Tuesday, April 21, 2020 10:13 PM > To: dev@mahout.apache.org <dev@mahout.apache.org> > Subject: Re: Hi ... need some help? > > Chris, Thank you so much for what you are doing, This is Apache at > its best.. I've been down and out with a serious Illness, Injury and other > issues, which have seriously limited my Machine time. I was pretty close > to getting a good build, but it was hacky, and the method that you use to > name the modules for both Scala versions, looks great. > > We've always relied on Stevo to fix the builds for us, but as he said > is unable to contribute right now. The main issues (solved by hacks), > currently are > > > 1. Dependencies and transitive dependencies are not being picked > and copied to the `./lib` directory, where `/bin/mahout` and parts of the > MahoutSparkContext look for them, to add to the class path. So running > either from the CLI or as a library, dependencies are not picked up. > * We used to use the mahout-experimental-xx.jar as a fat jar > for this, though it was bloated with now deprecated MR stuff, and no longer > packed. > 2. `./bin/mahout` (and `compute-classpath.sh`) need to be revamped > to ensure that they are picking up the correct classes. > > w.r.t. to Java 8/7 issues, We did mandate Java 8+, and this required a > few minor code changes to play nicely with Scala 2.11. Mainly one class > needed a JVM "Static" field, so i refactored that field out of the Class > and into a companion object. I wonder if this is what is giving you issues > with Java 7. > > I'd thought that Java 8 was mandated now, but may be thinking of maven > 3.3.x. > > Regardless Thank you very much for this. This board is doing really > doing well so far. and deserves accolades. > > > > > <dependency> > > <groupId>org.apache.mahout</groupId> > > <artifactId>mahout-spark</artifactId> > > <version>14.1-SNAPSHOT</version> > > <classifier>2.11</classifier> > > </dependency> > This would be perfect IMO. > > > I can send you the commits that I am talking about. > > As well, I saw that Trevor gave you a link to a filter.. I have one > here with a bit more limited scope, which is open issues fixversion == 14.1. > > To answer one question yes this was recently building, and releasing, > with all of the tests passing (for a few modules, that we were focusing > on). after that i made some changes that broke it again.. > > the board with limited scope: > https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=348&view=detail&selectedIssue=MAHOUT-2093 > > > Thanks again for helping out. we are really bad with poms, not so > much from the ground up, as fixing some that are 10 years old, as Stevo > mentioned, very quickly while working on several other things. > > Thank you again for this. It is a great help, and once we get a good > build, we can get back to doing work on the library itself. > > I have some documents that i can provide if it will help explain the > structure of the project, which is still kind of in flux. E.g. I'd like to > get the ViennaCL-OMP branch out of experimental, but there is much to clean > up first. As well, I am on medical leave, and dont have much time on the > computer these days.. have to budget my time. > > I'll send you some (closed) PRs with notes and changes, if it helps. > lmk. > > Thanks again, This is Huge. > > Andy > ________________________________ > From: Christofer Dutz <christofer.d...@c-ware.de> > Sent: Thursday, April 16, 2020 9:50 AM > To: dev@mahout.apache.org <dev@mahout.apache.org> > Subject: Re: Hi ... need some help? > > Hi Trevor, > > ok ... first of all ... the Mahout PMC is defining a "community > maintained" library which is not maintained by the mahout PMC?!?! > I thought at Apache everything is about Community over code. So is a > company driving the non-community stuff? > > But back to your build issues: > I had a look and I too encountered these comments and remarks and > sometimes patterns I recognized and could imagine why they were created. > Yes quite a bit of the build could be cleaned up and simplified a lot. > > So how about I create a fork and try to do a cleanup of the build. > Usually I also leave comments about what I do as I hope I'll not be > the only one maintaining a build and documenting things helps people feel > more confident. > > However in some cases I will have questions ... so would someone be > available on Slack for quick questions? > > Usually switching to another build system does solve some problems ... > mostly the reason to switch is that it solved the main problem that you are > having with the old. > However you usually notice too late that you get yourself a lot of new > problems. I remember doing some contract work for an insurance company and > they were totally down Maven-road but then had to build something with SBT > ... in the end I compiled the thing on my laptop, copied it to a USB stick > and told the people what was on the stick and that I'll be having a coffee > and will be back in 30 minutes. When I came back the sick wasn't at the > same place and the build problem was "solved" ;-) > > So I think it's quite good to stick to maven ... that is very mature, > you can do almost everything you want with it and it integrates perfectly > into the Apache infrastructure. > > But that's just my opinion. > > So if you want me to help, I'll be happy to be of assistance. > > > Chris > > > > Am 16.04.20, 15:28 schrieb "Trevor Grant" <trevor.d.gr...@gmail.com>: > > Hey Christopher, > > I would agree with what Stevo outlined but add some more context > and a > couple related JIRA issues. > > For 0.14.0 We did a big refactor and finally moved the MapReduce > based > Mahout all into what we called "community/" that is community > maintained, > which is to say, we're not maintaining it anymore (sunset began I > think in > 2015). > > But all of our POMs were so huge and fat because they'd been > layered up > over the years by people coming and going and dropping in code. I > wouldn't > call these drive- bys, its just been over 10 years and people come > and go. > Such is the life of Apache Projects. So we had a situation where a > lot of > the old Map Reduce stuff and the POMs were considered "old-magic" > no one > really knew how it was all tied together, but we didn't want to > mess with > it for fear of breaking something in the "new" Mahout (aka Samsar) > which is > the Scala/Spark based library that it is now* (to others in the > community: > I know it runs on other engines, but for simplicity, I'm just > calling it > "runs-on-spark"). > > For 0.14.0 We decided to trim out as much of that which was > possible. We > did some major liposuction on POMs, re organized things, etc. This > was done > by commenting out a section, then seeing if it would still build. > So the > current release > _does_ build. And aside for some CLI driver issues which are > outlined in > [1], the project runs fairly smooth. (An SBT would probably solve > [1], I > believe Pat Ferrel has made his own SBT script to compile Mahout, > which > solved that problem for them). > > The issue we ran into with the releases (and the reason I think > you're > here), is that we also somewhere along the line commented out > something > that was important to the release process. Hence why 0.14.0 > released source > only. > > Since 2008, there has been a lot of great work on generating > plugins for > doing Apache releases. Instead of the awkward hacks that made up > the old > poms (literally comments that said, "this is a hack, there's > supposedly > something better coming from ..." dated like 2012), we would like > to do it > the "right way" and incorporate the appropriate plugins. > > Refactoring to SBT was _one_ proposed solution. We're also OK > continuing to > use Maven, and I agree with what you said about the cross > compiling. We > actually have a script that just changes the scala version. We > tried using > the classifiers but there were issues in SBT, but the way you're > proposing > sounds a lot more pro than the route we were trying for. > > That said- we'd be OK just releasing one scala/spark version at a > time. > But getting the convenience binaries to release/publish would be a > major > first step. > > Also, we really appreciate the help, > > tg > > > [1] > > https://issues.apache.org/jira/projects/MAHOUT/issues/MAHOUT-2093?filter=allopenissues > > > > On Thu, Apr 16, 2020 at 4:50 AM Christofer Dutz < > christofer.d...@c-ware.de> > wrote: > > > Hi Stevo, > > > > so let me summarize what I understood: > > > > - There are some modules in mahout that are built with Scala, > some with > > java and some with both (At least that's what I see when > checking out the > > project) > > - The current build uses Scala 2.11 to build the Scala code. > > - The resulting libraries are only compatible with Scala 2.11 > > > > Now you want to also publish versions compatible with Scala 2.12? > > > > If that's the case I think Maven could easily add multiple > executions > > where each compile compiles to different output directories: > > - Java --> target/classes > > - Scala 2.11 --> target/classes-2.11 > > - Scala 2.12 --> target/classes-2.12 > > > > Then the packaging would also need a second execution ... each > of the > > executions bundling the classes and the corresponding scala > output. > > Ideally I would probably use maven classifiers to distinguish the > > artifacts. > > > > <dependency> > > <groupId>org.apache.mahout</groupId> > > <artifactId>mahout-spark</artifactId> > > <version>14.1-SNAPSHOT</version> > > <classifier>2.11</classifier> > > </dependency> > > > > Then it should all work in a normal maven build. In the > distributions you > > could also filter the versions according to their classifiers. > > > > So if this is the case, I could help you with this. > > > > Chris > > > > > > Am 16.04.20, 09:39 schrieb "Stevo Slavić" <ssla...@gmail.com>: > > > > Disclaimer: I'm not active Mahout maintainer for quite a > while, have > > some > > historical perspective, take it with a grain of salt, could > be I'm > > missing > > the whole point you were approached for by a wide margin of > error. > > > > At a point Mahout, some of its modules, have turned into a > scala > > library, and there was need to cross publish those modules, > across > > different scala versions. Back than Maven scala plugin > didn't support > > cross > > publishing, it doesn't fit well with Maven's build lifecycle > concept > > (multiple compile phases - one for each scala version, and > what not > > would > > be needed). Switching to sbt could have solved the problem. > Switch was > > deemed to be too big task, even though ages have been spent > on trying > > to > > apply Maven (profiles) + bash scripts and what not to solve > the > > problem. > > Trying to apply same approach over and over again and > expecting > > different > > results is not smart, no expert can help there. Mahout > maintainers and > > contributors, should consider alternative approach, one of > them being > > switching to sbt - it's scala native, supports scala cross > publishing, > > supports publishing Maven compatible release metadata and > binaries. > > > > Kind regards, > > Stevo Slavic. > > > > On Thu, Apr 16, 2020 at 9:15 AM Christofer Dutz < > > christofer.d...@c-ware.de> > > wrote: > > > > > Hi folks, > > > > > > my name is Chris and I’m involved in quite a lot of Apache > projects. > > > Justin approached me this morning, asking me if I could > perhaps help > > you. > > > He told me you were having trouble with doing Maven > releases. > > > > > > As Maven releases are my specialty, could you please > summarize the > > issues > > > you are having? > > > > > > Chris > > > > > > > > > >