Hi Andrew, guess I'll start with the fork and contact you folks on slack.
Chris Am 16.04.20, 19:43 schrieb "Andrew Musselman" <a...@apache.org>: Chris, thank you for your help.. Yeah if you fork what's in master you can see what state it's in; we are in the #mahout channel in tge-asf slack and this is also a fine way to keep track of discussion. We could file a JIRA ticket as well, however you prefer to work. Best Andrew On Thu, Apr 16, 2020 at 06:59 Christofer Dutz <christofer.d...@c-ware.de> wrote: > Hi Trevor, > > ok ... first of all ... the Mahout PMC is defining a "community > maintained" library which is not maintained by the mahout PMC?!?! > I thought at Apache everything is about Community over code. So is a > company driving the non-community stuff? > > But back to your build issues: > I had a look and I too encountered these comments and remarks and > sometimes patterns I recognized and could imagine why they were created. > Yes quite a bit of the build could be cleaned up and simplified a lot. > > So how about I create a fork and try to do a cleanup of the build. > Usually I also leave comments about what I do as I hope I'll not be the > only one maintaining a build and documenting things helps people feel more > confident. > > However in some cases I will have questions ... so would someone be > available on Slack for quick questions? > > Usually switching to another build system does solve some problems ... > mostly the reason to switch is that it solved the main problem that you are > having with the old. > However you usually notice too late that you get yourself a lot of new > problems. I remember doing some contract work for an insurance company and > they were totally down Maven-road but then had to build something with SBT > ... in the end I compiled the thing on my laptop, copied it to a USB stick > and told the people what was on the stick and that I'll be having a coffee > and will be back in 30 minutes. When I came back the sick wasn't at the > same place and the build problem was "solved" ;-) > > So I think it's quite good to stick to maven ... that is very mature, you > can do almost everything you want with it and it integrates perfectly into > the Apache infrastructure. > > But that's just my opinion. > > So if you want me to help, I'll be happy to be of assistance. > > > Chris > > > > Am 16.04.20, 15:28 schrieb "Trevor Grant" <trevor.d.gr...@gmail.com>: > > Hey Christopher, > > I would agree with what Stevo outlined but add some more context and a > couple related JIRA issues. > > For 0.14.0 We did a big refactor and finally moved the MapReduce based > Mahout all into what we called "community/" that is community > maintained, > which is to say, we're not maintaining it anymore (sunset began I > think in > 2015). > > But all of our POMs were so huge and fat because they'd been layered up > over the years by people coming and going and dropping in code. I > wouldn't > call these drive- bys, its just been over 10 years and people come and > go. > Such is the life of Apache Projects. So we had a situation where a lot > of > the old Map Reduce stuff and the POMs were considered "old-magic" no > one > really knew how it was all tied together, but we didn't want to mess > with > it for fear of breaking something in the "new" Mahout (aka Samsar) > which is > the Scala/Spark based library that it is now* (to others in the > community: > I know it runs on other engines, but for simplicity, I'm just calling > it > "runs-on-spark"). > > For 0.14.0 We decided to trim out as much of that which was possible. > We > did some major liposuction on POMs, re organized things, etc. This was > done > by commenting out a section, then seeing if it would still build. So > the > current release > _does_ build. And aside for some CLI driver issues which are outlined > in > [1], the project runs fairly smooth. (An SBT would probably solve [1], > I > believe Pat Ferrel has made his own SBT script to compile Mahout, which > solved that problem for them). > > The issue we ran into with the releases (and the reason I think you're > here), is that we also somewhere along the line commented out something > that was important to the release process. Hence why 0.14.0 released > source > only. > > Since 2008, there has been a lot of great work on generating plugins > for > doing Apache releases. Instead of the awkward hacks that made up the > old > poms (literally comments that said, "this is a hack, there's supposedly > something better coming from ..." dated like 2012), we would like to > do it > the "right way" and incorporate the appropriate plugins. > > Refactoring to SBT was _one_ proposed solution. We're also OK > continuing to > use Maven, and I agree with what you said about the cross compiling. We > actually have a script that just changes the scala version. We tried > using > the classifiers but there were issues in SBT, but the way you're > proposing > sounds a lot more pro than the route we were trying for. > > That said- we'd be OK just releasing one scala/spark version at a time. > But getting the convenience binaries to release/publish would be a > major > first step. > > Also, we really appreciate the help, > > tg > > > [1] > > https://issues.apache.org/jira/projects/MAHOUT/issues/MAHOUT-2093?filter=allopenissues > > > > On Thu, Apr 16, 2020 at 4:50 AM Christofer Dutz < > christofer.d...@c-ware.de> > wrote: > > > Hi Stevo, > > > > so let me summarize what I understood: > > > > - There are some modules in mahout that are built with Scala, some > with > > java and some with both (At least that's what I see when checking > out the > > project) > > - The current build uses Scala 2.11 to build the Scala code. > > - The resulting libraries are only compatible with Scala 2.11 > > > > Now you want to also publish versions compatible with Scala 2.12? > > > > If that's the case I think Maven could easily add multiple executions > > where each compile compiles to different output directories: > > - Java --> target/classes > > - Scala 2.11 --> target/classes-2.11 > > - Scala 2.12 --> target/classes-2.12 > > > > Then the packaging would also need a second execution ... each of the > > executions bundling the classes and the corresponding scala output. > > Ideally I would probably use maven classifiers to distinguish the > > artifacts. > > > > <dependency> > > <groupId>org.apache.mahout</groupId> > > <artifactId>mahout-spark</artifactId> > > <version>14.1-SNAPSHOT</version> > > <classifier>2.11</classifier> > > </dependency> > > > > Then it should all work in a normal maven build. In the > distributions you > > could also filter the versions according to their classifiers. > > > > So if this is the case, I could help you with this. > > > > Chris > > > > > > Am 16.04.20, 09:39 schrieb "Stevo Slavić" <ssla...@gmail.com>: > > > > Disclaimer: I'm not active Mahout maintainer for quite a while, > have > > some > > historical perspective, take it with a grain of salt, could be > I'm > > missing > > the whole point you were approached for by a wide margin of > error. > > > > At a point Mahout, some of its modules, have turned into a scala > > library, and there was need to cross publish those modules, > across > > different scala versions. Back than Maven scala plugin didn't > support > > cross > > publishing, it doesn't fit well with Maven's build lifecycle > concept > > (multiple compile phases - one for each scala version, and what > not > > would > > be needed). Switching to sbt could have solved the problem. > Switch was > > deemed to be too big task, even though ages have been spent on > trying > > to > > apply Maven (profiles) + bash scripts and what not to solve the > > problem. > > Trying to apply same approach over and over again and expecting > > different > > results is not smart, no expert can help there. Mahout > maintainers and > > contributors, should consider alternative approach, one of them > being > > switching to sbt - it's scala native, supports scala cross > publishing, > > supports publishing Maven compatible release metadata and > binaries. > > > > Kind regards, > > Stevo Slavic. > > > > On Thu, Apr 16, 2020 at 9:15 AM Christofer Dutz < > > christofer.d...@c-ware.de> > > wrote: > > > > > Hi folks, > > > > > > my name is Chris and I’m involved in quite a lot of Apache > projects. > > > Justin approached me this morning, asking me if I could > perhaps help > > you. > > > He told me you were having trouble with doing Maven releases. > > > > > > As Maven releases are my specialty, could you please summarize > the > > issues > > > you are having? > > > > > > Chris > > > > > > > > >