We still have a legacy code that uses for a Stochastic SVD the local HADOOP instance directly in a Java desktop application. But if the desire is to eliminate it, we’ve been inclining for a while to migrate everything to Spark.
Sorry, I’m old school and use MR, plus I’m new to Spark :) Is there an easy way to migrate your Spark example into the Java source code so that we do not disrupt the overall flow? Have a great evening! Mihai > On 21 Mar 2016, at 19:31, Dmitriy Lyubimov <dlie...@gmail.com> wrote: > > my 1 cents (since it is less than 2) is MAHOUT_LOCAL is part of MR legacy > packaging. as long as MR is still here (and I would say it needs to be > still here, unless it falls in complete disrepair and totally out of sync > with even dated mapreduce apis), MAHOUT_LOCAL needs to stay. As soon as MR > goes, it goes too. > > maybe we just simply need a separate mahout script for non-legacy things, > or factor out legacy related shell things into another script (something > like mahout-mr.sh instead of mahout.sh) > > On Mon, Mar 21, 2016 at 8:45 AM, Suneel Marthi <smar...@apache.org> wrote: > >> Some background on this issue: >> >> 1. Now that we support Spark and H2O as back ends since 0.10.0 and Flink >> coming soon in 0.12.0, its been bloating the size of our release artifacts >> when pushing releases to Apache mirrors. Hence we were looking at pruning >> some of the components that have not been used or have been long marked >> deprecated and are not being worked on. >> >> 2. Since Mahout 0.7 release in June 2012, the project has diverged from >> the MiA book even for legacy MapReduce. Not sure if that's indeed helping >> onboard new users. >> >> 3. Seems like the consensus so far based on the user responses is to >> retain the MAHOUT_LOCAL the option, thanks all for your responses. >> >> >> On Mon, Mar 21, 2016 at 11:38 AM, scott cote <scottcc...@gmail.com> wrote: >> >>> one more comment - I understand that it only works for the legacy code. >>> Kill it when the legacy code is no longer deprecated, but gone …. >>> >>> Otherwise - you will shut out people who buy the older mahout books (such >>> as MIA) which are still good reads, even though the tech is dated. >>> >>> SCott >>> >>>> On Mar 21, 2016, at 2:24 AM, David Starina <david.star...@gmail.com> >>> wrote: >>>> >>>> Anyhow, I'm +1 for removing MAHOUT_LOCAL, but I believe the deprecated >>>> MapReduce-based code still makes sense if it is running well on Ignite. >>>> >>>> On Mon, Mar 21, 2016 at 8:20 AM, David Starina < >> david.star...@gmail.com> >>>> wrote: >>>> >>>>> Has anyone tried to run the deprecated MapReduce code on Ignite? Is >> the >>>>> performance improvement good enough to reconsider leaving those >>> algorithms >>>>> in Mahout? >>>>> >>>>> On Mon, Mar 21, 2016 at 12:45 AM, Andrew Musselman < >>>>> andrew.mussel...@gmail.com> wrote: >>>>> >>>>>> Yes I agree; will leave the question open a couple days. >>>>>> >>>>>> On Sunday, March 20, 2016, Pat Ferrel <p...@occamsmachete.com> wrote: >>>>>> >>>>>>> Maybe a better user question is: How many people are still using the >>>>>>> deprecated Hadoop code? >>>>>>> >>>>>>> If the number is small +1 for removal. >>>>>>> >>>>>>> On Mar 20, 2016, at 11:04 AM, Andrew Musselman < >>>>>> andrew.mussel...@gmail.com >>>>>>> <javascript:;>> wrote: >>>>>>> >>>>>>> To clarify, the MAHOUT_LOCAL option only works for legacy Hadoop >>>>>>> MapReduce-based jobs which officially became deprecated in 0.10.0. >>>>>>> >>>>>>> On Sun, Mar 20, 2016 at 10:25 AM, Andrew Musselman < >>>>>>> andrew.mussel...@gmail.com <javascript:;>> wrote: >>>>>>> >>>>>>>> Yes as I understand it. >>>>>>>> >>>>>>>> >>>>>>>> On Sunday, March 20, 2016, Pat Ferrel <p...@occamsmachete.com >>>>>>> <javascript:;>> wrote: >>>>>>>> >>>>>>>>> Are we just talking about Hadoop Mapreduce? I thought is was >> ignored >>>>>>> when >>>>>>>>> using Spark. >>>>>>>>> >>>>>>>>> On Mar 20, 2016, at 8:20 AM, alok tanna <tannaa...@gmail.com >>>>>>> <javascript:;>> wrote: >>>>>>>>> >>>>>>>>> -1 MAHOUT_LOCAL is very useful for quick POC . >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Alok Tanna >>>>>>>>> Sent from my iPhone >>>>>>>>> >>>>>>>>>> On Mar 20, 2016, at 5:01 AM, Mihai Dascalu < >>> mihai.dasc...@cs.pub.ro >>>>>>> <javascript:;>> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> -1 I still use it for fast deployment and it’s really helpful for >>>>>> small >>>>>>>>> local processing >>>>>>>>>> >>>>>>>>>> Have a great weekend! >>>>>>>>>> Mihai >>>>>>>>>> >>>>>>>>>>> On 20 Mar 2016, at 06:13, Suneel Marthi < >> suneel.mar...@gmail.com >>>>>>> <javascript:;>> >>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> +1 to remove this >>>>>>>>>>> >>>>>>>>>>> Sent from my iPhone >>>>>>>>>>> >>>>>>>>>>>> On Mar 20, 2016, at 12:01 AM, Andrew Musselman < >>>>>>>>> andrew.mussel...@gmail.com <javascript:;>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> We're discussing removing the MAHOUT_LOCAL option in order to >>> trim >>>>>>>>> artifact >>>>>>>>>>>> sizes. >>>>>>>>>>>> >>>>>>>>>>>> If you think keeping the option to use MAHOUT_LOCAL for testing >>>>>> with >>>>>>>>> the >>>>>>>>>>>> single-node mode of Hadoop is important please let us know. It >>>>>> can be >>>>>>>>> handy >>>>>>>>>>>> for trying things out but it would be nice to ditch the effort >>>>>>>>> required to >>>>>>>>>>>> maintain it. >>>>>>>>>>>> >>>>>>>>>>>> See https://issues.apache.org/jira/browse/MAHOUT-1705 for more >>>>>>>>> context. >>>>>>>>>>>> >>>>>>>>>>>> Thanks! >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>> >>> >>