Re: Removing MAHOUT_LOCAL option
stochastic svd in DSSVD.scala is identical to MR with exception that MR frankly is using a more numerically stable reordered Givens QR, while the DSSVD.scala uses a less numerically stable Cholesky QR. Aside from that, the DrmLike input parameter is fully compatible with hdfs sequence file input for the MR version. in Samsara the code would be (I am writing from memory and hopefully spell everything right) val drmX = drmDfsRead(path=) val (drmU, drmV, s) = dssvd(drmX, k=..., q=..., ...) // whatever paremeters you normally use here This should do it. of course you'd run into significant infrastructure migration if you currently do not have H20 or Spark available and spinning somewhere already. -d On Mon, Mar 21, 2016 at 12:57 PM, Mihai Dascalu wrote: > We still have a legacy code that uses for a Stochastic SVD the local > HADOOP instance directly in a Java desktop application. But if the desire > is to eliminate it, we’ve been inclining for a while to migrate everything > to Spark. > > Sorry, I’m old school and use MR, plus I’m new to Spark :) Is there an > easy way to migrate your Spark example into the Java source code so that we > do not disrupt the overall flow? > > > Have a great evening! > Mihai > > > On 21 Mar 2016, at 19:31, Dmitriy Lyubimov wrote: > > > > my 1 cents (since it is less than 2) is MAHOUT_LOCAL is part of MR legacy > > packaging. as long as MR is still here (and I would say it needs to be > > still here, unless it falls in complete disrepair and totally out of sync > > with even dated mapreduce apis), MAHOUT_LOCAL needs to stay. As soon as > MR > > goes, it goes too. > > > > maybe we just simply need a separate mahout script for non-legacy things, > > or factor out legacy related shell things into another script (something > > like mahout-mr.sh instead of mahout.sh) > > > > On Mon, Mar 21, 2016 at 8:45 AM, Suneel Marthi > wrote: > > > >> Some background on this issue: > >> > >> 1. Now that we support Spark and H2O as back ends since 0.10.0 and > Flink > >> coming soon in 0.12.0, its been bloating the size of our release > artifacts > >> when pushing releases to Apache mirrors. Hence we were looking at > pruning > >> some of the components that have not been used or have been long marked > >> deprecated and are not being worked on. > >> > >> 2. Since Mahout 0.7 release in June 2012, the project has diverged from > >> the MiA book even for legacy MapReduce. Not sure if that's indeed > helping > >> onboard new users. > >> > >> 3. Seems like the consensus so far based on the user responses is to > >> retain the MAHOUT_LOCAL the option, thanks all for your responses. > >> > >> > >> On Mon, Mar 21, 2016 at 11:38 AM, scott cote > wrote: > >> > >>> one more comment - I understand that it only works for the legacy code. > >>> Kill it when the legacy code is no longer deprecated, but gone …. > >>> > >>> Otherwise - you will shut out people who buy the older mahout books > (such > >>> as MIA) which are still good reads, even though the tech is dated. > >>> > >>> SCott > >>> > On Mar 21, 2016, at 2:24 AM, David Starina > >>> wrote: > > Anyhow, I'm +1 for removing MAHOUT_LOCAL, but I believe the deprecated > MapReduce-based code still makes sense if it is running well on > Ignite. > > On Mon, Mar 21, 2016 at 8:20 AM, David Starina < > >> david.star...@gmail.com> > wrote: > > > Has anyone tried to run the deprecated MapReduce code on Ignite? Is > >> the > > performance improvement good enough to reconsider leaving those > >>> algorithms > > in Mahout? > > > > On Mon, Mar 21, 2016 at 12:45 AM, Andrew Musselman < > > andrew.mussel...@gmail.com> wrote: > > > >> Yes I agree; will leave the question open a couple days. > >> > >> On Sunday, March 20, 2016, Pat Ferrel > wrote: > >> > >>> Maybe a better user question is: How many people are still using > the > >>> deprecated Hadoop code? > >>> > >>> If the number is small +1 for removal. > >>> > >>> On Mar 20, 2016, at 11:04 AM, Andrew Musselman < > >> andrew.mussel...@gmail.com > >>> > wrote: > >>> > >>> To clarify, the MAHOUT_LOCAL option only works for legacy Hadoop > >>> MapReduce-based jobs which officially became deprecated in 0.10.0. > >>> > >>> On Sun, Mar 20, 2016 at 10:25 AM, Andrew Musselman < > >>> andrew.mussel...@gmail.com > wrote: > >>> > Yes as I understand it. > > > On Sunday, March 20, 2016, Pat Ferrel >>> > wrote: > > > Are we just talking about Hadoop Mapreduce? I thought is was > >> ignored > >>> when > > using Spark. > > > > On Mar 20, 2016, at 8:20 AM, alok tanna >>> > wrote: > > > > -1 MAHOUT_LOCAL is very useful for quick POC . > > > > Thanks, > > Alok Tanna > > Sent from my iPhone > > > >> On Mar 20, 2016, at 5:01 AM,
Re: Removing MAHOUT_LOCAL option
We still have a legacy code that uses for a Stochastic SVD the local HADOOP instance directly in a Java desktop application. But if the desire is to eliminate it, we’ve been inclining for a while to migrate everything to Spark. Sorry, I’m old school and use MR, plus I’m new to Spark :) Is there an easy way to migrate your Spark example into the Java source code so that we do not disrupt the overall flow? Have a great evening! Mihai > On 21 Mar 2016, at 19:31, Dmitriy Lyubimov wrote: > > my 1 cents (since it is less than 2) is MAHOUT_LOCAL is part of MR legacy > packaging. as long as MR is still here (and I would say it needs to be > still here, unless it falls in complete disrepair and totally out of sync > with even dated mapreduce apis), MAHOUT_LOCAL needs to stay. As soon as MR > goes, it goes too. > > maybe we just simply need a separate mahout script for non-legacy things, > or factor out legacy related shell things into another script (something > like mahout-mr.sh instead of mahout.sh) > > On Mon, Mar 21, 2016 at 8:45 AM, Suneel Marthi wrote: > >> Some background on this issue: >> >> 1. Now that we support Spark and H2O as back ends since 0.10.0 and Flink >> coming soon in 0.12.0, its been bloating the size of our release artifacts >> when pushing releases to Apache mirrors. Hence we were looking at pruning >> some of the components that have not been used or have been long marked >> deprecated and are not being worked on. >> >> 2. Since Mahout 0.7 release in June 2012, the project has diverged from >> the MiA book even for legacy MapReduce. Not sure if that's indeed helping >> onboard new users. >> >> 3. Seems like the consensus so far based on the user responses is to >> retain the MAHOUT_LOCAL the option, thanks all for your responses. >> >> >> On Mon, Mar 21, 2016 at 11:38 AM, scott cote wrote: >> >>> one more comment - I understand that it only works for the legacy code. >>> Kill it when the legacy code is no longer deprecated, but gone …. >>> >>> Otherwise - you will shut out people who buy the older mahout books (such >>> as MIA) which are still good reads, even though the tech is dated. >>> >>> SCott >>> On Mar 21, 2016, at 2:24 AM, David Starina >>> wrote: Anyhow, I'm +1 for removing MAHOUT_LOCAL, but I believe the deprecated MapReduce-based code still makes sense if it is running well on Ignite. On Mon, Mar 21, 2016 at 8:20 AM, David Starina < >> david.star...@gmail.com> wrote: > Has anyone tried to run the deprecated MapReduce code on Ignite? Is >> the > performance improvement good enough to reconsider leaving those >>> algorithms > in Mahout? > > On Mon, Mar 21, 2016 at 12:45 AM, Andrew Musselman < > andrew.mussel...@gmail.com> wrote: > >> Yes I agree; will leave the question open a couple days. >> >> On Sunday, March 20, 2016, Pat Ferrel wrote: >> >>> Maybe a better user question is: How many people are still using the >>> deprecated Hadoop code? >>> >>> If the number is small +1 for removal. >>> >>> On Mar 20, 2016, at 11:04 AM, Andrew Musselman < >> andrew.mussel...@gmail.com >>> > wrote: >>> >>> To clarify, the MAHOUT_LOCAL option only works for legacy Hadoop >>> MapReduce-based jobs which officially became deprecated in 0.10.0. >>> >>> On Sun, Mar 20, 2016 at 10:25 AM, Andrew Musselman < >>> andrew.mussel...@gmail.com > wrote: >>> Yes as I understand it. On Sunday, March 20, 2016, Pat Ferrel >> > wrote: > Are we just talking about Hadoop Mapreduce? I thought is was >> ignored >>> when > using Spark. > > On Mar 20, 2016, at 8:20 AM, alok tanna >> > wrote: > > -1 MAHOUT_LOCAL is very useful for quick POC . > > Thanks, > Alok Tanna > Sent from my iPhone > >> On Mar 20, 2016, at 5:01 AM, Mihai Dascalu < >>> mihai.dasc...@cs.pub.ro >>> > > wrote: >> >> -1 I still use it for fast deployment and it’s really helpful for >> small > local processing >> >> Have a great weekend! >> Mihai >> >>> On 20 Mar 2016, at 06:13, Suneel Marthi < >> suneel.mar...@gmail.com >>> > > wrote: >>> >>> +1 to remove this >>> >>> Sent from my iPhone >>> On Mar 20, 2016, at 12:01 AM, Andrew Musselman < > andrew.mussel...@gmail.com > wrote: We're discussing removing the MAHOUT_LOCAL option in order to >>> trim > artifact sizes. If you think keeping the option to use MAHOUT_LOCAL for testing >> with > the single-node mode of Hadoop is important please let us know. It >> can be > handy for trying t
Re: Removing MAHOUT_LOCAL option
I haven't but if you'd like to try it out and report back I'd love to hear about it. The mr jobs are staying in for now, no active move to remove them. On Mon, Mar 21, 2016 at 12:20 AM, David Starina wrote: > Has anyone tried to run the deprecated MapReduce code on Ignite? Is the > performance improvement good enough to reconsider leaving those algorithms > in Mahout? >
Re: Removing MAHOUT_LOCAL option
my 1 cents (since it is less than 2) is MAHOUT_LOCAL is part of MR legacy packaging. as long as MR is still here (and I would say it needs to be still here, unless it falls in complete disrepair and totally out of sync with even dated mapreduce apis), MAHOUT_LOCAL needs to stay. As soon as MR goes, it goes too. maybe we just simply need a separate mahout script for non-legacy things, or factor out legacy related shell things into another script (something like mahout-mr.sh instead of mahout.sh) On Mon, Mar 21, 2016 at 8:45 AM, Suneel Marthi wrote: > Some background on this issue: > > 1. Now that we support Spark and H2O as back ends since 0.10.0 and Flink > coming soon in 0.12.0, its been bloating the size of our release artifacts > when pushing releases to Apache mirrors. Hence we were looking at pruning > some of the components that have not been used or have been long marked > deprecated and are not being worked on. > > 2. Since Mahout 0.7 release in June 2012, the project has diverged from > the MiA book even for legacy MapReduce. Not sure if that's indeed helping > onboard new users. > > 3. Seems like the consensus so far based on the user responses is to > retain the MAHOUT_LOCAL the option, thanks all for your responses. > > > On Mon, Mar 21, 2016 at 11:38 AM, scott cote wrote: > > > one more comment - I understand that it only works for the legacy code. > > Kill it when the legacy code is no longer deprecated, but gone …. > > > > Otherwise - you will shut out people who buy the older mahout books (such > > as MIA) which are still good reads, even though the tech is dated. > > > > SCott > > > > > On Mar 21, 2016, at 2:24 AM, David Starina > > wrote: > > > > > > Anyhow, I'm +1 for removing MAHOUT_LOCAL, but I believe the deprecated > > > MapReduce-based code still makes sense if it is running well on Ignite. > > > > > > On Mon, Mar 21, 2016 at 8:20 AM, David Starina < > david.star...@gmail.com> > > > wrote: > > > > > >> Has anyone tried to run the deprecated MapReduce code on Ignite? Is > the > > >> performance improvement good enough to reconsider leaving those > > algorithms > > >> in Mahout? > > >> > > >> On Mon, Mar 21, 2016 at 12:45 AM, Andrew Musselman < > > >> andrew.mussel...@gmail.com> wrote: > > >> > > >>> Yes I agree; will leave the question open a couple days. > > >>> > > >>> On Sunday, March 20, 2016, Pat Ferrel wrote: > > >>> > > Maybe a better user question is: How many people are still using the > > deprecated Hadoop code? > > > > If the number is small +1 for removal. > > > > On Mar 20, 2016, at 11:04 AM, Andrew Musselman < > > >>> andrew.mussel...@gmail.com > > > wrote: > > > > To clarify, the MAHOUT_LOCAL option only works for legacy Hadoop > > MapReduce-based jobs which officially became deprecated in 0.10.0. > > > > On Sun, Mar 20, 2016 at 10:25 AM, Andrew Musselman < > > andrew.mussel...@gmail.com > wrote: > > > > > Yes as I understand it. > > > > > > > > > On Sunday, March 20, 2016, Pat Ferrel > > wrote: > > > > > >> Are we just talking about Hadoop Mapreduce? I thought is was > ignored > > when > > >> using Spark. > > >> > > >> On Mar 20, 2016, at 8:20 AM, alok tanna > > wrote: > > >> > > >> -1 MAHOUT_LOCAL is very useful for quick POC . > > >> > > >> Thanks, > > >> Alok Tanna > > >> Sent from my iPhone > > >> > > >>> On Mar 20, 2016, at 5:01 AM, Mihai Dascalu < > > mihai.dasc...@cs.pub.ro > > > > > >> wrote: > > >>> > > >>> -1 I still use it for fast deployment and it’s really helpful for > > >>> small > > >> local processing > > >>> > > >>> Have a great weekend! > > >>> Mihai > > >>> > > On 20 Mar 2016, at 06:13, Suneel Marthi < > suneel.mar...@gmail.com > > > > > >> wrote: > > > > +1 to remove this > > > > Sent from my iPhone > > > > > On Mar 20, 2016, at 12:01 AM, Andrew Musselman < > > >> andrew.mussel...@gmail.com > wrote: > > > > > > We're discussing removing the MAHOUT_LOCAL option in order to > > trim > > >> artifact > > > sizes. > > > > > > If you think keeping the option to use MAHOUT_LOCAL for testing > > >>> with > > >> the > > > single-node mode of Hadoop is important please let us know. It > > >>> can be > > >> handy > > > for trying things out but it would be nice to ditch the effort > > >> required to > > > maintain it. > > > > > > See https://issues.apache.org/jira/browse/MAHOUT-1705 for more > > >> context. > > > > > > Thanks! > > >>> > > >> > > >> > > > > > > >>> > > >> > > >> > > > > >
Re: Removing MAHOUT_LOCAL option
Some background on this issue: 1. Now that we support Spark and H2O as back ends since 0.10.0 and Flink coming soon in 0.12.0, its been bloating the size of our release artifacts when pushing releases to Apache mirrors. Hence we were looking at pruning some of the components that have not been used or have been long marked deprecated and are not being worked on. 2. Since Mahout 0.7 release in June 2012, the project has diverged from the MiA book even for legacy MapReduce. Not sure if that's indeed helping onboard new users. 3. Seems like the consensus so far based on the user responses is to retain the MAHOUT_LOCAL the option, thanks all for your responses. On Mon, Mar 21, 2016 at 11:38 AM, scott cote wrote: > one more comment - I understand that it only works for the legacy code. > Kill it when the legacy code is no longer deprecated, but gone …. > > Otherwise - you will shut out people who buy the older mahout books (such > as MIA) which are still good reads, even though the tech is dated. > > SCott > > > On Mar 21, 2016, at 2:24 AM, David Starina > wrote: > > > > Anyhow, I'm +1 for removing MAHOUT_LOCAL, but I believe the deprecated > > MapReduce-based code still makes sense if it is running well on Ignite. > > > > On Mon, Mar 21, 2016 at 8:20 AM, David Starina > > wrote: > > > >> Has anyone tried to run the deprecated MapReduce code on Ignite? Is the > >> performance improvement good enough to reconsider leaving those > algorithms > >> in Mahout? > >> > >> On Mon, Mar 21, 2016 at 12:45 AM, Andrew Musselman < > >> andrew.mussel...@gmail.com> wrote: > >> > >>> Yes I agree; will leave the question open a couple days. > >>> > >>> On Sunday, March 20, 2016, Pat Ferrel wrote: > >>> > Maybe a better user question is: How many people are still using the > deprecated Hadoop code? > > If the number is small +1 for removal. > > On Mar 20, 2016, at 11:04 AM, Andrew Musselman < > >>> andrew.mussel...@gmail.com > > wrote: > > To clarify, the MAHOUT_LOCAL option only works for legacy Hadoop > MapReduce-based jobs which officially became deprecated in 0.10.0. > > On Sun, Mar 20, 2016 at 10:25 AM, Andrew Musselman < > andrew.mussel...@gmail.com > wrote: > > > Yes as I understand it. > > > > > > On Sunday, March 20, 2016, Pat Ferrel > wrote: > > > >> Are we just talking about Hadoop Mapreduce? I thought is was ignored > when > >> using Spark. > >> > >> On Mar 20, 2016, at 8:20 AM, alok tanna > wrote: > >> > >> -1 MAHOUT_LOCAL is very useful for quick POC . > >> > >> Thanks, > >> Alok Tanna > >> Sent from my iPhone > >> > >>> On Mar 20, 2016, at 5:01 AM, Mihai Dascalu < > mihai.dasc...@cs.pub.ro > > > >> wrote: > >>> > >>> -1 I still use it for fast deployment and it’s really helpful for > >>> small > >> local processing > >>> > >>> Have a great weekend! > >>> Mihai > >>> > On 20 Mar 2016, at 06:13, Suneel Marthi > > >> wrote: > > +1 to remove this > > Sent from my iPhone > > > On Mar 20, 2016, at 12:01 AM, Andrew Musselman < > >> andrew.mussel...@gmail.com > wrote: > > > > We're discussing removing the MAHOUT_LOCAL option in order to > trim > >> artifact > > sizes. > > > > If you think keeping the option to use MAHOUT_LOCAL for testing > >>> with > >> the > > single-node mode of Hadoop is important please let us know. It > >>> can be > >> handy > > for trying things out but it would be nice to ditch the effort > >> required to > > maintain it. > > > > See https://issues.apache.org/jira/browse/MAHOUT-1705 for more > >> context. > > > > Thanks! > >>> > >> > >> > > > >>> > >> > >> > >
Re: Removing MAHOUT_LOCAL option
one more comment - I understand that it only works for the legacy code. Kill it when the legacy code is no longer deprecated, but gone …. Otherwise - you will shut out people who buy the older mahout books (such as MIA) which are still good reads, even though the tech is dated. SCott > On Mar 21, 2016, at 2:24 AM, David Starina wrote: > > Anyhow, I'm +1 for removing MAHOUT_LOCAL, but I believe the deprecated > MapReduce-based code still makes sense if it is running well on Ignite. > > On Mon, Mar 21, 2016 at 8:20 AM, David Starina > wrote: > >> Has anyone tried to run the deprecated MapReduce code on Ignite? Is the >> performance improvement good enough to reconsider leaving those algorithms >> in Mahout? >> >> On Mon, Mar 21, 2016 at 12:45 AM, Andrew Musselman < >> andrew.mussel...@gmail.com> wrote: >> >>> Yes I agree; will leave the question open a couple days. >>> >>> On Sunday, March 20, 2016, Pat Ferrel wrote: >>> Maybe a better user question is: How many people are still using the deprecated Hadoop code? If the number is small +1 for removal. On Mar 20, 2016, at 11:04 AM, Andrew Musselman < >>> andrew.mussel...@gmail.com > wrote: To clarify, the MAHOUT_LOCAL option only works for legacy Hadoop MapReduce-based jobs which officially became deprecated in 0.10.0. On Sun, Mar 20, 2016 at 10:25 AM, Andrew Musselman < andrew.mussel...@gmail.com > wrote: > Yes as I understand it. > > > On Sunday, March 20, 2016, Pat Ferrel >>> > wrote: > >> Are we just talking about Hadoop Mapreduce? I thought is was ignored when >> using Spark. >> >> On Mar 20, 2016, at 8:20 AM, alok tanna >>> > wrote: >> >> -1 MAHOUT_LOCAL is very useful for quick POC . >> >> Thanks, >> Alok Tanna >> Sent from my iPhone >> >>> On Mar 20, 2016, at 5:01 AM, Mihai Dascalu >>> > >> wrote: >>> >>> -1 I still use it for fast deployment and it’s really helpful for >>> small >> local processing >>> >>> Have a great weekend! >>> Mihai >>> On 20 Mar 2016, at 06:13, Suneel Marthi >>> > >> wrote: +1 to remove this Sent from my iPhone > On Mar 20, 2016, at 12:01 AM, Andrew Musselman < >> andrew.mussel...@gmail.com > wrote: > > We're discussing removing the MAHOUT_LOCAL option in order to trim >> artifact > sizes. > > If you think keeping the option to use MAHOUT_LOCAL for testing >>> with >> the > single-node mode of Hadoop is important please let us know. It >>> can be >> handy > for trying things out but it would be nice to ditch the effort >> required to > maintain it. > > See https://issues.apache.org/jira/browse/MAHOUT-1705 for more >> context. > > Thanks! >>> >> >> >>> >> >>
Re: Removing MAHOUT_LOCAL option
I know that I’m not a contributor, but the local option allowed me to get into the use of Mahout without a lot of upfront cost. Please don’t lose site of acquiring new users of Mahout - the local install is a key part of that process. SCott > On Mar 19, 2016, at 11:01 PM, Andrew Musselman > wrote: > > We're discussing removing the MAHOUT_LOCAL option in order to trim artifact > sizes. > > If you think keeping the option to use MAHOUT_LOCAL for testing with the > single-node mode of Hadoop is important please let us know. It can be handy > for trying things out but it would be nice to ditch the effort required to > maintain it. > > See https://issues.apache.org/jira/browse/MAHOUT-1705 for more context. > > Thanks!
Re: Removing MAHOUT_LOCAL option
Anyhow, I'm +1 for removing MAHOUT_LOCAL, but I believe the deprecated MapReduce-based code still makes sense if it is running well on Ignite. On Mon, Mar 21, 2016 at 8:20 AM, David Starina wrote: > Has anyone tried to run the deprecated MapReduce code on Ignite? Is the > performance improvement good enough to reconsider leaving those algorithms > in Mahout? > > On Mon, Mar 21, 2016 at 12:45 AM, Andrew Musselman < > andrew.mussel...@gmail.com> wrote: > >> Yes I agree; will leave the question open a couple days. >> >> On Sunday, March 20, 2016, Pat Ferrel wrote: >> >> > Maybe a better user question is: How many people are still using the >> > deprecated Hadoop code? >> > >> > If the number is small +1 for removal. >> > >> > On Mar 20, 2016, at 11:04 AM, Andrew Musselman < >> andrew.mussel...@gmail.com >> > > wrote: >> > >> > To clarify, the MAHOUT_LOCAL option only works for legacy Hadoop >> > MapReduce-based jobs which officially became deprecated in 0.10.0. >> > >> > On Sun, Mar 20, 2016 at 10:25 AM, Andrew Musselman < >> > andrew.mussel...@gmail.com > wrote: >> > >> > > Yes as I understand it. >> > > >> > > >> > > On Sunday, March 20, 2016, Pat Ferrel > > > wrote: >> > > >> > >> Are we just talking about Hadoop Mapreduce? I thought is was ignored >> > when >> > >> using Spark. >> > >> >> > >> On Mar 20, 2016, at 8:20 AM, alok tanna > > > wrote: >> > >> >> > >> -1 MAHOUT_LOCAL is very useful for quick POC . >> > >> >> > >> Thanks, >> > >> Alok Tanna >> > >> Sent from my iPhone >> > >> >> > >>> On Mar 20, 2016, at 5:01 AM, Mihai Dascalu > > > >> > >> wrote: >> > >>> >> > >>> -1 I still use it for fast deployment and it’s really helpful for >> small >> > >> local processing >> > >>> >> > >>> Have a great weekend! >> > >>> Mihai >> > >>> >> > On 20 Mar 2016, at 06:13, Suneel Marthi > > > >> > >> wrote: >> > >> > +1 to remove this >> > >> > Sent from my iPhone >> > >> > > On Mar 20, 2016, at 12:01 AM, Andrew Musselman < >> > >> andrew.mussel...@gmail.com > wrote: >> > > >> > > We're discussing removing the MAHOUT_LOCAL option in order to trim >> > >> artifact >> > > sizes. >> > > >> > > If you think keeping the option to use MAHOUT_LOCAL for testing >> with >> > >> the >> > > single-node mode of Hadoop is important please let us know. It >> can be >> > >> handy >> > > for trying things out but it would be nice to ditch the effort >> > >> required to >> > > maintain it. >> > > >> > > See https://issues.apache.org/jira/browse/MAHOUT-1705 for more >> > >> context. >> > > >> > > Thanks! >> > >>> >> > >> >> > >> >> > >> > >> > >
Re: Removing MAHOUT_LOCAL option
Has anyone tried to run the deprecated MapReduce code on Ignite? Is the performance improvement good enough to reconsider leaving those algorithms in Mahout? On Mon, Mar 21, 2016 at 12:45 AM, Andrew Musselman < andrew.mussel...@gmail.com> wrote: > Yes I agree; will leave the question open a couple days. > > On Sunday, March 20, 2016, Pat Ferrel wrote: > > > Maybe a better user question is: How many people are still using the > > deprecated Hadoop code? > > > > If the number is small +1 for removal. > > > > On Mar 20, 2016, at 11:04 AM, Andrew Musselman < > andrew.mussel...@gmail.com > > > wrote: > > > > To clarify, the MAHOUT_LOCAL option only works for legacy Hadoop > > MapReduce-based jobs which officially became deprecated in 0.10.0. > > > > On Sun, Mar 20, 2016 at 10:25 AM, Andrew Musselman < > > andrew.mussel...@gmail.com > wrote: > > > > > Yes as I understand it. > > > > > > > > > On Sunday, March 20, 2016, Pat Ferrel > > wrote: > > > > > >> Are we just talking about Hadoop Mapreduce? I thought is was ignored > > when > > >> using Spark. > > >> > > >> On Mar 20, 2016, at 8:20 AM, alok tanna > > wrote: > > >> > > >> -1 MAHOUT_LOCAL is very useful for quick POC . > > >> > > >> Thanks, > > >> Alok Tanna > > >> Sent from my iPhone > > >> > > >>> On Mar 20, 2016, at 5:01 AM, Mihai Dascalu > > > > >> wrote: > > >>> > > >>> -1 I still use it for fast deployment and it’s really helpful for > small > > >> local processing > > >>> > > >>> Have a great weekend! > > >>> Mihai > > >>> > > On 20 Mar 2016, at 06:13, Suneel Marthi > > > > >> wrote: > > > > +1 to remove this > > > > Sent from my iPhone > > > > > On Mar 20, 2016, at 12:01 AM, Andrew Musselman < > > >> andrew.mussel...@gmail.com > wrote: > > > > > > We're discussing removing the MAHOUT_LOCAL option in order to trim > > >> artifact > > > sizes. > > > > > > If you think keeping the option to use MAHOUT_LOCAL for testing > with > > >> the > > > single-node mode of Hadoop is important please let us know. It can > be > > >> handy > > > for trying things out but it would be nice to ditch the effort > > >> required to > > > maintain it. > > > > > > See https://issues.apache.org/jira/browse/MAHOUT-1705 for more > > >> context. > > > > > > Thanks! > > >>> > > >> > > >> > > > > >
Re: Removing MAHOUT_LOCAL option
Yes I agree; will leave the question open a couple days. On Sunday, March 20, 2016, Pat Ferrel wrote: > Maybe a better user question is: How many people are still using the > deprecated Hadoop code? > > If the number is small +1 for removal. > > On Mar 20, 2016, at 11:04 AM, Andrew Musselman > wrote: > > To clarify, the MAHOUT_LOCAL option only works for legacy Hadoop > MapReduce-based jobs which officially became deprecated in 0.10.0. > > On Sun, Mar 20, 2016 at 10:25 AM, Andrew Musselman < > andrew.mussel...@gmail.com > wrote: > > > Yes as I understand it. > > > > > > On Sunday, March 20, 2016, Pat Ferrel > wrote: > > > >> Are we just talking about Hadoop Mapreduce? I thought is was ignored > when > >> using Spark. > >> > >> On Mar 20, 2016, at 8:20 AM, alok tanna > wrote: > >> > >> -1 MAHOUT_LOCAL is very useful for quick POC . > >> > >> Thanks, > >> Alok Tanna > >> Sent from my iPhone > >> > >>> On Mar 20, 2016, at 5:01 AM, Mihai Dascalu > > >> wrote: > >>> > >>> -1 I still use it for fast deployment and it’s really helpful for small > >> local processing > >>> > >>> Have a great weekend! > >>> Mihai > >>> > On 20 Mar 2016, at 06:13, Suneel Marthi > > >> wrote: > > +1 to remove this > > Sent from my iPhone > > > On Mar 20, 2016, at 12:01 AM, Andrew Musselman < > >> andrew.mussel...@gmail.com > wrote: > > > > We're discussing removing the MAHOUT_LOCAL option in order to trim > >> artifact > > sizes. > > > > If you think keeping the option to use MAHOUT_LOCAL for testing with > >> the > > single-node mode of Hadoop is important please let us know. It can be > >> handy > > for trying things out but it would be nice to ditch the effort > >> required to > > maintain it. > > > > See https://issues.apache.org/jira/browse/MAHOUT-1705 for more > >> context. > > > > Thanks! > >>> > >> > >> > >
Re: Removing MAHOUT_LOCAL option
Maybe a better user question is: How many people are still using the deprecated Hadoop code? If the number is small +1 for removal. On Mar 20, 2016, at 11:04 AM, Andrew Musselman wrote: To clarify, the MAHOUT_LOCAL option only works for legacy Hadoop MapReduce-based jobs which officially became deprecated in 0.10.0. On Sun, Mar 20, 2016 at 10:25 AM, Andrew Musselman < andrew.mussel...@gmail.com> wrote: > Yes as I understand it. > > > On Sunday, March 20, 2016, Pat Ferrel wrote: > >> Are we just talking about Hadoop Mapreduce? I thought is was ignored when >> using Spark. >> >> On Mar 20, 2016, at 8:20 AM, alok tanna wrote: >> >> -1 MAHOUT_LOCAL is very useful for quick POC . >> >> Thanks, >> Alok Tanna >> Sent from my iPhone >> >>> On Mar 20, 2016, at 5:01 AM, Mihai Dascalu >> wrote: >>> >>> -1 I still use it for fast deployment and it’s really helpful for small >> local processing >>> >>> Have a great weekend! >>> Mihai >>> On 20 Mar 2016, at 06:13, Suneel Marthi >> wrote: +1 to remove this Sent from my iPhone > On Mar 20, 2016, at 12:01 AM, Andrew Musselman < >> andrew.mussel...@gmail.com> wrote: > > We're discussing removing the MAHOUT_LOCAL option in order to trim >> artifact > sizes. > > If you think keeping the option to use MAHOUT_LOCAL for testing with >> the > single-node mode of Hadoop is important please let us know. It can be >> handy > for trying things out but it would be nice to ditch the effort >> required to > maintain it. > > See https://issues.apache.org/jira/browse/MAHOUT-1705 for more >> context. > > Thanks! >>> >> >>
Re: Removing MAHOUT_LOCAL option
To clarify, the MAHOUT_LOCAL option only works for legacy Hadoop MapReduce-based jobs which officially became deprecated in 0.10.0. On Sun, Mar 20, 2016 at 10:25 AM, Andrew Musselman < andrew.mussel...@gmail.com> wrote: > Yes as I understand it. > > > On Sunday, March 20, 2016, Pat Ferrel wrote: > >> Are we just talking about Hadoop Mapreduce? I thought is was ignored when >> using Spark. >> >> On Mar 20, 2016, at 8:20 AM, alok tanna wrote: >> >> -1 MAHOUT_LOCAL is very useful for quick POC . >> >> Thanks, >> Alok Tanna >> Sent from my iPhone >> >> > On Mar 20, 2016, at 5:01 AM, Mihai Dascalu >> wrote: >> > >> > -1 I still use it for fast deployment and it’s really helpful for small >> local processing >> > >> > Have a great weekend! >> > Mihai >> > >> >> On 20 Mar 2016, at 06:13, Suneel Marthi >> wrote: >> >> >> >> +1 to remove this >> >> >> >> Sent from my iPhone >> >> >> >>> On Mar 20, 2016, at 12:01 AM, Andrew Musselman < >> andrew.mussel...@gmail.com> wrote: >> >>> >> >>> We're discussing removing the MAHOUT_LOCAL option in order to trim >> artifact >> >>> sizes. >> >>> >> >>> If you think keeping the option to use MAHOUT_LOCAL for testing with >> the >> >>> single-node mode of Hadoop is important please let us know. It can be >> handy >> >>> for trying things out but it would be nice to ditch the effort >> required to >> >>> maintain it. >> >>> >> >>> See https://issues.apache.org/jira/browse/MAHOUT-1705 for more >> context. >> >>> >> >>> Thanks! >> > >> >>
Re: Removing MAHOUT_LOCAL option
Yes as I understand it. On Sunday, March 20, 2016, Pat Ferrel wrote: > Are we just talking about Hadoop Mapreduce? I thought is was ignored when > using Spark. > > On Mar 20, 2016, at 8:20 AM, alok tanna > wrote: > > -1 MAHOUT_LOCAL is very useful for quick POC . > > Thanks, > Alok Tanna > Sent from my iPhone > > > On Mar 20, 2016, at 5:01 AM, Mihai Dascalu > wrote: > > > > -1 I still use it for fast deployment and it’s really helpful for small > local processing > > > > Have a great weekend! > > Mihai > > > >> On 20 Mar 2016, at 06:13, Suneel Marthi > wrote: > >> > >> +1 to remove this > >> > >> Sent from my iPhone > >> > >>> On Mar 20, 2016, at 12:01 AM, Andrew Musselman < > andrew.mussel...@gmail.com > wrote: > >>> > >>> We're discussing removing the MAHOUT_LOCAL option in order to trim > artifact > >>> sizes. > >>> > >>> If you think keeping the option to use MAHOUT_LOCAL for testing with > the > >>> single-node mode of Hadoop is important please let us know. It can be > handy > >>> for trying things out but it would be nice to ditch the effort > required to > >>> maintain it. > >>> > >>> See https://issues.apache.org/jira/browse/MAHOUT-1705 for more > context. > >>> > >>> Thanks! > > > >
Re: Removing MAHOUT_LOCAL option
Are we just talking about Hadoop Mapreduce? I thought is was ignored when using Spark. On Mar 20, 2016, at 8:20 AM, alok tanna wrote: -1 MAHOUT_LOCAL is very useful for quick POC . Thanks, Alok Tanna Sent from my iPhone > On Mar 20, 2016, at 5:01 AM, Mihai Dascalu wrote: > > -1 I still use it for fast deployment and it’s really helpful for small local > processing > > Have a great weekend! > Mihai > >> On 20 Mar 2016, at 06:13, Suneel Marthi wrote: >> >> +1 to remove this >> >> Sent from my iPhone >> >>> On Mar 20, 2016, at 12:01 AM, Andrew Musselman >>> wrote: >>> >>> We're discussing removing the MAHOUT_LOCAL option in order to trim artifact >>> sizes. >>> >>> If you think keeping the option to use MAHOUT_LOCAL for testing with the >>> single-node mode of Hadoop is important please let us know. It can be handy >>> for trying things out but it would be nice to ditch the effort required to >>> maintain it. >>> >>> See https://issues.apache.org/jira/browse/MAHOUT-1705 for more context. >>> >>> Thanks! >
Re: Removing MAHOUT_LOCAL option
-1 MAHOUT_LOCAL is very useful for quick POC . Thanks, Alok Tanna Sent from my iPhone > On Mar 20, 2016, at 5:01 AM, Mihai Dascalu wrote: > > -1 I still use it for fast deployment and it’s really helpful for small local > processing > > Have a great weekend! > Mihai > >> On 20 Mar 2016, at 06:13, Suneel Marthi wrote: >> >> +1 to remove this >> >> Sent from my iPhone >> >>> On Mar 20, 2016, at 12:01 AM, Andrew Musselman >>> wrote: >>> >>> We're discussing removing the MAHOUT_LOCAL option in order to trim artifact >>> sizes. >>> >>> If you think keeping the option to use MAHOUT_LOCAL for testing with the >>> single-node mode of Hadoop is important please let us know. It can be handy >>> for trying things out but it would be nice to ditch the effort required to >>> maintain it. >>> >>> See https://issues.apache.org/jira/browse/MAHOUT-1705 for more context. >>> >>> Thanks! >
Re: Removing MAHOUT_LOCAL option
-1 I still use it for fast deployment and it’s really helpful for small local processing Have a great weekend! Mihai > On 20 Mar 2016, at 06:13, Suneel Marthi wrote: > > +1 to remove this > > Sent from my iPhone > >> On Mar 20, 2016, at 12:01 AM, Andrew Musselman >> wrote: >> >> We're discussing removing the MAHOUT_LOCAL option in order to trim artifact >> sizes. >> >> If you think keeping the option to use MAHOUT_LOCAL for testing with the >> single-node mode of Hadoop is important please let us know. It can be handy >> for trying things out but it would be nice to ditch the effort required to >> maintain it. >> >> See https://issues.apache.org/jira/browse/MAHOUT-1705 for more context. >> >> Thanks!
Re: Removing MAHOUT_LOCAL option
+1 to remove this Sent from my iPhone > On Mar 20, 2016, at 12:01 AM, Andrew Musselman > wrote: > > We're discussing removing the MAHOUT_LOCAL option in order to trim artifact > sizes. > > If you think keeping the option to use MAHOUT_LOCAL for testing with the > single-node mode of Hadoop is important please let us know. It can be handy > for trying things out but it would be nice to ditch the effort required to > maintain it. > > See https://issues.apache.org/jira/browse/MAHOUT-1705 for more context. > > Thanks!