[
https://issues.apache.org/jira/browse/MAHOUT-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Palumbo updated MAHOUT-1838:
---
Attachment: drmSamplePlot2d.png
> Provide and plotting capabilities for Mahout mtrices and DR
[
https://issues.apache.org/jira/browse/MAHOUT-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Palumbo updated MAHOUT-1838:
---
Assignee: (was: Andrew Palumbo)
> Provide and plotting capabilities for Mahout mtrices an
Andrew Palumbo created MAHOUT-1838:
--
Summary: Provide and plotting capabilities for Mahout mtrices and
DRMs
Key: MAHOUT-1838
URL: https://issues.apache.org/jira/browse/MAHOUT-1838
Project: Mahout
Not sure if they have fkmeans.. it wouldnt surprise me if they do..They do have
quite a few algorithms though.
Btw. Have you checked Weka for fkmeans? I think they may have an
implementation built in.
Problem with smile is they have slf4j classes but no artifacts in their build
files.
Al
Hi!
SMILE has all machine learning algorithm except fuzzy clustering algorithm
. :(
Am I correct ?
Regards
Prakash
On Fri, Apr 29, 2016 at 12:40 AM, Andrew Palumbo wrote:
> Has anyone had any luck getting this project to build? Or integrating
> it's artifacts with anything else?
>
>
> http
Thanks, this helps, I hope to have a proposal to dev outlining some use cases
in the next few weeks.
> From: ap@outlook.com
> To: dev@mahout.apache.org
> Subject: Re: Mahout contributions
> Date: Fri, 29 Apr 2016 00:03:41 +
>
> One last thing, Saikat, in answer to your question below. T
One last thing, Saikat, in answer to your question below. To clarify, for
proposed smaller scale mahout contributions (not on the roadmap or in currently
open Jiras):
a good workflow would be as follows:
1. Investigate your idea independently
2. Float the proposal to dev@,
3. Allow some time
the grid should be upgraded it may take some time to publish.
From: Andrew Palumbo
Sent: Thursday, April 28, 2016 5:24 PM
To: mahout
Subject: Drprecated MR algorithms
To avoid any confusion, unless there are any objections, I'll mark all of the
MapReduce
Has anyone had any luck getting this project to build? Or integrating it's
artifacts with anything else?
https://github.com/haifengl/smile
there might be a concept of "contrib" sub project with totally separate
code tree, some asf projects do that. that way it is easy to keep it around
if it turns out to be useful, and easy to strip off if it becomes
unsupported (sorry for pragmatic cynicism)
On Thu, Apr 28, 2016 at 2:48 PM, Khurrum
Andrew/Khurrum,
To be clear this project involves building some algorithms that are not yet
implemented in spark based on the wiki (namely the clustering algorithms) and
then integrating them into elasticsearch and kibana through a rest API. Mahout
will remain as is, I will look at Prediction.i
I agree with Andrew. Mahout should remain indigenous.
Prakash - you may want to create your own project on github using the mahout
library.
> On Apr 28, 2016, at 5:43 PM, Andrew Palumbo wrote:
>
> I don't think that this sort of of integration work would be a good fit
> directly to
Dear Dmitriy,
I really appreciate you as you write so long to clarify my confusion. Much
appreciated. Thank you so much :)
Regards
Prakash Poudyal
On Thu, Apr 28, 2016 at 10:13 PM, Dmitriy Lyubimov
wrote:
> Prakash,
>
> (1) to be clear, the ASF trademark and branding policy is not to endorse
>
I don't think that this sort of of integration work would be a good fit
directly to the Mahout project. Mahout is more about math, algorithms and an
environment to develop algorithms. We stay away from direct platform
integration. In the past we did have some elasticsearch/mahout integration
To avoid any confusion, unless there are any objections, I'll mark all of the
MapReduce algorithms on the algorithms page as deprecated.
Prakash,
(1) to be clear, the ASF trademark and branding policy is not to endorse
views of the 3rd party publications and to ask 3rd party writers to do a
disclosure that their views are not endorsed by ASF project. To that end,
ASF project can't really tell you that some publication is
"(in)appro
@Prakash - Albeit I’m a Mahout noob - if you can represent your problem as a
network with 2d input then yes Mahout can be used (so i’ve heard).
IMO - every machine based computation problem can be represented as a graph -
although this may not always be optimal.
Taking this notion of fuzzy clus
Dear Suneel, Dmitriy and Ted,
This is just gentle remainder to answer my confusion that I mention in my
previous email. It would be great if you could response me sooner, so that
I can go ahead.
Thank you so much.
Prakash
On Thu, Apr 28, 2016 at 8:02 PM, Prakash Poudyal
wrote:
> Hi!
>
> Thank
[
https://issues.apache.org/jira/browse/MAHOUT-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262827#comment-15262827
]
Andrew Palumbo commented on MAHOUT-1837:
The first fix is trivial (that's the act
(sorry for repetition, the list rejects my previous replies due to quoted
message size)
"Auto" just reclusters the input per given _configured cluster capacity_
(there's some safe guard there though i think that doesn't blow up # of
splits if the initial number of splits is ridiculously small thou
Hmm, can’t get images through the Apache mail servers.
The image is here:
https://drive.google.com/file/d/0B4cAk1SMC1ChWFZiRG9DSEpkdzg/view?usp=sharing
On Apr 28, 2016, at 11:55 AM, Pat Ferrel wrote:
Actually on your advice Dmitriy I think these changes went in about 11. Before
11 par was n
Hi!
Thank you for your emails !!
Actually, I need to use fuzzy clustering to cluster the sentence in my
research. This is my goal.
I started to use Fuzzy K means clustering of Mahout since last week !!! I
found several blogs links, and many other helpful documents I was
going through, as b
[
https://issues.apache.org/jira/browse/MAHOUT-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262703#comment-15262703
]
Suneel Marthi commented on MAHOUT-1837:
---
Wouldn't it be easier to separate the 2 ou
That's correct, deprecated as of Feb 2014 and will be completely purged in
one of the upcoming releases (0.13.0)
On Thu, Apr 28, 2016 at 2:10 PM, Dmitriy Lyubimov wrote:
> Prakash,
>
> if you are using any Mahout Mapreduce algorithm for research, please make
> sure to make this disclosure:
>
> a
Prakash,
if you are using any Mahout Mapreduce algorithm for research, please make
sure to make this disclosure:
all Mahout MapReduce algorithms are officially not supported and deprecated
since February, 2014 (IIRC). I can dig up a specific issue regarding this.
There also has been an announceme
Yes, the entire MapReduce code (which includes the fuzzy clustering that u
r looking at) is not supported anymore as of Mahout 0.10.0 (suggest reading
the release notes on mahout.apache.org)
On Thu, Apr 28, 2016 at 2:05 PM, Prakash Poudyal
wrote:
> Hi! Ted,
>
> You mean Mahout is no more suppor
On Thu, Apr 28, 2016 at 1:54 PM, Prakash Poudyal
wrote:
> Dear Suneel,
>
> Thank you so much for your reply, I was waiting for long time.
>
> Actually, I need to use fuzzy clustering to cluster the sentence in my
> research. I found fuzzy k clustering algorithm in Apache Mahout, thus, I
> am try
Hi! Ted,
You mean Mahout is no more supporting "fuzzy K clustering for the
sentences". Can you clarify in more detail . :(
Prakash
On Thu, Apr 28, 2016 at 6:58 PM, Ted Dunning wrote:
> On Thu, Apr 28, 2016 at 10:54 AM, Prakash Poudyal <
> prakashpoud...@gmail.com>
> wrote:
>
> > Actually, I ne
On Thu, Apr 28, 2016 at 10:54 AM, Prakash Poudyal
wrote:
> Actually, I need to use fuzzy clustering to cluster the sentence in my
> research. I found fuzzy k clustering algorithm in Apache Mahout, thus, I
> am trying to use it for my purpose.
>
That's great.
But that code is no longer supporte
Dear Suneel,
Thank you so much for your reply, I was waiting for long time.
Actually, I need to use fuzzy clustering to cluster the sentence in my
research. I found fuzzy k clustering algorithm in Apache Mahout, thus, I
am trying to use it for my purpose.
Regarding your reply, of "first thing"
I want to start with social data as an example, for example data returned from
FB graph API as well user Twitter data, will send some samples later if you're
interested.
Sent from my iPhone
> On Apr 28, 2016, at 10:41 AM, Khurrum Nasim wrote:
>
>
> What type of JSON payload size are we talki
What type of JSON payload size are we talking about here ?
> On Apr 28, 2016, at 1:32 PM, Saikat Kanjilal wrote:
>
> Because EL gives you the visualization and non Lucene type query constructs
> as well and also that it already has a rest API that I plan on tying into
> mahout. I plan on wra
Because EL gives you the visualization and non Lucene type query constructs as
well and also that it already has a rest API that I plan on tying into mahout.
I plan on wrapping some of the clustering algorithms that I implement using
Mahout and Spark as a service which can then make calls into
First thing, most of this code is legacy MapReduce and is not supported
anymore. Hence you r not seeing answers.
Back to ur question: -c specifies the folder for the initial centroids that
r randomly generated. IIR, the centroids are generated when u execute the
Clustering Driver.
On Wed, Apr 2
@Saikat- why use EL instead of Lucene directly.
> On Apr 28, 2016, at 12:08 PM, Saikat Kanjilal wrote:
>
> This is great information thank you, based on this recommendation I won't
> create a JIRA but start work on my project and when the code approaches the
> percentages you are describing
Yes.
Parallelism in Spark makes all the difference.
Since scatter type exchnange in spark increases I/O with increase of # of
the splits, strong scalling is not achievable. if you just keep increasing
parallelism, there's a point where individual cpu load decreases but
cumulative IO cancels out a
This is great information thank you, based on this recommendation I won't
create a JIRA but start work on my project and when the code approaches the
percentages you are describing I will create the appropriate JIRA's and put
together a proposal to send to the list, sound ok? Based on your late
Hi,
Ok, so interestingly enough when I repartition my input data across indicators
on the User IDs, I get significant speedup. This is probably because shuffle
goes down since RDDs with the same user ids are more likely located on the same
nodes. What’s even more interesting is the behaviour as
38 matches
Mail list logo