from:"Marko"

Re: Streaming K-means

2015-06-02 Thread Marko Dinic

is why I'm considering this Streaming approach now. Would you think that it is worthy of giving a shot? I'm really stretching for a scalable solution. Best regards, Marko On Tue 02 Jun 2015 12:03:40 AM CEST, Ted Dunning wrote: The streaming k-means works by building a sketch of th

Streaming K-means

2015-06-01 Thread Marko Dinic

igger problems than K-means because it's not scalable, but can be useful in some cases (e.g. It allows more sophisticated distance measures). What is your opinion about implementation of this? Best regards, Marko

K-means implementation

2015-01-23 Thread Marko Dinic

Hello everyone, I was digging through K-means implementation on Hadoop and I'm a bit confused with one thing so I wanted to check. To calculate the distance from point to all centroids, centroids need to be accessed from every mapper. So it seemed logical to me to put the centroids (sequenceF

Re: DTW distance measure and K-medioids, Hierarchical clustering

2015-01-15 Thread marko . dinic

th the implementation, if it doesn't sound that crazy. I wish you all the best, Marko Quoting Ted Dunning : On Thu, Jan 15, 2015 at 3:50 AM, Marko Dinic wrote: Thank you for your answer. Maybe I made a wrong picture about my data when giving sinusoid as an example, my time series are

Re: DTW distance measure and K-medioids, Hierarchical clustering

2015-01-15 Thread Marko Dinic

calculations, how much time could I expect for such an algorithm in case of 10.000 signals with 300 points, for example? How can I even estimate that? Thanks for your effort, if you have time to answer. Regards, Marko On Thu 15 Jan 2015 05:25:55 AM CET, Anand Avati wrote: Perhaps you could think o

Re: DTW distance measure and K-medioids, Hierarchical clustering

2015-01-10 Thread Marko Dinic

scalable solution for my problem, I tried to fit it in what's already implemented in Mahout (for clustering), but it's not so obvious to me. I'm open to suggestions, I'm still new to all of this. Thanks, Marko On Sat 10 Jan 2015 07:32:33 AM CET, Ted Dunning wrote: Why is i

Re: DTW distance measure and K-medioids, Hierarchical clustering

2015-01-09 Thread Marko Dinic

about the scalability? I would highly appreciate your answer, thanks. On Thu 08 Jan 2015 08:19:18 PM CET, Ted Dunning wrote: On Thu, Jan 8, 2015 at 7:00 AM, Marko Dinic wrote: 1) Is there an implementation of DTW (Dynamic Time Warping) in Mahout that could be used as a distance measure for cluster

DTW distance measure and K-medioids, Hierarchical clustering

2015-01-08 Thread Marko Dinic

Hello everyone. I have a couple of questions. 1) Is there an implementation of DTW (Dynamic Time Warping) in Mahout that could be used as a distance measure for clustering? 2) Why isn't there an implementation of K-mediods in Mahout? I'm guessing that it could not be implemented efficiently

Re: How can I include mahout 0.9 with hadoop 2.3 in my project?

2014-12-15 Thread Marko Dinic

Hello, Sorry for bumping like this, but I have a very similar question, can I use Mahout 0.9 with Hadoop 0.20.2? Thanks On Mon 15 Dec 2014 10:09:56 AM CET, jyotiranjan panda wrote: Hi, mahout-0.9 is compatible with hadoop-1.2.1 Regards Jyoti Ranjan Panda On Mon, Dec 15, 2014 at 2:33 PM, Le

Re: Mahout 0.9 on Hadoop 0.20.2

2014-10-28 Thread Marko Dinić

since Hadoop is installed on the cluster? I have never done deployment to cluster, so I'm really confused. Any help would great, or any reference like the previous one? Regards, Marko On уторак, 28. октобар 2014. 17:12:59 CET, Chandramani Tiwary wrote: Hi Marko, Nothing special needs to b

Re: Mahout 0.9 on Hadoop 0.20.2

2014-10-28 Thread Marko Dinić

xpect failures in case of it? Regards, Marko On уторак, 28. октобар 2014. 16:48:03 CET, Chandramani Tiwary wrote: Hi Marko, You can configure Mahout 0.9 over Hadoop 0.20.2 but the Hadoop dependencies might lead to failure quite a few time. One example, If I remember correctly is that Hadoop 0

Mahout 0.9 on Hadoop 0.20.2

2014-10-28 Thread Marko Dinić

Hello, I have Hadoop cluster on which Hadoop 0.20.2 is installed. Is there a way to use Mahout 0.9 on that cluster? I understand that Mahout 0.9 is based on Hadoop 1.2.1, but I have this constraint, so I cannot install another version of Hadoop on it. Thanks, Marko

Re: Streaming K Means exception without any reason

2014-10-09 Thread Marko Dinić

o how many points? possible to share ur dataset to troubleshoot ? On Thu, Oct 9, 2014 at 9:18 AM, Marko Dinić wrote: Suneel, Thank you for your answer, this was rather strange to me. The number of points is 942. I have multiple runs, in each run I have a loop in which number of cluste

Re: Streaming K Means exception without any reason

2014-10-09 Thread Marko Dinić

Here is the dataset. On четвртак, 09. октобар 2014. 16:53:25 CEST, Marko Dinić wrote: Yes it is small, but it is just a sample, so the dataset will probably be much bigger. So you think that this was the problem? Will this problem be avoided in case of larger dataset? I think that there were

Re: Streaming K Means exception without any reason

2014-10-09 Thread Marko Dinić

share ur dataset to troubleshoot ? On Thu, Oct 9, 2014 at 9:18 AM, Marko Dinić wrote: Suneel, Thank you for your answer, this was rather strange to me. The number of points is 942. I have multiple runs, in each run I have a loop in which number of clusters is increased in each iteration

Re: Streaming K Means exception without any reason

2014-10-09 Thread Marko Dinić

take a look at this. On Thu, Oct 9, 2014 at 5:39 AM, Marko Dinić wrote: Hello everyone, I'm using Mahout Streaming K Means multiple times in a loop, every time for same input data, and output path is always different. Concretely, I'm increasing number of clusters in each iteration.

Streaming K Means exception without any reason

2014-10-09 Thread Marko Dinić

Hello everyone, I'm using Mahout Streaming K Means multiple times in a loop, every time for same input data, and output path is always different. Concretely, I'm increasing number of clusters in each iteration. Currently it is run on a single machine. A couple of times (maybe 3 of 20 runs) I

Use of streaming K Means

2014-10-02 Thread Marko Dinić

that is performed after streaming step. The question that arrives is - when to do Ball K Means step, since the data arrives all the time... Should I even consider this, or should I go for lambda architecture? Any help would be great. Thanks, Marko

Re: Streaming K Means

2014-10-02 Thread Marko Dinić

ster-reuters.sh that you have provided, what is it used for? Thanks, Marko On понедељак, 29. септембар 2014. 20:00:33 CEST, Suneel Marthi wrote: This was replied to earlier with the details u r looking for, repeating here again: See http://stackoverflow.com/questions/17272296/how-to-use-mah

Streaming K Means

2014-09-29 Thread Marko

Hello everyone, I have previously asked a question about Streaming K Means examples, and got an answer that there are not so many available. Can anyone give me example of how to call Streaming K Means clustering for a dataset, and how to get the results? What are the results, are they the s

Re: word weights using BM25

2014-09-24 Thread Marko

Hello everyone, I'm very sorry to bump in like this, I have been added to the mail list (I think), but it seems that I'm somehow unable to ask a question, that is, I asked a question full times and got no answer. I hope this way will work. I'm new to Mahout and I've been struggling with Stre

KMeans for clustering individual point

2014-09-08 Thread Marko

Hello, I know that Mahout is used for batch processing, but I am interested if I can use its KMeans, and how, for clustering individual points? Let's say that we have following situation * Global clustering, that performs batch processing on all data and gives centroids as result * One p

Streaming K Means

2014-09-04 Thread Marko

Configuration configuration = new Configuration(); configuration.set("--estimatedNumMapClusters", "18"); configuration.set("-k", "6"); configuration.set("--distanceMeasure", "org.apache.mahout.common.distance.

Re: question on writing a customized item similiarity function

2011-09-30 Thread Marko Ciric

item > > description content) for example for product recommendation, how can i > > customize the similarity function ? As far as I understand, the current > > mahout similarity function is based on user rating only. Any one had > > experience writing a custom item based similarity

Re: Ehcache and Mahout

2011-09-30 Thread Marko Ciric

> http://ehcache.org/ > > > > > > For iterative MapReduce applications running on a NoSQL data store, it > > > should provide a good performance boost by providing an in-memory > object > > > cache (I think). Any comments? > > > -- -- Marko Ćirić ciric.ma...@gmail.com

Re: Article on Mahout recommenders and Cassandra

2011-08-16 Thread Marko Ciric

g Cassandra > > and/or the non-distributed recommenders. > > > > Sean > > > -- -- Marko Ćirić ciric.ma...@gmail.com

Re: Advice request

2011-08-08 Thread Marko Ciric

You could also introduce clustering and build clusters from pages that have a lot of similar words. If your pages data doesn't change too often, you could select most similar pages from within a cluster and recommend it to a user.. On Aug 8, 2011 6:08 PM, "Marko Ciric" wrote: >

Re: Advice request

2011-08-08 Thread Marko Ciric

You might want to use TanimotoCoefficientSimilarity if your data set isn't large. On Jul 27, 2011 10:51 AM, "Sean Owen" wrote: > Sounds good. In that case, the surprise-n-coincidence counterpart you are > probably looking for it LogLikelihoodSimilarity, which implements > ItemSimilarity. Use it wi

Re: Mahout Binary Recommender Evaluator

2011-07-28 Thread Marko Ciric

Correction: I didn't mean to re-implement the existing functionality, but there should be an easy way to connect UAC with Taste evaluators. On 28 July 2011 12:57, Marko Ciric wrote: > I think it wouldn't be a big problem to reimplement it thought it would > have to have a sort o

Re: Mahout Binary Recommender Evaluator

2011-07-28 Thread Marko Ciric

l, we do have numerous ways to compute AUC. I don't think that they are > integrated into the recommendation evaluation framework yet. Would you > like > to take on the application of suitable glue? > > > On Mon, Jul 25, 2011 at 1:00 PM, Marko Ciric > wrote: > > >

AUC

2011-07-25 Thread Marko Ciric

Hi guys, I'm wondering if any resources or tutorials are available (and where) about calculating AUC when working with boolean preferences data models? -- -- Marko Ćirić ciric.ma...@gmail.com

Re: Mahout Binary Recommender Evaluator

2011-07-25 Thread Marko Ciric

On Mon, Jul 25, 2011 at 3:16 AM, Marko Ciric > wrote: > > > The better way to do it is to implement an evaluator which accepts the > > collection of items that are relevant. > > > -- -- Marko Ćirić ciric.ma...@gmail.com

Re: Mahout Binary Recommender Evaluator

2011-07-25 Thread Marko Ciric

difficulty is including > it in a clean way. Up for a patch? > > > > > > > Finaly, I believe the documentation page has some mistakes in the last > code > > excerpt : > > > > evaluator.evaluate(builder, myModel, null, 3, > > RecommenderIRStatusEvaluator.CHOOSE_THRESHOLD, > >§1.0); > > > > should be > > evaluator.evaluate(builder, null, myModel, null, 3, > > GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1.0); > > > > > > OK will look at that. > -- -- Marko Ćirić ciric.ma...@gmail.com

Re: Evaluating boolean preference data sets

2011-07-21 Thread Marko Ciric

Also the evaluation could be done per user, and thus manually running multiple times per each user. Or simple defining a matrix with relevant items per each user.. On Jul 21, 2011 4:18 PM, "Marko Ciric" wrote: > Yes, there should exist an evaluation that allows you to pass whic

Re: Evaluating boolean preference data sets

2011-07-21 Thread Marko Ciric

tings. > It has to pick random items as "relevant", for starters. It's another > reason > your idea is good, to let the user specify those relevant items. > > On Thu, Jul 21, 2011 at 1:49 PM, Marko Ciric > wrote: > > > Hi guys, > > > > I wonder if

Evaluating boolean preference data sets

2011-07-21 Thread Marko Ciric

items, the precision and recall would have the same value. Is this Ok or is it a bug, given that precision = intersection / num_recommended_items (where num_recommended_items is almost always "at") recall = intersection / num_relevant_items (also "at" as the previously mention

Re: Connection Pooling

2011-07-21 Thread Marko Ciric

M, Vitali Mogilevsky > > >> > wrote: > > >> > > > >> >> Hey, > > >> >> I got the same problem, of slowness while using MYSQL data model, > > after > > >> a > > >> >> small research and looking into mysql's query log, revealed that > user > > - > > >> >> user > > >> >> recommendation just floods the database with thousands and > thousands > > of > > >> >> requests. > > >> >> and thats on small database. > > >> >> for now Im dumbping the database into file, and using filedata > model > > >> which > > >> >> works much faster > > >> >> > > >> >> > > >> > > > >> > > > > > > -- -- Marko Ćirić ciric.ma...@gmail.com

Re: Exclude by RuleSet

2011-07-04 Thread Marko Ciric

rences? > > Thanks! > > > Am 04.07.2011 12:39, schrieb Marko Ciric: > > > > Hi Em, > > > > If I understood well what you're asking, you could implement a new > > CandidateItemStrategy class. If you see that interface, there's this > > method ge

Re: Exclude by RuleSet

2011-07-04 Thread Marko Ciric

Hi Em, If I understood well what you're asking, you could implement a new CandidateItemStrategy class. If you see that interface, there's this method getCandidateItems(long userID, DataModel dataModel) that has all parameters you need in order to filter out items that belong to the unwanted

Re: Hybrid RecSys — ways to do it

2011-06-27 Thread Marko Ciric

quality or satisfaction indicator and a > per-user current model indicator then you might be able to use these > as a feature for an interesting "if it ain't broke, don't fix it" > stacking model. > > On Thu, Jun 9, 2011 at 3:51 PM, Marko Ciric wrote: > &

Re: Mahout and Kolt

2011-06-23 Thread Marko Ciric

framework. But recently there has been talk about switching all of this to use fastutil (?) On Thu, Jun 23, 2011 at 2:25 PM, Marko Ciric wrote: How similar are Mahout collections (like FastMap) with Kolt (cern.kolt)? -- Marko Ćirić ciric.ma...@gmail.com

Mahout and Kolt

2011-06-23 Thread Marko Ciric

How similar are Mahout collections (like FastMap) with Kolt (cern.kolt)? -- Marko Ćirić ciric.ma...@gmail.com

Re: Which is more effective?

2011-06-22 Thread Marko Ciric

gt; >>> > >>>> I have used the SGD classifiers for content based recommendation. It > >>> works > >>>> out reasonably but the interaction variables can get kind of > expensive. > >>>> > >>>> Doing it again, I t

Which is more effective?

2011-06-21 Thread Marko Ciric

the experience with comparing performance/accuracy of those? Thanks -- Marko Ćirić ciric.ma...@gmail.com

Re: Hybrid RecSys — ways to do it

2011-06-09 Thread Marko Ciric

eatures is required first if I'm correct. What features to use when the recommended items (that need to be classified) are a result of different recommenders that use different similarity calculation (only a "brand" recommender is using an item feature here and CF and top-40 recommenders

Content-based recommending with Taste

2011-02-25 Thread Marko Ciric

e existing recommender evaluators to evaluate my content-based recommender. Any hints? -- -- Marko Ćirić ciric.ma...@gmail.com

46 matches

Mail list logo