You'll have to work out the ES query JSON, use arrays of strings un-analysed.

ES docs indexed
  cluster_1: [“category 1”, “category 2”]
  cluster_2: [“category 5”, “category 10”, …]


  user_purchase_history: [“category 1”, “category 2”]

So he query would be:  [“category 1”, “category 2”] and it would return the 
clusters with cluster_1 ranked highest.

as you can see the terms in the user history can be used as a query to return 
the cluster-id that is most similar. This is called K-Nearest Neighbors (KNN) 
and is done using cosine similarity. ES (and Solr, both based n Lucene) are 
great KNN engines for sparse data.


On Jul 7, 2017, at 4:30 AM, Luciano Vandi <[email protected]> wrote:

Thanks Pat, you're right. This is what I'm trying to do.

It's not clear to me how to query ElasticSearch with user’s history of bought 
item categories. Can you make an example? 

2017-07-06 23:13 GMT+02:00 Pat Ferrel <[email protected] 
<mailto:[email protected]>>:
Actually it sounds like you already have clusters that are made up of 
categories and you want to know which cluster definition is most similar to 
what the user has bought? If so you don’t need clustering but similarity. This 
is pretty easy to do by putting each cluster into Elasticsearch as a doc with a 
list of categories—so 6 or so docs, then use the user’s history of bought item 
categories as the query, you’ll get all clusters ranked from most similar (to 
the user’s history) to least.

You would have to store user history on your own

This could be put into a simple template but if you already have user history, 
it may be overkill.



On Jul 6, 2017, at 1:39 PM, Pat Ferrel <[email protected] 
<mailto:[email protected]>> wrote:

There are 2 clustering templates but it looks like they both need to be moved 
from Prediction.io <http://prediction.io/> to Apache PIO, which should be easy. 
See the template gallery here: 
http://predictionio.incubator.apache.org/gallery/template-gallery/ 
<http://predictionio.incubator.apache.org/gallery/template-gallery/>


On Jul 6, 2017, at 12:35 PM, Luciano Vandi <[email protected] 
<mailto:[email protected]>> wrote:

Hi there, i'm new to the mailing-list. Thanks to the guys at Apache.org 
<http://apache.org/>, ActionML and to anyone from the community!

I have a question regarding a project I'm working on. From a database of 
customers/orders I would like to export buy/view events in order to assign each 
customer to one or more of 6 predefined cluster. Each cluster reflect the 
macro-category associated to the bought/viewed item.

Then I would like to query a service to get all customers within a cluster, or 
all cluster where a customer belongs. 

Is there any pio-template I should start to explore, or do I need to ask a 
consultancy to ActionML team?

Have a nice day!


Luciano
--





-- 

  Soluzioni PaaS e SaaS per il Commercio Elettronico
  Email: [email protected] <mailto:[email protected]>
  Mobile: (+39) 340 90 21 354

Reply via email to