Re: Need a Suggessations

Vaghawan Ojha Thu, 23 Mar 2017 20:04:54 -0700

Hi Pat,

Thank you so much for giving me such a clear idea, it really did help me
alot. This is the very first time I'm touching the big data, I hope It
wouldn't be that bad.


I would set it up as you recommended, and will come to ask you if something
I need to know, which will be very often.

Thank You
Vaghawan

On Fri, Mar 24, 2017 at 3:23 AM, Pat Ferrel <[email protected]> wrote:

> Think of the recommender as a single app. It is scalable to whatever your
> data size via the services it is built on. We often see that using a
> recommender is people’s first experience with really big data. Other tools
> and services you use outside of it are fine because they do not deal with
> such large data. Recommenders force you so process every interaction that
> all your users have made over perhaps a year and do it often. There are few
> other apps that require this. Welcome to Big-Data.
>
> MySQL is fine to run your app as you no doubt know. The “model” built in a
> recommender is generally not human readable but in the case of the UR you
> can understand it with some experience. It lives in Elasticsearch while the
> user interactions live in HBase. The user events can be looked at but not
> sure why you’d want too, they are condensed snippets of server logs.
>
> In any case it may help to think of the model in Elasticsearch as a
> product catalog. It will define what items can be recommended and have an
> entry for each item with Machine Learning calculated attributes attached
> that indicate the type of user that prefers each item. But the model also
> contains item properties/attributes that you may want to include for
> business rules.
>
> The Recommender is easily accessed from you app through the input and
> query API. You can change attributes of items by sending special input
> events. Queries are defined that match the type of things recommenders with
> business rules do and the model can be seen through Elasticsearch APIs but
> it is discouraged to do any direct manipulation of these since their
> meaning or format may change with any update.
>
> Plan to use the PIO query API, it will respond in real-time, with latency
> on the order of 25ms, and multiple simultaneous connections/queries. There
> would be no reason to pull out data from the UR and put it in a database or
> you would loose the ability to react to user’s real-time behavior, which is
> used to make recommendations. Stick to the input/query APIs and feed data
> into the UR in real-time and you’ll get the most benefit.
>
>
> On Mar 23, 2017, at 12:25 PM, Vaghawan Ojha <[email protected]> wrote:
>
> Hi Pat,
>
> Thank you very much.Yes I will be following actionml instruction since I'm
> going to use UR. I think I should rather direct myself to HBASE rather than
> expensing time  in setting up Mysql. Part of my need is that once we train
> the dataset, the result should be easily available to the application which
> are running into Mysql.
>
> I'm fairly new to the concept itself. So basically I would always have a
> larage json file coming from the application which uses mysql(this
> shouldn't be the problem). Then I would use PIO and UR to do the hard work,
> and get back the result either like an API which I think already works in
> PIO or saved somewhere in database like mysql or something like that.
>
> Thanks
>
> On Fri, Mar 24, 2017 at 1:03 AM, Pat Ferrel <[email protected]> wrote:
>
>> The UR uses Elasticsearch for part of the Recommender algorithm, therefor
>> it must be configured as a storage backend. It is possible to use Postgres
>> or MySQL for the other stores but we have very little experience with this.
>> HBase is indefinitely scalable so we always use that. Single machine
>> deployments are rare with a reasonably sized data so Elasticsearch + Hbase
>> running separately or in clusters will always meet the data needs. The RDBs
>> will not and anyway, like I said you have to use Elasticsearch.
>>
>> Therefore for the UR follow instructions on the ActionML site since they
>> are specific to the UR. For other templates you may use other
>> configurations of PIO but if you use the UR config you can also use every
>> template too.
>>
>>
>>
>> On Mar 23, 2017, at 9:07 AM, Vaghawan Ojha <[email protected]> wrote:
>>
>> Hi, Thank you!
>>
>> I came into further more confusion here, actually I installed prediction
>> IO version 0.10.0 from here http://predictionio.incub
>> ator.apache.org/install/install-sourcecode/  and have been fighting to
>> configure mysql as a storage in my local linux machine.
>>
>> But I see there is a different documentation of installing in actionml
>> website, I'm not sure for which I would have to go. Currently there is no "
>> pio-env.sh".  file inside conf folder however there is
>> pio-env.sh.template file. I commented the pgsql section and uncommented the
>> mysql section with the username and password, but whenever I do . sudo
>> PredictionIO-0.10.0-incubating/bin/pio eventserver there seems to be an
>> error that says that authentication failed with pgsql, however I don't want
>> to use pgsql.
>>
>> # Storage Repositories
>>
>> # Default is to use PostgreSQL
>> PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
>> PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=PGSQL
>>
>> PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
>> PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=PGSQL
>>
>> PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
>> PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=PGSQL
>>
>> # Storage Data Sources
>>
>> # PostgreSQL Default Settings
>> # Please change "pio" to your database name in
>> PIO_STORAGE_SOURCES_PGSQL_URL
>> # Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and
>> # PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
>> #PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
>> #PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio
>> #PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
>> #PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio
>>
>> # MySQL Example
>>  PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc
>>  PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://localhost/pio
>>  PIO_STORAGE_SOURCES_MYSQL_USERNAME=root
>>  PIO_STORAGE_SOURCES_MYSQL_PASSWORD=root
>>
>>
>> This is how the pio-env.sh.template looks like. And again when I visited
>> the actionml site, it suggests that I do have to have ELASTICSEARCH. but
>> prediction.io site doesn't tells us the same. Which one should I follow
>> and where would I find the current working version of installation guide. I
>> actually wanaa use prediction.io in my production shortly after I
>> implemented in local.
>>
>> Please help me, thank you very much for your help, I appreciate it so
>> much.
>> Vaghawan
>>
>>
>> On Thu, Mar 23, 2017 at 9:27 PM, Pat Ferrel <[email protected]>
>> wrote:
>>
>>> Since PIO has moved to Apache, the namespace of PIO code changed and so
>>> all templates need to be updated. None of the ones in
>>> https://github.com/PredictionIO/
>>> <https://github.com/PredictionIO/template-scala-parallel-universal-recommendation>
>>>  will
>>> work with Apache PIO. For the upgraded UR see: https://github.com/action
>>> ml/universal-recommender Docs for the UR are here:
>>> http://actionml.com/docs/ur
>>>
>>> Also look on the Template gallery page here for a description of
>>> template status. Some have not been moved to the new namespace and
>>> converted to run with PIO but this is pretty easy to do yourself.
>>> http://predictionio.incubator.apache.org/gallery/template-gallery/
>>>
>>> user_id, product_id and purchase_date is all you need to use any
>>> recommender. If you plan to gather other events in the future, use the UR.
>>> As far as item or user based recommendations, the UR will give either based
>>> on the query with the same data and model, as some others will do. The UR
>>> allows you to mix both types in a single query, which may be useful with
>>> small amounts of individual user data.
>>>
>>> Also the accepted wisdom about this it to put item-based recs on item
>>> detail pages, and user-based recs elsewhere, when you don’t have an item to
>>> base recs on, or in another placement on any page.
>>>
>>> You can have many different placements of recs in any page by changing
>>> the queries. This is how Netflix gets rows and rows of specialized recs for
>>> different things all based on the same data. The UR queries are quite
>>> flexible.
>>>
>>>
>>> On Mar 23, 2017, at 7:08 AM, Vaghawan Ojha <[email protected]>
>>> wrote:
>>>
>>> Hi,
>>>
>>> I've been trying to deploy a recommendation system using
>>> https://github.com/PredictionIO/template-scala-paralle
>>> l-universal-recommendation.
>>>
>>> I've purchase history of user something like this:
>>> user_id, product_id and purchase_date, so I will be using user_id and
>>> product_id to determine the recommendation. I'm not sure if I would be able
>>> to customize the default even parameter.
>>>
>>> Do you have any suggestions like which template would be more suitable
>>> for my problem. I don't have data like rating or view state, I only have
>>> data about user and product they purchased. I need something like item
>>> based similarity as well as user based item similarity.
>>>
>>> Any help would be great
>>>
>>> Thank you
>>> Vaghawan
>>>
>>>
>>
>>
>
>

Re: Need a Suggessations

Reply via email to