Re: Misconfigured date information, either your engine.json date settings or your query's dateRange is incorrect.

2017-01-09 Thread Harsh Mathur
Also what am not able to understand is in ES, the mapping for datefield
(for both releaseDate and release_date) is string not analysed instead of
operational datetime.



Regards
Harsh Mathur
harshmathur.1...@gmail.com

*“Perseverance is the hard work you do after you get tired of doing the
hard work you already did."*

On Tue, Jan 10, 2017 at 12:58 AM, Harsh Mathur <harshmathur.1...@gmail.com>
wrote:

> Hi Pat,
>
> Thank you :)
>
> I mean to take the first approach and thats why my engine.json looks like :
>
>
> *"dateName":"releaseDate"   // our datefield*
>
> *Below you are setting up a date attached to items and have called it
> “releaseDate" in one place but set it using $set events to be
> “release_date”, different strings. This is probably your error, you didn’t
> send the error.*
>
> By this you mean to send events as // releaseDate same as mentioned in
> engine.json
>
> *{*
> *"event" : "$set",*
> *"entityType" : "item",*
> *"entityId" : "some-item-id",*
> *"properties" : {*
> *"releaseDate": "2016-04-15T03:34:05Z" *
> *},*
> *"eventTime" : "2016-04-15T03:34:05Z"*
> *}*
>
> or send event in with release_date
> *{*
> *"event" : "$set",*
> *"entityType" : "item",*
> *"entityId" : "some-item-id",*
> *"properties" : {*
> *"release_date": "2016-04-15T03:34:05Z"*
> *    },*
> *    "eventTime" : "2016-04-15T03:34:05Z"*
> *}*
>
> Btw, The Error message in subject stopped appearing in log after I started
> querying
>
> {
>  "dateRange": {
>  "name": "releaseDate",
>  "before": "" // date string in iso format,
>  "after": "" // date string in iso format
>  }
> }
>
> Regards
> Harsh Mathur
> harshmathur.1...@gmail.com
>
> *“Perseverance is the hard work you do after you get tired of doing the
> hard work you already did."*
>
> On Mon, Jan 9, 2017 at 11:36 PM, Pat Ferrel <p...@occamsmachete.com> wrote:
>
>> Sorry typos:
>>
>> no, that is talking about using a date range in the query.
>>
>> Answer this first. Do you want:
>> 1)  a single fixed date attached to itmes with the range in the query
>> 2) an expired/available attached to items and the current date that must
>> fall between them?
>>
>> You can’t do both so pick on or the other.
>>
>> Below you are setting up a date attached to items and have called it
>> “releaseDate" in one place but set it using $set events to be
>> “release_date”, different strings. This is probably your error, you didn’t
>> send the error.
>>
>>
>>
>> On Jan 9, 2017, at 8:56 AM, Pat Ferrel <p...@occamsmachete.com> wrote:
>>
>> no, that is talking about using a date range in the query.
>>
>> Answer this first: Do you want a fixed date attached to itmes with the
>> rang in the query or an expired/available attached to items and the corrent
>> date that must fall between them?
>>
>> Pick on or the other.
>>
>> Below you are setting up a date attached to items and have called it
>> “releaseDate" in one place but set it using $set events to be
>> “release_date”, different strings.
>>
>>
>>
>>
>> On Jan 9, 2017, at 3:15 AM, Harsh Mathur <harshmathur.1...@gmail.com>
>> wrote:
>>
>> Found it.
>>
>> Thanks a lot :)
>>
>> https://groups.google.com/forum/#!msg/actionml-user/z1eJdXni
>> Kl0/0VgKjuLgBwAJ
>>
>> Regards
>> Harsh Mathur
>> harshmathur.1...@gmail.com
>>
>> *“Perseverance is the hard work you do after you get tired of doing the
>> hard work you already did."*
>>
>> On Mon, Jan 9, 2017 at 4:06 PM, Harsh Mathur <harshmathur.1...@gmail.com>
>> wrote:
>>
>>> Hi,
>>> I have not been able to correctly set up dates for items.
>>>
>>> I am using the following approach:
>>>
>>> *specifying date range in query and comparing to date attached to
>>> items.(http://actionml.com/docs/ur_advanced_tuning
>>> <http://actionml.com/docs/ur_advanced_tuning>)*
>>>
>>> engine.json:
>>>
>>> *"algorithms": [*
>>> *{*
>>> *  "comment": "simplest setup where all values are default,
>>> popula

Re: Misconfigured date information, either your engine.json date settings or your query's dateRange is incorrect.

2017-01-09 Thread Harsh Mathur
Found it.

Thanks a lot :)

https://groups.google.com/forum/#!msg/actionml-user/z1eJdXniKl0/0VgKjuLgBwAJ

Regards
Harsh Mathur
harshmathur.1...@gmail.com

*“Perseverance is the hard work you do after you get tired of doing the
hard work you already did."*

On Mon, Jan 9, 2017 at 4:06 PM, Harsh Mathur <harshmathur.1...@gmail.com>
wrote:

> Hi,
> I have not been able to correctly set up dates for items.
>
> I am using the following approach:
>
> *specifying date range in query and comparing to date attached to
> items.(http://actionml.com/docs/ur_advanced_tuning
> <http://actionml.com/docs/ur_advanced_tuning>)*
>
> engine.json:
>
> *"algorithms": [*
> *{*
> *  "comment": "simplest setup where all values are default, popularity
> based backfill, must add eventsNames",*
> *  "name": "ur",*
> *  "params": {*
> *"appName": "MyApp1",*
> *"indexName": "urindex",*
> *"typeName": "items",*
> *"comment": "must have data for the first event or the model will
> not build, other events are optional",*
> *"eventNames": ["purchase","preview", "view" ],*
> *"dateName":"releaseDate"   // our datefield*
> *  }*
> *}*
> *  ]*
>
>
> I am sending event as
>
> *{*
> *"event" : "$set",*
> *"entityType" : "item",*
> *"entityId" : "some-item-id",*
> *"properties" : {*
> *"releaseDate": "2016-04-15T03:34:05Z"*
> *},*
> *"eventTime" : "2016-04-15T03:34:05Z"*
> *}*
>
>
> When I check ES, the releaseDate field is mapped as string instead of
> DateTime:
>
> ES Mapping:
> *"releaseDate" : {*
> *"type" : "string",*
> *"index" : "not_analyzed"*
> *  },*
> *  "release_date" : {*
> *"type" : "string",*
> *"index" : "not_analyzed"*
> *  },*
>
>
>
> *Misconfigured date information, either your engine.json date settings or
> your query's dateRange is incorrect.*
>
> This error keeps getting printed in the logs.
>
> Any help appreciated. Thanks a lot in advance.
>
> Regards
> Harsh Mathur
> harshmathur.1...@gmail.com
>
> *“Perseverance is the hard work you do after you get tired of doing the
> hard work you already did."*
>


Misconfigured date information, either your engine.json date settings or your query's dateRange is incorrect.

2017-01-09 Thread Harsh Mathur
Hi,
I have not been able to correctly set up dates for items.

I am using the following approach:

*specifying date range in query and comparing to date attached to
items.(http://actionml.com/docs/ur_advanced_tuning
<http://actionml.com/docs/ur_advanced_tuning>)*

engine.json:

*"algorithms": [*
*{*
*  "comment": "simplest setup where all values are default, popularity
based backfill, must add eventsNames",*
*  "name": "ur",*
*  "params": {*
*"appName": "MyApp1",*
*"indexName": "urindex",*
*"typeName": "items",*
*"comment": "must have data for the first event or the model will
not build, other events are optional",*
*"eventNames": ["purchase","preview", "view" ],*
*"dateName":"releaseDate"   // our datefield*
*  }*
*}*
*  ]*


I am sending event as

*{*
*"event" : "$set",*
*"entityType" : "item",*
*"entityId" : "some-item-id",*
*"properties" : {*
*"releaseDate": "2016-04-15T03:34:05Z"*
*},*
*"eventTime" : "2016-04-15T03:34:05Z"*
*}*


When I check ES, the releaseDate field is mapped as string instead of
DateTime:

ES Mapping:
*"releaseDate" : {*
*"type" : "string",*
*"index" : "not_analyzed"*
*  },*
*  "release_date" : {*
*"type" : "string",*
*"index" : "not_analyzed"*
*  },*



*Misconfigured date information, either your engine.json date settings or
your query's dateRange is incorrect.*

This error keeps getting printed in the logs.

Any help appreciated. Thanks a lot in advance.

Regards
Harsh Mathur
harshmathur.1...@gmail.com

*“Perseverance is the hard work you do after you get tired of doing the
hard work you already did."*


PIO Stops working suddenly after few days

2017-01-04 Thread Harsh Mathur
Hi,
I am facing 2 issues which happen after a period of time and gets resolved
after pio restart and pio redeploy.

1. While querying the pio for recommendations, it starts giving ES No Node
available exception, But if we check ES health, its green. pio status also
finds no fault in connecting to ES.

2. Invalid Access Key error keeps coming while pushing events in pio while
using correct access key.

Both these errors get resolved after restarting pio and redeploying the
service.

Any Suggestions?

Regards
Harsh Mathur
harshmathur.1...@gmail.com

*“Perseverance is the hard work you do after you get tired of doing the
hard work you already did."*


Re: Tuning of Recommendation Engine

2016-12-01 Thread Harsh Mathur
Hi Pat,
I really appreciate the product, but our team was discussing about how
little control we have here.
As in, say some recommendations got delivered to the user and we are
tracking conversions of course, so we can know if it's working or not. Now,
say if we see that conversions are low, as a developer I have very little
to experiment with here. I don't mean any disrespect. I have gone through
the code and have put in efforts to understand it too,  the UR is still
better than the explicit or implicit templates as it has filtration for
properties, only thing lacking in my opinion is the weightages.

I read your ppt
Recommendations = PtP +PtV+...
We were wondering if it could be
Recommendations = a * PtP + b * PtV+ ...

Where a and b are constants for tuning. In my understanding PtP is a matrix
so scalar multiplication should have be possible. Please correct me if I am
wrong.

Also I was reading about log likelihood method, but I couldn't find a
proper explanation. I would be happy if anyone here can explain it in more
detail. Thanks in advance.

Here is what I understood.
For every item-item pair per expression (PtP, PtV), to calculate a score,
it will find 4 things:
1. No of users who posted both events for the pair,
2. No of users who posted event for one but not the other and vice versa,
3. No of users who posted for neither

Then a formula is applied taking the 4 params as input and a score is
returned.

For each item and event pair you are storing top 20 items according to
score in elastic search. I didn't understand why the 2nd and third
parameters are taken, also if anyone can explain the correctness of the
method, That is why does it work rather how it works?

Regards
Harsh Mathur
On Dec 1, 2016 11:01 PM, "Pat Ferrel" <p...@occamsmachete.com> wrote:

> Exactly so. The weighting of events is done by the algorithm. To add
> biases would very likely be wrong and result in worse results. It is
> therefore not supported in the current code. There may be a place for this
> type of bias but it would have to be done in conjunction with a
> cross-validation tests we have in our MAP test suite and it is not yet
> supported. Best to leave them with the default weighting in the CCO
> algorithm, which is based on the strength of correlation with the
> conversion event, which I guess is purchase in your case.
>
>
> On Nov 28, 2016, at 2:19 PM, Magnus Kragelund <m...@ida.dk> wrote:
>
> Hi,
> It's my understanding that you cannot apply a bias to the event, such as
> "view" or "purchase" at query time. How the engine is using your different
> events to calculate score, is something that is in part defined by you and
> in part defined during training.
> In the engine.json config file you set an array of event names. The first
> event in the array is considered a primary event, and will be the event
> that the engine is trying to predict. The other events that you might
> specify is secondary events, that the engine is allowed to take in to
> consideration, when finding correlations to the primary event in your data
> set. If no correlation is found for a given event, the event data is not
> taken into account when predicting results.
>
> Your array might look like this, when predicting purchases: ["purchase",
>  "initiated_payment", "view", "preview"]
>
> If you use the special $set event to add metadata to your items, you can
> apply a bias or filter on those metadata properties at query time.
>
> /magnus
>
> --
> *From:* Harsh Mathur <harshmathur.1...@gmail.com>
> *Sent:* Monday, November 28, 2016 3:46:46 PM
> *To:* user@predictionio.incubator.apache.org
> *Subject:* Tuning of Recommendation Engine
>
> Hi,
> I have successfully deployed the UR template.
>
> Now I wanted to tune it a little bit, As of now I am sending 4 events,
> purchase, view, initiated_payment and preview. Also our products have
> categories, I am setting that as item properties.
>
> Now, as I query say:
> {
> "item": "{item_id}",
> "fields": [
> {
> "name": "view",
> "bias": 0.5
> },
> {
> "name": "preview",
> "bias": 5
> },
> {
> "name": "purchase",
> "bias": 20
> }
> ]
> }
>
> and query
> {
> "item": "{item_id}"
> }
>
>
> For both queries, I get the same number of recommendations just the score
> varies. The boosting isn't changing any recommendations, just changing the
> scores. Is there any way in UR that we can give more preference to some
> events, it will help give us more room to try and see and make the
> recommendations more relevant to us.
>
> Regards
> Harsh Mathur
> harshmathur.1...@gmail.com
>
> *“Perseverance is the hard work you do after you get tired of doing the
> hard work you already did."*
>
>


Re: Universal Recommender Creating Index in ElasticSearch but not able to persist the model

2016-11-24 Thread Harsh Mathur
Hi,
Thanks for the help and you are right, REST API was not working due to the
problem I found below.

The problem and solution, just in case someone else encounters the same:

The problem was we are using a hosted ES as a service and while writing the
data to index, it first discovers all the ES nodes and then try to write.
The process of discovery gives it different hosts and ports then the one
provided by me in config.

Hence in in UR template (build.sbt), I needed to update:

"org.elasticsearch" % "elasticsearch-spark_2.10" % "2.1.2"

to
"org.elasticsearch" % "elasticsearch-spark_2.10" % "2.2.0"

Also in sparkConf in engine.json, I needed to provide:

"es.nodes.wan.only": true

Ofcourse, The alternate solution would be to get the ES Cluster and UR
running in same network but it is not possible for us as ES is outsourced
and UR is in aws.

Relevant Page: https://www.elastic.co/guide/en/elasticsearch/hadoop/
2.2/cloud.html


Regards
Harsh Mathur
harshmathur.1...@gmail.com

*“Perseverance is the hard work you do after you get tired of doing the
hard work you already did."*

On Fri, Nov 25, 2016 at 5:05 AM, Pat Ferrel <p...@occamsmachete.com> wrote:

> I think the index is created using the transport client so requires no
> auth and happens before the writing starts, which uses the REST API. So the
> REST API seems to not be working at all.
>
> Check that the Spark Executor can connect to REST with your config. Also
> include "es.index.auto.create": "true"
>
>
>
> On Nov 23, 2016, at 6:56 AM, Harsh Mathur <harshmathur.1...@gmail.com>
> wrote:
>
> Hi,
> The problem is in training phase of the UR template, the model
> (correlators), is built, but unable to persist in elastic search. The
> connectivity is there because indices are getting created, though no data
> is being pushed. Help appreciated.
>
> My SparkConf is as follows:
> "es.nodes": "hostname",
> "es.port": port,
> "es.net.http.auth.user": "username",
> "es.net.http.auth.pass": "password",
> "es.net.ssl": true
>
>
> However, I can see a new index getting created but no data being pushed
> (timeout exception)
>
> https://basic_auth@host:port/_cat/indices?v
> green  open   urindex_1479911005241   4   0  00
> 460b   460b
> green  open   pio_meta4   0 191
>  210.8kb210.8kb
> green  open   urindex_1479909625467   4   0  00
> 460b   460b
> green  open   urindex_1479908293213   4   0  00
> 460b   460b
> green  open   urindex_1479911811014   4   0  00
> 460b   460b
> green  open   urindex_1479910315092   4   0  00
> 460b   460b
>
>
> As you can see after every retry, I can see a new index.
>
> The Exception I get is:
>
> [INFO] [URAlgorithm] Correlators created now putting into URModel
> [INFO] [URModel] Converting cooccurrence matrices into correlators
> [INFO] [URModel] Group all properties RDD
> [INFO] [URModel] ES fields[10]: List(available, id, defaultRank,
> categories, expires, popRank, purchase, countries, view, date)
> [Stage 103:>(0 +
> 2) / 5][INFO] [HttpMethodDirector] I/O exception
> (java.net.ConnectException) caught when processing request: Connection
> timed out (Connection timed out)
> [INFO] [HttpMethodDirector] Retrying request
> [INFO] [HttpMethodDirector] I/O exception (java.net.ConnectException)
> caught when processing request: Connection timed out (Connection timed out)
> [INFO] [HttpMethodDirector] Retrying request
>
>
>
>
> Regards
> Harsh Mathur
> harshmathur.1...@gmail.com
>
> *“Perseverance is the hard work you do after you get tired of doing the
> hard work you already did."*
>
>


Re: Scala 2.11 with PredictionIO

2016-11-22 Thread Harsh Mathur
Hi,
Thanks for prompt reply.

I was asking because we have other use cases for spark and simultaneous
development is going on using spark 2.0.1. So, the idea was to share spark
infra between all projects including PIO.

Any suggestions? Either I build from sources spark 2.0.1 with Scala 2.10.4
or use Scala 1.6 everywhere?

Regards
Harsh Mathur
On Nov 22, 2016 9:10 PM, "Pat Ferrel" <p...@occamsmachete.com> wrote:

> I would not try Scala 2.11 or Spark 2.x. These will require both
> PredictionIO and templates to be ported and these on the roadmap but not
> even started.
>
>
> On Nov 22, 2016, at 6:47 AM, Harsh Mathur <harshmathur.1...@gmail.com>
> wrote:
>
> Hi,
> I wanted to use spark 2.x series and it is built with scala 2.11.
> But PIO is built with 2.10.4. I stumbled upon these two links.
>
> https://issues.apache.org/jira/browse/PIO-30
> https://github.com/apache/incubator-predictionio/pull/295/
>
> What should I do in meantime?
> Regards
> Harsh Mathur
> harshmathur.1...@gmail.com
>
> *“Perseverance is the hard work you do after you get tired of doing the
> hard work you already did."*
>
>


Re: Regarding Remote ES Cluster with Pio

2016-11-19 Thread Harsh Mathur
Hi,
Thanks man:)

After some of hit and trial, I changed https to http in the url and put the
java native client port, it worked without any auth.

Thanks again:)

Regards
Harsh Mathur
harshmathur.1...@gmail.com

*“Perseverance is the hard work you do after you get tired of doing the
hard work you already did."*

On Sat, Nov 19, 2016 at 2:18 PM, Hasan Can Saral <hasancansa...@gmail.com>
wrote:

> Hi!
>
> There might be an issue with basic auth. I have not tried to configure pio
> with an ES server with basic auth. And from the error you get, I understand
> that pio does not seem to be happy with (or even find) the hosts you
> provided. Also what port is your ES cluster listening to? Can you try 9300
> and 9200 explicitly?
>
>
> On Nov 17, 2016, at 5:26 PM, Harsh Mathur <harshmathur.1...@gmail.com>
> wrote:
>
> Hi PredictionIO developers,
> First of all Thank you for a great open source product.
>
> I am Harsh, I was deploying the system in production and I have an ES
> instance as a managed service. I am not able to make pio use my managed es
> instance instead of me installing a local es. Thanks a lot for all the help
> in advance.
>
> I have a ES config in form: https://user:password@host
> ports available:
> 1. x: for http
> 2. y: for native java node clients
>
> I tried editing pio-env.sh as follows:
>
> # Elasticsearch Example
> PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
> PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=elasticsearch
> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=https://user:password@host
> PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=native_java_port
> # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/
> vendors/elasticsearch-1.5.2
>
>
> But Pio is not bale to find any nodes:
>
> [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
>
> [WARN] [netty] [Aftershock] exception caught on transport layer [[id:
> 0x63808344]], closing connection
>
> [ERROR] [Console$] Unable to connect to all storage backends successfully.
> The following shows the error message from the storage backend.
>
> [ERROR] [Console$] None of the configured nodes are available: []
> (org.elasticsearch.client.transport.NoNodeAvailableException)
>
> [ERROR] [Console$] Dumping configuration of initialized storage backend
> sources. Please make sure they are correct.
> Regards
> Harsh Mathur
> harshmathur.1...@gmail.com
>
> *“Perseverance is the hard work you do after you get tired of doing the
> hard work you already did."*
>
>
>


Regarding Remote ES Cluster with Pio

2016-11-17 Thread Harsh Mathur
Hi PredictionIO developers,
First of all Thank you for a great open source product.

I am Harsh, I was deploying the system in production and I have an ES
instance as a managed service. I am not able to make pio use my managed es
instance instead of me installing a local es. Thanks a lot for all the help
in advance.

I have a ES config in form: https://user:password@host
ports available:
1. x: for http
2. y: for native java node clients

I tried editing pio-env.sh as follows:

# Elasticsearch Example
PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=https://user:password@host
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=native_java_port
#
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-1.5.2


But Pio is not bale to find any nodes:

[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...

[WARN] [netty] [Aftershock] exception caught on transport layer [[id:
0x63808344]], closing connection

[ERROR] [Console$] Unable to connect to all storage backends successfully.
The following shows the error message from the storage backend.

[ERROR] [Console$] None of the configured nodes are available: []
(org.elasticsearch.client.transport.NoNodeAvailableException)

[ERROR] [Console$] Dumping configuration of initialized storage backend
sources. Please make sure they are correct.
Regards
Harsh Mathur
harshmathur.1...@gmail.com

*“Perseverance is the hard work you do after you get tired of doing the
hard work you already did."*