[jira] [Commented] (PIO-101) Document usage of Plug-in of event server and engine server
[ https://issues.apache.org/jira/browse/PIO-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194159#comment-16194159 ] ASF GitHub Bot commented on PIO-101: Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/440 This is awesome. Thanks @takezoe ! > Document usage of Plug-in of event server and engine server > --- > > Key: PIO-101 > URL: https://issues.apache.org/jira/browse/PIO-101 > Project: PredictionIO > Issue Type: Task > Components: Documentation >Reporter: Kenneth Chan >Assignee: Naoki Takezoe > > see > http://mail-archives.apache.org/mod_mbox/incubator-predictionio-dev/201706.mbox/%3CCAF_HxLtEonOVALSQgrCRGXctAbL7eypxwG0ErHpaBJJym15j5Q%40mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-129) CLI document does not expand side menu
[ https://issues.apache.org/jira/browse/PIO-129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194149#comment-16194149 ] ASF GitHub Bot commented on PIO-129: Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/438 LGTM. Merging. Thanks @takezoe ! > CLI document does not expand side menu > -- > > Key: PIO-129 > URL: https://issues.apache.org/jira/browse/PIO-129 > Project: PredictionIO > Issue Type: Improvement > Components: Documentation >Reporter: Naoki Takezoe >Assignee: Naoki Takezoe >Priority: Minor > > There are links to CLI document in the deploy section and collecting data > section of the side menu, but if these links are clicked, the side menu is > closed because these links have a hash like {{/cli/#engine-commands}}. I > think that such unclear navigation would confuse readers. > https://predictionio.incubator.apache.org/cli/#engine-commands > I propose to remove these links from the deploy section and the collecting > data section, and put a link to CLI document in the resource section without > hash. In addition, put links to the CLI reference in documents of the event > server and the engine server. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194133#comment-16194133 ] ASF GitHub Bot commented on PIO-125: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/436 > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194132#comment-16194132 ] ASF GitHub Bot commented on PIO-125: Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/436 LGTM. Merging. Thanks @marevol ! > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-101) Document usage of Plug-in of event server and engine server
[ https://issues.apache.org/jira/browse/PIO-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194108#comment-16194108 ] ASF GitHub Bot commented on PIO-101: GitHub user takezoe opened a pull request: https://github.com/apache/incubator-predictionio/pull/440 [PIO-101] Document usage of Plug-in of event server and engine server You can merge this pull request into a Git repository by running: $ git pull https://github.com/takezoe/incubator-predictionio plugin-document Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/440.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #440 commit a6a140254d08f8b95e956bace4e7323ecd690c73 Author: Naoki Takezoe Date: 2017-10-05T12:59:10Z Document usage of Plug-in of event server and engine server > Document usage of Plug-in of event server and engine server > --- > > Key: PIO-101 > URL: https://issues.apache.org/jira/browse/PIO-101 > Project: PredictionIO > Issue Type: Task > Components: Documentation >Reporter: Kenneth Chan >Assignee: Naoki Takezoe > > see > http://mail-archives.apache.org/mod_mbox/incubator-predictionio-dev/201706.mbox/%3CCAF_HxLtEonOVALSQgrCRGXctAbL7eypxwG0ErHpaBJJym15j5Q%40mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-131) Fix Apache licensing issues for doc site
[ https://issues.apache.org/jira/browse/PIO-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16193966#comment-16193966 ] ASF GitHub Bot commented on PIO-131: Github user dszeto commented on a diff in the pull request: https://github.com/apache/incubator-predictionio/pull/439#discussion_r143086982 --- Diff: docs/manual/source/install/install-sourcecode.html.md.erb --- @@ -28,15 +28,20 @@ Download Apache PredictionIO (incubating) <%= data.versions.pio %> from an Apache [mirror](https://www.apache.org/dyn/closer.cgi/incubator/predictionio/<%= data.versions.pio %>/apache-predictionio-<%= data.versions.pio %>.tar.gz). +Verify this release using [signatures and checksums](https://www.apache.org/ +dist/incubator/predictionio/0.12.0-incubating/) and [project release KEYS]( --- End diff -- How about using `data.versions.pio` instead? > Fix Apache licensing issues for doc site > > > Key: PIO-131 > URL: https://issues.apache.org/jira/browse/PIO-131 > Project: PredictionIO > Issue Type: Task >Reporter: Chan >Assignee: Chan > > Fix issues blocking graduation > (https://www.mail-archive.com/general@incubator.apache.org/msg61352.html) > 1. Add links to http://apache.org as in > https://www.apache.org/foundation/marks/pmcs#navigation > 2. Add instructions for checking signature of download as in > http://httpd.apache.org/download.cgi#verify -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-131) Fix Apache licensing issues for doc site
[ https://issues.apache.org/jira/browse/PIO-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16193967#comment-16193967 ] ASF GitHub Bot commented on PIO-131: Github user dszeto commented on a diff in the pull request: https://github.com/apache/incubator-predictionio/pull/439#discussion_r143087035 --- Diff: docs/manual/source/install/install-sourcecode.html.md.erb --- @@ -28,15 +28,20 @@ Download Apache PredictionIO (incubating) <%= data.versions.pio %> from an Apache [mirror](https://www.apache.org/dyn/closer.cgi/incubator/predictionio/<%= data.versions.pio %>/apache-predictionio-<%= data.versions.pio %>.tar.gz). +Verify this release using [signatures and checksums](https://www.apache.org/ +dist/incubator/predictionio/0.12.0-incubating/) and [project release KEYS]( +https://www.apache.org/dist/incubator/predictionio/KEYS). Refer to +[instructions](http://httpd.apache.org/download.cgi#verify) on how to verify. --- End diff -- I think it would be better to copy those instructions over. > Fix Apache licensing issues for doc site > > > Key: PIO-131 > URL: https://issues.apache.org/jira/browse/PIO-131 > Project: PredictionIO > Issue Type: Task >Reporter: Chan >Assignee: Chan > > Fix issues blocking graduation > (https://www.mail-archive.com/general@incubator.apache.org/msg61352.html) > 1. Add links to http://apache.org as in > https://www.apache.org/foundation/marks/pmcs#navigation > 2. Add instructions for checking signature of download as in > http://httpd.apache.org/download.cgi#verify -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-131) Fix Apache licensing issues for doc site
[ https://issues.apache.org/jira/browse/PIO-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16193936#comment-16193936 ] ASF GitHub Bot commented on PIO-131: GitHub user chanlee514 opened a pull request: https://github.com/apache/incubator-predictionio/pull/439 [PIO-131] Fix Apache licensing issues for doc site You can merge this pull request into a Git repository by running: $ git pull https://github.com/chanlee514/incubator-predictionio hotfix-license Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/439.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #439 commit 75c31f38fa6cdc0506976a68ef0921ba676bdfc7 Author: Chan Lee Date: 2017-10-05T23:51:11Z Fix issues to meet Apache project requirements > Fix Apache licensing issues for doc site > > > Key: PIO-131 > URL: https://issues.apache.org/jira/browse/PIO-131 > Project: PredictionIO > Issue Type: Task >Reporter: Chan >Assignee: Chan > > Fix issues blocking graduation > (https://www.mail-archive.com/general@incubator.apache.org/msg61352.html) > 1. Add links to http://apache.org as in > https://www.apache.org/foundation/marks/pmcs#navigation > 2. Add instructions for checking signature of download as in > http://httpd.apache.org/download.cgi#verify -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-129) CLI document does not expand side menu
[ https://issues.apache.org/jira/browse/PIO-129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192628#comment-16192628 ] ASF GitHub Bot commented on PIO-129: GitHub user takezoe opened a pull request: https://github.com/apache/incubator-predictionio/pull/438 [PIO-129] Move CLI document There are links to CLI document in the deploy section and collecting data section of the side menu, but if these links are clicked, the side menu is closed because these links have a hash like `/cli/#engine-commands`. I think that such unclear navigation would confuse readers. https://predictionio.incubator.apache.org/cli/#engine-commands I propose to remove these links from the deploy section and the collecting data section, and put a link to CLI document in the resource section without hash: ![cli-doc1](https://user-images.githubusercontent.com/1094760/31218456-5e1e312e-a9f5-11e7-8d8f-e4d1e06e3d08.png) In addition, put links to the CLI reference in documents of the event server and the engine server. A following screenshot shows a link to the event server CLI in the document about deployment: ![cli-doc2](https://user-images.githubusercontent.com/1094760/31218708-248da2c2-a9f6-11e7-8a8a-6bef8ee1e6e5.png) You can merge this pull request into a Git repository by running: $ git pull https://github.com/takezoe/incubator-predictionio move-cli-document Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/438.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #438 commit f2446820a9b7dfbecd4e80a5cf4aaefb84518018 Author: Naoki Takezoe Date: 2017-10-05T08:25:34Z Move CLI document > CLI document does not expand side menu > -- > > Key: PIO-129 > URL: https://issues.apache.org/jira/browse/PIO-129 > Project: PredictionIO > Issue Type: Improvement > Components: Documentation >Reporter: Naoki Takezoe >Assignee: Naoki Takezoe >Priority: Minor > > There are links to CLI document in the deploy section and collecting data > section of the side menu, but if these links are clicked, the side menu is > closed because these links have a hash like {{/cli/#engine-commands}}. I > think that such unclear navigation would confuse readers. > https://predictionio.incubator.apache.org/cli/#engine-commands > I propose to remove these links from the deploy section and the collecting > data section, and put a link to CLI document in the resource section without > hash. In addition, put links to the CLI reference in documents of the event > server and the engine server. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192360#comment-16192360 ] ASF GitHub Bot commented on PIO-125: Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/436 @marevol Apache installed `docker-compose` on their Jenkins finally. We can move these tests there for faster build time. > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192317#comment-16192317 ] ASF GitHub Bot commented on PIO-125: Github user marevol commented on the issue: https://github.com/apache/incubator-predictionio/pull/436 Re-run Travis. I think it's better to reduce test matrix... > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192318#comment-16192318 ] ASF GitHub Bot commented on PIO-125: GitHub user marevol reopened a pull request: https://github.com/apache/incubator-predictionio/pull/436 [PIO-125] Add Spark 2.2 support You can merge this pull request into a Git repository by running: $ git pull https://github.com/marevol/incubator-predictionio spark22 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/436.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #436 commit 7ed1e9cb5282def385f49c0d01b8d4561d663bd4 Author: Shinsuke Sugaya Date: 2017-09-20T10:47:28Z add spark 2.2 commit a657cb50ca64998df3fe9ec01468421bd7dff67f Author: Shinsuke Sugaya Date: 2017-10-04T13:49:35Z add mapreduce.output.fileoutputformat.outputdir > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192315#comment-16192315 ] ASF GitHub Bot commented on PIO-125: Github user marevol commented on the issue: https://github.com/apache/incubator-predictionio/pull/436 Spark 2.2 has guava dependency problem and [SPARK-21549](https://issues.apache.org/jira/browse/SPARK-21549). I added a workaround for SPARK-21549. > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192316#comment-16192316 ] ASF GitHub Bot commented on PIO-125: Github user marevol closed the pull request at: https://github.com/apache/incubator-predictionio/pull/436 > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192103#comment-16192103 ] ASF GitHub Bot commented on PIO-125: Github user marevol commented on the issue: https://github.com/apache/incubator-predictionio/pull/436 Re-run Travis. > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192104#comment-16192104 ] ASF GitHub Bot commented on PIO-125: GitHub user marevol reopened a pull request: https://github.com/apache/incubator-predictionio/pull/436 [PIO-125] Add Spark 2.2 support You can merge this pull request into a Git repository by running: $ git pull https://github.com/marevol/incubator-predictionio spark22 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/436.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #436 commit 7ed1e9cb5282def385f49c0d01b8d4561d663bd4 Author: Shinsuke Sugaya Date: 2017-09-20T10:47:28Z add spark 2.2 commit a657cb50ca64998df3fe9ec01468421bd7dff67f Author: Shinsuke Sugaya Date: 2017-10-04T13:49:35Z add mapreduce.output.fileoutputformat.outputdir > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192102#comment-16192102 ] ASF GitHub Bot commented on PIO-125: Github user marevol closed the pull request at: https://github.com/apache/incubator-predictionio/pull/436 > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190870#comment-16190870 ] ASF GitHub Bot commented on PIO-125: Github user marevol commented on the issue: https://github.com/apache/incubator-predictionio/pull/436 Oops, my travis result missed some results... I'll fix this PR. This problem might come from HBase. > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190794#comment-16190794 ] ASF GitHub Bot commented on PIO-125: Github user shimamoto commented on the issue: https://github.com/apache/incubator-predictionio/pull/436 It is not supposed to need the new Guava dependency because Spark has Guava dependency. But if this dependency isn't included, compile error occurs at data project. When I investigated the cause, I discovered Spark 2.1.1 and Spark 2.2.0 were different Guava dependencies. - Spark 2.1.1 : com.google.guava:guava:14.0.1 - Spark 2.2.0 : com.google.guava:guava:11.0.2 It is expected guava:14.0.1 in Spark 2.2 intrinsically. The root cause is probably this: https://github.com/sbt/sbt/issues/2861 The Apache Curator version referenced in Spark has changed in 2.2.0. This has been thought to have some effect. https://github.com/apache/spark/blob/v2.1.1/pom.xml#L130 https://github.com/apache/spark/blob/v2.2.0/pom.xml#L126 https://github.com/apache/curator/blob/2.4.0/pom.xml#L307 https://github.com/apache/curator/blob/apache-curator-2.6.0/pom.xml#L424 > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190630#comment-16190630 ] ASF GitHub Bot commented on PIO-125: GitHub user marevol reopened a pull request: https://github.com/apache/incubator-predictionio/pull/436 [PIO-125] Add Spark 2.2 support You can merge this pull request into a Git repository by running: $ git pull https://github.com/marevol/incubator-predictionio spark22 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/436.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #436 commit 7ed1e9cb5282def385f49c0d01b8d4561d663bd4 Author: Shinsuke Sugaya Date: 2017-09-20T10:47:28Z add spark 2.2 > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190628#comment-16190628 ] ASF GitHub Bot commented on PIO-125: Github user marevol closed the pull request at: https://github.com/apache/incubator-predictionio/pull/436 > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190629#comment-16190629 ] ASF GitHub Bot commented on PIO-125: Github user marevol commented on the issue: https://github.com/apache/incubator-predictionio/pull/436 Re-run Travis. > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190627#comment-16190627 ] ASF GitHub Bot commented on PIO-125: Github user marevol commented on the issue: https://github.com/apache/incubator-predictionio/pull/436 For test failures on Travis, I think it's Travis problem. [My Travis](https://travis-ci.org/jpioug/incubator-predictionio/builds/282978334) result is passed. Spark 2.2 seems to break Guava dependency... @shimamoto looked into this problem. > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190424#comment-16190424 ] ASF GitHub Bot commented on PIO-125: Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/436 Hey @marevol , any idea with the failing test? I'm wondering if the new Guava dependency is not working fine with older Hadoop/Spark. > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-125) Spark 2.2 support
[ https://issues.apache.org/jira/browse/PIO-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174198#comment-16174198 ] ASF GitHub Bot commented on PIO-125: GitHub user marevol opened a pull request: https://github.com/apache/incubator-predictionio/pull/436 [PIO-125] Add Spark 2.2 support You can merge this pull request into a Git repository by running: $ git pull https://github.com/marevol/incubator-predictionio spark22 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/436.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #436 commit 7ed1e9cb5282def385f49c0d01b8d4561d663bd4 Author: Shinsuke Sugaya Date: 2017-09-20T10:47:28Z add spark 2.2 > Spark 2.2 support > - > > Key: PIO-125 > URL: https://issues.apache.org/jira/browse/PIO-125 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > Add Spark 2.2 to scalaSparkDepsVersion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-59) "pio app new" sometimes takes long time
[ https://issues.apache.org/jira/browse/PIO-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16164059#comment-16164059 ] ASF GitHub Bot commented on PIO-59: --- Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/367 > "pio app new" sometimes takes long time > --- > > Key: PIO-59 > URL: https://issues.apache.org/jira/browse/PIO-59 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.11.0-incubating >Reporter: Shinsuke Sugaya >Priority: Minor > Fix For: 0.12.0-incubating > > > Some users reported this problem in user ML, and I also encountered it. > Checking stack traces, I think the cause is an entropy shortage for > /dev/random. > {code} > "main" #1 prio=5 os_prio=0 tid=0x7fc94803f800 nid=0x49a9 runnable > [0x7fc94fa1b000] >java.lang.Thread.State: RUNNABLE > at java.io.FileInputStream.readBytes(Native Method) > at java.io.FileInputStream.read(FileInputStream.java:255) > at > sun.security.provider.NativePRNG$RandomIO.readFully(NativePRNG.java:424) > at > sun.security.provider.NativePRNG$RandomIO.ensureBufferValid(NativePRNG.java:525) > at > sun.security.provider.NativePRNG$RandomIO.implNextBytes(NativePRNG.java:544) > - locked <0x0003d34e8a48> (a java.lang.Object) > at > sun.security.provider.NativePRNG$RandomIO.access$400(NativePRNG.java:331) > at > sun.security.provider.NativePRNG$Blocking.engineNextBytes(NativePRNG.java:268) > at java.security.SecureRandom.nextBytes(SecureRandom.java:468) > at > org.apache.predictionio.data.storage.AccessKeys$class.generateKey(AccessKeys.scala:71) > at > org.apache.predictionio.data.storage.elasticsearch.ESAccessKeys.generateKey(ESAccessKeys.scala:40) > at > org.apache.predictionio.data.storage.elasticsearch.ESAccessKeys.insert(ESAccessKeys.scala:60) > at > org.apache.predictionio.tools.commands.App$$anonfun$create$4$$anonfun$apply$5.apply(App.scala:71) > at > org.apache.predictionio.tools.commands.App$$anonfun$create$4$$anonfun$apply$5.apply(App.scala:62) > at scala.Option.map(Option.scala:145) > at > org.apache.predictionio.tools.commands.App$$anonfun$create$4.apply(App.scala:62) > at > org.apache.predictionio.tools.commands.App$$anonfun$create$4.apply(App.scala:55) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.predictionio.tools.commands.App$.create(App.scala:55) > at > org.apache.predictionio.tools.console.Pio$App$.create(Pio.scala:172) > at > org.apache.predictionio.tools.console.Console$$anonfun$main$1.apply(Console.scala:683) > at > org.apache.predictionio.tools.console.Console$$anonfun$main$1.apply(Console.scala:626) > at scala.Option.map(Option.scala:145) > at > org.apache.predictionio.tools.console.Console$.main(Console.scala:626) > at org.apache.predictionio.tools.console.Console.main(Console.scala) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-120) Process hangs if Elasticsearch is not available during train
[ https://issues.apache.org/jira/browse/PIO-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162269#comment-16162269 ] ASF GitHub Bot commented on PIO-120: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/432 > Process hangs if Elasticsearch is not available during train > > > Key: PIO-120 > URL: https://issues.apache.org/jira/browse/PIO-120 > Project: PredictionIO > Issue Type: Bug > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Mars Hall >Assignee: Mars Hall > > I noticed that, when Elasticsearch is configured as meta storage, `pio train` > will hang with the following error unless Elasticsearch is on-line/available: > {code} > Exception in thread "main" java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:171) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-59) "pio app new" sometimes takes long time
[ https://issues.apache.org/jira/browse/PIO-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161714#comment-16161714 ] ASF GitHub Bot commented on PIO-59: --- Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/367 I am convinced that this PR does not decrease the security of generating random app access keys. I will merge if there is no objection. > "pio app new" sometimes takes long time > --- > > Key: PIO-59 > URL: https://issues.apache.org/jira/browse/PIO-59 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.11.0-incubating >Reporter: Shinsuke Sugaya >Priority: Minor > Fix For: 0.12.0-incubating > > > Some users reported this problem in user ML, and I also encountered it. > Checking stack traces, I think the cause is an entropy shortage for > /dev/random. > {code} > "main" #1 prio=5 os_prio=0 tid=0x7fc94803f800 nid=0x49a9 runnable > [0x7fc94fa1b000] >java.lang.Thread.State: RUNNABLE > at java.io.FileInputStream.readBytes(Native Method) > at java.io.FileInputStream.read(FileInputStream.java:255) > at > sun.security.provider.NativePRNG$RandomIO.readFully(NativePRNG.java:424) > at > sun.security.provider.NativePRNG$RandomIO.ensureBufferValid(NativePRNG.java:525) > at > sun.security.provider.NativePRNG$RandomIO.implNextBytes(NativePRNG.java:544) > - locked <0x0003d34e8a48> (a java.lang.Object) > at > sun.security.provider.NativePRNG$RandomIO.access$400(NativePRNG.java:331) > at > sun.security.provider.NativePRNG$Blocking.engineNextBytes(NativePRNG.java:268) > at java.security.SecureRandom.nextBytes(SecureRandom.java:468) > at > org.apache.predictionio.data.storage.AccessKeys$class.generateKey(AccessKeys.scala:71) > at > org.apache.predictionio.data.storage.elasticsearch.ESAccessKeys.generateKey(ESAccessKeys.scala:40) > at > org.apache.predictionio.data.storage.elasticsearch.ESAccessKeys.insert(ESAccessKeys.scala:60) > at > org.apache.predictionio.tools.commands.App$$anonfun$create$4$$anonfun$apply$5.apply(App.scala:71) > at > org.apache.predictionio.tools.commands.App$$anonfun$create$4$$anonfun$apply$5.apply(App.scala:62) > at scala.Option.map(Option.scala:145) > at > org.apache.predictionio.tools.commands.App$$anonfun$create$4.apply(App.scala:62) > at > org.apache.predictionio.tools.commands.App$$anonfun$create$4.apply(App.scala:55) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.predictionio.tools.commands.App$.create(App.scala:55) > at > org.apache.predictionio.tools.console.Pio$App$.create(Pio.scala:172) > at > org.apache.predictionio.tools.console.Console$$anonfun$main$1.apply(Console.scala:683) > at > org.apache.predictionio.tools.console.Console$$anonfun$main$1.apply(Console.scala:626) > at scala.Option.map(Option.scala:145) > at > org.apache.predictionio.tools.console.Console$.main(Console.scala:626) > at org.apache.predictionio.tools.console.Console.main(Console.scala) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-119) Bump up Elasticsearch to 5.5.2
[ https://issues.apache.org/jira/browse/PIO-119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160654#comment-16160654 ] ASF GitHub Bot commented on PIO-119: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/430 > Bump up Elasticsearch to 5.5.2 > -- > > Key: PIO-119 > URL: https://issues.apache.org/jira/browse/PIO-119 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > I encountered [this > problem|https://discuss.elastic.co/t/org-elasticsearch-hadoop-rest-eshadoopinvalidrequest-returned-400-bad-request/95803]. > To support elasticsearch 5.5, we need to update it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-116) PySpark Support
[ https://issues.apache.org/jira/browse/PIO-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160600#comment-16160600 ] ASF GitHub Bot commented on PIO-116: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/427 > PySpark Support > --- > > Key: PIO-116 > URL: https://issues.apache.org/jira/browse/PIO-116 > Project: PredictionIO > Issue Type: New Feature > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > This provides PySpark support with minimum PIO changes. > 1. Support pyspark on pio-shell > 2. Add python files to use pyspark > 3. Add --main-py-file option to "pio train" to submit .py file to spark > Note that this provides only fixes for Spark 2.x. > (because this fixes expect to use SparkML) > Sample project is: > https://github.com/jpioug/predictionio-template-iris > (For prediction API, Scala code is used.) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-117) Cannot delete event data on ESLEvents
[ https://issues.apache.org/jira/browse/PIO-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160170#comment-16160170 ] ASF GitHub Bot commented on PIO-117: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio-sdk-python/pull/22 > Cannot delete event data on ESLEvents > - > > Key: PIO-117 > URL: https://issues.apache.org/jira/browse/PIO-117 > Project: PredictionIO > Issue Type: Bug > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > For elasticsearch event storage, delete request does not work. > {noformat} > $ curl -XDELETE > "localhost:7070/events/AV5QAIH0VGhejKgUn-2J.json?accessKey=..." > { > "message": "Did not find value which can be converted into java.lang.String" > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-116) PySpark Support
[ https://issues.apache.org/jira/browse/PIO-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160164#comment-16160164 ] ASF GitHub Bot commented on PIO-116: Github user marevol commented on the issue: https://github.com/apache/incubator-predictionio/pull/427 Thanks! I'll merge this PR tomorrow. > PySpark Support > --- > > Key: PIO-116 > URL: https://issues.apache.org/jira/browse/PIO-116 > Project: PredictionIO > Issue Type: New Feature > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > This provides PySpark support with minimum PIO changes. > 1. Support pyspark on pio-shell > 2. Add python files to use pyspark > 3. Add --main-py-file option to "pio train" to submit .py file to spark > Note that this provides only fixes for Spark 2.x. > (because this fixes expect to use SparkML) > Sample project is: > https://github.com/jpioug/predictionio-template-iris > (For prediction API, Scala code is used.) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-116) PySpark Support
[ https://issues.apache.org/jira/browse/PIO-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160128#comment-16160128 ] ASF GitHub Bot commented on PIO-116: Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/427 Verified working with Python 2. Thanks! > PySpark Support > --- > > Key: PIO-116 > URL: https://issues.apache.org/jira/browse/PIO-116 > Project: PredictionIO > Issue Type: New Feature > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > This provides PySpark support with minimum PIO changes. > 1. Support pyspark on pio-shell > 2. Add python files to use pyspark > 3. Add --main-py-file option to "pio train" to submit .py file to spark > Note that this provides only fixes for Spark 2.x. > (because this fixes expect to use SparkML) > Sample project is: > https://github.com/jpioug/predictionio-template-iris > (For prediction API, Scala code is used.) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-116) PySpark Support
[ https://issues.apache.org/jira/browse/PIO-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16158713#comment-16158713 ] ASF GitHub Bot commented on PIO-116: Github user marevol commented on the issue: https://github.com/apache/incubator-predictionio/pull/427 Added __init__.py. It will work on Python 2.7. > PySpark Support > --- > > Key: PIO-116 > URL: https://issues.apache.org/jira/browse/PIO-116 > Project: PredictionIO > Issue Type: New Feature > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > This provides PySpark support with minimum PIO changes. > 1. Support pyspark on pio-shell > 2. Add python files to use pyspark > 3. Add --main-py-file option to "pio train" to submit .py file to spark > Note that this provides only fixes for Spark 2.x. > (because this fixes expect to use SparkML) > Sample project is: > https://github.com/jpioug/predictionio-template-iris > (For prediction API, Scala code is used.) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-117) Cannot delete event data on ESLEvents
[ https://issues.apache.org/jira/browse/PIO-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157945#comment-16157945 ] ASF GitHub Bot commented on PIO-117: Github user marevol closed the pull request at: https://github.com/apache/incubator-predictionio-sdk-python/pull/22 > Cannot delete event data on ESLEvents > - > > Key: PIO-117 > URL: https://issues.apache.org/jira/browse/PIO-117 > Project: PredictionIO > Issue Type: Bug > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > For elasticsearch event storage, delete request does not work. > {noformat} > $ curl -XDELETE > "localhost:7070/events/AV5QAIH0VGhejKgUn-2J.json?accessKey=..." > { > "message": "Did not find value which can be converted into java.lang.String" > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-117) Cannot delete event data on ESLEvents
[ https://issues.apache.org/jira/browse/PIO-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157946#comment-16157946 ] ASF GitHub Bot commented on PIO-117: GitHub user marevol reopened a pull request: https://github.com/apache/incubator-predictionio-sdk-python/pull/22 Add travis test and Refactoring I'll merge this PR after [PIO-117](https://issues.apache.org/jira/browse/PIO-117). You can merge this pull request into a Git repository by running: $ git pull https://github.com/marevol/incubator-predictionio-sdk-python travis Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio-sdk-python/pull/22.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22 commit cddbd06a8bfa1c904d5b7652b4f1ee72703a0de6 Author: Shinsuke Sugaya Date: 2017-09-04T09:00:57Z add travis and refactoring commit edeaa98c932876a4937aee6fd5e8bc4f5fa39551 Author: Shinsuke Sugaya Date: 2017-09-08T00:09:38Z use apache github repository > Cannot delete event data on ESLEvents > - > > Key: PIO-117 > URL: https://issues.apache.org/jira/browse/PIO-117 > Project: PredictionIO > Issue Type: Bug > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > For elasticsearch event storage, delete request does not work. > {noformat} > $ curl -XDELETE > "localhost:7070/events/AV5QAIH0VGhejKgUn-2J.json?accessKey=..." > { > "message": "Did not find value which can be converted into java.lang.String" > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-117) Cannot delete event data on ESLEvents
[ https://issues.apache.org/jira/browse/PIO-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157889#comment-16157889 ] ASF GitHub Bot commented on PIO-117: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/428 > Cannot delete event data on ESLEvents > - > > Key: PIO-117 > URL: https://issues.apache.org/jira/browse/PIO-117 > Project: PredictionIO > Issue Type: Bug > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > For elasticsearch event storage, delete request does not work. > {noformat} > $ curl -XDELETE > "localhost:7070/events/AV5QAIH0VGhejKgUn-2J.json?accessKey=..." > { > "message": "Did not find value which can be converted into java.lang.String" > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-118) ClassCastException from NullWritable to Text in ESEventsUtil
[ https://issues.apache.org/jira/browse/PIO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157890#comment-16157890 ] ASF GitHub Bot commented on PIO-118: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/429 > ClassCastException from NullWritable to Text in ESEventsUtil > > > Key: PIO-118 > URL: https://issues.apache.org/jira/browse/PIO-118 > Project: PredictionIO > Issue Type: Bug > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > {noformat} > Caused by: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable > cannot be cast to org.apache.hadoop.io.Text > at > org.apache.predictionio.data.storage.elasticsearch.ESEventsUtil$.getOptStringCol$1(ESEventsUtil.scala:58) > at > org.apache.predictionio.data.storage.elasticsearch.ESEventsUtil$.resultToEvent(ESEventsUtil.scala:68) > at > org.apache.predictionio.data.storage.elasticsearch.ESPEvents$$anonfun$5.apply(ESPEvents.scala:89) > at > org.apache.predictionio.data.storage.elasticsearch.ESPEvents$$anonfun$5.apply(ESPEvents.scala:87) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:99) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > ... 1 more > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-120) Process hangs if Elasticsearch is not available during train
[ https://issues.apache.org/jira/browse/PIO-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157788#comment-16157788 ] ASF GitHub Bot commented on PIO-120: Github user mars commented on the issue: https://github.com/apache/incubator-predictionio/pull/432 Would be great to have this included in 0.12.0 release. > Process hangs if Elasticsearch is not available during train > > > Key: PIO-120 > URL: https://issues.apache.org/jira/browse/PIO-120 > Project: PredictionIO > Issue Type: Bug > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Mars Hall >Assignee: Mars Hall > > I noticed that, when Elasticsearch is configured as meta storage, `pio train` > will hang with the following error unless Elasticsearch is on-line/available: > {code} > Exception in thread "main" java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:171) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-120) Process hangs if Elasticsearch is not available during train
[ https://issues.apache.org/jira/browse/PIO-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157787#comment-16157787 ] ASF GitHub Bot commented on PIO-120: GitHub user mars opened a pull request: https://github.com/apache/incubator-predictionio/pull/432 [PIO-120] Process hangs if Elasticsearch is not available during train Fixes [PIO-120](https://issues.apache.org/jira/browse/PIO-120) This changeset ensures that the process exits gracefully after ES connection error. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mars/incubator-predictionio fix-es-hang-on-train Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/432.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #432 commit f1c7337e246c9bd2bed5cc080efcf3dc81e4b055 Author: Mars Hall Date: 2017-09-07T21:38:46Z Graceful exit after ES connection error during train. > Process hangs if Elasticsearch is not available during train > > > Key: PIO-120 > URL: https://issues.apache.org/jira/browse/PIO-120 > Project: PredictionIO > Issue Type: Bug > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Mars Hall >Assignee: Mars Hall > > I noticed that, when Elasticsearch is configured as meta storage, `pio train` > will hang with the following error unless Elasticsearch is on-line/available: > {code} > Exception in thread "main" java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:171) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) > at > org.apache.predictionio.shaded.org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-117) Cannot delete event data on ESLEvents
[ https://issues.apache.org/jira/browse/PIO-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157626#comment-16157626 ] ASF GitHub Bot commented on PIO-117: Github user mars commented on the issue: https://github.com/apache/incubator-predictionio/pull/428 👍 looks good > Cannot delete event data on ESLEvents > - > > Key: PIO-117 > URL: https://issues.apache.org/jira/browse/PIO-117 > Project: PredictionIO > Issue Type: Bug > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > For elasticsearch event storage, delete request does not work. > {noformat} > $ curl -XDELETE > "localhost:7070/events/AV5QAIH0VGhejKgUn-2J.json?accessKey=..." > { > "message": "Did not find value which can be converted into java.lang.String" > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-119) Bump up Elasticsearch to 5.5.2
[ https://issues.apache.org/jira/browse/PIO-119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157624#comment-16157624 ] ASF GitHub Bot commented on PIO-119: Github user mars commented on the issue: https://github.com/apache/incubator-predictionio/pull/430 Just tested build, train, batchpredict, & deploy locally with ES 5.5.2. 👍 looks good! > Bump up Elasticsearch to 5.5.2 > -- > > Key: PIO-119 > URL: https://issues.apache.org/jira/browse/PIO-119 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > I encountered [this > problem|https://discuss.elastic.co/t/org-elasticsearch-hadoop-rest-eshadoopinvalidrequest-returned-400-bad-request/95803]. > To support elasticsearch 5.5, we need to update it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-119) Bump up Elasticsearch to 5.5.2
[ https://issues.apache.org/jira/browse/PIO-119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156539#comment-16156539 ] ASF GitHub Bot commented on PIO-119: Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/430 @mars may want to test this. > Bump up Elasticsearch to 5.5.2 > -- > > Key: PIO-119 > URL: https://issues.apache.org/jira/browse/PIO-119 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > I encountered [this > problem|https://discuss.elastic.co/t/org-elasticsearch-hadoop-rest-eshadoopinvalidrequest-returned-400-bad-request/95803]. > To support elasticsearch 5.5, we need to update it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-116) PySpark Support
[ https://issues.apache.org/jira/browse/PIO-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156537#comment-16156537 ] ASF GitHub Bot commented on PIO-116: Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/427 @marevol Looks like Python 3 is also a hard requirement? This does not work when I run `pyspark` with Python 2. > PySpark Support > --- > > Key: PIO-116 > URL: https://issues.apache.org/jira/browse/PIO-116 > Project: PredictionIO > Issue Type: New Feature > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > This provides PySpark support with minimum PIO changes. > 1. Support pyspark on pio-shell > 2. Add python files to use pyspark > 3. Add --main-py-file option to "pio train" to submit .py file to spark > Note that this provides only fixes for Spark 2.x. > (because this fixes expect to use SparkML) > Sample project is: > https://github.com/jpioug/predictionio-template-iris > (For prediction API, Scala code is used.) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-119) Bump up Elasticsearch to 5.5.2
[ https://issues.apache.org/jira/browse/PIO-119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154851#comment-16154851 ] ASF GitHub Bot commented on PIO-119: GitHub user marevol opened a pull request: https://github.com/apache/incubator-predictionio/pull/430 [PIO-119] Bump up Elasticsearch to 5.5.2 You can merge this pull request into a Git repository by running: $ git pull https://github.com/marevol/incubator-predictionio es552 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/430.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #430 commit 8e725b7c6f1206cd0858e1eca3618673a8966bf5 Author: Shinsuke Sugaya Date: 2017-09-06T05:59:08Z bump up elasticsearch to 5.5.2 > Bump up Elasticsearch to 5.5.2 > -- > > Key: PIO-119 > URL: https://issues.apache.org/jira/browse/PIO-119 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > I encountered [this > problem|https://discuss.elastic.co/t/org-elasticsearch-hadoop-rest-eshadoopinvalidrequest-returned-400-bad-request/95803]. > To support elasticsearch 5.5, we need to update it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-118) ClassCastException from NullWritable to Text in ESEventsUtil
[ https://issues.apache.org/jira/browse/PIO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154839#comment-16154839 ] ASF GitHub Bot commented on PIO-118: GitHub user marevol opened a pull request: https://github.com/apache/incubator-predictionio/pull/429 [PIO-118] ClassCastException from NullWritable to Text in ESEventsUtil You can merge this pull request into a Git repository by running: $ git pull https://github.com/marevol/incubator-predictionio cce_nullwritable Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/429.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #429 commit 97f363f1d08ce0662781d3bd3454c586186a4868 Author: Shinsuke Sugaya Date: 2017-09-06T05:36:16Z fix cast exception from NullWritable > ClassCastException from NullWritable to Text in ESEventsUtil > > > Key: PIO-118 > URL: https://issues.apache.org/jira/browse/PIO-118 > Project: PredictionIO > Issue Type: Bug > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > {noformat} > Caused by: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable > cannot be cast to org.apache.hadoop.io.Text > at > org.apache.predictionio.data.storage.elasticsearch.ESEventsUtil$.getOptStringCol$1(ESEventsUtil.scala:58) > at > org.apache.predictionio.data.storage.elasticsearch.ESEventsUtil$.resultToEvent(ESEventsUtil.scala:68) > at > org.apache.predictionio.data.storage.elasticsearch.ESPEvents$$anonfun$5.apply(ESPEvents.scala:89) > at > org.apache.predictionio.data.storage.elasticsearch.ESPEvents$$anonfun$5.apply(ESPEvents.scala:87) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:99) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > ... 1 more > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-117) Cannot delete event data on ESLEvents
[ https://issues.apache.org/jira/browse/PIO-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154669#comment-16154669 ] ASF GitHub Bot commented on PIO-117: Github user takezoe commented on the issue: https://github.com/apache/incubator-predictionio/pull/428 Nice catch! LGTM! > Cannot delete event data on ESLEvents > - > > Key: PIO-117 > URL: https://issues.apache.org/jira/browse/PIO-117 > Project: PredictionIO > Issue Type: Bug > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > For elasticsearch event storage, delete request does not work. > {noformat} > $ curl -XDELETE > "localhost:7070/events/AV5QAIH0VGhejKgUn-2J.json?accessKey=..." > { > "message": "Did not find value which can be converted into java.lang.String" > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-117) Cannot delete event data on ESLEvents
[ https://issues.apache.org/jira/browse/PIO-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154588#comment-16154588 ] ASF GitHub Bot commented on PIO-117: GitHub user marevol opened a pull request: https://github.com/apache/incubator-predictionio-sdk-python/pull/22 [WIP] Add travis test and Refactoring I'll merge this PR after [PIO-117](https://issues.apache.org/jira/browse/PIO-117). You can merge this pull request into a Git repository by running: $ git pull https://github.com/marevol/incubator-predictionio-sdk-python travis Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio-sdk-python/pull/22.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22 commit cddbd06a8bfa1c904d5b7652b4f1ee72703a0de6 Author: Shinsuke Sugaya Date: 2017-09-04T09:00:57Z add travis and refactoring > Cannot delete event data on ESLEvents > - > > Key: PIO-117 > URL: https://issues.apache.org/jira/browse/PIO-117 > Project: PredictionIO > Issue Type: Bug > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > For elasticsearch event storage, delete request does not work. > {noformat} > $ curl -XDELETE > "localhost:7070/events/AV5QAIH0VGhejKgUn-2J.json?accessKey=..." > { > "message": "Did not find value which can be converted into java.lang.String" > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-72) In `pio-shell` jdbc.StorageClient cannot be loaded
[ https://issues.apache.org/jira/browse/PIO-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154018#comment-16154018 ] ASF GitHub Bot commented on PIO-72: --- Github user mars commented on the issue: https://github.com/apache/incubator-predictionio/pull/401 Yes @BrianOn99, I do believe [class loading for pio-shell is fixed](https://github.com/apache/incubator-predictionio/blob/develop/bin/pio-shell#L59) for the next release, or if you `make-distribution.sh` on main `develop` branch, you'll get these fixes now. > In `pio-shell` jdbc.StorageClient cannot be loaded > -- > > Key: PIO-72 > URL: https://issues.apache.org/jira/browse/PIO-72 > Project: PredictionIO > Issue Type: Bug > Components: Core >Affects Versions: 0.11.0-incubating > Environment: local developer machines >Reporter: Mars Hall >Assignee: Chan > Fix For: 0.12.0-incubating > > Attachments: image.png > > > Class loading/classpath is currently broken in {{pio-shell}}. Attached > screenshot is the public docs that explain the intended functionality. > Instead, users see errors when attempting to use storage classes: > {code:title=pio-shell.error|borderStyle=solid} > java.lang.ClassNotFoundException: jdbc.StorageClient > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:264) > at org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:228) > at > org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:254) > at > org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:215) > at > org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:215) > at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:189) > at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:91) > at > org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:215) > at > org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:284) > at > org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:269) > at > org.apache.predictionio.data.storage.Storage$.getMetaDataApps(Storage.scala:387) > at > org.apache.predictionio.data.store.Common$.appsDb$lzycompute(Common.scala:27) > at org.apache.predictionio.data.store.Common$.appsDb(Common.scala:27) > at org.apache.predictionio.data.store.Common$.appNameToId(Common.scala:32) > at > org.apache.predictionio.data.store.PEventStore$.aggregateProperties(PEventStore.scala:108) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:31) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:36) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:38) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC.(:40) > at $line20.$read$$iwC$$iwC$$iwC$$iwC.(:42) > at $line20.$read$$iwC$$iwC$$iwC.(:44) > at $line20.$read$$iwC$$iwC.(:46) > at $line20.$read$$iwC.(:48) > at $line20.$read.(:50) > at $line20.$read$.(:54) > at $line20.$read$.() > at $line20.$eval$.(:7) > at $line20.$eval$.() > at $line20.$eval.$print() > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) > at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) > at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) > at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) > at > org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902) > at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) > at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657) > at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665) > at > org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670) > at > org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997) > at > org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) > at > org.apache.spark.repl.SparkILoop$$anonfun$
[jira] [Commented] (PIO-117) Cannot delete event data on ESLEvents
[ https://issues.apache.org/jira/browse/PIO-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153100#comment-16153100 ] ASF GitHub Bot commented on PIO-117: GitHub user marevol opened a pull request: https://github.com/apache/incubator-predictionio/pull/428 [PIO-117] Cannot delete event data on ESLEvents You can merge this pull request into a Git repository by running: $ git pull https://github.com/marevol/incubator-predictionio es_delete_response Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/428.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #428 commit faf747d82b140add8bf49bbfe7d53d31b079ed89 Author: Shinsuke Sugaya Date: 2017-09-05T04:26:18Z check deleted property in response > Cannot delete event data on ESLEvents > - > > Key: PIO-117 > URL: https://issues.apache.org/jira/browse/PIO-117 > Project: PredictionIO > Issue Type: Bug > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > For elasticsearch event storage, delete request does not work. > {noformat} > $ curl -XDELETE > "localhost:7070/events/AV5QAIH0VGhejKgUn-2J.json?accessKey=..." > { > "message": "Did not find value which can be converted into java.lang.String" > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-72) In `pio-shell` jdbc.StorageClient cannot be loaded
[ https://issues.apache.org/jira/browse/PIO-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153079#comment-16153079 ] ASF GitHub Bot commented on PIO-72: --- Github user BrianOn99 commented on the issue: https://github.com/apache/incubator-predictionio/pull/401 Good news to hear that 0.12 will be released soon! > In `pio-shell` jdbc.StorageClient cannot be loaded > -- > > Key: PIO-72 > URL: https://issues.apache.org/jira/browse/PIO-72 > Project: PredictionIO > Issue Type: Bug > Components: Core >Affects Versions: 0.11.0-incubating > Environment: local developer machines >Reporter: Mars Hall >Assignee: Chan > Fix For: 0.12.0-incubating > > Attachments: image.png > > > Class loading/classpath is currently broken in {{pio-shell}}. Attached > screenshot is the public docs that explain the intended functionality. > Instead, users see errors when attempting to use storage classes: > {code:title=pio-shell.error|borderStyle=solid} > java.lang.ClassNotFoundException: jdbc.StorageClient > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:264) > at org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:228) > at > org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:254) > at > org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:215) > at > org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:215) > at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:189) > at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:91) > at > org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:215) > at > org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:284) > at > org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:269) > at > org.apache.predictionio.data.storage.Storage$.getMetaDataApps(Storage.scala:387) > at > org.apache.predictionio.data.store.Common$.appsDb$lzycompute(Common.scala:27) > at org.apache.predictionio.data.store.Common$.appsDb(Common.scala:27) > at org.apache.predictionio.data.store.Common$.appNameToId(Common.scala:32) > at > org.apache.predictionio.data.store.PEventStore$.aggregateProperties(PEventStore.scala:108) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:31) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:36) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:38) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC.(:40) > at $line20.$read$$iwC$$iwC$$iwC$$iwC.(:42) > at $line20.$read$$iwC$$iwC$$iwC.(:44) > at $line20.$read$$iwC$$iwC.(:46) > at $line20.$read$$iwC.(:48) > at $line20.$read.(:50) > at $line20.$read$.(:54) > at $line20.$read$.() > at $line20.$eval$.(:7) > at $line20.$eval$.() > at $line20.$eval.$print() > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) > at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) > at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) > at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) > at > org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902) > at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) > at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657) > at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665) > at > org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670) > at > org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997) > at > org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) > at > org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) > at > scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) > at > org.apache.spark.repl.SparkILo
[jira] [Commented] (PIO-72) In `pio-shell` jdbc.StorageClient cannot be loaded
[ https://issues.apache.org/jira/browse/PIO-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153076#comment-16153076 ] ASF GitHub Bot commented on PIO-72: --- Github user BrianOn99 commented on the issue: https://github.com/apache/incubator-predictionio/pull/401 @mars Thanks for the help. Yes the jdbc driver is not loaded correctly, though I have the jar downloaded. I need to add `export CLASSPATH` like this in pio-shell ``` if [[ "$1" == "--with-spark" ]] then echo "Starting the PIO shell with the Apache Spark Shell." # Get paths of assembly jars to pass to spark-shell . ${PIO_HOME}/bin/compute-classpath.sh shift export CLASSPATH ${SPARK_HOME}/bin/spark-shell --jars ${ASSEMBLY_JAR},/PredictionIO-0.11.0-incubating/lib/spark/pio-data-jdbc-assembly-0.11.0-incubating.jar $@ else ``` and then every thing works. I note that `export CLASSPATH` exist in `bin/pio-class`, so in my environment `pio status`, `pio train`, etc, can work, but `pio-shell` does not call `bin/pio-class` so `CLASSPATH` is not exported. Is this fixed upstream or my setting is wrong? > In `pio-shell` jdbc.StorageClient cannot be loaded > -- > > Key: PIO-72 > URL: https://issues.apache.org/jira/browse/PIO-72 > Project: PredictionIO > Issue Type: Bug > Components: Core >Affects Versions: 0.11.0-incubating > Environment: local developer machines >Reporter: Mars Hall >Assignee: Chan > Fix For: 0.12.0-incubating > > Attachments: image.png > > > Class loading/classpath is currently broken in {{pio-shell}}. Attached > screenshot is the public docs that explain the intended functionality. > Instead, users see errors when attempting to use storage classes: > {code:title=pio-shell.error|borderStyle=solid} > java.lang.ClassNotFoundException: jdbc.StorageClient > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:264) > at org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:228) > at > org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:254) > at > org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:215) > at > org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:215) > at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:189) > at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:91) > at > org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:215) > at > org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:284) > at > org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:269) > at > org.apache.predictionio.data.storage.Storage$.getMetaDataApps(Storage.scala:387) > at > org.apache.predictionio.data.store.Common$.appsDb$lzycompute(Common.scala:27) > at org.apache.predictionio.data.store.Common$.appsDb(Common.scala:27) > at org.apache.predictionio.data.store.Common$.appNameToId(Common.scala:32) > at > org.apache.predictionio.data.store.PEventStore$.aggregateProperties(PEventStore.scala:108) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:31) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:36) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:38) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC.(:40) > at $line20.$read$$iwC$$iwC$$iwC$$iwC.(:42) > at $line20.$read$$iwC$$iwC$$iwC.(:44) > at $line20.$read$$iwC$$iwC.(:46) > at $line20.$read$$iwC.(:48) > at $line20.$read.(:50) > at $line20.$read$.(:54) > at $line20.$read$.() > at $line20.$eval$.(:7) > at $line20.$eval$.() > at $line20.$eval.$print() > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) > at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) > at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) > at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) > at > org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scal
[jira] [Commented] (PIO-72) In `pio-shell` jdbc.StorageClient cannot be loaded
[ https://issues.apache.org/jira/browse/PIO-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152784#comment-16152784 ] ASF GitHub Bot commented on PIO-72: --- Github user mars commented on the issue: https://github.com/apache/incubator-predictionio/pull/401 Hi @BrianOn99, Adding that `--jars` option to `pio-shell` command is the right solution, and then the "No suitable driver found" error can be solved by adding the Postgres driver to your PredictionIO install: 1. download [Postgres JDBC driver](https://jdbc.postgresql.org/download.html) (probably the newest one for Java 8) 2. put it in the PredictionIO distribution's `lib/` directory (this directory is sibling to the `bin/` directory where the `pio` command is located; any jars in that directory are automatically added to the classpath for `pio` commands) We're working on releasing 0.12! > In `pio-shell` jdbc.StorageClient cannot be loaded > -- > > Key: PIO-72 > URL: https://issues.apache.org/jira/browse/PIO-72 > Project: PredictionIO > Issue Type: Bug > Components: Core >Affects Versions: 0.11.0-incubating > Environment: local developer machines >Reporter: Mars Hall >Assignee: Chan > Fix For: 0.12.0-incubating > > Attachments: image.png > > > Class loading/classpath is currently broken in {{pio-shell}}. Attached > screenshot is the public docs that explain the intended functionality. > Instead, users see errors when attempting to use storage classes: > {code:title=pio-shell.error|borderStyle=solid} > java.lang.ClassNotFoundException: jdbc.StorageClient > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:264) > at org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:228) > at > org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:254) > at > org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:215) > at > org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:215) > at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:189) > at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:91) > at > org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:215) > at > org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:284) > at > org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:269) > at > org.apache.predictionio.data.storage.Storage$.getMetaDataApps(Storage.scala:387) > at > org.apache.predictionio.data.store.Common$.appsDb$lzycompute(Common.scala:27) > at org.apache.predictionio.data.store.Common$.appsDb(Common.scala:27) > at org.apache.predictionio.data.store.Common$.appNameToId(Common.scala:32) > at > org.apache.predictionio.data.store.PEventStore$.aggregateProperties(PEventStore.scala:108) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:31) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:36) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:38) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC.(:40) > at $line20.$read$$iwC$$iwC$$iwC$$iwC.(:42) > at $line20.$read$$iwC$$iwC$$iwC.(:44) > at $line20.$read$$iwC$$iwC.(:46) > at $line20.$read$$iwC.(:48) > at $line20.$read.(:50) > at $line20.$read$.(:54) > at $line20.$read$.() > at $line20.$eval$.(:7) > at $line20.$eval$.() > at $line20.$eval.$print() > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) > at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) > at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) > at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) > at > org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902) > at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) > at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657) > at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665) > at > org.apache.spark.r
[jira] [Commented] (PIO-72) In `pio-shell` jdbc.StorageClient cannot be loaded
[ https://issues.apache.org/jira/browse/PIO-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152162#comment-16152162 ] ASF GitHub Bot commented on PIO-72: --- Github user BrianOn99 commented on the issue: https://github.com/apache/incubator-predictionio/pull/401 I am using predicitonio 0.11 with postgres, and meet this issue `java.lang.ClassNotFoundException: jdbc.StorageClient` when using `pio-shell` when trying the tutorial https://predictionio.incubator.apache.org/datacollection/eventmodel/ As 0.12 is not released, I tried to apply this patch to 0.11 by appending `pio-data-jdbc-assembly-0.11.0-incubating.jar` to the `--jars` flag in `pio-shell` file, but then another error pops up ``` java.sql.SQLException: No suitable driver found for jdbc:postgresql://localhost/pio at java.sql.DriverManager.getConnection(DriverManager.java:689) at java.sql.DriverManager.getConnection(DriverManager.java:247) ``` Is it necessary to patch `Common.scala` and recompile? Or is there another way to do the same thing as `PEventStore.aggregateProperties` as shown in the tutorial without `pio-shell`? I really hope to have some help to be shown in the tutorial otherwise other newcomers like me will be confused. > In `pio-shell` jdbc.StorageClient cannot be loaded > -- > > Key: PIO-72 > URL: https://issues.apache.org/jira/browse/PIO-72 > Project: PredictionIO > Issue Type: Bug > Components: Core >Affects Versions: 0.11.0-incubating > Environment: local developer machines >Reporter: Mars Hall >Assignee: Chan > Fix For: 0.12.0-incubating > > Attachments: image.png > > > Class loading/classpath is currently broken in {{pio-shell}}. Attached > screenshot is the public docs that explain the intended functionality. > Instead, users see errors when attempting to use storage classes: > {code:title=pio-shell.error|borderStyle=solid} > java.lang.ClassNotFoundException: jdbc.StorageClient > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:264) > at org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:228) > at > org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:254) > at > org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:215) > at > org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:215) > at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:189) > at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:91) > at > org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:215) > at > org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:284) > at > org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:269) > at > org.apache.predictionio.data.storage.Storage$.getMetaDataApps(Storage.scala:387) > at > org.apache.predictionio.data.store.Common$.appsDb$lzycompute(Common.scala:27) > at org.apache.predictionio.data.store.Common$.appsDb(Common.scala:27) > at org.apache.predictionio.data.store.Common$.appNameToId(Common.scala:32) > at > org.apache.predictionio.data.store.PEventStore$.aggregateProperties(PEventStore.scala:108) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:31) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:36) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:38) > at $line20.$read$$iwC$$iwC$$iwC$$iwC$$iwC.(:40) > at $line20.$read$$iwC$$iwC$$iwC$$iwC.(:42) > at $line20.$read$$iwC$$iwC$$iwC.(:44) > at $line20.$read$$iwC$$iwC.(:46) > at $line20.$read$$iwC.(:48) > at $line20.$read.(:50) > at $line20.$read$.(:54) > at $line20.$read$.() > at $line20.$eval$.(:7) > at $line20.$eval$.() > at $line20.$eval.$print() > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) > at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) > at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) > at org.ap
[jira] [Commented] (PIO-116) PySpark Support
[ https://issues.apache.org/jira/browse/PIO-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152071#comment-16152071 ] ASF GitHub Bot commented on PIO-116: Github user marevol commented on the issue: https://github.com/apache/incubator-predictionio/pull/427 Thank you for checking it. Replaced with if-else. > PySpark Support > --- > > Key: PIO-116 > URL: https://issues.apache.org/jira/browse/PIO-116 > Project: PredictionIO > Issue Type: New Feature > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > This provides PySpark support with minimum PIO changes. > 1. Support pyspark on pio-shell > 2. Add python files to use pyspark > 3. Add --main-py-file option to "pio train" to submit .py file to spark > Note that this provides only fixes for Spark 2.x. > (because this fixes expect to use SparkML) > Sample project is: > https://github.com/jpioug/predictionio-template-iris > (For prediction API, Scala code is used.) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-106) Elasticsearch 5.x StorageClient should reuse RestClient
[ https://issues.apache.org/jira/browse/PIO-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145871#comment-16145871 ] ASF GitHub Bot commented on PIO-106: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/421 > Elasticsearch 5.x StorageClient should reuse RestClient > --- > > Key: PIO-106 > URL: https://issues.apache.org/jira/browse/PIO-106 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.11.0-incubating >Reporter: Mars Hall >Assignee: Mars Hall > > When using the proposed [PIO-105 Batch > Predictions|https://issues.apache.org/jira/browse/PIO-105] feature with an > engine that queries Elasticsearch in {{Algorithm#predict}}, Elasticsearch's > REST interface appears to become overloaded, ending with the Spark job being > killed from errors like: > {noformat} > [ERROR] [ESChannels] Failed to access to /pio_meta/channels/_search > [ERROR] [Utils] Aborting task > [ERROR] [ESApps] Failed to access to /pio_meta/apps/_search > [ERROR] [Executor] Exception in task 747.0 in stage 1.0 (TID 749) > [ERROR] [Executor] Exception in task 735.0 in stage 1.0 (TID 737) > [ERROR] [Common$] Invalid app name ur > [ERROR] [Utils] Aborting task > [ERROR] [URAlgorithm] Error when read recent events: > java.lang.IllegalArgumentException: Invalid app name ur > [ERROR] [Executor] Exception in task 749.0 in stage 1.0 (TID 751) > [ERROR] [Utils] Aborting task > [ERROR] [Executor] Exception in task 748.0 in stage 1.0 (TID 750) > [WARN] [TaskSetManager] Lost task 749.0 in stage 1.0 (TID 751, localhost, > executor driver): java.net.BindException: Can't assign requested address > at sun.nio.ch.Net.connect0(Native Method) > at sun.nio.ch.Net.connect(Net.java:454) > at sun.nio.ch.Net.connect(Net.java:446) > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) > at > org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processSessionRequests(DefaultConnectingIOReactor.java:273) > at > org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:139) > at > org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) > at > org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) > at > org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) > at java.lang.Thread.run(Thread.java:745) > {noformat} > After these errors happen & the job is killed, Elasticsearch immediately > recovers. It responds to queries normally. I researched what could cause this > and found an [old issue in the main Elasticsearch > repo|https://github.com/elastic/elasticsearch/issues/3647]. With the hints > given therein about *using keep-alive in the ES client* to avoid these > performance issues, I investigated how PredictionIO's [Elasticsearch > StorageClient|https://github.com/apache/incubator-predictionio/tree/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch] > manages its connections. > I found that unlike the other StorageClients (Elasticsearch1, HBase, JDBC), > Elasticsearch creates a new underlying connection, an Elasticsearch > RestClient, for > [every|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESApps.scala#L80] > > [single|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESApps.scala#L157] > > [query|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESChannels.scala#L78] > & > [interaction|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESEngineInstances.scala#L205] > with its API. As a result, *there is no way Elasticsearch TCP connections > can be reused via HTTP keep-alive*. > High-performance workloads with Elasticsearch 5.x will suffer from these > issues unless we refactor Elasticsearch StorageClient to share the underlying > RestClient instead of [building a new one everytime the client is > used|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/StorageClient.scala#L31]. > There are certainly different approaches we could take to sharing a > RestClient so that its keep-alive behavior may work as designed: > *
[jira] [Commented] (PIO-115) Cache name-to-ID lookups for Storage app & channel
[ https://issues.apache.org/jira/browse/PIO-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145855#comment-16145855 ] ASF GitHub Bot commented on PIO-115: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/424 > Cache name-to-ID lookups for Storage app & channel > -- > > Key: PIO-115 > URL: https://issues.apache.org/jira/browse/PIO-115 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.11.0-incubating >Reporter: Mars Hall >Assignee: Mars Hall > > When stress testing the Universal Recommender with high-concurrency HTTP/REST > queries, we observed that Elasticsearch traffic was majority composed of > requests resolving the Storage app's name & channel, over and over and over > again! In this case, [each per-query call to > `LEventStore.findByEntity`|https://github.com/heroku/predictionio-engine-ur/blob/master/src/main/scala/URAlgorithm.scala#L694] > re-resolves the app name to an ID. > Implement memoization for the function that performs these name-to-ID > lookups, so that only one set of lookups is performed per process for each > app+channel combination. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-116) PySpark Support
[ https://issues.apache.org/jira/browse/PIO-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144973#comment-16144973 ] ASF GitHub Bot commented on PIO-116: GitHub user marevol opened a pull request: https://github.com/apache/incubator-predictionio/pull/427 [PIO-116] PySpark Support This PR provides PySpark support with minimum PIO changes. 1. Support pyspark on pio-shell 2. Add python files to use pyspark 3. Add --main-py-file option to "pio train" to submit .py file to spark Note that this provides only fixes for Spark 2.x. (because this fixes expect to use SparkML) Sample project is: https://github.com/jpioug/predictionio-template-iris (For prediction API, Scala code is used.) You can merge this pull request into a Git repository by running: $ git pull https://github.com/marevol/incubator-predictionio pyspark Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/427.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #427 commit ee28fcf139c6ac8184d990cbdc4d43b00ff483fd Author: Shinsuke Sugaya Date: 2017-08-22T09:47:05Z add pyspark sub-command commit 97f0343691ff1ca98f1ce65fc8ad3e25df6cd15b Author: Shinsuke Sugaya Date: 2017-08-27T14:16:18Z replace with values.toString commit 2970397a6024f17872011979edcae1712f8a4362 Author: Shinsuke Sugaya Date: 2017-08-28T10:04:24Z add --main-py-file option to train > PySpark Support > --- > > Key: PIO-116 > URL: https://issues.apache.org/jira/browse/PIO-116 > Project: PredictionIO > Issue Type: New Feature > Components: Core >Reporter: Shinsuke Sugaya >Assignee: Shinsuke Sugaya > > This provides PySpark support with minimum PIO changes. > 1. Support pyspark on pio-shell > 2. Add python files to use pyspark > 3. Add --main-py-file option to "pio train" to submit .py file to spark > Note that this provides only fixes for Spark 2.x. > (because this fixes expect to use SparkML) > Sample project is: > https://github.com/jpioug/predictionio-template-iris > (For prediction API, Scala code is used.) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143610#comment-16143610 ] ASF GitHub Bot commented on PIO-110: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/425 > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143298#comment-16143298 ] ASF GitHub Bot commented on PIO-110: Github user shimamoto commented on a diff in the pull request: https://github.com/apache/incubator-predictionio/pull/425#discussion_r135429437 --- Diff: core/src/main/scala/org/apache/predictionio/controller/PAlgorithm.scala --- @@ -115,15 +115,12 @@ abstract class PAlgorithm[PD, M, Q, P] algoParams: Params, bm: Any): Any = { val m = bm.asInstanceOf[M] -if (m.isInstanceOf[PersistentModel[_]]) { - if (m.asInstanceOf[PersistentModel[Params]].save( -modelId, algoParams, sc)) { -PersistentModelManifest(className = m.getClass.getName) - } else { -() - } -} else { - () +m match { + case m: PersistentModel[Params] @unchecked => +if(m.save(modelId, algoParams, sc)){ + PersistentModelManifest(className = m.getClass.getName) +} else () + case _ => () --- End diff -- I got it. > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16142607#comment-16142607 ] ASF GitHub Bot commented on PIO-110: Github user takezoe commented on the issue: https://github.com/apache/incubator-predictionio/pull/425 @mars Some of them I suggested are applied by this pull request. But as a result of checking the whole codebase, I thought that I shouldn't apply some of them to the whole. For now, all of my refactoring is this pull request, but I may create an another pull request in the future. > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16142594#comment-16142594 ] ASF GitHub Bot commented on PIO-110: Github user takezoe commented on a diff in the pull request: https://github.com/apache/incubator-predictionio/pull/425#discussion_r135380673 --- Diff: core/src/main/scala/org/apache/predictionio/controller/PAlgorithm.scala --- @@ -115,15 +115,12 @@ abstract class PAlgorithm[PD, M, Q, P] algoParams: Params, bm: Any): Any = { val m = bm.asInstanceOf[M] -if (m.isInstanceOf[PersistentModel[_]]) { - if (m.asInstanceOf[PersistentModel[Params]].save( -modelId, algoParams, sc)) { -PersistentModelManifest(className = m.getClass.getName) - } else { -() - } -} else { - () +m match { + case m: PersistentModel[Params] @unchecked => +if(m.save(modelId, algoParams, sc)){ + PersistentModelManifest(className = m.getClass.getName) +} else () + case _ => () --- End diff -- > This code will be simpler to read by joining the if guard with the case statement. Yes, but I want to keep the form of code as same as other algorithms. > Which is correct? I don't know. In any case, I think that I shouldn't modify current behavior in this pull request because this pull request is for refactoring. > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16142123#comment-16142123 ] ASF GitHub Bot commented on PIO-110: Github user mars commented on the issue: https://github.com/apache/incubator-predictionio/pull/425 Great Scala-style improvements here, @takezoe. Great to see this gardening of the codebase 🤓 I'm wondering, in [PIO-110](https://issues.apache.org/jira/browse/PIO-110) the objective is to refactor the common code between `CreateServer` and `BatchPredict`, yet I do not see that kind of change here. Are you working on extracting & reusing the common code as the next step for this PR? > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141270#comment-16141270 ] ASF GitHub Bot commented on PIO-110: Github user shimamoto commented on a diff in the pull request: https://github.com/apache/incubator-predictionio/pull/425#discussion_r135175043 --- Diff: core/src/main/scala/org/apache/predictionio/controller/PAlgorithm.scala --- @@ -115,15 +115,12 @@ abstract class PAlgorithm[PD, M, Q, P] algoParams: Params, bm: Any): Any = { val m = bm.asInstanceOf[M] -if (m.isInstanceOf[PersistentModel[_]]) { - if (m.asInstanceOf[PersistentModel[Params]].save( -modelId, algoParams, sc)) { -PersistentModelManifest(className = m.getClass.getName) - } else { -() - } -} else { - () +m match { + case m: PersistentModel[Params] @unchecked => +if(m.save(modelId, algoParams, sc)){ + PersistentModelManifest(className = m.getClass.getName) +} else () + case _ => () --- End diff -- This code will be simpler to read by joining the if guard with the case statement. ```scala case m: PersistentModel[Params] @unchecked if m.save(modelId, algoParams, sc) => ... case _ => ... ``` But it looks this behavior differs from other algorithms (LAlgorithm.scala and P2LAlgorithm.scala). Which is correct? > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141265#comment-16141265 ] ASF GitHub Bot commented on PIO-110: Github user shimamoto commented on a diff in the pull request: https://github.com/apache/incubator-predictionio/pull/425#discussion_r135180795 --- Diff: data/src/main/scala/org/apache/predictionio/data/api/Webhooks.scala --- @@ -62,22 +53,23 @@ private[predictionio] object Webhooks { } eventFuture.flatMap { eventOpt => - if (eventOpt.isEmpty) { -Future successful { - val message = s"webhooks connection for ${web} is not supported." - (StatusCodes.NotFound, Map("message" -> message)) -} - } else { -val event = eventOpt.get -val data = eventClient.futureInsert(event, appId, channelId).map { id => - val result = (StatusCodes.Created, Map("eventId" -> s"${id}")) - - if (stats) { -statsActorRef ! Bookkeeping(appId, result._1, event) + eventOpt match { +case None => + Future successful { +val message = s"webhooks connection for ${web} is not supported." +(StatusCodes.NotFound, Map("message" -> message)) --- End diff -- It's better to use in function args for pattern matching. ```scala eventFuture.flatMap { case None => ... case Some(event) => ... } ``` > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141266#comment-16141266 ] ASF GitHub Bot commented on PIO-110: Github user shimamoto commented on a diff in the pull request: https://github.com/apache/incubator-predictionio/pull/425#discussion_r135180991 --- Diff: data/src/main/scala/org/apache/predictionio/data/api/Webhooks.scala --- @@ -62,22 +53,23 @@ private[predictionio] object Webhooks { } eventFuture.flatMap { eventOpt => - if (eventOpt.isEmpty) { -Future successful { - val message = s"webhooks connection for ${web} is not supported." - (StatusCodes.NotFound, Map("message" -> message)) -} - } else { -val event = eventOpt.get -val data = eventClient.futureInsert(event, appId, channelId).map { id => - val result = (StatusCodes.Created, Map("eventId" -> s"${id}")) - - if (stats) { -statsActorRef ! Bookkeeping(appId, result._1, event) + eventOpt match { +case None => + Future successful { +val message = s"webhooks connection for ${web} is not supported." +(StatusCodes.NotFound, Map("message" -> message)) } - result -} -data +case Some(event) => + val event = eventOpt.get --- End diff -- It does not need it. > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141271#comment-16141271 ] ASF GitHub Bot commented on PIO-110: Github user shimamoto commented on a diff in the pull request: https://github.com/apache/incubator-predictionio/pull/425#discussion_r135187599 --- Diff: storage/hbase/src/main/scala/org/apache/predictionio/data/storage/hbase/HBEventsUtil.scala --- @@ -376,32 +375,30 @@ object HBEventsUtil { } targetEntityType.foreach { tetOpt => - if (tetOpt.isEmpty) { -val filter = createSkipRowIfColumnExistFilter("targetEntityType") -filters.addFilter(filter) - } else { -tetOpt.foreach { tet => + tetOpt match { +case None => + val filter = createSkipRowIfColumnExistFilter("targetEntityType") + filters.addFilter(filter) +case Some(tet) => --- End diff -- Same above. > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141267#comment-16141267 ] ASF GitHub Bot commented on PIO-110: Github user shimamoto commented on a diff in the pull request: https://github.com/apache/incubator-predictionio/pull/425#discussion_r135191103 --- Diff: data/src/main/scala/org/apache/predictionio/data/api/Webhooks.scala --- @@ -115,22 +107,22 @@ private[predictionio] object Webhooks { } eventFuture.flatMap { eventOpt => - if (eventOpt.isEmpty) { -Future { - val message = s"webhooks connection for ${web} is not supported." - (StatusCodes.NotFound, Map("message" -> message)) -} - } else { -val event = eventOpt.get -val data = eventClient.futureInsert(event, appId, channelId).map { id => - val result = (StatusCodes.Created, Map("eventId" -> s"${id}")) - - if (stats) { -statsActorRef ! Bookkeeping(appId, result._1, event) + eventOpt match { +case None => + Future { +val message = s"webhooks connection for ${web} is not supported." +(StatusCodes.NotFound, Map("message" -> message)) + } --- End diff -- Originally `Future {...}`, but here it looks good to use `Future successful`. > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141268#comment-16141268 ] ASF GitHub Bot commented on PIO-110: Github user shimamoto commented on a diff in the pull request: https://github.com/apache/incubator-predictionio/pull/425#discussion_r135181421 --- Diff: data/src/main/scala/org/apache/predictionio/data/api/Webhooks.scala --- @@ -115,22 +107,22 @@ private[predictionio] object Webhooks { } eventFuture.flatMap { eventOpt => - if (eventOpt.isEmpty) { -Future { - val message = s"webhooks connection for ${web} is not supported." - (StatusCodes.NotFound, Map("message" -> message)) -} - } else { -val event = eventOpt.get -val data = eventClient.futureInsert(event, appId, channelId).map { id => - val result = (StatusCodes.Created, Map("eventId" -> s"${id}")) - - if (stats) { -statsActorRef ! Bookkeeping(appId, result._1, event) + eventOpt match { +case None => + Future { +val message = s"webhooks connection for ${web} is not supported." +(StatusCodes.NotFound, Map("message" -> message)) + } +case Some(event) => + val data = eventClient.futureInsert(event, appId, channelId).map { id => +val result = (StatusCodes.Created, Map("eventId" -> s"${id}")) + +if (stats) { + statsActorRef ! Bookkeeping(appId, result._1, event) +} +result --- End diff -- Same above. > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141269#comment-16141269 ] ASF GitHub Bot commented on PIO-110: Github user shimamoto commented on a diff in the pull request: https://github.com/apache/incubator-predictionio/pull/425#discussion_r135187633 --- Diff: storage/hbase/src/main/scala/org/apache/predictionio/data/storage/hbase/HBEventsUtil.scala --- @@ -376,32 +375,30 @@ object HBEventsUtil { } targetEntityType.foreach { tetOpt => - if (tetOpt.isEmpty) { -val filter = createSkipRowIfColumnExistFilter("targetEntityType") -filters.addFilter(filter) - } else { -tetOpt.foreach { tet => + tetOpt match { +case None => + val filter = createSkipRowIfColumnExistFilter("targetEntityType") + filters.addFilter(filter) +case Some(tet) => val filter = createBinaryFilter( "targetEntityType", Bytes.toBytes(tet)) // the entire row will be skipped if the column is not found. filter.setFilterIfMissing(true) filters.addFilter(filter) -} } } targetEntityId.foreach { teidOpt => - if (teidOpt.isEmpty) { -val filter = createSkipRowIfColumnExistFilter("targetEntityId") -filters.addFilter(filter) - } else { -teidOpt.foreach { teid => + teidOpt match { +case None => + val filter = createSkipRowIfColumnExistFilter("targetEntityId") + filters.addFilter(filter) +case Some(teid) => --- End diff -- Same above. > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141076#comment-16141076 ] ASF GitHub Bot commented on PIO-110: Github user shimamoto commented on the issue: https://github.com/apache/incubator-predictionio/pull/425 @takezoe Will do. > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141012#comment-16141012 ] ASF GitHub Bot commented on PIO-110: Github user takezoe commented on the issue: https://github.com/apache/incubator-predictionio/pull/425 All tests are passed! @shimamoto Could you review this request? > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139930#comment-16139930 ] ASF GitHub Bot commented on PIO-110: Github user takezoe commented on the issue: https://github.com/apache/incubator-predictionio/pull/425 Closed and reopened to re-run Travis test. > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139926#comment-16139926 ] ASF GitHub Bot commented on PIO-110: Github user takezoe closed the pull request at: https://github.com/apache/incubator-predictionio/pull/425 > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139927#comment-16139927 ] ASF GitHub Bot commented on PIO-110: GitHub user takezoe reopened a pull request: https://github.com/apache/incubator-predictionio/pull/425 [PIO-110] Refactoring You can merge this pull request into a Git repository by running: $ git pull https://github.com/takezoe/incubator-predictionio refactor-common-code Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/425.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #425 commit 24e1ec7a626857d320d69d8d09b68daead818831 Author: Naoki Takezoe Date: 2017-08-23T05:42:52Z [PIO-110] Refactoring > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-110) Refactor common code shared by CreateServer and BatchPredict
[ https://issues.apache.org/jira/browse/PIO-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137936#comment-16137936 ] ASF GitHub Bot commented on PIO-110: GitHub user takezoe opened a pull request: https://github.com/apache/incubator-predictionio/pull/425 [PIO-110]Refactoring You can merge this pull request into a Git repository by running: $ git pull https://github.com/takezoe/incubator-predictionio refactor-common-code Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/425.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #425 commit 8a2aa88148b96625764b5c619614064136344e66 Author: Naoki Takezoe Date: 2017-08-23T05:42:52Z [PIO-110]Refactoring > Refactor common code shared by CreateServer and BatchPredict > > > Key: PIO-110 > URL: https://issues.apache.org/jira/browse/PIO-110 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Labels: newbie > > {{BatchPredict}} was created in PIO-105 and has a substantial amount of > shared code with {{CreateServer}}. It would be beneficial to refactor both of > them to share as much common code as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-115) Cache name-to-ID lookups for Storage app & channel
[ https://issues.apache.org/jira/browse/PIO-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137419#comment-16137419 ] ASF GitHub Bot commented on PIO-115: Github user mars commented on the issue: https://github.com/apache/incubator-predictionio/pull/424 Thanks for your feedback @dszeto. I've addressed the code style & JIRA issue. > Cache name-to-ID lookups for Storage app & channel > -- > > Key: PIO-115 > URL: https://issues.apache.org/jira/browse/PIO-115 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.11.0-incubating >Reporter: Mars Hall >Assignee: Mars Hall > > When stress testing the Universal Recommender with high-concurrency HTTP/REST > queries, we observed that Elasticsearch traffic was majority composed of > requests resolving the Storage app's name & channel, over and over and over > again! In this case, [each per-query call to > `LEventStore.findByEntity`|https://github.com/heroku/predictionio-engine-ur/blob/master/src/main/scala/URAlgorithm.scala#L694] > re-resolves the app name to an ID. > This changeset implements memoization for the function that performs these > name-to-ID lookups, so that only one set of lookups is performed per process > for each app+channel combination. As a result, we've seen overall throughput > increase 📈 and error rate drop dramatically 📉. > This common optimization effects all storage backends, not just Elasticsearch. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-106) Elasticsearch 5.x StorageClient should reuse RestClient
[ https://issues.apache.org/jira/browse/PIO-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126322#comment-16126322 ] ASF GitHub Bot commented on PIO-106: Github user mars closed the pull request at: https://github.com/apache/incubator-predictionio/pull/420 > Elasticsearch 5.x StorageClient should reuse RestClient > --- > > Key: PIO-106 > URL: https://issues.apache.org/jira/browse/PIO-106 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.11.0-incubating >Reporter: Mars Hall >Assignee: Mars Hall > > When using the proposed [PIO-105 Batch > Predictions|https://issues.apache.org/jira/browse/PIO-105] feature with an > engine that queries Elasticsearch in {{Algorithm#predict}}, Elasticsearch's > REST interface appears to become overloaded, ending with the Spark job being > killed from errors like: > {noformat} > [ERROR] [ESChannels] Failed to access to /pio_meta/channels/_search > [ERROR] [Utils] Aborting task > [ERROR] [ESApps] Failed to access to /pio_meta/apps/_search > [ERROR] [Executor] Exception in task 747.0 in stage 1.0 (TID 749) > [ERROR] [Executor] Exception in task 735.0 in stage 1.0 (TID 737) > [ERROR] [Common$] Invalid app name ur > [ERROR] [Utils] Aborting task > [ERROR] [URAlgorithm] Error when read recent events: > java.lang.IllegalArgumentException: Invalid app name ur > [ERROR] [Executor] Exception in task 749.0 in stage 1.0 (TID 751) > [ERROR] [Utils] Aborting task > [ERROR] [Executor] Exception in task 748.0 in stage 1.0 (TID 750) > [WARN] [TaskSetManager] Lost task 749.0 in stage 1.0 (TID 751, localhost, > executor driver): java.net.BindException: Can't assign requested address > at sun.nio.ch.Net.connect0(Native Method) > at sun.nio.ch.Net.connect(Net.java:454) > at sun.nio.ch.Net.connect(Net.java:446) > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) > at > org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processSessionRequests(DefaultConnectingIOReactor.java:273) > at > org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:139) > at > org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) > at > org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) > at > org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) > at java.lang.Thread.run(Thread.java:745) > {noformat} > After these errors happen & the job is killed, Elasticsearch immediately > recovers. It responds to queries normally. I researched what could cause this > and found an [old issue in the main Elasticsearch > repo|https://github.com/elastic/elasticsearch/issues/3647]. With the hints > given therein about *using keep-alive in the ES client* to avoid these > performance issues, I investigated how PredictionIO's [Elasticsearch > StorageClient|https://github.com/apache/incubator-predictionio/tree/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch] > manages its connections. > I found that unlike the other StorageClients (Elasticsearch1, HBase, JDBC), > Elasticsearch creates a new underlying connection, an Elasticsearch > RestClient, for > [every|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESApps.scala#L80] > > [single|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESApps.scala#L157] > > [query|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESChannels.scala#L78] > & > [interaction|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESEngineInstances.scala#L205] > with its API. As a result, *there is no way Elasticsearch TCP connections > can be reused via HTTP keep-alive*. > High-performance workloads with Elasticsearch 5.x will suffer from these > issues unless we refactor Elasticsearch StorageClient to share the underlying > RestClient instead of [building a new one everytime the client is > used|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/StorageClient.scala#L31]. > There are certainly different approaches we could take to sharing a > RestClient so that its keep-alive behavior may work as designed: > * m
[jira] [Commented] (PIO-106) Elasticsearch 5.x StorageClient should reuse RestClient
[ https://issues.apache.org/jira/browse/PIO-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126321#comment-16126321 ] ASF GitHub Bot commented on PIO-106: Github user mars commented on the issue: https://github.com/apache/incubator-predictionio/pull/420 Closing in favor of https://github.com/apache/incubator-predictionio/pull/421 > Elasticsearch 5.x StorageClient should reuse RestClient > --- > > Key: PIO-106 > URL: https://issues.apache.org/jira/browse/PIO-106 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.11.0-incubating >Reporter: Mars Hall >Assignee: Mars Hall > > When using the proposed [PIO-105 Batch > Predictions|https://issues.apache.org/jira/browse/PIO-105] feature with an > engine that queries Elasticsearch in {{Algorithm#predict}}, Elasticsearch's > REST interface appears to become overloaded, ending with the Spark job being > killed from errors like: > {noformat} > [ERROR] [ESChannels] Failed to access to /pio_meta/channels/_search > [ERROR] [Utils] Aborting task > [ERROR] [ESApps] Failed to access to /pio_meta/apps/_search > [ERROR] [Executor] Exception in task 747.0 in stage 1.0 (TID 749) > [ERROR] [Executor] Exception in task 735.0 in stage 1.0 (TID 737) > [ERROR] [Common$] Invalid app name ur > [ERROR] [Utils] Aborting task > [ERROR] [URAlgorithm] Error when read recent events: > java.lang.IllegalArgumentException: Invalid app name ur > [ERROR] [Executor] Exception in task 749.0 in stage 1.0 (TID 751) > [ERROR] [Utils] Aborting task > [ERROR] [Executor] Exception in task 748.0 in stage 1.0 (TID 750) > [WARN] [TaskSetManager] Lost task 749.0 in stage 1.0 (TID 751, localhost, > executor driver): java.net.BindException: Can't assign requested address > at sun.nio.ch.Net.connect0(Native Method) > at sun.nio.ch.Net.connect(Net.java:454) > at sun.nio.ch.Net.connect(Net.java:446) > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) > at > org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processSessionRequests(DefaultConnectingIOReactor.java:273) > at > org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:139) > at > org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) > at > org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) > at > org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) > at java.lang.Thread.run(Thread.java:745) > {noformat} > After these errors happen & the job is killed, Elasticsearch immediately > recovers. It responds to queries normally. I researched what could cause this > and found an [old issue in the main Elasticsearch > repo|https://github.com/elastic/elasticsearch/issues/3647]. With the hints > given therein about *using keep-alive in the ES client* to avoid these > performance issues, I investigated how PredictionIO's [Elasticsearch > StorageClient|https://github.com/apache/incubator-predictionio/tree/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch] > manages its connections. > I found that unlike the other StorageClients (Elasticsearch1, HBase, JDBC), > Elasticsearch creates a new underlying connection, an Elasticsearch > RestClient, for > [every|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESApps.scala#L80] > > [single|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESApps.scala#L157] > > [query|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESChannels.scala#L78] > & > [interaction|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESEngineInstances.scala#L205] > with its API. As a result, *there is no way Elasticsearch TCP connections > can be reused via HTTP keep-alive*. > High-performance workloads with Elasticsearch 5.x will suffer from these > issues unless we refactor Elasticsearch StorageClient to share the underlying > RestClient instead of [building a new one everytime the client is > used|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/StorageClient.scala#L31]. > There are certainly different approaches we could take to s
[jira] [Commented] (PIO-106) Elasticsearch 5.x StorageClient should reuse RestClient
[ https://issues.apache.org/jira/browse/PIO-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126318#comment-16126318 ] ASF GitHub Bot commented on PIO-106: GitHub user mars opened a pull request: https://github.com/apache/incubator-predictionio/pull/421 Elasticsearch singleton client with authentication Fixes both [PIO-106](https://issues.apache.org/jira/browse/PIO-106) & [PIO-114](https://issues.apache.org/jira/browse/PIO-114), replacing https://github.com/apache/incubator-predictionio/pull/372. These are combined because they each heavily revise the same class. ## Authentication Add optional username-password configuration for the new Elasticsearch 5 client; in `pio-env.sh` config: ```bash # Optional basic HTTP auth PIO_STORAGE_SOURCES_ELASTICSEARCH_USERNAME=my-name PIO_STORAGE_SOURCES_ELASTICSEARCH_PASSWORD=my-secret ``` These credentials are sent in each Elasticsearch request as an HTTP Basic Authorization header. Enables use of public-cloud, hosted Elasticsearch clusters, such as [Bonsai on Heroku](https://elements.heroku.com/addons/bonsai). ## Singleton client This PR moves to a singleton Elasticsearch RestClient which has built-in HTTP keep-alive and TCP connection pooling. Running on this branch, we've seen a 2x speed-up in predictions from the Universal Recommender with ES5, and the feared "cannot assign requested address" 😱 Elasticsearch connection errors have completely disappeared. Running `pio batchpredict` for 160K queries results in only 7 total TCP connections to Elasticsearch. Previously that would escalate to ~25,000 connections before denying further connections. **This fundamentally changes the interface for the new [Elasticsearch 5.x REST client](https://github.com/apache/incubator-predictionio/tree/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch)** introduced with PredictionIO 0.11.0-incubating. With this changeset, the `client` is a single instance of [`org.elasticsearch.client.RestClient`](https://github.com/elastic/elasticsearch/blob/master/client/rest/src/main/java/org/elasticsearch/client/RestClient.java). 🚨 **As a result of this change, any engine templates that directly use the Elasticsearch 5 StorageClient would require an update for compatibility.** The change is this: ### Original ```scala val client: StorageClient = … // code to instantiate client val restClient: RestClient = client.open() try { restClient.performRequest(…) } finally { restClient.close() } ``` ### With this PR ```scala val client: RestClient = … // code to instantiate client client.performRequest(…) ``` *No more balancing `open` & `close` as this is handled by using a new `CleanupFunctions` hook added to the framework in this PR.* [Universal Recommender](https://github.com/actionml/universal-recommender) is the only template that I know of which directly uses the ES StorageClient outside of PredictionIO core. See example [UR changes for compatibility with this PR](https://github.com/heroku/predictionio-engine-ur/compare/esclient-singleton). ### Elasticsearch StorageClient changes * reimplemented as singleton * installs a cleanup function See [StorageClient](https://github.com/apache/incubator-predictionio/compare/develop...mars:esclient-singleton?expand=1#diff-2926f4cfd93ccb02320e2a9503ccd223) ### Core changes A new [`CleanupFunctions`](https://github.com/apache/incubator-predictionio/compare/develop...mars:esclient-singleton?expand=1#diff-2a958821ac58f019fbce38540c775f19) hook has been added which enables developers of storage modules to register anonymous functions with `CleanupFunctions.add { … }` to be executed after Spark-related commands/workflows. The hook is called in a `finally { CleanupFunctions.run() }` from within: * `pio import` * `pio export` * `pio train` * `pio batchpredict` Apologies for the huge indentation shifts from the requisite try-finally blocks: ```scala try { // Freshly indented code. } finally { CleanupFunctions.run() } ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/mars/incubator-predictionio esclient-singleton-with-auth Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/421.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #421 commit f30f27bcc09a397efb42a7923938beceaeac37bf Author: Mars Hall Date: 2017-08-08T23:29:15Z Migrate to singleto
[jira] [Commented] (PIO-106) Elasticsearch 5.x StorageClient should reuse RestClient
[ https://issues.apache.org/jira/browse/PIO-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122849#comment-16122849 ] ASF GitHub Bot commented on PIO-106: Github user mars commented on the issue: https://github.com/apache/incubator-predictionio/pull/420 Seem to solve this [long ago reported Elasticsearch connection issue](https://github.com/elastic/elasticsearch/issues/3647) > Elasticsearch 5.x StorageClient should reuse RestClient > --- > > Key: PIO-106 > URL: https://issues.apache.org/jira/browse/PIO-106 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.11.0-incubating >Reporter: Mars Hall >Assignee: Mars Hall > > When using the proposed [PIO-105 Batch > Predictions|https://issues.apache.org/jira/browse/PIO-105] feature with an > engine that queries Elasticsearch in {{Algorithm#predict}}, Elasticsearch's > REST interface appears to become overloaded, ending with the Spark job being > killed from errors like: > {noformat} > [ERROR] [ESChannels] Failed to access to /pio_meta/channels/_search > [ERROR] [Utils] Aborting task > [ERROR] [ESApps] Failed to access to /pio_meta/apps/_search > [ERROR] [Executor] Exception in task 747.0 in stage 1.0 (TID 749) > [ERROR] [Executor] Exception in task 735.0 in stage 1.0 (TID 737) > [ERROR] [Common$] Invalid app name ur > [ERROR] [Utils] Aborting task > [ERROR] [URAlgorithm] Error when read recent events: > java.lang.IllegalArgumentException: Invalid app name ur > [ERROR] [Executor] Exception in task 749.0 in stage 1.0 (TID 751) > [ERROR] [Utils] Aborting task > [ERROR] [Executor] Exception in task 748.0 in stage 1.0 (TID 750) > [WARN] [TaskSetManager] Lost task 749.0 in stage 1.0 (TID 751, localhost, > executor driver): java.net.BindException: Can't assign requested address > at sun.nio.ch.Net.connect0(Native Method) > at sun.nio.ch.Net.connect(Net.java:454) > at sun.nio.ch.Net.connect(Net.java:446) > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) > at > org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processSessionRequests(DefaultConnectingIOReactor.java:273) > at > org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:139) > at > org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) > at > org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) > at > org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) > at java.lang.Thread.run(Thread.java:745) > {noformat} > After these errors happen & the job is killed, Elasticsearch immediately > recovers. It responds to queries normally. I researched what could cause this > and found an [old issue in the main Elasticsearch > repo|https://github.com/elastic/elasticsearch/issues/3647]. With the hints > given therein about *using keep-alive in the ES client* to avoid these > performance issues, I investigated how PredictionIO's [Elasticsearch > StorageClient|https://github.com/apache/incubator-predictionio/tree/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch] > manages its connections. > I found that unlike the other StorageClients (Elasticsearch1, HBase, JDBC), > Elasticsearch creates a new underlying connection, an Elasticsearch > RestClient, for > [every|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESApps.scala#L80] > > [single|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESApps.scala#L157] > > [query|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESChannels.scala#L78] > & > [interaction|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESEngineInstances.scala#L205] > with its API. As a result, *there is no way Elasticsearch TCP connections > can be reused via HTTP keep-alive*. > High-performance workloads with Elasticsearch 5.x will suffer from these > issues unless we refactor Elasticsearch StorageClient to share the underlying > RestClient instead of [building a new one everytime the client is > used|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/StorageClient.scala#L31]. > There are cer
[jira] [Commented] (PIO-106) Elasticsearch 5.x StorageClient should reuse RestClient
[ https://issues.apache.org/jira/browse/PIO-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122618#comment-16122618 ] ASF GitHub Bot commented on PIO-106: Github user mars commented on a diff in the pull request: https://github.com/apache/incubator-predictionio/pull/420#discussion_r132601642 --- Diff: storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESEvaluationInstances.scala --- @@ -110,28 +104,24 @@ class ESEvaluationInstances(client: ESClient, config: StorageClientConfig, index error(s"Failed to access to /$index/$estype/$id", e) None } finally { - restClient.close() + client.close() --- End diff -- This `close` should be removed. > Elasticsearch 5.x StorageClient should reuse RestClient > --- > > Key: PIO-106 > URL: https://issues.apache.org/jira/browse/PIO-106 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.11.0-incubating >Reporter: Mars Hall >Assignee: Mars Hall > > When using the proposed [PIO-105 Batch > Predictions|https://issues.apache.org/jira/browse/PIO-105] feature with an > engine that queries Elasticsearch in {{Algorithm#predict}}, Elasticsearch's > REST interface appears to become overloaded, ending with the Spark job being > killed from errors like: > {noformat} > [ERROR] [ESChannels] Failed to access to /pio_meta/channels/_search > [ERROR] [Utils] Aborting task > [ERROR] [ESApps] Failed to access to /pio_meta/apps/_search > [ERROR] [Executor] Exception in task 747.0 in stage 1.0 (TID 749) > [ERROR] [Executor] Exception in task 735.0 in stage 1.0 (TID 737) > [ERROR] [Common$] Invalid app name ur > [ERROR] [Utils] Aborting task > [ERROR] [URAlgorithm] Error when read recent events: > java.lang.IllegalArgumentException: Invalid app name ur > [ERROR] [Executor] Exception in task 749.0 in stage 1.0 (TID 751) > [ERROR] [Utils] Aborting task > [ERROR] [Executor] Exception in task 748.0 in stage 1.0 (TID 750) > [WARN] [TaskSetManager] Lost task 749.0 in stage 1.0 (TID 751, localhost, > executor driver): java.net.BindException: Can't assign requested address > at sun.nio.ch.Net.connect0(Native Method) > at sun.nio.ch.Net.connect(Net.java:454) > at sun.nio.ch.Net.connect(Net.java:446) > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) > at > org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processSessionRequests(DefaultConnectingIOReactor.java:273) > at > org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:139) > at > org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) > at > org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) > at > org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) > at java.lang.Thread.run(Thread.java:745) > {noformat} > After these errors happen & the job is killed, Elasticsearch immediately > recovers. It responds to queries normally. I researched what could cause this > and found an [old issue in the main Elasticsearch > repo|https://github.com/elastic/elasticsearch/issues/3647]. With the hints > given therein about *using keep-alive in the ES client* to avoid these > performance issues, I investigated how PredictionIO's [Elasticsearch > StorageClient|https://github.com/apache/incubator-predictionio/tree/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch] > manages its connections. > I found that unlike the other StorageClients (Elasticsearch1, HBase, JDBC), > Elasticsearch creates a new underlying connection, an Elasticsearch > RestClient, for > [every|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESApps.scala#L80] > > [single|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESApps.scala#L157] > > [query|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESChannels.scala#L78] > & > [interaction|https://github.com/apache/incubator-predictionio/blob/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESEngineInstances.scala#L205] > with its API. As a result, *there is no way Elasticsearch TCP connections > can be reused via HTTP keep-alive*. > High-performance workloads with Elasticsearch
[jira] [Commented] (PIO-106) Elasticsearch 5.x StorageClient should reuse RestClient
[ https://issues.apache.org/jira/browse/PIO-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122574#comment-16122574 ] ASF GitHub Bot commented on PIO-106: GitHub user mars opened a pull request: https://github.com/apache/incubator-predictionio/pull/420 [PIO-106] Elasticsearch 5.x StorageClient should reuse RestClient Implements [PIO-106](https://issues.apache.org/jira/browse/PIO-106) This PR moves to a singleton Elasticsearch RestClient which has built-in HTTP keep-alive and TCP connection pooling. Running on this branch, we've seen a 2x speed-up in predictions from the Universal Recommender with ES5, and the feared "cannot bind" 😱 Elasticsearch connection errors have completely disappeared. Running `pio batchpredict` for 170K queries results in only 7 total TCP connections to Elasticsearch. Previously that would escalate to ~25,000 connections before denying further connections. **This fundamentally changes the interface for the new [Elasticsearch 5.x REST client](https://github.com/apache/incubator-predictionio/tree/develop/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch)** introduced with PredictionIO 0.11.0-incubating. With this changeset, the `client` is a single instance of [`org.elasticsearch.client.RestClient`](https://github.com/elastic/elasticsearch/blob/master/client/rest/src/main/java/org/elasticsearch/client/RestClient.java). 🚨 **As a result of this change, any engine templates that directly use the Elasticsearch 5 StorageClient would require an update for compatibility.** The change is this: ### Original ```scala val client: StorageClient = … // code to instantiate client val restClient: RestClient = client.open() try { restClient.performRequest(…) } finally { restClient.close() } ``` ### With this PR ```scala val client: RestClient = … // code to instantiate client client.performRequest(…) ``` *No more balancing `open` & `close` as this is handled by using a new `CleanupFunctions` hook added to the framework in this PR.* [Universal Recommender](https://github.com/actionml/universal-recommender) is the only template that I know of which directly uses the ES StorageClient outside of PredictionIO core. See the [UR changes for compatibility with this PR](https://github.com/heroku/predictionio-engine-ur/compare/esclient-singleton). ### Elasticsearch StorageClient changes * reimplemented as singleton * installs a cleanup function See [StorageClient](https://github.com/apache/incubator-predictionio/compare/develop...mars:esclient-singleton?expand=1#diff-2926f4cfd93ccb02320e2a9503ccd223) ### Core changes A new [`CleanupFunctions`](https://github.com/apache/incubator-predictionio/compare/develop...mars:esclient-singleton?expand=1#diff-2a958821ac58f019fbce38540c775f19) hook has been added which enables developers of storage modules to register anonymous functions with `CleanupFunctions.add { … }` to be executed after Spark-related commands/workflows. The hook is called in a `finally { CleanupFunctions.run() }` from within: * `pio import` * `pio export` * `pio train` * `pio batchpredict` Apologies for the huge indentation shifts from the requisite try-finally blocks: ```scala try { // Freshly indented code. } finally { CleanupFunctions.run() } ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/mars/incubator-predictionio esclient-singleton Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/420.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #420 commit f30f27bcc09a397efb42a7923938beceaeac37bf Author: Mars Hall Date: 2017-08-08T23:29:15Z Migrate to singleton Elasticsearch client to use underlying connection pooling (PoolingNHttpClientConnectionManager) commit d99927089a41cb85f525cb74bdf394eed4686bf2 Author: Mars Hall Date: 2017-08-10T03:00:58Z Log stacktrace for Storage initialization errors. commit dc4c31cbcddbb3b281d52b8099e210adc546d1ed Author: Mars Hall Date: 2017-08-10T22:55:38Z Remove shade rule that breaks Elasticsearch 5 client commit 7634a7ab720239d5f8efda85f67b26bdaff797f8 Author: Mars Hall Date: 2017-08-10T22:59:01Z Collect & run cleanup functions to allow spark-submit processes to end gracefully. commit 5953451f40e554eafa887328122c794edbbd8f1d Author: Mars Hall Date: 2017-08-11T00:06:24Z Rename CleanupFunctions to match object name > Elasticsearch 5.x StorageClient should reuse RestClient >
[jira] [Commented] (PIO-66) Document JIRA processes and add to public documentation
[ https://issues.apache.org/jira/browse/PIO-66?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114703#comment-16114703 ] ASF GitHub Bot commented on PIO-66: --- Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/417 > Document JIRA processes and add to public documentation > --- > > Key: PIO-66 > URL: https://issues.apache.org/jira/browse/PIO-66 > Project: PredictionIO > Issue Type: Task >Reporter: Sara Asher >Assignee: Takako Shimamoto > Fix For: 0.12.0-incubating > > > https://docs.google.com/document/d/1nQpENncXZq72KeI3WMe_X8Xz8HKkYO2QC12GD3ZKP9g/edit#heading=h.4og7ud94e5g1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-63) Fix incubator branding issues
[ https://issues.apache.org/jira/browse/PIO-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114042#comment-16114042 ] ASF GitHub Bot commented on PIO-63: --- Github user takezoe commented on the issue: https://github.com/apache/incubator-predictionio/pull/405 @dszeto I confirmed that my fix has been applied. Thanks for your help! > Fix incubator branding issues > - > > Key: PIO-63 > URL: https://issues.apache.org/jira/browse/PIO-63 > Project: PredictionIO > Issue Type: Bug >Affects Versions: 0.11.0-incubating >Reporter: Donald Szeto >Assignee: Naoki Takezoe > Fix For: 0.12.0-incubating > > > {quote} > John D. Ament > Please review the branding guide here: > http://incubator.apache.org/guides/branding.html > Specifically, we expect all podlings to show a logo (the actual logo has > changed) for the Incubator, and include a disclaimer (the same release > disclaimer) on the website. I can find neither on your website. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-111) Document pio batchpredict
[ https://issues.apache.org/jira/browse/PIO-111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113287#comment-16113287 ] ASF GitHub Bot commented on PIO-111: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/418 > Document pio batchpredict > - > > Key: PIO-111 > URL: https://issues.apache.org/jira/browse/PIO-111 > Project: PredictionIO > Issue Type: Task > Components: Documentation >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Mars Hall > Labels: newbie > > {{pio batchpredict}} is a new feature created in PIO-105. It needs to be > documented. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-66) Document JIRA processes and add to public documentation
[ https://issues.apache.org/jira/browse/PIO-66?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113261#comment-16113261 ] ASF GitHub Bot commented on PIO-66: --- Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/417 @shimamoto @takezoe Yes that sounds good. I'll merge this. Thanks! > Document JIRA processes and add to public documentation > --- > > Key: PIO-66 > URL: https://issues.apache.org/jira/browse/PIO-66 > Project: PredictionIO > Issue Type: Task >Reporter: Sara Asher >Assignee: Sara Asher > > https://docs.google.com/document/d/1nQpENncXZq72KeI3WMe_X8Xz8HKkYO2QC12GD3ZKP9g/edit#heading=h.4og7ud94e5g1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-63) Fix incubator branding issues
[ https://issues.apache.org/jira/browse/PIO-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113148#comment-16113148 ] ASF GitHub Bot commented on PIO-63: --- Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/405 @takezoe It's done by our Jenkins project on ASF: https://builds.apache.org/job/PredictionIO-build-site/69/console Looks like the site built correctly but the publish step failed. I am investigating. > Fix incubator branding issues > - > > Key: PIO-63 > URL: https://issues.apache.org/jira/browse/PIO-63 > Project: PredictionIO > Issue Type: Bug >Affects Versions: 0.11.0-incubating >Reporter: Donald Szeto >Assignee: Donald Szeto > > {quote} > John D. Ament > Please review the branding guide here: > http://incubator.apache.org/guides/branding.html > Specifically, we expect all podlings to show a logo (the actual logo has > changed) for the Incubator, and include a disclaimer (the same release > disclaimer) on the website. I can find neither on your website. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-63) Fix incubator branding issues
[ https://issues.apache.org/jira/browse/PIO-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112441#comment-16112441 ] ASF GitHub Bot commented on PIO-63: --- Github user takezoe commented on the issue: https://github.com/apache/incubator-predictionio/pull/405 Merged to `livedoc` branch but the web site hasn't been updated yet. I'm wondering how is docs deployed on the web site... > Fix incubator branding issues > - > > Key: PIO-63 > URL: https://issues.apache.org/jira/browse/PIO-63 > Project: PredictionIO > Issue Type: Bug >Affects Versions: 0.11.0-incubating >Reporter: Donald Szeto >Assignee: Donald Szeto > > {quote} > John D. Ament > Please review the branding guide here: > http://incubator.apache.org/guides/branding.html > Specifically, we expect all podlings to show a logo (the actual logo has > changed) for the Incubator, and include a disclaimer (the same release > disclaimer) on the website. I can find neither on your website. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-63) Fix incubator branding issues
[ https://issues.apache.org/jira/browse/PIO-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112244#comment-16112244 ] ASF GitHub Bot commented on PIO-63: --- Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/405 > Fix incubator branding issues > - > > Key: PIO-63 > URL: https://issues.apache.org/jira/browse/PIO-63 > Project: PredictionIO > Issue Type: Bug >Affects Versions: 0.11.0-incubating >Reporter: Donald Szeto >Assignee: Donald Szeto > > {quote} > John D. Ament > Please review the branding guide here: > http://incubator.apache.org/guides/branding.html > Specifically, we expect all podlings to show a logo (the actual logo has > changed) for the Incubator, and include a disclaimer (the same release > disclaimer) on the website. I can find neither on your website. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-66) Document JIRA processes and add to public documentation
[ https://issues.apache.org/jira/browse/PIO-66?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112162#comment-16112162 ] ASF GitHub Bot commented on PIO-66: --- Github user shimamoto commented on the issue: https://github.com/apache/incubator-predictionio/pull/417 Currently there isn't section to put documents for committers. It is better to clean up not only Resources section but also the others. It might be a good idea to do it in another task. > Document JIRA processes and add to public documentation > --- > > Key: PIO-66 > URL: https://issues.apache.org/jira/browse/PIO-66 > Project: PredictionIO > Issue Type: Task >Reporter: Sara Asher >Assignee: Sara Asher > > https://docs.google.com/document/d/1nQpENncXZq72KeI3WMe_X8Xz8HKkYO2QC12GD3ZKP9g/edit#heading=h.4og7ud94e5g1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-66) Document JIRA processes and add to public documentation
[ https://issues.apache.org/jira/browse/PIO-66?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112012#comment-16112012 ] ASF GitHub Bot commented on PIO-66: --- Github user takezoe commented on the issue: https://github.com/apache/incubator-predictionio/pull/417 I think adding a new section like "For Developers" is good for Release Cadence and other documentation for committers. Further, we should clean up the whole index in the left sidebar. Contents are good, but the index is not much kind. I think we can classify them more intelligible. Anyway, we should do it on an another ticket. How about merging this addition as is and creating a new ticket about re-structuring the documentation? > Document JIRA processes and add to public documentation > --- > > Key: PIO-66 > URL: https://issues.apache.org/jira/browse/PIO-66 > Project: PredictionIO > Issue Type: Task >Reporter: Sara Asher >Assignee: Sara Asher > > https://docs.google.com/document/d/1nQpENncXZq72KeI3WMe_X8Xz8HKkYO2QC12GD3ZKP9g/edit#heading=h.4og7ud94e5g1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-111) Document pio batchpredict
[ https://issues.apache.org/jira/browse/PIO-111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111972#comment-16111972 ] ASF GitHub Bot commented on PIO-111: GitHub user mars opened a pull request: https://github.com/apache/incubator-predictionio/pull/418 batchpredict docs JIRA [PIO-111](https://issues.apache.org/jira/browse/PIO-111) You can merge this pull request into a Git repository by running: $ git pull https://github.com/mars/incubator-predictionio batchpredict-docs Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/418.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #418 commit 382d238f73fb04728b5fba9fc0484084ffc0945d Author: Mars Hall Date: 2017-08-02T22:21:39Z Update therubyracer gem to most recent patch-level for macOS 10.12 compatibility. commit eb79654f2c95abaf747f163bc43f86e8ed9328a0 Author: Mars Hall Date: 2017-08-03T00:29:39Z Documentation for `pio batchpredict` > Document pio batchpredict > - > > Key: PIO-111 > URL: https://issues.apache.org/jira/browse/PIO-111 > Project: PredictionIO > Issue Type: Task > Components: Documentation >Affects Versions: 0.12.0-incubating >Reporter: Donald Szeto >Assignee: Mars Hall > Labels: newbie > > {{pio batchpredict}} is a new feature created in PIO-105. It needs to be > documented. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-66) Document JIRA processes and add to public documentation
[ https://issues.apache.org/jira/browse/PIO-66?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109998#comment-16109998 ] ASF GitHub Bot commented on PIO-66: --- Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/417 Hey, what do you think about putting Release Cadence under the Getting Involved section? We should probably clean up the Resources section because it sounds pretty generic. > Document JIRA processes and add to public documentation > --- > > Key: PIO-66 > URL: https://issues.apache.org/jira/browse/PIO-66 > Project: PredictionIO > Issue Type: Task >Reporter: Sara Asher >Assignee: Sara Asher > > https://docs.google.com/document/d/1nQpENncXZq72KeI3WMe_X8Xz8HKkYO2QC12GD3ZKP9g/edit#heading=h.4og7ud94e5g1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-105) Batch Predictions
[ https://issues.apache.org/jira/browse/PIO-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109864#comment-16109864 ] ASF GitHub Bot commented on PIO-105: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/412 > Batch Predictions > - > > Key: PIO-105 > URL: https://issues.apache.org/jira/browse/PIO-105 > Project: PredictionIO > Issue Type: New Feature > Components: Core >Reporter: Mars Hall >Assignee: Mars Hall > > Implement a new {{pio batchpredict}} command to enable massive, fast, batch > predictions from a trained model. Read a multi-object JSON file as the input > format, with one query object per line. Similarly, write results to a > multi-object JSON file, with one prediction result + its original query per > line. > Currently getting bulk predictions from PredictionIO is possible with either: > * a {{pio eval}} script, which will always train a fresh, unvalidated model > before getting predictions > * a custom script that hits the {{queries.json}} HTTP API, which is a serious > bottleneck when requesting hundreds-of-thousands or millions of predictions > Neither of these existing bulk-prediction hacks are adequate for the reasons > mentioned. > It's time for this use-case to be a firstclass command :D > Pull request https://github.com/apache/incubator-predictionio/pull/412 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-105) Batch Predictions
[ https://issues.apache.org/jira/browse/PIO-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109723#comment-16109723 ] ASF GitHub Bot commented on PIO-105: Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/412 Created https://issues.apache.org/jira/browse/PIO-110 and https://issues.apache.org/jira/browse/PIO-111 as follow-ups. Thanks @mars for the feature and @takezoe for the feedback! > Batch Predictions > - > > Key: PIO-105 > URL: https://issues.apache.org/jira/browse/PIO-105 > Project: PredictionIO > Issue Type: New Feature > Components: Core >Reporter: Mars Hall >Assignee: Mars Hall > > Implement a new {{pio batchpredict}} command to enable massive, fast, batch > predictions from a trained model. Read a multi-object JSON file as the input > format, with one query object per line. Similarly, write results to a > multi-object JSON file, with one prediction result + its original query per > line. > Currently getting bulk predictions from PredictionIO is possible with either: > * a {{pio eval}} script, which will always train a fresh, unvalidated model > before getting predictions > * a custom script that hits the {{queries.json}} HTTP API, which is a serious > bottleneck when requesting hundreds-of-thousands or millions of predictions > Neither of these existing bulk-prediction hacks are adequate for the reasons > mentioned. > It's time for this use-case to be a firstclass command :D > Pull request https://github.com/apache/incubator-predictionio/pull/412 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-66) Document JIRA processes and add to public documentation
[ https://issues.apache.org/jira/browse/PIO-66?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16108241#comment-16108241 ] ASF GitHub Bot commented on PIO-66: --- GitHub user shimamoto opened a pull request: https://github.com/apache/incubator-predictionio/pull/417 [PIO-66] JIRA and release process for PredictionIO https://issues.apache.org/jira/browse/PIO-66?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel You can merge this pull request into a Git repository by running: $ git pull https://github.com/shimamoto/incubator-predictionio PIO-66_jira-processes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/417.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #417 commit 4c2d65731a9bf11be28421c17be2fbd9cb707a33 Author: shimamoto Date: 2017-07-31T09:53:54Z [PIO-66] Document JIRA processes and add to public documentation. > Document JIRA processes and add to public documentation > --- > > Key: PIO-66 > URL: https://issues.apache.org/jira/browse/PIO-66 > Project: PredictionIO > Issue Type: Task >Reporter: Sara Asher >Assignee: Sara Asher > > https://docs.google.com/document/d/1nQpENncXZq72KeI3WMe_X8Xz8HKkYO2QC12GD3ZKP9g/edit#heading=h.4og7ud94e5g1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-104) Make proper implementation of plugins
[ https://issues.apache.org/jira/browse/PIO-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105616#comment-16105616 ] ASF GitHub Bot commented on PIO-104: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/407 > Make proper implementation of plugins > - > > Key: PIO-104 > URL: https://issues.apache.org/jira/browse/PIO-104 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.11.0-incubating >Reporter: Naoki Takezoe >Assignee: Naoki Takezoe > Fix For: 0.12.0-incubating > > > The current plugin system has some issues to be fixed: > - start() method of plugin is not called, this method seems to be unnecessary > - outputSniffer exists as interface, but it's not implemented in engine server > We should fix them before documenting plugin usage in PIO-101 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-104) Make proper implementation of plugins
[ https://issues.apache.org/jira/browse/PIO-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105614#comment-16105614 ] ASF GitHub Bot commented on PIO-104: Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/407 LGTM. Thanks @takezoe ! > Make proper implementation of plugins > - > > Key: PIO-104 > URL: https://issues.apache.org/jira/browse/PIO-104 > Project: PredictionIO > Issue Type: Improvement > Components: Core >Affects Versions: 0.11.0-incubating >Reporter: Naoki Takezoe >Assignee: Naoki Takezoe > > The current plugin system has some issues to be fixed: > - start() method of plugin is not called, this method seems to be unnecessary > - outputSniffer exists as interface, but it's not implemented in engine server > We should fix them before documenting plugin usage in PIO-101 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIO-102) ESEngineInstances `getAll` results out of order (Elasticsearch 5.x)
[ https://issues.apache.org/jira/browse/PIO-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105501#comment-16105501 ] ASF GitHub Bot commented on PIO-102: Github user asfgit closed the pull request at: https://github.com/apache/incubator-predictionio/pull/406 > ESEngineInstances `getAll` results out of order (Elasticsearch 5.x) > --- > > Key: PIO-102 > URL: https://issues.apache.org/jira/browse/PIO-102 > Project: PredictionIO > Issue Type: Bug > Components: Core >Affects Versions: 0.11.0-incubating >Reporter: Mars Hall >Assignee: Mars Hall > > Using the new Elasticsearch 5.x REST storage client as the meta storage > source (`PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH` setup in > conf/pio-env.sh), I found that once an engine has been trained a certain > number of times, that the most recent engine instance is no longer retrieved. > So, I tracked down where those Elasticsearch queries originate. > In the original Elasticsearch 1.x storage client, [the "scroll" pagination > responses are collected by > *appending*|https://github.com/apache/incubator-predictionio/blob/release/0.11.0/storage/elasticsearch1/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESUtils.scala#L44] > them to one another. > In the new Elasticsearch 5.x client, [the "scroll" responses are collected by > *prepending*|https://github.com/apache/incubator-predictionio/blob/release/0.11.0/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESUtils.scala#L152] > them to one another. > This out-of-order concatenation breaks [ESEngineInstances > `getLatestCompleted`|https://github.com/apache/incubator-predictionio/blob/release/0.11.0/storage/elasticsearch/src/main/scala/org/apache/predictionio/data/storage/elasticsearch/ESEngineInstances.scala#L192] > by erroneously replacing the head of the results with an older engine > instance, when there are enough engine instances to overflow a single page of > Elasticsearch hits. > I've observed this buggy behavior after ten trainings, when enough engine > instances are stored to trigger Elasticsearch's scroll feature. > Pull request: https://github.com/apache/incubator-predictionio/pull/406 -- This message was sent by Atlassian JIRA (v6.4.14#64029)