gt; -- Forwarded message -
> From: Pat Ferrel mailto:p...@occamsmachete.com>>
> Date: Tue, Dec 5, 2023 at 11:58 AM
> Subject: Unsubscribe
> To: mailto:issues@mahout.apache.org>>
>
>
> Unsubscribe
Unsubscribe
[
https://issues.apache.org/jira/browse/MAHOUT-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217820#comment-17217820
]
Pat Ferrel commented on MAHOUT-2023:
I don't install Mahout as a shell process. This only occurs
Still haven’t had a chance to test since it will take some experimentation
to figure out jars needed etc. My test is to replace 0.13 with 0.14.1
Still I see no reason to delay the release for my slow testing
+1
From: Andrew Musselman
Reply: dev@mahout.apache.org
Date: September 28, 2020 at
To try to keep this on-subject I’ll say that I’ve been working on what I once
saw as a next-gen PIO. It is ASL 2, and has 2 engines that ran in PIO — most
notably the Universal Recommender. We offered to make the Harness project part
of PIO a couple years back but didn’t get much interest. It
To try to keep this on-subject I’ll say that I’ve been working on what I once
saw as a next-gen PIO. It is ASL 2, and has 2 engines that ran in PIO — most
notably the Universal Recommender. We offered to make the Harness project part
of PIO a couple years back but didn’t get much interest. It
Big fun. Thanks for putting this together.
I’ll abuse my few Twitter followers with the announcement.
From: Trevor Grant
Reply: user@mahout.apache.org
Date: August 12, 2020 at 5:59:45 AM
To: Mahout Dev List , user@mahout.apache.org
Subject: [ANNOUNCE] Mahout Con 2020 (A sub-track of
Big fun. Thanks for putting this together.
I’ll abuse my few Twitter followers with the announcement.
From: Trevor Grant
Reply: u...@mahout.apache.org
Date: August 12, 2020 at 5:59:45 AM
To: Mahout Dev List , u...@mahout.apache.org
Subject: [ANNOUNCE] Mahout Con 2020 (A sub-track of
I have used Spark for several years and realize from recent chatter on this
list that I don’t really understand how it uses memory.
Specifically is spark.executor.memory and spark.driver.memory taken from the
JVM heap when does Spark take memory from JVM heap and when it is from off JVM
heap.
IntelliJ Scala works well when debugging master=local. Has anyone used it for
remote/cluster debugging? I’ve heard it is possible...
From: Luiz Camargo
Reply: Luiz Camargo
Date: April 7, 2020 at 10:26:35 AM
To: Dennis Suhari
Cc: yeikel valdes , zahidr1...@gmail.com
, user@spark.apache.org
PredictionIO is scalable BY SCALING ITS SUB-SERVICES. Running on a single
machine sounds like no scaling has been executed or even planned.
How do you scale ANY system?
1) vertical scaling: make the instance larger with more cores, more disk,
and most importantly more memory. Increase whatever
+1 from another user fwiw. We also have livy containers and helm charts. The
real problem is deploying a Spark Cluster in k8s. We know of no working images
for this. The Spark team seems focused on deploying Jobs with k8s, which is
fine but is not enough. We need to deploy Spark itself. We
Seems like some action should be taken before 2 years, even if it is to
close the PR because it is not appropriate. Isn’t this the responsibility
of the chair to guard against committer changes where the contributor is
still willing? Or if a mentor is guiding the PR they should help it get
todays
question.
From: Matt Cheah
Reply: Matt Cheah
Date: July 1, 2019 at 5:14:05 PM
To: Pat Ferrel ,
user@spark.apache.org
Subject: Re: k8s orchestrating Spark service
> We’d like to deploy Spark Workers/Executors and Master (whatever master
is easiest to talk about since we really do
Oops, should have said: "I may have missed something but I don’t recall PIO
being released by Apache as an ASF maintained container/image release
artifact."
From: Pat Ferrel
Reply: user@predictionio.apache.org
Date: July 3, 2019 at 11:16:43 AM
To: Wei Chen ,
d...@predictionio.
Oops, should have said: "I may have missed something but I don’t recall PIO
being released by Apache as an ASF maintained container/image release
artifact."
From: Pat Ferrel
Reply: u...@predictionio.apache.org
Date: July 3, 2019 at 11:16:43 AM
To: Wei Chen ,
dev@predictionio.apach
BTW the container you use is supported by the container author, if at all.
I may have missed something but I don’t recall PIO being released by Apache
as an ASF maintained release artifact.
I wish ASF projects would publish Docker Images made for real system
integration, but IIRC PIO does not.
BTW the container you use is supported by the container author, if at all.
I may have missed something but I don’t recall PIO being released by Apache
as an ASF maintained release artifact.
I wish ASF projects would publish Docker Images made for real system
integration, but IIRC PIO does not.
run our Driver and Executors considering that the Driver is part of the
Server process?
Maybe we are talking past each other with some mistaken assumptions (on my
part perhaps).
From: Pat Ferrel
Reply: Pat Ferrel
Date: July 1, 2019 at 4:57:20 PM
To: user@spark.apache.org , Matt
Cheah
anyone have something they like?
From: Matt Cheah
Reply: Matt Cheah
Date: July 1, 2019 at 4:45:55 PM
To: Pat Ferrel ,
user@spark.apache.org
Subject: Re: k8s orchestrating Spark service
Sorry, I don’t quite follow – why use the Spark standalone cluster as an
in-between layer when one can just
of services including Spark. The rest work,
we are asking if anyone has seen a good starting point for adding Spark as
a k8s managed service.
From: Matt Cheah
Reply: Matt Cheah
Date: July 1, 2019 at 3:26:20 PM
To: Pat Ferrel ,
user@spark.apache.org
Subject: Re: k8s orchestrating Spark service
We're trying to setup a system that includes Spark. The rest of the
services have good Docker containers and Helm charts to start from.
Spark on the other hand is proving difficult. We forked a container and
have tried to create our own chart but are having several problems with
this.
So back to
It is always dangerous to run a NEWER version of code on an OLDER cluster.
The danger increases with the semver change and this one is not just a
build #. In other word 2.4 is considered to be a fairly major change from
2.3. Not much else can be said.
From: Nicolas Paris
Reply:
In order to create an application that executes code on Spark we have a
long lived process. It periodically runs jobs programmatically on a Spark
cluster, meaning it does not use spark-submit. The Jobs it executes have
varying requirements for memory so we want to have the Spark Driver run in
the
Streams have no end until watermarked or closed. Joins need bounded
datasets, et voila. Something tells me you should consider the streaming
nature of your data and whether your joins need to use increments/snippets
of infinite streams or to re-join the entire contents of the streams
accumulated
@Riccardo
Spark does not do the DL learning part of the pipeline (afaik) so it is
limited to data ingestion and transforms (ETL). It therefore is optional
and other ETL options might be better for you.
Most of the technologies @Gourav mentions have their own scaling based on
their own compute
Does Livy work with a Standalone Spark Master?
Most people running on a Windows machine use a VM running Linux. You will
run into constant issues if you go down another road with something like
cygwin, so avoid the headache.
From: Steve Pruitt
Reply: user@predictionio.apache.org
Date: April 15, 2019 at 10:59:09 AM
To:
To slightly over simplify, all it takes to be a TLP for Apache is:
1) clear community support
2) a couple Apache members to sponsor (Incubator members help)
3) demonstrated processes that follow the Apache way
4) the will of committers and PMC to move to TLP
What is missing in Livy?
I am
Thanks, are you referring to
https://github.com/spark-jobserver/spark-jobserver or the undocumented REST
job server included in Spark?
From: Jason Nerothin
Reply: Jason Nerothin
Date: March 28, 2019 at 2:53:05 PM
To: Pat Ferrel
Cc: Felix Cheung
, Marcelo
Vanzin , user
Subject: Re
;-)
Great idea. Can you suggest a project?
Apache PredictionIO uses spark-submit (very ugly) and Apache Mahout only
launches trivially in test apps since most uses are as a lib.
From: Felix Cheung
Reply: Felix Cheung
Date: March 28, 2019 at 9:42:31 AM
To: Pat Ferrel , Marcelo
Vanzin
Cc
e mode you might be able to
use this:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/Client.scala
Lastly, you can always check where Spark processes run by executing ps on
the machine, i.e. `ps aux | grep java`.
Best,
Jianneng
*From:* Pat Ferrel
*Dat
Reply: Marcelo Vanzin
Date: March 26, 2019 at 1:59:36 PM
To: Pat Ferrel
Cc: user
Subject: Re: spark.submit.deployMode: cluster
If you're not using spark-submit, then that option does nothing.
If by "context creation API" you mean "new SparkContext()" or an
equ
I have a server that starts a Spark job using the context creation API. It
DOES NOY use spark-submit.
I set spark.submit.deployMode = “cluster”
In the GUI I see 2 workers with 2 executors. The link for running
application “name” goes back to my server, the machine that launched the
job.
This is
:07 AM
To: Pat Ferrel
Cc: Akhil Das , user
Subject: Re: Where does the Driver run?
Hi Pat,
Indeed, I don't think that it's possible to use cluster mode w/o
spark-submit. All the docs I see appear to always describe needing to use
spark-submit for cluster mode -- it's not even compatible
only guessing at that).
Further; if we don’t use spark-submit we can’t use deployMode = cluster ???
From: Akhil Das
Reply: Akhil Das
Date: March 24, 2019 at 7:45:07 PM
To: Pat Ferrel
Cc: user
Subject: Re: Where does the Driver run?
There's also a driver ui (usually available on port 4040
60g
BTW I would expect this to create one Executor, one Driver, and the Master
on 2 Workers.
From: Andrew Melo
Reply: Andrew Melo
Date: March 24, 2019 at 12:46:35 PM
To: Pat Ferrel
Cc: Akhil Das , user
Subject: Re: Where does the Driver run?
Hi Pat,
On Sun, Mar 24, 2019 at 1:03 PM
60g
From: Andrew Melo
Reply: Andrew Melo
Date: March 24, 2019 at 12:46:35 PM
To: Pat Ferrel
Cc: Akhil Das , user
Subject: Re: Where does the Driver run?
Hi Pat,
On Sun, Mar 24, 2019 at 1:03 PM Pat Ferrel wrote:
> Thanks, I have seen this many times in my research. Paraphrasi
: Akhil Das
Date: March 23, 2019 at 9:26:50 PM
To: Pat Ferrel
Cc: user
Subject: Re: Where does the Driver run?
If you are starting your "my-app" on your local machine, that's where the
driver is running.
[image: image.png]
Hope this helps.
<https://spark.apache.org/docs/l
I have researched this for a significant amount of time and find answers
that seem to be for a slightly different question than mine.
The Spark 2.3.3 cluster is running fine. I see the GUI on “
http://master-address:8080;, there are 2 idle workers, as configured.
I have a Scala application that
Executor.java:163)
at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at
io.netty.util.concurren
From: Pat Ferrel
Reply: Pat Ferrel
Date: February 12, 2019 at 5:40:41 PM
To: user@spark.apache.org
Subject: Spark with Kubernetes connecting to pod id, not address
We have a k8s deployment of several services including Apache Spark. All
services seem to be operational. Our application
+1
From: Apache Mahout
Reply: dev@mahout.apache.org
Date: January 3, 2019 at 11:53:02 AM
To: dev
Subject: Re: [NOTICE] Mandatory migration of git repositories to
gitbox.apache.org
On Thu, 3 Jan 2019 13:51:40 -0600, dev wrote:
Cool, just making sure we needed it.
On Thu, Jan 3,
There is a tag v0.7.3 and yes it is in master:
https://github.com/actionml/universal-recommender/tree/v0.7.3
From: Marco Goldin
Reply: user@predictionio.apache.org
Date: November 20, 2018 at 6:56:39 AM
To: user@predictionio.apache.org ,
gyar...@griddynamics.com
Subject: Re: universal
[
https://issues.apache.org/jira/browse/PIO-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621051#comment-16621051
]
Pat Ferrel commented on PIO-31:
---
I assume we are talking about the Event Server and the query server both
Assuming your are using the UR…
I don’t know how many times this has been caused by a misspelling of
eventNames in engine.json but assume you have checked that.
The fail-safe way to check is to `pio export` your data and check it
against your engine.json.
BTW `pio status` does not even try to
The random ranking is assigned after every `pio train` so if you have not
trained in-between, they will be the same. Random is not really meant to do
what you are using it for, it is meant to surface items with no data—no
primary events. This will allow some to get real events and be recommended
Is it necessary these commits are going to the incubator list? Are
notifications setup wrong?
From: git-site-r...@apache.org
Reply: dev@predictionio.apache.org
Date: August 24, 2018 at 10:33:34 AM
To: comm...@predictionio.incubator.apache.org
Subject: [7/7] predictionio-site git commit:
Oh and no it does not need a new context for every query, only for the
deploy.
From: Pat Ferrel
Date: August 7, 2018 at 10:00:49 AM
To: Ulavapalle Meghamala
Cc: user@predictionio.apache.org
, actionml-user
Subject: Re: PredictionIO spark deployment in Production
The answers to your
into Elasticsearch for serving independently
scalable queries.
I always advise you keep Spark out of serving for the reasons mentioned
above.
From: Ulavapalle Meghamala
Date: August 7, 2018 at 9:27:46 AM
To: Pat Ferrel
Cc: user@predictionio.apache.org
, actionml-user
Subject: Re
PIO is designed to use Spark in train and deploy. But the Universal
Recommender removes the need for Spark to make predictions. This IMO is a
key to use Spark well—remove it from serving results. PIO creates a Spark
context to launch the `pio deploy' driver but Spark is never used and the
context
+1
From: takako shimamoto
Reply: user@predictionio.apache.org
Date: August 2, 2018 at 2:55:49 AM
To: d...@predictionio.apache.org
, user@predictionio.apache.org
Subject: Straw poll: deprecating Scala 2.10 and Spark 1.x support
Hi all,
We're considering deprecating Scala 2.10 and Spark
What template?
From: Sami Serbey
Reply: user@predictionio.apache.org
Date: August 2, 2018 at 9:08:05 AM
To: user@predictionio.apache.org
Subject: 2 pio servers with 1 event server
Greetings,
I am trying to run 2 pio servers on different ports where each server have
his own app. When I
Please read the docs. There is no need to $set users since they are
attached to usage events and can be detected automatically. In fact
"$set"ting them is ignored. There are no properties of users that are not
calculated based on named “indicators’, which can be profile type things.
Fot this
-id, "searched-for”,
search-term) This as a secondary event has proven to be quite useful in at
least one dataset I’ve seen.
From: Pat Ferrel
Reply: Pat Ferrel
Date: July 2, 2018 at 12:18:16 PM
To: user@predictionio.apache.org
, Sami Serbey
Cc: actionml-user
Subject: Re: Digging in
The only requirement is that someone performed the primary event on A and
the secondary event is correlated to that primary event.
the UR can recommend to a user who has only performed the secondary event
on B as long as that is in the model. Makes no difference what subset of
events the user has
[
https://issues.apache.org/jira/browse/MAHOUT-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pat Ferrel updated MAHOUT-2048:
---
Sprint: 0.14.0 Release
> There are duplicate content pages which need redirects inst
Pat Ferrel created MAHOUT-2048:
--
Summary: There are duplicate content pages which need redirects
instead
Key: MAHOUT-2048
URL: https://issues.apache.org/jira/browse/MAHOUT-2048
Project: Mahout
This should work with any node down. Elasticsearch should elect a new
master.
What version of PIO are you using? PIO and the UR changed the client from
the transport client to the RET client in 0.12.0, which is why you are
using port 9200.
Do all PIO functions work correctly like:
- pio app
user@predictionio.apache.org
Date: June 20, 2018 at 10:25:53 AM
To: user@predictionio.apache.org , Pat Ferrel
Cc: user@predictionio.apache.org
Subject: Re: UR trending ranking as separate process
Hi George,
I didn't get your question but I think I am missing something. So you're using
the Univ
No the trending algorithm is meant to look at something like trends over 2
days. This is because it looks at 2 buckets of conversion frequencies and
if you cut them smaller than a day you will have so much bias due to daily
variations that the trends will be invalid. In other words the ups and
PIO_STORAGE_SOURCES_HBASE_HOME=/usr/local/hbase
Thanks,
Anuj Kumar
On Tue, Jun 19, 2018 at 9:16 PM Pat Ferrel wrote:
> Can you show me where on the AML site it says to store models in HDFS, it
> should not say that? I think that may be from the PIO site so you should
> ignore it.
>
> Can
; based backfill, must add eventsNames",
>
> "name": "ur",
>
> "params": {
>
> "appName": "np",
>
> "indexName": "np",
>
> "typeName": "items",
>
> "blacklistEvents": [],
te gets it wrong.
From: KRISH MEHTA
Reply: KRISH MEHTA
Date: June 13, 2018 at 2:19:17 PM
To: Pat Ferrel
Subject: Re: Few Queries Regarding the Recommendation Template
I Understand but if I just want the likes, dislikes and views then I can
combine the algorithms right? Given in
We do not use these for recommenders. The precision rate is low when the
lift in your KPI like sales is relatively high. This is not like
classification.
We use MAP@k with increasing values of k. This should yield a diminishing
mean average precision chart with increasing k. This tells you 2
Actually if you are using the Universal Recommender you only need to deploy
once as long as the engine.json does not change. The hot swap happens as
@Digambar says and there is literally no downtime. If you are using any of the
other recommenders you do have to re-deploy after every train but
No but we have 2 ways to handle this situation automatically and you can
tell if recommendations are not from personal user history.
1. when there is not enough user history to recommend, we fill in the
lower ranking recommendations with popular, trending, or hot items. Not
completely
Yarn has to be started explicitly. Usually it is part of Hadoop and is
started with Hadoop. Spark only contains the client for Yarn (afaik).
From: Miller, Clifford
Reply: user@predictionio.apache.org
Date: May 29, 2018 at 6:45:43 PM
To: user@predictionio.apache.org
Subject: Re: PIO
Sorry, what I meant was the actual spark-submit command that PIO was using.
It should be in the log.
What Spark version was that? I recall classpath issues with certain
versions of Spark.
On Thu, May 24, 2018 at 4:52 PM, Pat Ferrel wrote:
> Thanks Donald,
>
> We have:
>
>
No, this is as expected. When you run pseudo-distributed everything
internally is configured as if the services were on separate machines. See
clustered instructions here: http://actionml.com/docs/small_ha_cluster This
is to setup 3 machines running different parts and is not really the best
rd.mil...@phoenix-opsgroup.com>
Reply: Miller, Clifford <clifford.mil...@phoenix-opsgroup.com>
<clifford.mil...@phoenix-opsgroup.com>
Date: May 25, 2018 at 10:16:01 AM
To: Pat Ferrel <p...@occamsmachete.com> <p...@occamsmachete.com>
Cc: user@predictionio.apache.org <user@predict
No, you need to have HBase installed, or at least the config installed on
the PIO machine. The pio-env.sh defined servers will be configured cluster
operations and will be started separately from PIO. PIO then will not start
hbase and try to sommunicate only, not start it. But PIO still needs
I’m having a java.lang.NoClassDefFoundError in a different context and
different class. Have you tried this without Yarn? Sorry I can’t find the
rest of this thread.
From: Miller, Clifford
Reply:
doop2 in the storage driver assembly.
Looking at Git history it has not changed in a while.
Do you have the exact classpath that has gone into your Spark cluster?
On Wed, May 23, 2018 at 1:30 PM, Pat Ferrel <p...@actionml.com> wrote:
> A source build did not fix the problem, has anyone r
ster=local but not with remote Spark master
I’ve passed in the hbase-client in the --jars part of spark-submit, still
fails, what am I missing?
From: Pat Ferrel <p...@actionml.com> <p...@actionml.com>
Reply: Pat Ferrel <p...@actionml.com> <p...@actionml.com>
Date: May 23, 2018 at 8:57:32 A
Same CLI works using local Spark master, but fails using remote master for
a cluster due to a missing class def for protobuf used in hbase. We are
using the binary dist 0.12.1. Is this known? Is there a work around?
We are now trying a source build in hope the class will be put in the
assembly
e case where yarn is tyring to findout pio.log file on hdfs cluster.
You can try "--master yarn --deploy-mode client ". you need to pass this
configuration with pio train
e.g., pio train -- --master yarn --deploy-mode client
Thanks and Regards
Ambuj Sharma
Sunrise may late, B
arbitrary Spark params exactly as
you would to spark-submit on the pio command line. The double dash
separates PIO and Spark params.
From: Pat Ferrel <p...@occamsmachete.com> <p...@occamsmachete.com>
Reply: user@predictionio.apache.org <user@predictionio.apache.org>
<user@pr
What is the command line for `pio train …` Specifically are you using
yarn-cluster mode? This causes the driver code, which is a PIO process, to be
executed on an executor. Special setup is required for this.
From: Wojciech Kowalski
Reply: user@predictionio.apache.org
BTW The Universal Recommender has it’s own community support group here:
https://groups.google.com/forum/#!forum/actionml-user
From: Pat Ferrel <p...@occamsmachete.com> <p...@occamsmachete.com>
Reply: user@predictionio.apache.org <user@predictionio.apache.org>
<user@predicti
and “ItemBias” on the query
> do not have any effect on the result.
>
> 5.Is it feasible to build/train/deploy only once, and query for
> all 3 use cases?
>
>
> 6. How to make queries towards the different Apps because there is
> no any obvious way in the query para
Exactly, ranking is the only task of a recommender. Precision is not
automatically good at that but something like MAP@k is.
From: Marco Goldin <markomar...@gmail.com> <markomar...@gmail.com>
Date: May 10, 2018 at 10:09:22 PM
To: Pat Ferrel <p...@occamsmachete.com> <p...@
3 AM
To: Pat Ferrel <p...@occamsmachete.com>
Cc: user@predictionio.apache.org <user@predictionio.apache.org>
Subject: Re: UR evaluation
thank you very much, i didn't see this tool, i'll definitely try it. Clearly
better to have such a specific instrument.
2018-05-10 18:36
You can if you want but we have external tools for the UR that are much
more flexible. The UR has tuning that can’t really be covered by the built
in API. https://github.com/actionml/ur-analysis-tools They do MAP@k as well
as creating a bunch of other metrics and comparing different types of input
Why do you want to throw away user behavior in making recommendations? The
lift you get in purchases will be less.
There is a use case for this when you are making recommendations basically
inside a session where the user is browsing/viewing things on a hunt for
something. In this case you would
Hi all,
Mahout has hit a bit of a bump in releasing a Scala 2.11 version. I was
able to build 0.13.0 for Scala 2.11 and have published it on github as a
Maven compatible repo. I’m also using it from SBT.
If anyone wants access let me know.
Hi all,
Mahout has hit a bit of a bump in releasing a Scala 2.11 version. I was
able to build 0.13.0 for Scala 2.11 and have published it on github as a
Maven compatible repo. I’m also using it from SBT.
If anyone wants access let me know.
PIO is based on the architecture of Spark, which uses HDFS. HBase also uses
HDFS. Scaling these are quite well documented on the web. Scaling PIO is
the same as scaling all it’s services. It is unlikely you’ll need it but
you can also have more than one PIO server behind a load balancer.
Don’t
The need for Spark at query time depends on the engine. Which are you
using? The Universal Recommender, which I maintain, does not require Spark
for queries but uses PIO. We simply don’t use the Spark context so it is
ignored. To make PIO work you need to have the Spark code accessible but
that
This may seem unhelpful now but for others it might be useful to mention some
minimum PIO in production best practices:
1) PIO should IMO never be run in production on a single node. When all
services share the same memory, cpu, and disk, it is very difficult to find the
root cause to a
There are instructions for using Intellij but, I wrote the last version, I
apologize that I can’t make them work anymore. If you get them to work you
would be doing the community a great service by telling us how or editing
the instructions.
http://predictionio.apache.org/resources/intellij/
: user@predictionio.apache.org <user@predictionio.apache.org>
Date: March 29, 2018 at 6:19:58 AM
To: Pat Ferrel <p...@occamsmachete.com>
Cc: user@predictionio.apache.org <user@predictionio.apache.org>
Subject: Re: Unclear problem with using S3 as a storage data source
Sorry
: Dave Novelli <d...@ultravioletanalytics.com>
<d...@ultravioletanalytics.com>
Date: March 28, 2018 at 12:13:12 PM
To: Pat Ferrel <p...@occamsmachete.com> <p...@occamsmachete.com>
Cc: user@predictionio.apache.org <user@predictionio.apache.org>
<user@predictionio.
Pio build requires that ES hosts are known to Spark, which write the model
to ES. You can pass these in on the `pio train` command line:
pio train … -- --conf spark.es.nodes=“node1,node2,node3”
notice no spaces in the quoted list of hosts, also notice the double dash,
which separates pio
BTW I think you may have to push setting on the cli by adding “spark” to
the beginning of the key name:
*pio train -- --conf spark.es.nodes=**“**localhost" --driver-memory 8g
--executor-memory 8g*
From: Pat Ferrel <p...@occamsmachete.com> <p...@occamsmachete.co
es.nodes is supposed to be a string with hostnames separated by commas.
Depending on how your containers are set to communicate with the outside
world (Docker networking or port mapping) you may also need to set the
port, which is 9200 by default.
If your container is using port mapping and maps
evor Grant <trevor.d.gr...@gmail.com>
> Sent: Friday, March 2, 2018 5:15:35 PM
> To: Mahout Dev List
> Subject: Re: Spark 2.x/scala 2.11.x release
>
> The only "mess" is in the cli spark drivers, namely scopt.
>
> Get rid of the drivers/fix the scopt issue- we
e cli spark drivers, namely scopt.
>
> Get rid of the drivers/fix the scopt issue- we have no mess.
>
>
>
> On Mar 2, 2018 4:09 PM, "Pat Ferrel" <p...@occamsmachete.com> wrote:
>
> > BTW the mess master is in is why git flow was invented and why I asked
&
t;
> - Cherrypick any commits that we'd like to release (E.g.: SparseSpeedup)
> onto `develop` (along with a PR ad a ticket).
>
>
> - Merge `develop` to `master`, run through Smoke tests, tag master @
> `mahout-0.13.1`(automatically), and release.
>
>
> This will also ge
r`, run through Smoke tests, tag master @
> `mahout-0.13.1`(automatically), and release.
>
>
> This will also get us to more of a git-flow workflow, as we've discussed
> moving towards.
>
>
> Thoughts @all?
>
>
> --andy
>
>
>
>
>
>
> _
1 - 100 of 2128 matches
Mail list logo