Hello Pat, I am using two UR because one of them simply using popularity as recommendation while another is the normal version. Both of two engines use same data (app), about 10G. I have two machines running pio, one only runs ElasticSearch as slave, and another runs all (Hbase, ES, Spark), so I am running `pio-start-all` on the second machine. Also the two machines both contains 8 cores and 16G memory.
You can find the two engine.json as attached. Thank you for helping! Best regards, Amy Pat Ferrel <[email protected]> 於 2017年3月14日 週二 上午1:30寫道: > If you are running pio-start-all you must be running everything on a > single machine. This is called vertical scaling and is very prone to > running out of resources, either compute cores, or memory. If it has been > running for some time you may have finally hit the limit if what you can do > on the machine. > > What are the machines scecs, cores, memory? What is the size of you data? > Have you exported it with `pio export`? > > Also do you have a different indexName in the 2 engine.json files? And why > have 2 URs? > > > > On Mar 12, 2017, at 8:58 PM, Lin Amy <[email protected]> wrote: > > Hello everyone, > > I got two universal recommendation engine using the same events. And this > morning I find the server busy running with 100% CPU, so I shut it down, > tried to run up all the server. > However, after `pio-start-all` succeeded, I ran `pio train` on the two > engines, one succeeded with another failed. It returns the following error > message: > > *Exception in thread "main" org.apache.spark.SparkException: Job aborted > due to stage failure: Task 1 in stage 47.0 failed 1 times, most recent > failure: Lost task 1.0 in stage 47.0 (TID 156, localhost): > org.apache.spark.util.TaskCompletionListenerException: Found unrecoverable > error [10.1.3.100:9200 <http://10.1.3.100:9200/>] returned Bad Request(400) > - [MapperParsingException[failed to parse [t]]; nested: > ElasticsearchIllegalArgumentException[unknown property [obj]]; ]; Bailing > out..* > * at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:112)* > * at org.apache.spark.scheduler.Task.run(Task.scala:102)* > * at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)* > * at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)* > * at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)* > * at java.lang.Thread.run(Thread.java:745)* > > Any advise on how to solve the weird situation? > Thank you! > > Best regards, > Amy > >
normal_version_engine.json
Description: application/json
popularity_engine.json
Description: application/json
