I noticed the appName is different for DataSource (“shop _live”) and
Algorithm (“shop_live”). AppNames must match.

Also the eventNames are different, which should be ok but it’s still a
question. Why input something that is not used? Given the meaning of the
events, I’d use them all for recommendations but you may eventually want to
create shopping cart and wishlist models separately since this will yield
“complimentary purchases” and “things you may be missing” in the wishlist.


From: Wojciech Kowalski <wojci...@tomandco.co.uk> <wojci...@tomandco.co.uk>
Reply: user@predictionio.apache.org <user@predictionio.apache.org>
<user@predictionio.apache.org>
Date: May 23, 2018 at 5:17:06 AM
To: Ambuj Sharma <am...@getamplify.com> <am...@getamplify.com>,
user@predictionio.apache.org <user@predictionio.apache.org>
<user@predictionio.apache.org>
Subject:  RE: Problem with training in yarn cluster

Hello again,



After moving hbase to dataproc cluster from docker ( probs dns/hostname
resolution issues ) no more hbase error but still training stops:



[INFO] [RecommendationEngine$]



               _   _             __  __ _

     /\       | | (_)           |  \/  | |

    /  \   ___| |_ _  ___  _ __ | \  / | |

   / /\ \ / __| __| |/ _ \| '_ \| |\/| | |

  / ____ \ (__| |_| | (_) | | | | |  | | |____

 /_/    \_\___|\__|_|\___/|_| |_|_|  |_|______|







[INFO] [Engine] Extracting datasource params...

[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.

[INFO] [Engine] Datasource params: (,DataSourceParams(shop
_live,List(purchase, basket-add, wishlist-add, view),None,None))

[INFO] [Engine] Extracting preparator params...

[INFO] [Engine] Preparator params: (,Empty)

[INFO] [Engine] Extracting serving params...

[INFO] [Engine] Serving params: (,Empty)

[INFO] [log] Logging initialized @10046ms

[INFO] [Server] jetty-9.2.z-SNAPSHOT

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@7a6f5572{/jobs,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@2679cc20{/jobs/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@489e0d2e{/jobs/job,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@720aa19c{/jobs/job/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@724eae6a{/stages,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@1a3e64cf{/stages/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@2271fddb{/stages/stage,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@550be48{/stages/stage/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@2ea7d76{/stages/pool,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@6b9b69f8{/stages/pool/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@46a9ce75{/storage,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@468b9a16{/storage/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@175b4e7c{/storage/rdd,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@27bf31c6{/storage/rdd/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@2f6d8922{/environment,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@35acfdf3{/environment/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@78496d94{/executors,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@26a6525a{/executors/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@65c1fb35{/executors/threadDump,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@3750c11b{/executors/threadDump/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@4462fa8{/static,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@10e699f8{/,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@7a14c082{/api,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@4bfd8ec2{/jobs/job/kill,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@7ef3c37a{/stages/stage/kill,null,AVAILABLE,@Spark}

[INFO] [ServerConnector] Started Spark@6a00b5d1{HTTP/1.1}{0.0.0.0:49349}

[INFO] [Server] Started @10430ms

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@379fcbd1{/metrics/json,null,AVAILABLE,@Spark}

[WARN] [YarnSchedulerBackend$YarnSchedulerEndpoint] Attempted to
request executors before the AM has registered!

[INFO] [DataSource]

╔════════════════════════════════════════════════════════════╗

║ Init DataSource                                            ║

║ ══════════════════════════════════════════════════════════ ║

║ App name                      shop _live             ║

║ Event window                  None                         ║

║ Event names                   List(purchase, basket-add, wishlist-add, view) ║

║ Min events per user           None                         ║

╚════════════════════════════════════════════════════════════╝



[INFO] [URAlgorithm]

╔════════════════════════════════════════════════════════════╗

║ Init URAlgorithm                                           ║

║ ══════════════════════════════════════════════════════════ ║

║ App name                      shop_live             ║

║ ES index name                 oburindex                    ║

║ ES type name                  items                        ║

║ RecsModel                     all                          ║

║ Event names                   List(purchase, view)         ║

║ ══════════════════════════════════════════════════════════ ║

║ Random seed                   -1931119310                  ║

║ MaxCorrelatorsPerEventType    50                           ║

║ MaxEventsPerEventType         500                          ║

║ BlacklistEvents               List(purchase)               ║

║ ══════════════════════════════════════════════════════════ ║

║ User bias                     1.0                          ║

║ Item bias                     1.0                          ║

║ Max query events              100                          ║

║ Limit                         20                           ║

║ ══════════════════════════════════════════════════════════ ║

║ Rankings:                                                  ║

║ popular                       Some(popRank)                ║

╚════════════════════════════════════════════════════════════╝



[INFO] [Engine$] EngineWorkflow.train

[INFO] [Engine$] DataSource: com.actionml.DataSource@4953588a

[INFO] [Engine$] Preparator: com.actionml.Preparator@715d8f93

[INFO] [Engine$] AlgorithmList: List(com.actionml.URAlgorithm@50c15628)

[INFO] [Engine$] Data sanity check is on.

[WARN] [ApplicationMaster] Reporter thread fails 1 time(s) in a row.

[WARN] [ApplicationMaster] Reporter thread fails 2 time(s) in a row.

[WARN] [ApplicationMaster] Reporter thread fails 3 time(s) in a row.

[WARN] [ApplicationMaster] Reporter thread fails 4 time(s) in a row.

[INFO] [ServerConnector] Stopped Spark@6a00b5d1{HTTP/1.1}{0.0.0.0:0}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@7ef3c37a{/stages/stage/kill,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@4bfd8ec2{/jobs/job/kill,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@7a14c082{/api,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@10e699f8{/,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@4462fa8{/static,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@3750c11b{/executors/threadDump/json,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@65c1fb35{/executors/threadDump,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@26a6525a{/executors/json,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@78496d94{/executors,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@35acfdf3{/environment/json,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@2f6d8922{/environment,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@27bf31c6{/storage/rdd/json,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@175b4e7c{/storage/rdd,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@468b9a16{/storage/json,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@46a9ce75{/storage,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@6b9b69f8{/stages/pool/json,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@2ea7d76{/stages/pool,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@550be48{/stages/stage/json,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@2271fddb{/stages/stage,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@1a3e64cf{/stages/json,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@724eae6a{/stages,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@720aa19c{/jobs/job/json,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@489e0d2e{/jobs/job,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@2679cc20{/jobs/json,null,UNAVAILABLE,@Spark}

[INFO] [ContextHandler] Stopped
o.s.j.s.ServletContextHandler@7a6f5572{/jobs,null,UNAVAILABLE,@Spark}

[ERROR] [LiveListenerBus] SparkListenerBus has already stopped!
Dropping event 
SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo@e1518c9)

[ERROR] [LiveListenerBus] SparkListenerBus has already stopped!
Dropping event 
SparkListenerJobEnd(0,1527077245287,JobFailed(org.apache.spark.SparkException:
Job 0 cancelled because SparkContext was shut down))





Also in stderr(?) this:

[Stage 0:>                                                          (0 + 0) / 5]



Yarn app info:

*User:*

pio <http://pio-cluster-m:8088/cluster/scheduler?openQueues=default>

*Name:*

org.apache.predictionio.workflow.CreateWorkflow

*Application Type:*

SPARK

*Application Tags:*

*Application Priority:*

0 (Higher Integer value indicates higher priority)

*YarnApplicationState:*

FINISHED

*Queue:*

default <http://pio-cluster-m:8088/cluster/scheduler?openQueues=default>

*FinalStatus Reported by AM:*

FAILED

*Started:*

Wed May 23 12:06:44 +0000 2018

*Elapsed:*

40sec

*Tracking URL:*

History <http://pio-cluster-m:8088/proxy/application_1526996273517_0030/>

*Log Aggregation Status:*

DISABLED

*Diagnostics:*

*Exception was thrown 5 time(s) from Reporter thread.*

*Unmanaged Application:*

false

*Application Node Label expression:*

<Not set>

*AM container Node Label expression:*

<DEFAULT_PARTITION>







Thanks,

Wojciech



*From: *Wojciech Kowalski <wojci...@tomandco.co.uk>
*Sent: *23 May 2018 11:26
*To: *Ambuj Sharma <am...@getamplify.com>; user@predictionio.apache.org
*Subject: *RE: Problem with training in yarn cluster



Hi,



Ok so full command now is:

pio train --scratch-uri hdfs://pio-cluster-m/pio -- --executor-memory 4g
--driver-memory 4g --deploy-mode cluster --master yarn



errors stopped after removing –executor-cores 2 --driver-cores 2

I found this error: Uncaught exception:
org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid
resource request, requested virtual cores < 0, or requested virtual cores >
max configured, requestedVirtualCores=4, maxVirtualCores=2



But now I have problem with hbase :/



I have hbase host set:

declare -x PIO_STORAGE_SOURCES_HBASE_HOSTS="pio-gc"



[INFO] [Engine$] EngineWorkflow.train

[INFO] [Engine$] DataSource: com.actionml.DataSource@2fdb4e2e

[INFO] [Engine$] Preparator: com.actionml.Preparator@d257dd4

[INFO] [Engine$] AlgorithmList: List(com.actionml.URAlgorithm@400bbb7)

[INFO] [Engine$] Data sanity check is on.

[ERROR] [StorageClient] HBase master is not running (ZooKeeper
ensemble: pio-cluster-m). Please make sure that HBase is running
properly, and that the configuration is pointing at the correct
ZooKeeper ensemble.

[ERROR] [Storage$] Error initializing storage client for source HBASE.

org.apache.hadoop.hbase.MasterNotRunningException:
com.google.protobuf.ServiceException: java.net.UnknownHostException:
unknown host: hbase-master

        at 
org.apache.hadoop.hbase.client.HConnectionManager$HCoolnnectionImplementation$StubMaker.makeStub(HConnectionManager.java:1645)

        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(HConnectionManager.java:1671)

        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getKeepAliveMasterService(HConnectionManager.java:1878)

        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.isMasterRunning(HConnectionManager.java:894)

        at 
org.apache.hadoop.hbase.client.HBaseAdmin.checkHBaseAvailable(HBaseAdmin.java:2366)

        at 
org.apache.predictionio.data.storage.hbase.StorageClient.<init>(StorageClient.scala:53)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)

        at 
org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:252)

        at 
org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:283)

        at 
org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)

        at 
org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)

        at 
scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)

        at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)

        at 
org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:244)

        at 
org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)

        at 
org.apache.predictionio.data.storage.Storage$.getPDataObject(Storage.scala:364)

        at 
org.apache.predictionio.data.storage.Storage$.getPDataObject(Storage.scala:307)

        at 
org.apache.predictionio.data.storage.Storage$.getPEvents(Storage.scala:454)

        at 
org.apache.predictionio.data.store.PEventStore$.eventsDb$lzycompute(PEventStore.scala:37)

        at 
org.apache.predictionio.data.store.PEventStore$.eventsDb(PEventStore.scala:37)

        at 
org.apache.predictionio.data.store.PEventStore$.find(PEventStore.scala:73)

        at com.actionml.DataSource.readTraining(DataSource.scala:76)

        at com.actionml.DataSource.readTraining(DataSource.scala:48)

        at 
org.apache.predictionio.controller.PDataSource.readTrainingBase(PDataSource.scala:40)

        at org.apache.predictionio.controller.Engine$.train(Engine.scala:642)

        at org.apache.predictionio.controller.Engine.train(Engine.scala:176)

        at 
org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67)

        at 
org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:251)

        at 
org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:498)

        at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)

Caused by: com.google.protobuf.ServiceException:
java.net.UnknownHostException: unknown host: hbase-master

        at 
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1678)

        at 
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)

        at 
org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:42561)

        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(HConnectionManager.java:1682)

        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(HConnectionManager.java:1591)

        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$StubMaker.makeStub(HConnectionManager.java:1617)

        ... 36 more

Caused by: java.net.UnknownHostException: unknown host: hbase-master

        at 
org.apache.hadoop.hbase.ipc.RpcClient$Connection.<init>(RpcClient.java:385)

        at 
org.apache.hadoop.hbase.ipc.RpcClient.createConnection(RpcClient.java:351)

        at 
org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1530)

        at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)

        at 
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)

        ... 41 more







*From: *Ambuj Sharma <am...@getamplify.com>
*Sent: *23 May 2018 08:59
*To: *user@predictionio.apache.org
*Cc: *Wojciech Kowalski <wojci...@tomandco.co.uk>
*Subject: *Re: Problem with training in yarn cluster



Hi wojciech,

 I also faced many problems while setting yarn with PredictionIO. This may
be the case where yarn is tyring to findout pio.log file on hdfs cluster.
You can try "--master yarn --deploy-mode client ". you need to pass this
configuration with pio train

e.g., pio train -- --master yarn --deploy-mode client








Thanks and Regards

Ambuj Sharma

Sunrise may late, But Morning is sure.....

Team ML

Betaout



On Wed, May 23, 2018 at 4:53 AM, Pat Ferrel <p...@occamsmachete.com> wrote:

Actually you might search the archives for “yarn” because I don’t recall
how the setup works off hand.



Archives here:
https://lists.apache.org/list.html?user@predictionio.apache.org



Also check the Spark Yarn requirements and remember that `pio train … --
various Spark params` allows you to pass arbitrary Spark params exactly as
you would to spark-submit on the pio command line. The double dash
separates PIO and Spark params.




From: Pat Ferrel <p...@occamsmachete.com> <p...@occamsmachete.com>
Reply: user@predictionio.apache.org <user@predictionio.apache.org>
<user@predictionio.apache.org>
Date: May 22, 2018 at 4:07:38 PM
To: user@predictionio.apache.org <user@predictionio.apache.org>
<user@predictionio.apache.org>, Wojciech Kowalski <wojci...@tomandco.co.uk>
<wojci...@tomandco.co.uk>


Subject:  RE: Problem with training in yarn cluster



What is the command line for `pio train …` Specifically are you using
yarn-cluster mode? This causes the driver code, which is a PIO process, to
be executed on an executor. Special setup is required for this.




From: Wojciech Kowalski <wojci...@tomandco.co.uk> <wojci...@tomandco.co.uk>
Reply: user@predictionio.apache.org <user@predictionio.apache.org>
<user@predictionio.apache.org>
Date: May 22, 2018 at 2:28:43 PM
To: user@predictionio.apache.org <user@predictionio.apache.org>
<user@predictionio.apache.org>
Subject:  RE: Problem with training in yarn cluster



Hello,



Actually I have another error in logs that is actually preventing train as
well:



[INFO] [RecommendationEngine$]



               _   _             __  __ _

     /\       | | (_)           |  \/  | |

    /  \   ___| |_ _  ___  _ __ | \  / | |

   / /\ \ / __| __| |/ _ \| '_ \| |\/| | |

  / ____ \ (__| |_| | (_) | | | | |  | | |____

 /_/    \_\___|\__|_|\___/|_| |_|_|  |_|______|







[INFO] [Engine] Extracting datasource params...

[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.

[INFO] [Engine] Datasource params:
(,DataSourceParams(shop_live,List(purchase, basket-add, wishlist-add,
view),None,None))

[INFO] [Engine] Extracting preparator params...

[INFO] [Engine] Preparator params: (,Empty)

[INFO] [Engine] Extracting serving params...

[INFO] [Engine] Serving params: (,Empty)

[INFO] [log] Logging initialized @6774ms

[INFO] [Server] jetty-9.2.z-SNAPSHOT

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@1798eb08{/jobs,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@47c4c3cd{/jobs/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@3e080dea{/jobs/job,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@c75847b{/jobs/job/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@5ce5ee56{/stages,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@3dde94ac{/stages/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@4347b9a0{/stages/stage,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@63b1bbef{/stages/stage/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@10556e91{/stages/pool,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@5967f3c3{/stages/pool/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@2793dbf6{/storage,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@49936228{/storage/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@7289bc6d{/storage/rdd,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@1496b014{/storage/rdd/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@2de3951b{/environment,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@7f3330ad{/environment/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@40e681f2{/executors,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@61519fea{/executors/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@502b9596{/executors/threadDump,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@367b7166{/executors/threadDump/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@42669f4a{/static,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@2f25f623{/,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@23ae4174{/api,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@4e33e426{/jobs/job/kill,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@38d9ae65{/stages/stage/kill,null,AVAILABLE,@Spark}

[INFO] [ServerConnector] Started Spark@17239b3{HTTP/1.1}{0.0.0.0:47948}

[INFO] [Server] Started @7040ms

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@16cffbe4{/metrics/json,null,AVAILABLE,@Spark}

[WARN] [YarnSchedulerBackend$YarnSchedulerEndpoint] Attempted to
request executors before the AM has registered!

[ERROR] [ApplicationMaster] Uncaught exception:



Thanks,

Wojciech



*From: *Wojciech Kowalski <wojci...@tomandco.co.uk>
*Sent: *22 May 2018 23:20
*To: *user@predictionio.apache.org
*Subject: *Problem with training in yarn cluster



Hello, I am trying to setup distributed cluster with separate all services
but i have problem while running train:



log4j:ERROR setFile(null,true) call failed.

java.io.FileNotFoundException: /pio/pio.log (No such file or directory)

        at java.io.FileOutputStream.open0(Native Method)

        at java.io.FileOutputStream.open(FileOutputStream.java:270)

        at java.io.FileOutputStream.<init>(FileOutputStream.java:213)

        at java.io.FileOutputStream.<init>(FileOutputStream.java:133)

        at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)

        at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)

        at 
org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)

        at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)

        at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)

        at 
org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)

        at 
org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)

        at 
org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648)

        at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514)

        at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)

        at 
org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)

        at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)

        at 
org.apache.spark.internal.Logging$class.initializeLogging(Logging.scala:117)

        at 
org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:102)

        at 
org.apache.spark.deploy.yarn.ApplicationMaster$.initializeLogIfNecessary(ApplicationMaster.scala:738)

        at org.apache.spark.internal.Logging$class.log(Logging.scala:46)

        at 
org.apache.spark.deploy.yarn.ApplicationMaster$.log(ApplicationMaster.scala:738)

        at 
org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:753)

        at 
org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)





setup:

hbase

Hadoop

Hdfs

Spark cluster with yarn



Training in cluster mode

I assume spark worker is trying to save log to /pio/pio.log on worker
machine instead of pio host. How can I set pio destination to hdfs path ?



Or any other advice ?



Thanks,

Wojciech

Reply via email to