[ 
https://issues.apache.org/jira/browse/MAHOUT-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Yaraev updated MAHOUT-1767:
----------------------------------
    Description: 
When one follows the instructions located in [README.md for H2O 
module|https://github.com/apache/mahout/blob/master/h2o/README.md] and tries to 
run tests in the distributed mode, tests run only in the local mode. There are 
three steps in the instruction:

# {code}
host-1:~/mahout$ ./bin/mahout h2o-node
...
.. INFO: Cloud of size 1 formed [/W.X.Y.Z:54321]
{code}
# {code}
host-2:~/mahout$ ./bin/mahout h2o-node
...
.. INFO: Cloud of size 2 formed [/A.B.C.D:54322]
{code}
# {code}
host-N:~/mahout/h2o$ mvn test
...
.. INFO: Cloud of size 3 formed [/E.F.G.H:54323]
...
All tests passed.
...
host-N:~/mahout/h2o$
{code}

First two steps are for executing worker nodes. The last one is for executing 
tests. According to the instruction, after launching tests one more worker is 
started. And it should join to the same cloud which other worker nodes forms. 
But it does joined them because it has different cloud name (or _masterURL_ in 
terms of the code). If you look in to the code, you can found the following:
{code:title=DistributedH2OSuite.scala}
...
mahoutCtx = mahoutH2OContext("mah2out" + System.currentTimeMillis())
...
{code}

We tried to remove generated suffix from the cloud name. After that it started 
to work.

  was:
When one follows the instructions located in [README.md for H2O 
module|https://github.com/apache/mahout/blob/master/h2o/README.md] and tries to 
run tests in the distributed mode, tests run only in the local mode. There are 
three steps:

# {code}
host-1:~/mahout$ ./bin/mahout h2o-node
...
.. INFO: Cloud of size 1 formed [/W.X.Y.Z:54321]
{code}
# {code}
host-2:~/mahout$ ./bin/mahout h2o-node
...
.. INFO: Cloud of size 2 formed [/A.B.C.D:54322]
{code}
# {code}
host-N:~/mahout/h2o$ mvn test
...
.. INFO: Cloud of size 3 formed [/E.F.G.H:54323]
...
All tests passed.
...
host-N:~/mahout/h2o$
{code}

First two steps are for executing worker nodes. The last one is for executing 
tests. According to the instruction, after launching tests one more worker is 
started. And it should join to the same cloud which other worker nodes forms. 
But it does joined them because it has different cloud name (or _masterURL_ in 
terms of the code). If you look in to the code, you can found the following:
{code:title=DistributedH2OSuite.scala}
...
mahoutCtx = mahoutH2OContext("mah2out" + System.currentTimeMillis())
...
{code}

We tried to remove generated suffix from the cloud name. After that it started 
to work.


> Unable to run tests on H2O enigne in distributed mode
> -----------------------------------------------------
>
>                 Key: MAHOUT-1767
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1767
>             Project: Mahout
>          Issue Type: Bug
>          Components: Documentation
>    Affects Versions: 0.11.0
>            Reporter: Dmitry Yaraev
>
> When one follows the instructions located in [README.md for H2O 
> module|https://github.com/apache/mahout/blob/master/h2o/README.md] and tries 
> to run tests in the distributed mode, tests run only in the local mode. There 
> are three steps in the instruction:
> # {code}
> host-1:~/mahout$ ./bin/mahout h2o-node
> ...
> .. INFO: Cloud of size 1 formed [/W.X.Y.Z:54321]
> {code}
> # {code}
> host-2:~/mahout$ ./bin/mahout h2o-node
> ...
> .. INFO: Cloud of size 2 formed [/A.B.C.D:54322]
> {code}
> # {code}
> host-N:~/mahout/h2o$ mvn test
> ...
> .. INFO: Cloud of size 3 formed [/E.F.G.H:54323]
> ...
> All tests passed.
> ...
> host-N:~/mahout/h2o$
> {code}
> First two steps are for executing worker nodes. The last one is for executing 
> tests. According to the instruction, after launching tests one more worker is 
> started. And it should join to the same cloud which other worker nodes forms. 
> But it does joined them because it has different cloud name (or _masterURL_ 
> in terms of the code). If you look in to the code, you can found the 
> following:
> {code:title=DistributedH2OSuite.scala}
> ...
> mahoutCtx = mahoutH2OContext("mah2out" + System.currentTimeMillis())
> ...
> {code}
> We tried to remove generated suffix from the cloud name. After that it 
> started to work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to