** Also affects: sahara
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/1400477

Title:
  In Juno Cannot Create Spark Cluster From Horizon

Status in OpenStack Dashboard (Horizon):
  New
Status in OpenStack Data Processing (Sahara, ex. Savanna):
  New

Bug description:
  Trying to instantiate a Spark 1.0.0 cluster, using “Data Processing”
  element under Horizon, and am having the following problems:

  1-      Security Group: Having problem with security groups, when
  either defining "Node Group Templates" or instantiating a cluster. In
  the first case, if I use an existing group, say "default", gui shows
  an error stating "Error Security group '2' not found". Sahara log file
  indicates the same thing:

     2014-12-03 23:41:24.144 30234 INFO urllib3.connectionpool [-] Starting new 
HTTP connection (1): cloudctrl1.maas17
      2014-12-03 23:41:24.146 30234 DEBUG urllib3.connectionpool [-] Setting 
read timeout to None _make_request 
/usr/lib/python2.7/dist-packages/urllib3/connectionpool.py:375
      2014-12-03 23:41:24.163 30234 DEBUG urllib3.connectionpool [-] "GET 
/v2/926c31c887f441f6a4e4b8031b8cc528/os-security-groups HTTP/1.1" 200 682 
_make_request /usr/lib/python2.7/dist-packages/urllib3/connectionpool.py:415
      2014-12-03 23:41:24.165 30234 DEBUG sahara.utils.api [-] Validation Error 
occurred: error_code=400, error_message=Security group '2' not found, 
error_name=INVALID_REFERENCE bad_request 
/usr/local/lib/python2.7/dist-packages/sahara/utils/api.py:245
      2014-12-03 23:41:24.165 30234 INFO sahara.cli.sahara_all [-] 10.0.0.86 - 
- [03/Dec/2014 23:41:24] "POST 
/v1.1/926c31c887f441f6a4e4b8031b8cc528/node-group-templates HTTP/1.1" 400 221 
0.063121
      2014-12-03 23:41:24.257 30234 DEBUG keystonemiddleware.auth_token [-] 
Authenticating user token __call__ 
/usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650
      2014-12-03 23:41:24.258 30234 DEBUG keystonemiddleware.auth_token [-] 
Removing headers from request environment: 
X-Identity-Status,X-Domain-Id,X-Domain-Name,X-Project-Id,X-Project-Name,X-Project-Domain-Id,X-Project-Domain-Name,X-User-Id,X-User-Name,X-User-Domain-Id,X-User-Domain-Name,X-Roles,X-Service-Catalog,X-User,X-Tenant-Id,X-Tenant-Name,X-Tenant,X-Role
 _remove_auth_headers 
/usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:707

  I can, temporarily, avoid the problem by selecting "Auto Security
  Group" option. This would allow for a node group to be created;
  however, I do not see any new security group, under Compute -> Access
  & Security. At any rate, this also fails during cluster instantiation:

  2014-12-03 23:55:06.285 30234 INFO urllib3.connectionpool [-] Starting new 
HTTP connection (1): cloudctrl1.maas17
  2014-12-03 23:55:06.286 30234 DEBUG urllib3.connectionpool [-] Setting read 
timeout to None _make_request 
/usr/lib/python2.7/dist-packages/urllib3/connectionpool.py:375
  2014-12-03 23:55:06.409 30234 DEBUG urllib3.connectionpool [-] "POST 
/v2/926c31c887f441f6a4e4b8031b8cc528/servers HTTP/1.1" 400 116 _make_request 
/usr/lib/python2.7/dist-packages/urllib3/connectionpool.py:415
  2014-12-03 23:55:06.451 30234 ERROR sahara.service.ops [-] Error during 
operating cluster 'Sprk265' (reason: Security group 6 not found for project 
926c31c887f441f6a4e4b8031b8cc528. (HTTP 400))
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops Traceback (most recent 
call last):
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops   File 
"/usr/local/lib/python2.7/dist-packages/sahara/service/ops.py", line 113, in 
wrapper
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops     f(cluster_id, 
*args, **kwds)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops   File 
"/usr/local/lib/python2.7/dist-packages/sahara/service/ops.py", line 198, in 
_provision_cluster
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops     
INFRA.create_cluster(cluster)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops   File 
"/usr/local/lib/python2.7/dist-packages/sahara/service/direct_engine.py", line 
51, in create_cluster
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops     
self._create_instances(cluster)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops   File 
"/usr/local/lib/python2.7/dist-packages/sahara/service/direct_engine.py", line 
168, in _create_instances
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops     
self._run_instance(cluster, node_group, idx, aa_group=aa_group)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops   File 
"/usr/local/lib/python2.7/dist-packages/sahara/service/direct_engine.py", line 
305, in _run_instance
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops     **nova_kwargs)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops   File 
"/usr/lib/python2.7/dist-packages/novaclient/v1_1/servers.py", line 883, in 
create
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops     **boot_kwargs)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops   File 
"/usr/lib/python2.7/dist-packages/novaclient/v1_1/servers.py", line 546, in 
_boot
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops     
return_raw=return_raw, **kwargs)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops   File 
"/usr/lib/python2.7/dist-packages/novaclient/base.py", line 100, in _create
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops     _resp, body = 
self.api.client.post(url, body=body)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops   File 
"/usr/lib/python2.7/dist-packages/novaclient/client.py", line 490, in post
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops     return 
self._cs_request(url, 'POST', **kwargs)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops   File 
"/usr/lib/python2.7/dist-packages/novaclient/client.py", line 465, in 
_cs_request
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops     resp, body = 
self._time_request(url, method, **kwargs)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops   File 
"/usr/lib/python2.7/dist-packages/novaclient/client.py", line 439, in 
_time_request
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops     resp, body = 
self.request(url, method, **kwargs)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops   File 
"/usr/lib/python2.7/dist-packages/novaclient/client.py", line 433, in request
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops     raise 
exceptions.from_response(resp, body, url, method)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops BadRequest: Security 
group 6 not found for project 926c31c887f441f6a4e4b8031b8cc528. (HTTP 400)
  2014-12-03 23:55:06.451 30234 TRACE sahara.service.ops 
  2014-12-03 23:55:06.611 30234 INFO sahara.service.direct_engine [-] Cluster 
'Sprk265' creation rollback (reason: Security group 6 not found for project 
926c31c887f441f6a4e4b8031b8cc528. (HTTP 400))
  2014-12-03 23:55:06.616 30234 INFO urllib3.connectionpool [-] Starting new 
HTTP connection (1): cloudctrl1.maas17
  2014-12-03 23:55:06.622 30234 DEBUG urllib3.connectionpool [-] Setting read 
timeout to None _make_request 
/usr/lib/python2.7/dist-packages/urllib3/connectionpool.py:375
  2014-12-03 23:55:07.018 30234 DEBUG keystonemiddleware.auth_token [-] 
Authenticating user token __call__ 
/usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650
  2014-12-03 23:55:07.019 30234 DEBUG keystonemiddleware.auth_token [-] 
Removing headers from request environment: 
X-Identity-Status,X-Domain-Id,X-Domain-Name,X-Project-Id,X-Project-Name,X-Project-Domain-Id,X-Project-Domain-Name,X-User-Id,X-User-Name,X-User-Domain-Id,X-User-Domain-Name,X-Roles,X-Service-Catalog,X-User,X-Tenant-Id,X-Tenant-Name,X-Tenant,X-Role
 _remove_auth_headers 
/usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:707

  Here is my conf file:

  [DEFAULT]
  use_floating_ips=True
  use_neutron=True
   [keystone_authtoken]
  auth_uri = http://keystone1.maas17:5000/v2.0/
  identity_uri=http://keystone1.maas17:35357/
  admin_user=sahara
  admin_password=sahara
  admin_tenant_name=sahara
  periodic_enable=true
  plugins=vanilla,hdp,idh,spark,cdh
  [database]
  connection=mysql://sahara:sahara@mysql1.maas17/sahara

  Sahara is also registered with Keystone:

  # keystone service-list
  
+----------------------------------+----------+-----------------+----------------------------+
  |                id                |   name   |       type      |        
description         |
  
+----------------------------------+----------+-----------------+----------------------------+
  | 75b2c466c35a44d5bbe7167c1ed38e20 |  cinder  |      volume     |   Cinder 
Volume Service    |
  | d5b014c2d96e4d619f9d9b8e646f0f5b |   ec2    |       ec2       |  EC2 
Compatibility Layer   |
  | 8180061e79a24627be43485910a9e16a |  glance  |      image      |    Glance 
Image Service    |
  | 4dad7b5145c842a4a8fbbab1f158629c | keystone |     identity    | Keystone 
Identity Service  |
  | d833594550e14d49be4df4394886c849 |   nova   |     compute     |    Nova 
Compute Service    |
  | fe42b5328002433989ffd6b18414aacc | quantum  |     network     | Quantum 
Networking Service |
  | fac3f901841c4528b3902ae1b7265b4e |    s3    |        s3       | S3 
Compatible object-store |
  | 12104e23db1441a28cb42cf8c9437139 |  sahara  | data_processing |  Data 
processing service   |
  
+----------------------------------+----------+-----------------+----------------------------+
  See 
https://ask.openstack.org/en/question/55161/juno-sahara-spark-100-security-group-error/
 for workarounds.

  2-      Spark Login: I am not able to login to the recommended Spark
  Image, i.e., http://sahara-files.mirantis.com/saha... . Launching this
  image either by itself for through Sahara/Data Processing, results in
  invalid user ubuntu:

  Generation complete.
   * Stopping Handle applying cloud-config[74G[ OK ]
   * Starting Hadoop namenode: 
  starting namenode, logging to 
/var/log/hadoop-hdfs/hadoop-hdfs-namenode-ubuntu.out
  open-vm-tools: not starting as this is not a VMware VM
  landscape-client is not configured, please run landscape-config.
   * Restoring resolver state...       [80G [74G[ OK ]
  chown: invalid user: 'ubuntu:ubuntu'
  chown: invalid user: 'ubuntu:ubuntu'
  rm: cannot remove '/tmp/in_target.d/post-install.d/20-spark': No such file or 
directory
   * Stopping System V runlevel compatibility[74G[ OK ]
   * Starting execute cloud user/final scripts[74G[ OK ]
  Cloud-init v. 0.7.5 running 'modules:final' at Thu, 04 Dec 2014 04:48:56 
+0000. Up 28.44 seconds.
  2014-12-04 04:48:56,208 - util.py[WARNING]: Running ssh-authkey-fingerprints 
(<module 'cloudinit.config.cc_ssh_authkey_fingerprints' from 
'/usr/lib/python2.7/dist-packages/cloudinit/config/cc_ssh_authkey_fingerprints.pyc'>)
 failed
  ec2: 
  ec2: #############################################################
  ec2: -----BEGIN SSH HOST KEY FINGERPRINTS-----
  ec2: -----END SSH HOST KEY FINGERPRINTS-----
  ec2: #############################################################
  -----BEGIN SSH HOST KEY KEYS-----
  -----END SSH HOST KEY KEYS-----
  Cloud-init v. 0.7.5 finished at Thu, 04 Dec 2014 04:48:56 +0000. Datasource 
DataSourceNone.  Up 28.55 seconds
  2014-12-04 04:48:56,237 - cc_final_message.py[WARNING]: Used fallback 
datasource

  Ubuntu 14.04.1 LTS ubuntu ttyS0

  ubuntu login:

  Sahara log file shows repeated failed login attempts:

  2014-12-04 05:01:40.099 30327 DEBUG sahara.service.engine [-] Can't login to 
node sprk265-worker-002 (10.0.200.59), reason error: [Errno 110] Connection 
timed out _wait_until_accessible 
/usr/local/lib/python2.7/dist-packages/sahara/service/engine.py:110
  2014-12-04 05:01:40.116 30327 DEBUG sahara.utils.ssh_remote [-] 
[sprk265-worker-001] _execute_command took 127.5 seconds to complete 
_log_command 
/usr/local/lib/python2.7/dist-packages/sahara/utils/ssh_remote.py:459
  2014-12-04 05:01:40.117 30327 DEBUG sahara.service.engine [-] Can't login to 
node sprk265-worker-001 (10.0.200.56), reason error: [Errno 110] Connection 
timed out _wait_until_accessible 
/usr/local/lib/python2.7/dist-packages/sahara/service/engine.py:110
  2014-12-04 05:01:44.644 30327 DEBUG sahara.utils.ssh_remote [-] 
[sprk265-worker-003] Executing "ls .ssh/authorized_keys" _log_command 
/usr/local/lib/python2.7/dist-packages/sahara/utils/ssh_remote.py:459
  2014-12-04 05:01:45.015 30327 DEBUG sahara.utils.ssh_remote [-] 
[sprk265-controller-001] Executing "ls .ssh/authorized_keys" _log_command 
/usr/local/lib/python2.7/dist-packages/sahara/utils/ssh_remote.py:459
  2014-12-04 05:01:45.141 30327 DEBUG sahara.openstack.common.periodic_task [-] 
Running periodic task SaharaPeriodicTasks.update_job_statuses 
run_periodic_tasks 
/usr/local/lib/python2.7/dist-packages/sahara/openstack/common/periodic_task.py:193

  Note that I am able to launch other ubuntu images using the same key
  pair

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1400477/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to