Making the agent retry custom actions

2016-02-19 Thread Greg Hill
This is for Ambari 2.1.1 so apologies if this has since been fixed.  We saw a 
failure today in one of our custom actions caused by a temporary network hiccup:

Caught an exception while executing custom service command: : Can not download file from url 
https://ambari.local:443/resources//custom_actions/.hash : ; Can not download file from url 
https://ambari.local:443/resources//custom_actions/.hash : 

Is there some way to tell the agent to not fail here? Just keep retrying until 
it can download the file from the server.  If it takes too long we'll handle 
timing out the build and cleaning up ourselves.

The 'tolerate_download_failures' setting doesn't trigger a retry, it just 
relies on the local cache to proceed, and the file isn't in the local cache 
yet, so it fails with a file missing exception if we enable it.

Greg


Re: About the Ambari python client

2016-02-19 Thread Greg Hill
It hasn't been worked on in over a year.  I wrote my own and released it to 
Pypi:

https://github.com/jimbobhickville/python-ambariclient

I was working on contributing it back to Ambari to replace this older client, 
but I could never get the Ambari tests to finish on my machine and something in 
my changes was causing an unrelated test to fail on Jenkins (but there was 
nothing in the logs indicating what failed).

You're welcome to use it and contribute back if you find any issues.  It hasn't 
been updated for new features in 2.2 yet.

Greg

From: Lazare Deathbringer 
mailto:lazare.amb...@gmail.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Thursday, February 18, 2016 at 2:25 PM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: About the Ambari python client

Hello !

I contact you because I would like to use the ambari python client.

I read the following documentation :
https://cwiki.apache.org/confluence/display/AMBARI/Ambari+python+Client

And it said :
WARNING: The client library is still in the works and not production ready.

It is dated from 2014, in June, so I wanted to know if this client library was 
now production ready or not yet ?

What is the best if I have a python program, do interact with ambari with the 
python client ? Or to interact with ambari with the REST API ?

Best regards.

Lazare


Re: Installing Ambari on RHEL 7.2/CentOS 7

2016-02-17 Thread Greg Hill
CentOS/RHEL 7.2 breaks ambari-server's init script because of some upstream 
changes in systemd:

https://issues.apache.org/jira/browse/AMBARI-14526


Greg

From: Dmitry Sen mailto:d...@hortonworks.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Wednesday, February 17, 2016 at 12:03 PM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: Re: Installing Ambari on RHEL 7.2/CentOS 7


Hi,


RHEL7 and CentOS7 are supported by ambari. Check 
https://cwiki.apache.org/confluence/display/AMBARI/Install+Ambari+2.2.1+from+Public+Repositories​


BR,

Dmytro Sen​


From: Edmon Begoli mailto:ebeg...@gmail.com>>
Sent: Wednesday, February 17, 2016 4:22 PM
To: user@ambari.apache.org
Subject: Installing Ambari on RHEL 7.2/CentOS 7

Hi all,

We are trying to install the Hadoop with Ambari, and we are running into a 
number of errors.
(RHEL/CentOS 7 are not officially supported)

We have corrected most, but I would like to ask any of the early adopters here 
if you have
successfully installed Hadoop and basic components (Hive, etc.) on RHEL so far, 
and if you have a summary of tips to share how to make this process successful.




Re: custom action times out even though it never even started

2016-02-04 Thread Greg Hill
Thanks.  I added some code to wait until the agents returned to HEALTHY state 
and it seems to be a lot more reliable now.

Greg

From: Sumit Mohanty mailto:smoha...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, February 3, 2016 at 11:11 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: custom action times out even though it never even started


Generally, if host is in heart-beat lost or unknown state then commands timeout 
immediately. Adding health check will help for sure.


From: Greg Hill mailto:greg.h...@rackspace.com>>
Sent: Wednesday, February 03, 2016 10:55 AM
To: user@ambari.apache.org<mailto:user@ambari.apache.org>
Subject: custom action times out even though it never even started

In our Ambari setup, we inject some custom actions.  Generally this has worked 
well, but lately I've been testing a specific one and the behavior I'm seeing 
confuses me.  We have one custom action that will download a script from a URL 
and run it.  However, despite my setting the timeout on the script to an hour, 
it randomly "times out" on specific servers in a matter of seconds.  It times 
out before it even attempts to run the script on that host.  Is there some time 
out based on whether ambari-agent starts running the action in a certain amount 
of time I need to tweak here?  It's random, but it usually affects at least one 
host in the cluster.

I should note that this is done after I restart the Ambari server, so it's 
possible that the agent hasn't fully re-established communications.  Should I 
check the host status before posting my Request to run the script to make sure 
it has gotten back to HEALTHY?

Ambari 2.1.1 if that matters (we're going to update to 2.2.1 when it's out).

Any help appreciated here.

Greg



custom action times out even though it never even started

2016-02-03 Thread Greg Hill
In our Ambari setup, we inject some custom actions.  Generally this has worked 
well, but lately I've been testing a specific one and the behavior I'm seeing 
confuses me.  We have one custom action that will download a script from a URL 
and run it.  However, despite my setting the timeout on the script to an hour, 
it randomly "times out" on specific servers in a matter of seconds.  It times 
out before it even attempts to run the script on that host.  Is there some time 
out based on whether ambari-agent starts running the action in a certain amount 
of time I need to tweak here?  It's random, but it usually affects at least one 
host in the cluster.

I should note that this is done after I restart the Ambari server, so it's 
possible that the agent hasn't fully re-established communications.  Should I 
check the host status before posting my Request to run the script to make sure 
it has gotten back to HEALTHY?

Ambari 2.1.1 if that matters (we're going to update to 2.2.1 when it's out).

Any help appreciated here.

Greg



Re: openjdk update breaks ambari-agent 2-way ssl

2016-01-22 Thread Greg Hill
Oh, ambari uses its own config file, I didn't notice that.  I'll test this out 
and report back next week probably.

Greg

From: Robert Levas mailto:rle...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Friday, January 22, 2016 at 12:09 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: openjdk update breaks ambari-agent 2-way ssl

Hi Greg.

 Can you check the details about the agent-side certificate.

openssl x509 -in /var/lib/ambari-agent/keys/HIOSTNAME.crt -text -noout

I assume the signature algorithm is md5WithRSAEncryption:

Signature Algorithm: md5WithRSAEncryption

Ambari is generating this cert using a custom cnf file.

So to fix your issue, you need to edit /var/lib/ambari-server/keys/ca.config 
and change

default_md = md5
To

default_md = sha1

Then on each of your hosts, remove the cert files and restart the agent:

rm /var/lib/ambari-agent/keys/HOSTAME.*
ambari-agent restart

I think that this should be permanently changed in Ambari since md5 is no 
longer trusted.  Then again sha1 isn't either, so maybe the default needs to be 
sha256.

I hope this helps,

Rob





From: Greg Hill mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Friday, January 22, 2016 at 10:01 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: openjdk update breaks ambari-agent 2-way ssl

We discovered a bug last night when our centos mirror updated openjdk and 
caused cluster builds to start failing.  This is in Ambari 2.1.1 but I didn't 
see anything in github to indicate that this code has since changed.  We 
tracked it down to the removal of the md5 algorithm from the list of supported 
algorithms in openjdk:

https://rhn.redhat.com/errata/RHSA-2016-0049.html

The ambari-server log (in DEBUG mode):

sun.security.validator.ValidatorException: PKIX path validation failed: 
java.security.cert.CertPathValidatorException: Algorithm constraints check 
failed: MD5withRSA
at 
sun.security.validator.PKIXValidator.doValidate(PKIXValidator.java:352)
at 
sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:249)
at sun.security.validator.Validator.validate(Validator.java:260)
at 
sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324)
at 
sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:279)
at 
sun.security.ssl.X509TrustManagerImpl.checkClientTrusted(X509TrustManagerImpl.java:130)
at 
sun.security.ssl.ServerHandshaker.clientCertificate(ServerHandshaker.java:1896)
... 13 more
Caused by: java.security.cert.CertPathValidatorException: Algorithm constraints 
check failed: MD5withRSA
at 
sun.security.provider.certpath.PKIXMasterCertPathValidator.validate(PKIXMasterCertPathValidator.java:135)
at 
sun.security.provider.certpath.PKIXCertPathValidator.validate(PKIXCertPathValidator.java:219)
at 
sun.security.provider.certpath.PKIXCertPathValidator.validate(PKIXCertPathValidator.java:140)
at 
sun.security.provider.certpath.PKIXCertPathValidator.engineValidate(PKIXCertPathValidator.java:79)
at 
java.security.cert.CertPathValidator.validate(CertPathValidator.java:292)
at 
sun.security.validator.PKIXValidator.doValidate(PKIXValidator.java:347)

I looked at the agent code to see how it generates the cert, and it doesn't 
appear to be using md5:

https://github.com/apache/ambari/blob/trunk/ambari-agent/src/main/python/ambari_agent/security.py#L35

The openssl default *is* md5 but CentOS resets the default to sha256 in 
/etc/pki/tls/openssl.cnf:

[ req ]
default_bits = 2048
default_md = sha256
default_keyfile = privkey.pem
distinguished_name  = req_distinguished_name
attributes = req_attributes
x509_extensions  = v3_ca # The extentions to add to the self signed cert

I'm not sure where to look next.  I think this is an Ambari bug, but I'm not 
exactly sure how to fix it or if we can fix it via configuration somehow.

Anyone know this stuff well and care to chime in?  Or pull someone else in who 
does?

Greg


openjdk update breaks ambari-agent 2-way ssl

2016-01-22 Thread Greg Hill
We discovered a bug last night when our centos mirror updated openjdk and 
caused cluster builds to start failing.  This is in Ambari 2.1.1 but I didn't 
see anything in github to indicate that this code has since changed.  We 
tracked it down to the removal of the md5 algorithm from the list of supported 
algorithms in openjdk:

https://rhn.redhat.com/errata/RHSA-2016-0049.html

The ambari-server log (in DEBUG mode):

sun.security.validator.ValidatorException: PKIX path validation failed: 
java.security.cert.CertPathValidatorException: Algorithm constraints check 
failed: MD5withRSA
at 
sun.security.validator.PKIXValidator.doValidate(PKIXValidator.java:352)
at 
sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:249)
at sun.security.validator.Validator.validate(Validator.java:260)
at 
sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324)
at 
sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:279)
at 
sun.security.ssl.X509TrustManagerImpl.checkClientTrusted(X509TrustManagerImpl.java:130)
at 
sun.security.ssl.ServerHandshaker.clientCertificate(ServerHandshaker.java:1896)
... 13 more
Caused by: java.security.cert.CertPathValidatorException: Algorithm constraints 
check failed: MD5withRSA
at 
sun.security.provider.certpath.PKIXMasterCertPathValidator.validate(PKIXMasterCertPathValidator.java:135)
at 
sun.security.provider.certpath.PKIXCertPathValidator.validate(PKIXCertPathValidator.java:219)
at 
sun.security.provider.certpath.PKIXCertPathValidator.validate(PKIXCertPathValidator.java:140)
at 
sun.security.provider.certpath.PKIXCertPathValidator.engineValidate(PKIXCertPathValidator.java:79)
at 
java.security.cert.CertPathValidator.validate(CertPathValidator.java:292)
at 
sun.security.validator.PKIXValidator.doValidate(PKIXValidator.java:347)

I looked at the agent code to see how it generates the cert, and it doesn't 
appear to be using md5:

https://github.com/apache/ambari/blob/trunk/ambari-agent/src/main/python/ambari_agent/security.py#L35

The openssl default *is* md5 but CentOS resets the default to sha256 in 
/etc/pki/tls/openssl.cnf:

[ req ]
default_bits = 2048
default_md = sha256
default_keyfile = privkey.pem
distinguished_name  = req_distinguished_name
attributes = req_attributes
x509_extensions  = v3_ca # The extentions to add to the self signed cert

I'm not sure where to look next.  I think this is an Ambari bug, but I'm not 
exactly sure how to fix it or if we can fix it via configuration somehow.

Anyone know this stuff well and care to chime in?  Or pull someone else in who 
does?

Greg


Re: Multiple CentOS versions in same stack?

2016-01-14 Thread Greg Hill
Honestly, I don't know that anyone has ever tried, but I have a feeling it
might not work out well.  The 'repo' is specified at the stack level, so
you'd have to make a new cluster after modifying the repo url on the stack
in order for the new nodes to even know to use a different repo from the
old nodes for installing packages.  Also, the os_family and os_type is in
the ambari-server configs and isn't overrideable per-node, unless there's
some hidden feature I'm not aware of.

Your best option is probably to spin up a new cluster with the new OS and
migrate the data.

Greg

On 1/13/16, 6:20 PM, "Andrew Robertson"  wrote:

>Has anyone ever tried to run an Ambari cluster with hosts at different
>centos versions (or some nodes with one OS like centos and other nodes
>with something else?)
>
>Any reason this wouldn't be advised?
>
>I'm considering upgrading from centos 6 -> centos 7.  Given the
>current centos 6 -> 7 upgrade path is "reinstall", this make take some
>time to accomplish and I'd end up with a mix of both machine types in
>the cluster during this time.
>
>I don't see any reasons this would not work - but I also don't see
>anything that explicitly states this is a tested/advised config
>either.
>
>Thanks!



Re: Systemd update breaks ambari-server and ambari-agent

2015-12-30 Thread Greg Hill
https://issues.apache.org/jira/browse/AMBARI-14526

I'd attempt a patch, but I don't know the packaging stuff that well and I've 
got some urgent projects ATM.  Should be relatively straightforward to someone 
who does, though, and if nobody else gets to it, I might take a stab in a 
couple weeks.

Greg

From: Sumit Mohanty mailto:smoha...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, December 30, 2015 at 1:29 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: Systemd update breaks ambari-server and ambari-agent


​Thanks Greg. Can you open a JIRA and add these to the description.


-Sumit


From: Greg Hill mailto:greg.h...@rackspace.com>>
Sent: Wednesday, December 30, 2015 11:21 AM
To: user@ambari.apache.org<mailto:user@ambari.apache.org>
Subject: Re: Systemd update breaks ambari-server and ambari-agent

This seems to work so far, in case someone else runs into the same problem:

/usr/lib/systemd/system/ambari-server.service
--

[Unit]
Description=ambari-server service
After=xe-linux-distribution.service

[Service]
Type=forking
ExecStart=/usr/sbin/ambari-server start
ExecStop=/usr/sbin/ambari-server stop

[Install]
WantedBy=multi-user.target


From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, December 30, 2015 at 10:10 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: Systemd update breaks ambari-server and ambari-agent

I was mistaken on one detail, ambari-agent does appear to still work with 
systemd, just not ambari-server.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, December 30, 2015 at 9:31 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Systemd update breaks ambari-server and ambari-agent

A recent CentOS update (7.2) is causing ambari-server to not work with systemd.

systemctl restart ambari-server
Unit ambari-server.service failed to load: No such file or directory.

This is because ambari-server does not install a service definition file in:
/usr/lib/systemd/system/ambari-server.service

I can't find anything in the Ambari git repo referencing systemd, so maybe this 
hasn't been addressed yet?

I think what maybe happened was that RHEL/CentOS provided a shim to use the old 
sysvinit init script with systemd and they have now either removed or broken 
that.

Does anyone have a working systemd service definition for ambari-agent and 
ambari-server by chance?  Otherwise, I'll figure out how to write one I guess 
(first time for everything).

Greg


Re: Systemd update breaks ambari-server and ambari-agent

2015-12-30 Thread Greg Hill
This seems to work so far, in case someone else runs into the same problem:

/usr/lib/systemd/system/ambari-server.service
--

[Unit]
Description=ambari-server service
After=xe-linux-distribution.service

[Service]
Type=forking
ExecStart=/usr/sbin/ambari-server start
ExecStop=/usr/sbin/ambari-server stop

[Install]
WantedBy=multi-user.target


From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Wednesday, December 30, 2015 at 10:10 AM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: Re: Systemd update breaks ambari-server and ambari-agent

I was mistaken on one detail, ambari-agent does appear to still work with 
systemd, just not ambari-server.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Wednesday, December 30, 2015 at 9:31 AM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: Systemd update breaks ambari-server and ambari-agent

A recent CentOS update (7.2) is causing ambari-server to not work with systemd.

systemctl restart ambari-server
Unit ambari-server.service failed to load: No such file or directory.

This is because ambari-server does not install a service definition file in:
/usr/lib/systemd/system/ambari-server.service

I can't find anything in the Ambari git repo referencing systemd, so maybe this 
hasn't been addressed yet?

I think what maybe happened was that RHEL/CentOS provided a shim to use the old 
sysvinit init script with systemd and they have now either removed or broken 
that.

Does anyone have a working systemd service definition for ambari-agent and 
ambari-server by chance?  Otherwise, I'll figure out how to write one I guess 
(first time for everything).

Greg


Re: Systemd update breaks ambari-server and ambari-agent

2015-12-30 Thread Greg Hill
I was mistaken on one detail, ambari-agent does appear to still work with 
systemd, just not ambari-server.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Wednesday, December 30, 2015 at 9:31 AM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: Systemd update breaks ambari-server and ambari-agent

A recent CentOS update (7.2) is causing ambari-server to not work with systemd.

systemctl restart ambari-server
Unit ambari-server.service failed to load: No such file or directory.

This is because ambari-server does not install a service definition file in:
/usr/lib/systemd/system/ambari-server.service

I can't find anything in the Ambari git repo referencing systemd, so maybe this 
hasn't been addressed yet?

I think what maybe happened was that RHEL/CentOS provided a shim to use the old 
sysvinit init script with systemd and they have now either removed or broken 
that.

Does anyone have a working systemd service definition for ambari-agent and 
ambari-server by chance?  Otherwise, I'll figure out how to write one I guess 
(first time for everything).

Greg


Systemd update breaks ambari-server and ambari-agent

2015-12-30 Thread Greg Hill
A recent CentOS update (7.2) is causing ambari-server to not work with systemd.

systemctl restart ambari-server
Unit ambari-server.service failed to load: No such file or directory.

This is because ambari-server does not install a service definition file in:
/usr/lib/systemd/system/ambari-server.service

I can't find anything in the Ambari git repo referencing systemd, so maybe this 
hasn't been addressed yet?

I think what maybe happened was that RHEL/CentOS provided a shim to use the old 
sysvinit init script with systemd and they have now either removed or broken 
that.

Does anyone have a working systemd service definition for ambari-agent and 
ambari-server by chance?  Otherwise, I'll figure out how to write one I guess 
(first time for everything).

Greg


Re: NullPointerException when posting a Request

2015-12-07 Thread Greg Hill
So, looking at our code a bit more, we submit multiple requests in parallel, 
and only one of them failed with the 500 error.  Perhaps a locking (or lack 
thereof) issue?  I can adjust our workflow to prevent the parallel submission 
as a workaround, but seems like Ambari should be able to handle it.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Monday, December 7, 2015 at 11:28 AM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:NullPointerException when posting a Request

Our product provisions clusters in an automated way using Ambari.  About 1 in 
50 clusters gets this error, so we're not sure exactly how to reproduce it.  It 
might be a race condition of some sort.  One of the first things we do is run a 
set of custom actions on all the nodes in the cluster.  We do that by POST'ing 
a request to the cluster.  Randomly that request will throw a 500 error with 
this NullPointerException:

https://gist.githubusercontent.com/jimbobhickville/62176c2053827a90efab/raw/34fec22ff0fda2056090377ca432b18f58073d9a/gistfile1.txt

This is on Ambari 2.1.1

The code path looks pretty normal, but I'm not much of a Java dev so there 
could be something that isn't obvious to me.  It's just looking up the current 
cluster in the database by name.  Doing a GET /clusters/ works fine 
at the same point in the process, afaict, so I don't see how it's getting a 
NullPointerException when doing it internally.

Is this a known issue?  Is there a workaround or way to tell when it's safe to 
issue the request?

Greg


NullPointerException when posting a Request

2015-12-07 Thread Greg Hill
Our product provisions clusters in an automated way using Ambari.  About 1 in 
50 clusters gets this error, so we're not sure exactly how to reproduce it.  It 
might be a race condition of some sort.  One of the first things we do is run a 
set of custom actions on all the nodes in the cluster.  We do that by POST'ing 
a request to the cluster.  Randomly that request will throw a 500 error with 
this NullPointerException:

https://gist.githubusercontent.com/jimbobhickville/62176c2053827a90efab/raw/34fec22ff0fda2056090377ca432b18f58073d9a/gistfile1.txt

This is on Ambari 2.1.1

The code path looks pretty normal, but I'm not much of a Java dev so there 
could be something that isn't obvious to me.  It's just looking up the current 
cluster in the database by name.  Doing a GET /clusters/ works fine 
at the same point in the process, afaict, so I don't see how it's getting a 
NullPointerException when doing it internally.

Is this a known issue?  Is there a workaround or way to tell when it's safe to 
issue the request?

Greg


Re: MYSQL_SERVER install failing 100% of the time now

2015-11-30 Thread Greg Hill
Honestly, I didn't investigate it that much after I discovered it was an issue 
with the HWX mirrors.  What I can say is, the same version of Ambari installing 
the same version of HDP worked consistently before last week and broke 
consistently starting last week.  Nothing on my end changed, so something in 
the mirrors changed.  After switching mirrors to our internal mirrors which 
haven't been synced in a while, it worked again.

We don't provide a preconfigured database for Hive.  We let Ambari 
automatically create one, so I think it *does* attempt to install MYSQL_SERVER 
automatically for us.

IMO, there's another bug here besides the package update breaking things.  yum 
install commands shouldn't fail if the package is already installed.  Ambari 
should either a) check if something is installed before attempting to install 
it or b) ignore 'already installed' errors.

Greg

From: Alejandro Fernandez 
mailto:afernan...@hortonworks.com>>
Date: Wednesday, November 25, 2015 at 12:29 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>, Greg 
mailto:greg.h...@rackspace.com>>
Subject: Re: MYSQL_SERVER install failing 100% of the time now

Actually, Hive MySQL should exclude mysql-community-release when not installing 
a new MySQL Server.
Its params_linux.py contains,


# There are other packages that contain 
/usr/share/java/mysql-connector-java.jar (like libmysql-java),
# trying to install mysql-connector-java upon them can cause packages to 
conflict.
if hive_use_existing_db:
  hive_exclude_packages = ['mysql-connector-java', 'mysql', 'mysql-server',
   'mysql-community-release', 'mysql-community-server']
else:
  if 'role' in config and config['role'] != "MYSQL_SERVER":
hive_exclude_packages = ['mysql', 'mysql-server', 'mysql-community-release',
 'mysql-community-server']
  if os.path.exists(mysql_jdbc_driver_jar):
hive_exclude_packages.append('mysql-connector-java')


In metainfo.xml, redhat7 installs mysql-community-release

  redhat7
  

  mysql-community-release
  true


  mysql-community-server
  true

  



Thanks,
Alejandro

From: Alejandro Fernandez 
mailto:afernan...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, November 25, 2015 at 10:17 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>, 
"greg.h...@rackspace.com<mailto:greg.h...@rackspace.com>" 
mailto:greg.h...@rackspace.com>>
Subject: Re: MYSQL_SERVER install failing 100% of the time now

Hi Greg, what do you get after running,
yum info mysql*

It should contain the repo that provided it.

Thanks,
Alejandro

From: Greg Hill mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, November 25, 2015 at 5:26 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: MYSQL_SERVER install failing 100% of the time now

FYI, sometime in the last few days, MYSQL_SERVER install started failing 100% 
of the time with Ambari 2.1.0 and HDP 2.3.0.0 on CentOS 7.  I'm guessing that a 
previously installed package now installs mysql-community-release as a 
dependency, whereas before it was only installed at this point.  Was there a 
Hortonworks package update recently?  Switching to internal mirrors that were 
synced a while back fixes the issues, so it's definitely something specific to 
Hortonworks public mirrors.

Why this command fails if it's already installed is beyond me, but here's the 
error:

resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum -d 0 -e 0 
-y install mysql-com
munity-release' returned 1. Error: Nothing to do
Traceback (most recent call last):
  File 
"/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_server.py
", line 64, in 
MysqlServer().execute()
  File 
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
 line 218,
 in execute
method(env)
  File 
"/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_server.py
", line 33, in install
self.install_packages(env, exclude_packages=params.hive_exclude_packages)
  File 
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
 line 376,
 in ins

MYSQL_SERVER install failing 100% of the time now

2015-11-25 Thread Greg Hill
FYI, sometime in the last few days, MYSQL_SERVER install started failing 100% 
of the time with Ambari 2.1.0 and HDP 2.3.0.0 on CentOS 7.  I'm guessing that a 
previously installed package now installs mysql-community-release as a 
dependency, whereas before it was only installed at this point.  Was there a 
Hortonworks package update recently?  Switching to internal mirrors that were 
synced a while back fixes the issues, so it's definitely something specific to 
Hortonworks public mirrors.

Why this command fails if it's already installed is beyond me, but here's the 
error:

resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum -d 0 -e 0 
-y install mysql-com
munity-release' returned 1. Error: Nothing to do
Traceback (most recent call last):
  File 
"/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_server.py
", line 64, in 
MysqlServer().execute()
  File 
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
 line 218,
 in execute
method(env)
  File 
"/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_server.py
", line 33, in install
self.install_packages(env, exclude_packages=params.hive_exclude_packages)
  File 
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
 line 376,
 in install_packages
Package(name)
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
line 154, in __init__
self.env.run()
  File 
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
line 152, in run
self.run_action(resource, action)
  File 
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
line 118, in run
_action
provider_action()
  File 
"/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
 l
ine 45, in action_install
self.install_package(package_name, self.resource.use_repos, 
self.resource.skip_repos)
  File 
"/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py",
 lin
e 49, in install_package
shell.checked_call(cmd, sudo=True, logoutput=self.get_logoutput())
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
line 70, in inner
result = function(command, **kwargs)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
line 92, in checked_ca
ll
tries=tries, try_sleep=try_sleep)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
line 140, in _call_wra
pper
result = _call(command, **kwargs_copy)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum -d 0 -e 0 
-y install mysql-com
munity-release' returned 1. Error: Nothing to do



Re: Tez View not loading in Ambari 2.1.2

2015-10-13 Thread Greg Hill
The Tez view requires Kerberos.  I'm not really sure why it does, but it seems 
to have something to do with the proxyuser settings in the YARN timeline 
server, IIRC.  I don't know why it doesn't allow to proxy system users like 
every other component. I could be wrong on that, since the docs just say it 
requires kerberos with no explanation, and that was the only relevant setting I 
could find that explained it.

If we can remove that requirement, that would be great for us over here as 
well, since we don't have kerberos going yet and would like to enable the Tez 
view as well.

Greg

From: Shaik M mailto:munna.had...@gmail.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Tuesday, October 13, 2015 at 12:11 AM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: Tez View not loading in Ambari 2.1.2

Team, Can we get any help for on this?

On 9 October 2015 at 08:49, Shaik M 
mailto:munna.had...@gmail.com>> wrote:
There is no Java Script error.

we are not enabled Ambari with Kerberos. It is running with without Kerberos, 
enabled only for Hadoop components.

On 9 October 2015 at 08:00, Jeffrey Sposetti 
mailto:j...@hortonworks.com>> wrote:
A couple things to check:

  1.  Any JavaScript errors?
  2.  Did you configure Ambari itself for Kerberos? 
http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.0/bk_ambari_views_guide/content/ch_configuring_views_for_kerberos.html

From: Shaik M mailto:munna.had...@gmail.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Thursday, October 8, 2015 at 7:56 PM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: Re: Tez View not loading in Ambari 2.1.2

Hi Srimanth,

PFA Tez view page.

we are running on Secure cluster, is it causing the issue?

Thanks

On 9 October 2015 at 07:47, Shaik M 
mailto:munna.had...@gmail.com>> wrote:
Hi Srimanth,

I have waiting long time (20 min), but its not loading.

I am able to access all components in Ambari except Tez view.

Thanks,
Shaik

On 9 October 2015 at 07:42, Srimanth Gunturi 
mailto:sgunt...@hortonworks.com>> wrote:

Hello,

The Tez view takes a minute to load ​data into the view... till then it shows 
as a blank page.

Hoping you have waited long enough?


Are you able to access other parts of the Ambari UI like the Dashboard etc.?

Regards,

Srimanth





From: Shaik M mailto:munna.had...@gmail.com>>
Sent: Thursday, October 08, 2015 4:31 PM
To: user@ambari.apache.org
Subject: Tez View not loading in Ambari 2.1.2

Hi,

I have deployed Ambari 2.1.2 with HDP 2.3.

I have used existing Tez View in Ambari and created Tez Instance.

I have trying to open the View from views tab, but, It's not loading Tez view 
in Amabri UI. Blank page it showing.

Please help me resolve this.






ambari-server not responsive for several minutes after started

2015-10-07 Thread Greg Hill
Something is goofy with the ambari-server startup process since 2.0. The init 
script exits and status says ambari-server is up, but it doesn't respond to API 
requests for about 2 minutes afterwards.  Looking at the logs, it looks like 
it's searching all the JAR files for every view and loading them.

Can we make it so that the views code is lazy-loaded so the API can be 
responsive before all the views are loaded?
Are there other startup time improvements?
Can we make the init script not return until it's actually running?

Greg


Re: COMMERCIAL:Hue on Ambrai

2015-09-23 Thread Greg Hill
The version of HUE in Ambari is really old and deprecated.  I don't think it's 
officially supported any more. It doesn't appear that it even supports 
overriding hue-env, afaict, so you wouldn't be able to use Ambari to override 
that configuration.  There are some attempts at making an updated HUE service 
on github if you google around; I haven't tried any of them.

Wish I had better news for you.

Greg

From: Jeetendra G mailto:jeetendr...@housing.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Wednesday, September 23, 2015 at 7:28 AM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Hue on Ambrai

Hi All I have followed steps from 
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap-hue.html
but when I started the hue its giving me below error my server already have 2.7 
installed and path is also set correctly,why Hue is specifically looking for 
2.6 Python

Sep 23 17:05:01 hadoop07.housing.com 
runuser[30301]: pam_unix(runuser:session): session opened for user hue by 
(uid=0)
Sep 23 17:05:01 hadoop07.housing.com hue[30299]: 
/usr/bin/env: python2.6: No such file or directory
Sep 23 17:05:01 hadoop07.housing.com 
runuser[30311]: pam_unix(runuser:session): session opened for user hue by 
(uid=0)
Sep 23 17:05:01 hadoop07.housing.com hue[30299]: 
Starting hue: /usr/bin/env: python2.6: No such file or directory
Sep 23 17:05:01 hadoop07.housing.com hue[30299]: 
[FAILED]
Sep 23 17:05:01 hadoop07.housing.com systemd[1]: 
hue.service: control process exited, code=exited status=127
Sep 23 17:05:01 hadoop07.housing.com systemd[1]: 
Failed to start SYSV: Hue web server.
Sep 23 17:05:01 hadoop07.housing.com systemd[1]: 
Unit hue.service entered failed state.


Re: COMMERCIAL:Re: Tez view questions

2015-09-01 Thread Greg Hill
It is.  After more research, I think this is due to the lack of kerberos on the 
cluster.  Apparently the Tez view requires kerberos to be installed and running 
to proxy users to the Job History page.  I'm not sure why this limitation 
exists and it can't just proxy system users like everything else.  The Ambari 
docs for how to set up Kerberos don't work at all on CentOS 7 (the very first 
command fails with a vague "there is missing configuration" error).

We're just moving forward with the remaining views and not the Tez one for now. 
 We'll figure out Kerberos later.

Greg

From: Srimanth Gunturi 
mailto:sgunt...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Tuesday, September 1, 2015 at 4:16 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: Tez view questions


Hi Greg,

Can you please verify that "hive.execution.engine" in hive-site.xml is set to 
'tez' ?

Regards,

Srimanth




From: Greg Hill mailto:greg.h...@rackspace.com>>
Sent: Monday, August 31, 2015 7:33 AM
To: user@ambari.apache.org<mailto:user@ambari.apache.org>
Subject: Tez view questions

So with the 2.1.1 release, I can now enable the Tez view on my clusters, but it 
doesn't seem to show anything.  Similarly, the Tez tab on the Hive view doesn't 
show anything.  It should show any Pig or Hive jobs that were run previously, 
right?  I verified that the job history page has the jobs.  What might I be 
missing?

The release docs mention needing Kerberos for the Tez view, but we aren't using 
Kerberos for any of the services on the cluster yet.  Is there something with 
the view itself that requires Kerberos?

Any pointers are appreciated.  This is the last roadblock to us offering the 
views to our customers.

Greg


Tez view questions

2015-08-31 Thread Greg Hill
So with the 2.1.1 release, I can now enable the Tez view on my clusters, but it 
doesn't seem to show anything.  Similarly, the Tez tab on the Hive view doesn't 
show anything.  It should show any Pig or Hive jobs that were run previously, 
right?  I verified that the job history page has the jobs.  What might I be 
missing?

The release docs mention needing Kerberos for the Tez view, but we aren't using 
Kerberos for any of the services on the cluster yet.  Is there something with 
the view itself that requires Kerberos?

Any pointers are appreciated.  This is the last roadblock to us offering the 
views to our customers.

Greg


When is 2.1.1 planned for release?

2015-08-28 Thread Greg Hill
I need the HTTP_ONLY fix for the Tez view.  What's the ETA on release for 2.1.1?

Greg


Re: COMMERCIAL:Setting the Ambari API hostname in views

2015-08-28 Thread Greg Hill
FWIW, I figured this out.  It is not configurable, it just looks up the local 
system hostname.  It looks like I can tell the agent to use a custom script to 
figure out what the agent hostname should be, so I can configure the system 
with the fqdn as the hostname and inject a script for the agent to get the 
local hostname and that should hopefully work.  I'm going to open a JIRA to 
suggest that both the agent and server hostnames should be configurable.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Tuesday, August 25, 2015 at 2:30 PM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Setting the Ambari API hostname in views

I'm trying to enable views, but Ambari is automatically configuring them with 
the local domain hostname that we use for internal cluster traffic, rather than 
the public fqdn that we use for the Ambari API.  Because of this we get an SSL 
hostname mismatch and the views don't work.  I tried setting ambari.server.url 
to the fqdn, but the view still tries to validate the cert for the local 
hostname.  Is there some way to tell Ambari what hostname to use for all API 
traffic from the views?

Trying to dig through the code to figure out where that value is populated, but 
no luck so far.

Greg


Setting the Ambari API hostname in views

2015-08-25 Thread Greg Hill
I'm trying to enable views, but Ambari is automatically configuring them with 
the local domain hostname that we use for internal cluster traffic, rather than 
the public fqdn that we use for the Ambari API.  Because of this we get an SSL 
hostname mismatch and the views don't work.  I tried setting ambari.server.url 
to the fqdn, but the view still tries to validate the cert for the local 
hostname.  Is there some way to tell Ambari what hostname to use for all API 
traffic from the views?

Trying to dig through the code to figure out where that value is populated, but 
no luck so far.

Greg


Url validation bug in views?

2015-08-20 Thread Greg Hill
I'm working on enabling views, and there's an issue in that the ambari server 
is known locally on the cluster as 'ambari.local', but has a fqdn for external 
access.  The fqdn looks something like:

ambari-$uuid.domain.com

For some reason, when I try to configure the CAPACITY-SCHEDULER view, if I let 
Ambari generate the API url, it uses ambari.local as the hostname and that 
hostname doesn't match the SSL certificate, so the view fails to connect due to 
SSL errors.  If I put in the real URL, Ambari complains that it's invalid and 
rejects the View Instance creation.  It's a valid domain name, so I don't know 
why Ambari is rejecting it.  Here's an example:

https://ambari-788df910ee9f23cad74712226f55b56f.cbdptest.com:443/api/v1/clusters/mycluster

I don't know it if dislikes the hyphen, or if it thinks the domain is too long, 
or what.  I know that I've run into domain validators in the past that misread 
the spec and thought the entire domain name had to be shorter than 64 
characters, when in fact, just each section of it does.  And hyphens are 
allowed as well.   It just says that the url needs to have a protocol, domain 
name, port, and cluster name.  If I change just the domain name to something 
else like 'ambari.com', it accepts it, but the view is broken because it's the 
wrong URL.

Is there any way I can tell Ambari to skip validation until it can be corrected?

Greg


Re: COMMERCIAL:Re: GANGLIA broken in Ambari 2.1?

2015-07-27 Thread Greg Hill
I'm aware of Metrics, but Ganglia is in fact still available and from what I 
read, supported.  We didn't want to convert over immediately because of the 
dependence on Hbase/Zookeeper requiring additional resources on the cluster, 
and the fact that it's a brand new thing vs something that's been around and 
battle-hardened for years.

Greg

From: Jayesh Thakrar mailto:j_thak...@yahoo.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>, Jayesh Thakrar 
mailto:j_thak...@yahoo.com>>
Date: Monday, July 27, 2015 at 4:16 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: GANGLIA broken in Ambari 2.1?

Hi Greg,

There's no Ganglia now in Ambari 2.1.

It has been replaced by an internal/local HBase and a metrics collector 
subsystem.

You will find some details on the wiki.

Thanks,
Jayesh


From: Greg Hill mailto:greg.h...@rackspace.com>>
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Sent: Monday, July 27, 2015 3:18 PM
Subject: GANGLIA broken in Ambari 2.1?

In the Centos7 HDP2.3 stack, it attempts to run '/etc/init.d/httpd' which 
doesn't exist, rather than using the 'service' shortcut that does still work, 
even though it forwards to 'systemctl'.  I injected a script into 
/etc/init.d/httpd to work around this, but the stack should probably be fixed.

In the Centos6 HDP2.2 stack, after everything finishes installing Ambari says 
that Ganglia Server is down, even though both httpd and gmetad are up and 
responsive.  You can restart it fine, and it says it's up for a minute or so, 
then says it's down.  Processes are still running, no errors in the logs.  Is 
this a known issue?  Is there a workaround?

Also, since it now installs the 'ganglia-gmond' package instead of the 
versioned 'ganglia-gmond-3.5.0', it gets a conflict with the Centos6 EPEL repo, 
which has version 3.7 available.  I disabled the epel repo for now.

I'll try to gather more details and open a JIRA, but just wondered if someone 
else had run into this and/or solved it.

Thanks,
Greg






Re: COMMERCIAL:GANGLIA broken in Ambari 2.1?

2015-07-27 Thread Greg Hill
Apparently the gmetad service was running and the hdp-gmetad wrapper service 
didn't recognize that and tried to start it again on the same port.  Had to 
chkconfig gmetad off and service gmetad stop on the server before handing it 
off to Ambari.  I'll submit a JIRA tomorrow.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Monday, July 27, 2015 at 3:18 PM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:GANGLIA broken in Ambari 2.1?

In the Centos7 HDP2.3 stack, it attempts to run '/etc/init.d/httpd' which 
doesn't exist, rather than using the 'service' shortcut that does still work, 
even though it forwards to 'systemctl'.  I injected a script into 
/etc/init.d/httpd to work around this, but the stack should probably be fixed.

In the Centos6 HDP2.2 stack, after everything finishes installing Ambari says 
that Ganglia Server is down, even though both httpd and gmetad are up and 
responsive.  You can restart it fine, and it says it's up for a minute or so, 
then says it's down.  Processes are still running, no errors in the logs.  Is 
this a known issue?  Is there a workaround?

Also, since it now installs the 'ganglia-gmond' package instead of the 
versioned 'ganglia-gmond-3.5.0', it gets a conflict with the Centos6 EPEL repo, 
which has version 3.7 available.  I disabled the epel repo for now.

I'll try to gather more details and open a JIRA, but just wondered if someone 
else had run into this and/or solved it.

Thanks,
Greg




GANGLIA broken in Ambari 2.1?

2015-07-27 Thread Greg Hill
In the Centos7 HDP2.3 stack, it attempts to run '/etc/init.d/httpd' which 
doesn't exist, rather than using the 'service' shortcut that does still work, 
even though it forwards to 'systemctl'.  I injected a script into 
/etc/init.d/httpd to work around this, but the stack should probably be fixed.

In the Centos6 HDP2.2 stack, after everything finishes installing Ambari says 
that Ganglia Server is down, even though both httpd and gmetad are up and 
responsive.  You can restart it fine, and it says it's up for a minute or so, 
then says it's down.  Processes are still running, no errors in the logs.  Is 
this a known issue?  Is there a workaround?

Also, since it now installs the 'ganglia-gmond' package instead of the 
versioned 'ganglia-gmond-3.5.0', it gets a conflict with the Centos6 EPEL repo, 
which has version 3.7 available.  I disabled the epel repo for now.

I'll try to gather more details and open a JIRA, but just wondered if someone 
else had run into this and/or solved it.

Thanks,
Greg




Re: Adding Local Repos to Ambari via REST APIs

2015-07-21 Thread Greg Hill
Just do GET instead of PUT to get the current values.

Greg

From: Pratik Gadiya 
mailto:pratik_gad...@persistent.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Tuesday, July 21, 2015 at 5:22 AM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Adding Local Repos to Ambari via REST APIs

Hi Team,

I came to know that we have BigInsights integrated with ambari in BI 4.0 
version.
So, I manually deployed the cluster via ambari UI and all the services were up 
and running.
In this, I have used my local repository for the deployment.

Currently, I want to convert this manual steps into automation.
For the same, I was wondering how can I add my local stack repositories via 
ambari rest apis.

I investigated on this and found some pointers on the url 
https://cwiki.apache.org/confluence/display/AMBARI/Blueprints#Blueprints-Step4:SetupStackRepositories(Optional)
However, I am not getting what should be the value or how to get the names 
highlighted in bold as below

PUT 
/api/v1/stacks/:stack/versions/:stackVersion/operating_systems/:osType/repositories/:repoId

For HDP its something like,
/api/v1/stacks/HDP/versions/2.2/operating_systems/redhat6/repositories/HDP-2.2

Is there any way by which I can get this value via REST API in the BI 4.0 
deployed cluster ?

Help Appreciated !!


With Regards,
Pratik Gadiya


DISCLAIMER == This e-mail may contain privileged and confidential 
information which is the property of Persistent Systems Ltd. It is intended 
only for the use of the individual or entity to which it is addressed. If you 
are not the intended recipient, you are not authorized to read, retain, copy, 
print, distribute or use this message. If you have received this communication 
in error, please notify the sender and delete all copies of this message. 
Persistent Systems Ltd. does not accept any liability for virus infected mails.


oozie + hive broken in Ambari 2.0?

2015-06-10 Thread Greg Hill
For the life of me, I can't figure out what's going on with this.  We use the 
example hive job as a test to verify that oozie + hive is working in new 
versions of HDP that we test.  With HDP 2.2.4 and Ambari 2.0, it keeps failing 
with this nebulous error:

2015-06-09 19:41:04,705  WARN HiveActionExecutor:546 - 
SERVER[secondary-1.local] USER[greg7137] GROUP[-] TOKEN[] APP[hive-wf] 
JOB[010-150609171709914-oozie-oozi-W] 
ACTION[010-150609171709914-oozie-oozi-W@hive-node] Launcher ERROR, reason: 
Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [1]

However, the actual hive job completed successfully.  Nothing else is logged 
anywhere, not by oozie, not by hive.  I have a feeling it's a classpath issue 
with the sharelib that is pushed into hdfs by Ambari, but I have to use 
'oozie.use.system.libpath=true' for it to even find the Hive JARs at all.  But 
I think it's also pulling in a conflicting JAR in one of the other folders.

Does anyone have Oozie + Hive working properly with Ambari 2.0/HDP 2.2.4?
Did you have to do anything special?

I can provide more logs if you like, but there's nothing else useful anywhere.  
Nothing to indicate what the actual error is.  I cranked every log in the 
system up to DEBUG and still produced nothing else useful.  No stack trace.  
Nothing.  The YARN job completes fine, but oozie thinks it failed.  Switching 
between Tez and MapReduce has no effect, same error.

Thanks in advance for any suggestions of things to check for.

Greg


Re: Ambari 2.0 DECOMMISSION

2015-05-15 Thread Greg Hill
I guess that makes sense if you plan to re-add a host with the same name, but 
we don't ever do that in our product.  Is there a way to disable the behavior?  
I'd like to be able to remove datanodes without alerts being triggered, as this 
is a common use-case for us (we do automated hdp provisioning on the cloud).  
I'm adding in a step to restart the namenode to see if that solves the alert 
problem, but ideally we'd like to not have to restart the namenode if we can 
avoid it.

Thanks in advance for any suggestions.

Greg

From: Sumit Mohanty mailto:smoha...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Thursday, May 14, 2015 at 4:17 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>, Sean Roberts 
mailto:srobe...@hortonworks.com>>
Subject: COMMERCIAL:Re: Ambari 2.0 DECOMMISSION


​Occasions where I do not see the node go to decommission is when the 
replication factor (dfs.replication) is equal to or greater than the number of 
data nodes that are active.


Hosts get removed from exclude file when the host gets deleted. This was added 
at some point so that when the host is added back the DN can join normally. 
Host component start/stop should not trigger this.


From: Greg Hill mailto:greg.h...@rackspace.com>>
Sent: Thursday, May 14, 2015 11:46 AM
To: user@ambari.apache.org<mailto:user@ambari.apache.org>; Sean Roberts
Subject: Re: Ambari 2.0 DECOMMISSION

Some further testing results:

1. Turning on maintenance mode beforehand didn't seem to affect it.
2. The datanodes do go to decommissioning briefly before they go back to live, 
so it is at least trying to decommission them.  Shouldn't they go to 
'decommissioned' after it finishes though?
3. Some operation I'm doing (either stop host components or deleting host 
components) is causing Ambari to automatically do a request like this for each 
node that's been decommissioned:
Remove host slave-6.local from exclude file
When that's done is when they get marked "dead" by the Namenode.

This worked fine in Ambari 1.7, so I'm guessing the "remove host from exclude 
file" thing is what's breaking it as that's new.  Is there some way to disable 
that?  Can someone explain the rationale behind it?  I'd like to be able to 
remove nodes without having to restart the Namenode.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Thursday, May 14, 2015 at 10:59 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>, Sean Roberts 
mailto:srobe...@hortonworks.com>>
Subject: COMMERCIAL:Ambari 2.0 DECOMMISSION

Did anything change with DECOMISSION in the 2.0 release?  The process appears 
to decommission fine (the request completes and says it updated the dfs.exclude 
file), but the datanodes aren't decommissioned and HDFS now says they're dead 
and I need to restart the Namenode.  For YARN, the nodemanagers appear to have 
decommissioned ok and are in decommissioned status, but it says I need to 
restart the resource manager (this didn't used to be the case in 1.7.0).

The only difference is that I don't set maintenance mode on the datanodes until 
after the decommission completes, because that wasn't working for me at one 
point (turns out hitting the API slightly differently would have made it work). 
 Is that the cause maybe?  Is restarting the master services now required after 
a decommission?


Task output:

DataNode Decommission: slave-2.local,slave-4.local

stderr:
None
 stdout:
2015-05-14 14:45:48,439 - u"File['/etc/hadoop/conf/dfs.exclude']" {'owner': 
'hdfs', 'content': Template('exclude_hosts_list.j2'), 'group': 'hadoop'}
2015-05-14 14:45:48,670 - Writing u"File['/etc/hadoop/conf/dfs.exclude']" 
because contents don't match
2015-05-14 14:45:48,864 - u"Execute['']" {'user': 'hdfs'}
2015-05-14 14:45:48,968 - u"ExecuteHadoop['dfsadmin -refreshNodes']" 
{'bin_dir': '/usr/hdp/current/hadoop-client/bin', 'conf_dir': 
'/etc/hadoop/conf', 'kinit_override': True, 'user': 'hdfs'}
2015-05-14 14:45:49,011 - u"Execute['hadoop --config /etc/hadoop/conf dfsadmin 
-refreshNodes']" {'logoutput': None, 'try_sleep': 0, 'environment': {}, 
'tries': 1, 'user': 'hdfs', 'path': ['/usr/hdp/current/hadoop-client/bin'

Re: Ambari 2.0 DECOMMISSION

2015-05-14 Thread Greg Hill
Some further testing results:

1. Turning on maintenance mode beforehand didn't seem to affect it.
2. The datanodes do go to decommissioning briefly before they go back to live, 
so it is at least trying to decommission them.  Shouldn't they go to 
'decommissioned' after it finishes though?
3. Some operation I'm doing (either stop host components or deleting host 
components) is causing Ambari to automatically do a request like this for each 
node that's been decommissioned:
Remove host slave-6.local from exclude file
When that's done is when they get marked "dead" by the Namenode.

This worked fine in Ambari 1.7, so I'm guessing the "remove host from exclude 
file" thing is what's breaking it as that's new.  Is there some way to disable 
that?  Can someone explain the rationale behind it?  I'd like to be able to 
remove nodes without having to restart the Namenode.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Thursday, May 14, 2015 at 10:59 AM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>, Sean Roberts 
mailto:srobe...@hortonworks.com>>
Subject: COMMERCIAL:Ambari 2.0 DECOMMISSION

Did anything change with DECOMISSION in the 2.0 release?  The process appears 
to decommission fine (the request completes and says it updated the dfs.exclude 
file), but the datanodes aren't decommissioned and HDFS now says they're dead 
and I need to restart the Namenode.  For YARN, the nodemanagers appear to have 
decommissioned ok and are in decommissioned status, but it says I need to 
restart the resource manager (this didn't used to be the case in 1.7.0).

The only difference is that I don't set maintenance mode on the datanodes until 
after the decommission completes, because that wasn't working for me at one 
point (turns out hitting the API slightly differently would have made it work). 
 Is that the cause maybe?  Is restarting the master services now required after 
a decommission?


Task output:

DataNode Decommission: slave-2.local,slave-4.local

stderr:
None
 stdout:
2015-05-14 14:45:48,439 - u"File['/etc/hadoop/conf/dfs.exclude']" {'owner': 
'hdfs', 'content': Template('exclude_hosts_list.j2'), 'group': 'hadoop'}
2015-05-14 14:45:48,670 - Writing u"File['/etc/hadoop/conf/dfs.exclude']" 
because contents don't match
2015-05-14 14:45:48,864 - u"Execute['']" {'user': 'hdfs'}
2015-05-14 14:45:48,968 - u"ExecuteHadoop['dfsadmin -refreshNodes']" 
{'bin_dir': '/usr/hdp/current/hadoop-client/bin', 'conf_dir': 
'/etc/hadoop/conf', 'kinit_override': True, 'user': 'hdfs'}
2015-05-14 14:45:49,011 - u"Execute['hadoop --config /etc/hadoop/conf dfsadmin 
-refreshNodes']" {'logoutput': None, 'try_sleep': 0, 'environment': {}, 
'tries': 1, 'user': 'hdfs', 'path': ['/usr/hdp/current/hadoop-client/bin']}

DataNodes Status3 live / 2 dead / 0 decommissioning

NodeManager Decommission: slave-2.local,slave-4.local

stderr:
None
 stdout:
2015-05-14 14:47:16,491 - u"File['/etc/hadoop/conf/yarn.exclude']" {'owner': 
'yarn', 'content': Template('exclude_hosts_list.j2'), 'group': 'hadoop'}
2015-05-14 14:47:16,866 - Writing u"File['/etc/hadoop/conf/yarn.exclude']" 
because contents don't match
2015-05-14 14:47:17,057 - u"Execute[' yarn --config /etc/hadoop/conf rmadmin 
-refreshNodes']" {'environment': {'PATH': 
'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/bin:/usr/bin:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin:/usr/hdp/current/hadoop-yarn-resourcemanager/bin'},
 'user': 'yarn'}

NodeManagers Status 3 active / 0 lost / 0 unhealthy / 0 rebooted / 2 
decommissioned



Ambari 2.0 DECOMMISSION

2015-05-14 Thread Greg Hill
Did anything change with DECOMISSION in the 2.0 release?  The process appears 
to decommission fine (the request completes and says it updated the dfs.exclude 
file), but the datanodes aren't decommissioned and HDFS now says they're dead 
and I need to restart the Namenode.  For YARN, the nodemanagers appear to have 
decommissioned ok and are in decommissioned status, but it says I need to 
restart the resource manager (this didn't used to be the case in 1.7.0).

The only difference is that I don't set maintenance mode on the datanodes until 
after the decommission completes, because that wasn't working for me at one 
point (turns out hitting the API slightly differently would have made it work). 
 Is that the cause maybe?  Is restarting the master services now required after 
a decommission?


Task output:

DataNode Decommission: slave-2.local,slave-4.local

stderr:
None
 stdout:
2015-05-14 14:45:48,439 - u"File['/etc/hadoop/conf/dfs.exclude']" {'owner': 
'hdfs', 'content': Template('exclude_hosts_list.j2'), 'group': 'hadoop'}
2015-05-14 14:45:48,670 - Writing u"File['/etc/hadoop/conf/dfs.exclude']" 
because contents don't match
2015-05-14 14:45:48,864 - u"Execute['']" {'user': 'hdfs'}
2015-05-14 14:45:48,968 - u"ExecuteHadoop['dfsadmin -refreshNodes']" 
{'bin_dir': '/usr/hdp/current/hadoop-client/bin', 'conf_dir': 
'/etc/hadoop/conf', 'kinit_override': True, 'user': 'hdfs'}
2015-05-14 14:45:49,011 - u"Execute['hadoop --config /etc/hadoop/conf dfsadmin 
-refreshNodes']" {'logoutput': None, 'try_sleep': 0, 'environment': {}, 
'tries': 1, 'user': 'hdfs', 'path': ['/usr/hdp/current/hadoop-client/bin']}

DataNodes Status 3 live / 2 dead / 0 decommissioning

NodeManager Decommission: slave-2.local,slave-4.local

stderr:
None
 stdout:
2015-05-14 14:47:16,491 - u"File['/etc/hadoop/conf/yarn.exclude']" {'owner': 
'yarn', 'content': Template('exclude_hosts_list.j2'), 'group': 'hadoop'}
2015-05-14 14:47:16,866 - Writing u"File['/etc/hadoop/conf/yarn.exclude']" 
because contents don't match
2015-05-14 14:47:17,057 - u"Execute[' yarn --config /etc/hadoop/conf rmadmin 
-refreshNodes']" {'environment': {'PATH': 
'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/bin:/usr/bin:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin:/usr/hdp/current/hadoop-yarn-resourcemanager/bin'},
 'user': 'yarn'}

NodeManagers Status 3 active / 0 lost / 0 unhealthy / 0 rebooted / 2 
decommissioned



Ambari 2.0 timeline server on hdfs times out

2015-05-13 Thread Greg Hill
If I pass this configuration in to YARN on HDP2.2 using Ambari 2.0:

"yarn.timeline-service.leveldb-timeline-store.path": 
'hdfs:///apps/yarn/timeline'

It spins forever waiting to set up the HDFS paths for me.  I like that it tries 
to auto-create the folder for me, but maybe there's an order-of-operations 
problem here?  It's been going for 20 minutes now, just sitting there.

Also, if the desired behavior is to auto-create the hdfs folders, this other 
setting needs to be updated to work the same way:

"yarn.resourcemanager.fs.state-store.uri": 'hdfs:///apps/yarn/rmstore'

Setting that just makes the resource manager crash until I can go in and create 
the folder in hdfs.

Here's the task output that's  hanging.  Advice is appreciated, but I have a 
feeling I'm just going to be opening bug reports here.


2015-05-13 19:39:09,834 - u"XmlConfig['capacity-scheduler.xml']" {'group': 
'hadoop', 'conf_dir': '/etc/hadoop/conf', 'mode': 0644, 
'configuration_attributes': {}, 'owner': 'yarn', 'configurations': ...}
2015-05-13 19:39:09,850 - Generating config: 
/etc/hadoop/conf/capacity-scheduler.xml
2015-05-13 19:39:09,850 - u"File['/etc/hadoop/conf/capacity-scheduler.xml']" 
{'owner': 'yarn', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': 
0644, 'encoding': 'UTF-8'}
2015-05-13 19:39:10,075 - Writing 
u"File['/etc/hadoop/conf/capacity-scheduler.xml']" because contents don't match
2015-05-13 19:39:10,255 - Changing owner for 
/etc/hadoop/conf/capacity-scheduler.xml from 0 to yarn
2015-05-13 19:39:10,305 - Changing group for 
/etc/hadoop/conf/capacity-scheduler.xml from 0 to hadoop
2015-05-13 19:39:10,355 - u"Directory['hdfs:///apps/yarn/timeline']" {'owner': 
'yarn', 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
2015-05-13 19:39:10,411 - Creating directory 
u"Directory['hdfs:///apps/yarn/timeline']"
2015-05-13 19:39:10,670 - Changing owner for hdfs:///apps/yarn/timeline from 0 
to yarn
2015-05-13 19:39:10,720 - Changing group for hdfs:///apps/yarn/timeline from 0 
to hadoop


Re: zookeeper required for Ambari 2.0?

2015-05-13 Thread Greg Hill
Looks like yarn.resourcemanager.store.class defaulted to 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore

Ambari should probably not set that unless there is a zookeeper_server in the 
cluster.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Wednesday, May 13, 2015 at 10:09 AM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:zookeeper required for Ambari 2.0?

The YARN resource manager keeps crashing in Ambari 2.0 + HDP 2.2.4.2 clusters 
for me.  The error log indicates that it can't connect to zookeeper, which 
makes sense since I didn't provision zookeeper as I don't use it.  I found the 
relevant settings in the Ambari UI:

yarn.resourcemanager.zk-address = localhost:2181
yarn.resourcemanager.ha.enabled = false

Since  HA is disabled, why is it trying to use Zookeeper at all?  Attempts to 
remove the zk-address setting, which was defaulted, are met with an error "This 
field is required".  Is there some way to stop the ResourceManager from 
attempting to use Zookeeper?  Should I open a JIRA ticket about this?  Is this 
an Ambari issue or a YARN issue?

Greg

Stack trace:

2015-05-13 14:49:32,144 FATAL resourcemanager.ResourceManager 
(ResourceManager.java:main(1229)) - Error starting ResourceManager
org.apache.hadoop.service.ServiceStateException: 
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /rmstore
at 
org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:204)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:581)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1014)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1051)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1047)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1047)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1091)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1226)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
KeeperErrorCode = ConnectionLoss for /rmstore
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$1.run(ZKRMStateStore.java:300)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$1.run(ZKRMStateStore.java:296)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1076)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1095)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createRootDir(ZKRMStateStore.java:296)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:279)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:478)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
... 12 more


zookeeper required for Ambari 2.0?

2015-05-13 Thread Greg Hill
The YARN resource manager keeps crashing in Ambari 2.0 + HDP 2.2.4.2 clusters 
for me.  The error log indicates that it can't connect to zookeeper, which 
makes sense since I didn't provision zookeeper as I don't use it.  I found the 
relevant settings in the Ambari UI:

yarn.resourcemanager.zk-address = localhost:2181
yarn.resourcemanager.ha.enabled = false

Since  HA is disabled, why is it trying to use Zookeeper at all?  Attempts to 
remove the zk-address setting, which was defaulted, are met with an error "This 
field is required".  Is there some way to stop the ResourceManager from 
attempting to use Zookeeper?  Should I open a JIRA ticket about this?  Is this 
an Ambari issue or a YARN issue?

Greg

Stack trace:

2015-05-13 14:49:32,144 FATAL resourcemanager.ResourceManager 
(ResourceManager.java:main(1229)) - Error starting ResourceManager
org.apache.hadoop.service.ServiceStateException: 
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /rmstore
at 
org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:204)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:581)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1014)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1051)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1047)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1047)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1091)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1226)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
KeeperErrorCode = ConnectionLoss for /rmstore
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$1.run(ZKRMStateStore.java:300)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$1.run(ZKRMStateStore.java:296)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1076)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1095)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createRootDir(ZKRMStateStore.java:296)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:279)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:478)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
... 12 more


Ambari 2.0 HIVE_CLIENT disappeared

2015-05-12 Thread Greg Hill
I'm not sure what I'm doing to cause this.  When I use the Quick Start guide, I 
can install everything fine.  But when I tried to then use Ambari 2.0 in our 
product, HIVE_CLIENT is no longer a valid component according to Ambari.  When 
I try to add it to a blueprint, I get a 400 Error: "The component 'HIVE_CLIENT' 
in host group 'gateway' is not valid for the specified stack".  Querying the 
Ambari server in the cluster, sure enough, HIVE_CLIENT, HCAT, and MYSQL_SERVER 
are not listed as components of the HIVE service.  Any ideas what I might be 
doing to cause this?

The only differences in environment I can think of are:

1. In the broken version, I point ambari-server setup at a specific jdk, 
whereas in the quick start one I just let it use the default.
2. Broken one is CentOS 6.5, working one is CentOS 6.4 (doubtful this is the 
problem since Ambari 1.7 worked fine on 6.5).

Basically, I'm using the exact same setup steps as I was using for Ambari 1.7.0 
without issue.  What else might I check?


API response for reference:

GET /api/v1/stacks/HDP/versions/2.2/services/HIVE
{
  "href" : "/api/v1/stacks/HDP/versions/2.2/services/HIVE",
  "StackServices" : {
"comments" : null,
"custom_commands" : [ ],
"display_name" : null,
"required_services" : [
  "PIG"
],
"service_check_supported" : false,
"service_name" : "HIVE",
"service_version" : "0.14.0.2.2",
"stack_name" : "HDP",
"stack_version" : "2.2",
"user_name" : null,
"config_types" : {
  "hive-env" : {
"supports" : {
  "adding_forbidden" : "false",
  "do_not_extend" : "false",
  "final" : "false"
}
  },
  "hive-site" : {
"supports" : {
  "adding_forbidden" : "false",
  "do_not_extend" : "false",
  "final" : "true"
}
  },
  "hiveserver2-site" : {
"supports" : {
  "adding_forbidden" : "false",
  "do_not_extend" : "false",
  "final" : "true"
}
  },
  "ranger-hive-plugin-properties" : {
"supports" : {
  "adding_forbidden" : "false",
  "do_not_extend" : "false",
  "final" : "true"
}
  },
  "webhcat-site" : {
"supports" : {
  "adding_forbidden" : "false",
  "do_not_extend" : "false",
  "final" : "true"
}
  }
}
  },
  "configurations" : [
... ellided ...
  ],
  "components" : [
{
  "href" : 
"/api/v1/stacks/HDP/versions/2.2/services/HIVE/components/HIVE_METASTORE",
  "StackServiceComponents" : {
"component_name" : "HIVE_METASTORE",
"service_name" : "HIVE",
"stack_name" : "HDP",
"stack_version" : "2.2"
  }
},
{
  "href" : 
"/api/v1/stacks/HDP/versions/2.2/services/HIVE/components/HIVE_SERVER",
  "StackServiceComponents" : {
"component_name" : "HIVE_SERVER",
"service_name" : "HIVE",
"stack_name" : "HDP",
"stack_version" : "2.2"
  }
}
  ],
  "artifacts" : [ ]
}


Re: Changing Ambari Portal Password

2015-05-11 Thread Greg Hill
It can be done in the Ambari Portal itself or via the Ambari API.  Go to the 
Admin section and there should be an option to edit users there.

On the API, a PUT call to the user record should let you update it:

PUT /api/v1/users/admin

{
"Users": {
"user_name": $desired_username,
"old_password": $old_password,
"password": $new_password
 }
}

This is easily done with the Python ambari client I wrote, if you're using 
Python 2.7 (still need to add support for other versions).  It's probably also 
easy using the Groovy client if you're on the JVM, but I don't know the syntax 
for it.

https://github.com/jimbobhickville/python-ambariclient

client.users(user_name).update(old_password=old_password, 
password=new_password, user_name=user_name)

Greg

From: Pratik Gadiya 
mailto:pratik_gad...@persistent.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Monday, May 11, 2015 at 9:28 AM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Changing Ambari Portal Password

Hi All,

Can someone let me know how can I change the ambari portal password ?
Is there any configs file from which I can change that, as I want to automate 
this process ?

Please let me know if any additional information is required.

Thanks & Regards,
Pratik Gadiya

DISCLAIMER == This e-mail may contain privileged and confidential 
information which is the property of Persistent Systems Ltd. It is intended 
only for the use of the individual or entity to which it is addressed. If you 
are not the intended recipient, you are not authorized to read, retain, copy, 
print, distribute or use this message. If you have received this communication 
in error, please notify the sender and delete all copies of this message. 
Persistent Systems Ltd. does not accept any liability for virus infected mails.


centos 6 + ambari 2.0 + hdp 2.2 == timeout

2015-05-04 Thread Greg Hill
I can't get an install to work on ambari 2.0 + hdp 2.2 using the public yum 
repos.  It just times out after 1800s on a yum install every time.  Are 
hortonworks repos being super slow or is something else screwy happening?

My repos were defaulted by Ambari to:

http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.2.4.2
http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6

I was able to log in to the node and install the package manually just fine.  
Maybe the repos were just down earlier today?  I seriously hope a default 
install doesn't take more than 30 minutes now :D

Greg
stderr:   /var/lib/ambari-agent/data/errors-29.txt

Python script has been killed due to timeout after waiting 1800 secs

stdout:   /var/lib/ambari-agent/data/output-29.txt

2015-05-04 20:32:54,008 - 
u"Directory['/var/lib/ambari-agent/data/tmp/AMBARI-artifacts/']" {'recursive': 
True}
2015-05-04 20:32:54,072 - Creating directory 
u"Directory['/var/lib/ambari-agent/data/tmp/AMBARI-artifacts/']"
2015-05-04 20:32:54,258 - 
u"File['/var/lib/ambari-agent/data/tmp/AMBARI-artifacts//UnlimitedJCEPolicyJDK7.zip']"
 {'content': 
DownloadSource('http://c6401.ambari.apache.org:8080/resources//UnlimitedJCEPolicyJDK7.zip')}
2015-05-04 20:32:54,346 - Downloading the file from 
http://c6401.ambari.apache.org:8080/resources//UnlimitedJCEPolicyJDK7.zip
2015-05-04 20:32:54,407 - Writing 
u"File['/var/lib/ambari-agent/data/tmp/AMBARI-artifacts//UnlimitedJCEPolicyJDK7.zip']"
 because it doesn't exist
2015-05-04 20:32:54,548 - u"Group['hadoop']" {'ignore_failures': False}
2015-05-04 20:32:54,550 - Adding group u"Group['hadoop']"
2015-05-04 20:32:54,594 - u"Group['nobody']" {'ignore_failures': False}
2015-05-04 20:32:54,595 - Modifying group nobody
2015-05-04 20:32:54,638 - u"Group['users']" {'ignore_failures': False}
2015-05-04 20:32:54,639 - Modifying group users
2015-05-04 20:32:54,683 - u"User['nobody']" {'gid': 'hadoop', 
'ignore_failures': False, 'groups': [u'nobody']}
2015-05-04 20:32:54,683 - Modifying user nobody
2015-05-04 20:32:54,727 - u"User['hive']" {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-05-04 20:32:54,728 - Adding user u"User['hive']"
2015-05-04 20:32:54,811 - u"User['mapred']" {'gid': 'hadoop', 
'ignore_failures': False, 'groups': [u'hadoop']}
2015-05-04 20:32:54,812 - Adding user u"User['mapred']"
2015-05-04 20:32:54,855 - u"User['ambari-qa']" {'gid': 'hadoop', 
'ignore_failures': False, 'groups': [u'users']}
2015-05-04 20:32:54,856 - Adding user u"User['ambari-qa']"
2015-05-04 20:32:54,899 - u"User['zookeeper']" {'gid': 'hadoop', 
'ignore_failures': False, 'groups': [u'hadoop']}
2015-05-04 20:32:54,899 - Adding user u"User['zookeeper']"
2015-05-04 20:32:54,942 - u"User['tez']" {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'users']}
2015-05-04 20:32:54,943 - Adding user u"User['tez']"
2015-05-04 20:32:54,987 - u"User['hdfs']" {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-05-04 20:32:54,988 - Adding user u"User['hdfs']"
2015-05-04 20:32:55,031 - u"User['yarn']" {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-05-04 20:32:55,031 - Adding user u"User['yarn']"
2015-05-04 20:32:55,075 - u"User['hcat']" {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-05-04 20:32:55,076 - Adding user u"User['hcat']"
2015-05-04 20:32:55,120 - 
u"File['/var/lib/ambari-agent/data/tmp/changeUid.sh']" {'content': 
StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2015-05-04 20:32:55,293 - Writing 
u"File['/var/lib/ambari-agent/data/tmp/changeUid.sh']" because it doesn't exist
2015-05-04 20:32:55,436 - Changing permission for 
/var/lib/ambari-agent/data/tmp/changeUid.sh from 644 to 555
2015-05-04 20:32:55,479 - 
u"Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa 
/tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa']"
 {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
2015-05-04 20:32:55,603 - u"Group['hdfs']" {'ignore_failures': False}
2015-05-04 20:32:55,603 - Adding group u"Group['hdfs']"
2015-05-04 20:32:55,648 - u"User['hdfs']" {'ignore_failures': False, 'groups': 
[u'hadoop', u'hdfs']}
2015-05-04 20:32:55,648 - Modifying user hdfs
2015-05-04 20:32:55,692 - u"Directory['/etc/hadoop']" {'mode': 0755}
2015-05-04 20:32:55,736 - Creating directory u"Directory['/etc/hadoop']"
2015-05-04 20:32:55,962 - u"Directory['/etc/hadoop/conf.empty']" {'owner': 
'root', 'group': 'hadoop', 'recursive': True}
2015-05-04 20:32:56,006 - Creating directory 
u"Directory['/etc/hadoop/conf.empty']"
2015-05-04 20:32:56,191 - Changing group for /etc/hadoop/conf.empty from 0 to 
hadoop
2015-05-04 20:32:56,235 - u"Link['/etc/hadoop/conf']" {'not_if': 'ls 
/etc/hadoop/conf', 'to': '/etc/hadoop/conf.empty'}
2015-05-04 20:32:56,368 - Creating symbolic u"Link['/etc/hadoop/conf']"
2015-05-04 20:32:56,422 - u"File['/etc/hadoop/conf/hadoop-env.sh']" {'content': 
InlineTemplate(...)

Re: adjust the agent heartbeat?

2015-04-17 Thread Greg Hill
https://issues.apache.org/jira/browse/AMBARI-10571

I also opened:
https://issues.apache.org/jira/browse/AMBARI-10570<https://issues.apache.org/jira/browse/AMBARI-10571>

For the bug I ran into last week.

Greg

From: Sumit Mohanty mailto:smoha...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Friday, April 17, 2015 at 11:07 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: adjust the agent heartbeat?


​Not without code change. This is probably a good feature to add. Can you 
create a task?


From: Greg Hill mailto:greg.h...@rackspace.com>>
Sent: Friday, April 17, 2015 8:32 AM
To: user@ambari.apache.org<mailto:user@ambari.apache.org>
Subject: adjust the agent heartbeat?

https://github.com/apache/ambari/blob/trunk/ambari-agent/src/main/python/ambari_agent/NetUtil.py#L34

Is there any way to tweak that heartbeat interval setting?  If I'm reading the 
code right, it checks in with the server every 10s.  I'd like to be able to 
tweak that and see if I can speed up build times maybe by making it check in 
more frequently.  I don't see any way to override that setting, but maybe 
there's something in the .ini file?

Greg


adjust the agent heartbeat?

2015-04-17 Thread Greg Hill
https://github.com/apache/ambari/blob/trunk/ambari-agent/src/main/python/ambari_agent/NetUtil.py#L34

Is there any way to tweak that heartbeat interval setting?  If I'm reading the 
code right, it checks in with the server every 10s.  I'd like to be able to 
tweak that and see if I can speed up build times maybe by making it check in 
more frequently.  I don't see any way to override that setting, but maybe 
there's something in the .ini file?

Greg


potential bug

2015-04-08 Thread Greg Hill
I'm diagnosing an issue, and I think I found a bug with the ambari-agent code:

https://github.com/apache/ambari/blob/trunk/ambari-agent/src/main/python/ambari_agent/Controller.py#L390

If 'cluster_name' has spaces in it, this request fails because it fails to 
URL-encode value.  This causes all of the agents to go to HEARTBEAT_LOST state 
and everything fails, but the error it spits out in the agent log is hugely 
misleading:

ERROR 2015-04-08 18:30:20,312 Controller.py:140 - Unable to connect to: 
https://ambari.local:8441/agent/v1/register/ambari.local
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 128, 
in registerWithServer
self.addToStatusQueue(ret['statusCommands'])
  File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 172, 
in addToStatusQueue
self.updateComponents(commands[0]['clusterName'])
  File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 360, 
in updateComponents
response = self.sendRequest(self.componentsUrl + cluster_name, None)
  File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 353, 
in sendRequest
+ '; Response: ' + str(response))
IOError: Response parsing failed! Request data: None; Response:

It connected fine, and parsed the response fine, but then died during 
processing of the response.  Probably shouldn't be trapping every Exception 
here:

https://github.com/apache/ambari/blob/trunk/ambari-agent/src/main/python/ambari_agent/Controller.py#L170

I assume that this is a bug and we want to allow cluster names to be whatever 
the customer would like.

I'll open a JIRA unless someone can disconfirm that this is a bug.

Greg


Re: COMMERCIAL:Re: COMMERCIAL:Re: Did something get broken for webhcat today?

2015-03-18 Thread Greg Hill
Thanks, that seems to do it.

Greg

From: Jeff Sposetti mailto:j...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 12:22 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: COMMERCIAL:Re: Did something get broken for webhcat 
today?

See if the API call here helps…might be what you are looking for…

https://cwiki.apache.org/confluence/display/AMBARI/Blueprints#Blueprints-Step4:SetupStackRepositories%28Optional%29



From: Greg Hill mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 1:11 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: COMMERCIAL:Re: Did something get broken for webhcat today?

Ok, I'll see if I can figure out the API equivalent.  We are automating 
everything since we provide hdp clusters as a service.

Greg

From: Yusaku Sako mailto:yus...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 12:06 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: Did something get broken for webhcat today?

Greg,

Ambari does automatically retrieve the repo info for the latest maintenance 
version of the stack.
For example, if you select "HDP 2.2", it will pull the latest HDP 2.2.x version.
It seems like HDP 2.2.3 was released last night, so when you are installing a 
new cluster it is trying to install with 2.2.3.
Since you already have HDP 2.2.0 bits pre-installed on your image, you need to 
explicitly set the repo URL to 2.2.0 bits in the Select Stack page, as Jeff 
mentioned.

This is only true for new clusters being installed.
For adding hosts to existing clusters, it will continue to use the repo URL 
that you originally used to install the cluster with.

Yusaku

From: Greg Hill mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Thursday, March 19, 2015 1:56 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: Did something get broken for webhcat today?

We did install that repo when we built the images we're using:

wget -O /etc/yum.repos.d/hdp.repo 
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/GA/2.2.0.0/hdp.repo

We preinstall a lot of packages on the images to reduce install time, including 
ambari.  So our version of Ambari didn't change, and we didn't inject those new 
repos.  Does ambari self-update or phone home to get the latest repos?  I can't 
figure out how the new repo got injected.

Greg


From: Jeff Sposetti mailto:j...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 11:48 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: Did something get broken for webhcat today?


In Ambari Web > Admin > Stack (or during install, on Select Stack, expand 
Advanced Repository Options): can you update your HDP repo Base URL to use the 
HDP 2.2 GA repository (instead of what it's pulling, which is 2.2.3.0)?


http://public-repo-1.hortonworks.com/HDP/centos6/2.x/GA/2.2.0.0



From: Greg Hill mailto:greg.h...@rackspace.com>>
Sent: Wednesday, March 18, 2015 12:41 PM
To: user@ambari.apache.org<mailto:user@ambari.apache.org>
Subject: Re: Did something get broken for webhcat today?

We didn't change anything.  Ambari 1.7.0, HDP 2.2.  Repos are:

[root@gateway-1 ~]# cat /etc/yum.repos.d/HDP.repo
[HDP-2.2]
name=HDP
baseurl=http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.2.3.0
path=/
enabled=1
gpgcheck=0
[root@gateway-1 ~]# cat /etc/yum.repos.d/HDP-UTILS.repo
[HDP-UTILS-1.1.0.20]
name=HDP-UTILS
baseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6
path=/
enabled=1
gpgcheck=0
[root@gateway-1 ~]# cat /etc/yum.repos.d/ambari.repo
[ambari-1.x]
name=Ambari 1.x
baseurl=http://public-repo-1.hortonworks.com/ambari/centos6/1.x/GA
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

[Updates-ambari-1.7.0]
name=ambari-1.7.0 - Updates
baseurl=http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0
gpgcheck=1
gpgkey=http://public-r

Re: COMMERCIAL:Re: Did something get broken for webhcat today?

2015-03-18 Thread Greg Hill
Ok, I'll see if I can figure out the API equivalent.  We are automating 
everything since we provide hdp clusters as a service.

Greg

From: Yusaku Sako mailto:yus...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 12:06 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: Did something get broken for webhcat today?

Greg,

Ambari does automatically retrieve the repo info for the latest maintenance 
version of the stack.
For example, if you select "HDP 2.2", it will pull the latest HDP 2.2.x version.
It seems like HDP 2.2.3 was released last night, so when you are installing a 
new cluster it is trying to install with 2.2.3.
Since you already have HDP 2.2.0 bits pre-installed on your image, you need to 
explicitly set the repo URL to 2.2.0 bits in the Select Stack page, as Jeff 
mentioned.

This is only true for new clusters being installed.
For adding hosts to existing clusters, it will continue to use the repo URL 
that you originally used to install the cluster with.

Yusaku

From: Greg Hill mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Thursday, March 19, 2015 1:56 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: Did something get broken for webhcat today?

We did install that repo when we built the images we're using:

wget -O /etc/yum.repos.d/hdp.repo 
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/GA/2.2.0.0/hdp.repo

We preinstall a lot of packages on the images to reduce install time, including 
ambari.  So our version of Ambari didn't change, and we didn't inject those new 
repos.  Does ambari self-update or phone home to get the latest repos?  I can't 
figure out how the new repo got injected.

Greg


From: Jeff Sposetti mailto:j...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 11:48 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: Did something get broken for webhcat today?


In Ambari Web > Admin > Stack (or during install, on Select Stack, expand 
Advanced Repository Options): can you update your HDP repo Base URL to use the 
HDP 2.2 GA repository (instead of what it's pulling, which is 2.2.3.0)?


http://public-repo-1.hortonworks.com/HDP/centos6/2.x/GA/2.2.0.0



From: Greg Hill mailto:greg.h...@rackspace.com>>
Sent: Wednesday, March 18, 2015 12:41 PM
To: user@ambari.apache.org<mailto:user@ambari.apache.org>
Subject: Re: Did something get broken for webhcat today?

We didn't change anything.  Ambari 1.7.0, HDP 2.2.  Repos are:

[root@gateway-1 ~]# cat /etc/yum.repos.d/HDP.repo
[HDP-2.2]
name=HDP
baseurl=http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.2.3.0
path=/
enabled=1
gpgcheck=0
[root@gateway-1 ~]# cat /etc/yum.repos.d/HDP-UTILS.repo
[HDP-UTILS-1.1.0.20]
name=HDP-UTILS
baseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6
path=/
enabled=1
gpgcheck=0
[root@gateway-1 ~]# cat /etc/yum.repos.d/ambari.repo
[ambari-1.x]
name=Ambari 1.x
baseurl=http://public-repo-1.hortonworks.com/ambari/centos6/1.x/GA
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

[Updates-ambari-1.7.0]
name=ambari-1.7.0 - Updates
baseurl=http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1



From: Jeff Sposetti mailto:j...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 11:26 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: Did something get broken for webhcat today?

Are you using ambari trunk or ambari 2.0.0 branch builds?

Also please confirm: your HDP repos have not changed (I.e. Are you using local 
repos for the HDP stack packages)?

From: Greg Hill mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 12:22 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Did something get broken for 

Re: Did something get broken for webhcat today?

2015-03-18 Thread Greg Hill
We did install that repo when we built the images we're using:

wget -O /etc/yum.repos.d/hdp.repo 
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/GA/2.2.0.0/hdp.repo

We preinstall a lot of packages on the images to reduce install time, including 
ambari.  So our version of Ambari didn't change, and we didn't inject those new 
repos.  Does ambari self-update or phone home to get the latest repos?  I can't 
figure out how the new repo got injected.

Greg


From: Jeff Sposetti mailto:j...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 11:48 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: Did something get broken for webhcat today?


In Ambari Web > Admin > Stack (or during install, on Select Stack, expand 
Advanced Repository Options): can you update your HDP repo Base URL to use the 
HDP 2.2 GA repository (instead of what it's pulling, which is 2.2.3.0)?


http://public-repo-1.hortonworks.com/HDP/centos6/2.x/GA/2.2.0.0



From: Greg Hill mailto:greg.h...@rackspace.com>>
Sent: Wednesday, March 18, 2015 12:41 PM
To: user@ambari.apache.org<mailto:user@ambari.apache.org>
Subject: Re: Did something get broken for webhcat today?

We didn't change anything.  Ambari 1.7.0, HDP 2.2.  Repos are:

[root@gateway-1 ~]# cat /etc/yum.repos.d/HDP.repo
[HDP-2.2]
name=HDP
baseurl=http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.2.3.0
path=/
enabled=1
gpgcheck=0
[root@gateway-1 ~]# cat /etc/yum.repos.d/HDP-UTILS.repo
[HDP-UTILS-1.1.0.20]
name=HDP-UTILS
baseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6
path=/
enabled=1
gpgcheck=0
[root@gateway-1 ~]# cat /etc/yum.repos.d/ambari.repo
[ambari-1.x]
name=Ambari 1.x
baseurl=http://public-repo-1.hortonworks.com/ambari/centos6/1.x/GA
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

[Updates-ambari-1.7.0]
name=ambari-1.7.0 - Updates
baseurl=http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1



From: Jeff Sposetti mailto:j...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 11:26 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: Did something get broken for webhcat today?

Are you using ambari trunk or ambari 2.0.0 branch builds?

Also please confirm: your HDP repos have not changed (I.e. Are you using local 
repos for the HDP stack packages)?

From: Greg Hill mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 12:22 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Did something get broken for webhcat today?

Starting this morning, we started seeing this on every single install.  I think 
someone at Hortonworks pushed out a broken RPM or something.  Any ideas?  This 
is rather urgent as we are no longer able to provision HDP 2.2 clusters at all 
because of it.


2015-03-18 15:58:05,982 - Group['hadoop'] {'ignore_failures': False}
2015-03-18 15:58:05,984 - Modifying group hadoop
2015-03-18 15:58:06,080 - Group['nobody'] {'ignore_failures': False}
2015-03-18 15:58:06,081 - Modifying group nobody
2015-03-18 15:58:06,219 - Group['users'] {'ignore_failures': False}
2015-03-18 15:58:06,220 - Modifying group users
2015-03-18 15:58:06,370 - Group['nagios'] {'ignore_failures': False}
2015-03-18 15:58:06,371 - Modifying group nagios
2015-03-18 15:58:06,474 - User['nobody'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'nobody']}
2015-03-18 15:58:06,475 - Modifying user nobody
2015-03-18 15:58:06,558 - User['hive'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:06,559 - Modifying user hive
2015-03-18 15:58:06,634 - User['mapred'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:06,635 - Modifying user mapred
2015-03-18 15:58:06,722 - User['nagios'] {'gid': 'nagios', 'ignore_failures': 
False, 'groups': [u

Re: Did something get broken for webhcat today?

2015-03-18 Thread Greg Hill
Interestingly:

[root@gateway-1 ~]# yum list installed hadoop*
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
Installed Packages
hadoop-lzo.x86_64  0.6.0-1  
@HDP-UTILS-1.1.0.20
hadoop-lzo-native.x86_64   0.6.0-1  
@HDP-UTILS-1.1.0.20
hadoop_2_2_0_0_2041.x86_64 2.6.0.2.2.0.0-2041.el6   
@HDP-2.2.0.0
hadoop_2_2_0_0_2041-client.x86_64  2.6.0.2.2.0.0-2041.el6   
@HDP-2.2.0.0
hadoop_2_2_0_0_2041-hdfs.x86_642.6.0.2.2.0.0-2041.el6   
@HDP-2.2.0.0
hadoop_2_2_0_0_2041-mapreduce.x86_64   2.6.0.2.2.0.0-2041.el6   
@HDP-2.2.0.0
hadoop_2_2_0_0_2041-yarn.x86_642.6.0.2.2.0.0-2041.el6   
@HDP-2.2.0.0
hadoop_2_2_3_0_2611.x86_64 2.6.0.2.2.3.0-2611.el6   
@HDP-2.2
hadoop_2_2_3_0_2611-client.x86_64  2.6.0.2.2.3.0-2611.el6   
@HDP-2.2
hadoop_2_2_3_0_2611-hdfs.x86_642.6.0.2.2.3.0-2611.el6   
@HDP-2.2
hadoop_2_2_3_0_2611-libhdfs.x86_64 2.6.0.2.2.3.0-2611.el6   
@HDP-2.2
hadoop_2_2_3_0_2611-mapreduce.x86_64   2.6.0.2.2.3.0-2611.el6   
@HDP-2.2
hadoop_2_2_3_0_2611-yarn.x86_642.6.0.2.2.3.0-2611.el6   
@HDP-2.2
hadooplzo_2_2_3_0_2611.x86_64  0.6.0.2.2.3.0-2611.el6   
@HDP-2.2
hadooplzo_2_2_3_0_2611-native.x86_64   0.6.0.2.2.3.0-2611.el6   
@HDP-2.2

Looks like I have multiple versions installed.  Because Hortonworks stopped 
following sane packaging practices and put the version numbers in the package 
names, it doesn't recognize them as the same packages, so it just installed new 
versions alongside the old rather than updating.

I also don't understand how the repo got moved from the 2.2.0 one to the 2.2.3 
one without me doing so manually.  Does Ambari update the repos automatically 
without any input from the user?

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 11:41 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: Did something get broken for webhcat today?

We didn't change anything.  Ambari 1.7.0, HDP 2.2.  Repos are:

[root@gateway-1 ~]# cat /etc/yum.repos.d/HDP.repo
[HDP-2.2]
name=HDP
baseurl=http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.2.3.0
path=/
enabled=1
gpgcheck=0
[root@gateway-1 ~]# cat /etc/yum.repos.d/HDP-UTILS.repo
[HDP-UTILS-1.1.0.20]
name=HDP-UTILS
baseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6
path=/
enabled=1
gpgcheck=0
[root@gateway-1 ~]# cat /etc/yum.repos.d/ambari.repo
[ambari-1.x]
name=Ambari 1.x
baseurl=http://public-repo-1.hortonworks.com/ambari/centos6/1.x/GA
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

[Updates-ambari-1.7.0]
name=ambari-1.7.0 - Updates
baseurl=http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1



From: Jeff Sposetti mailto:j...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 11:26 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: Did something get broken for webhcat today?

Are you using ambari trunk or ambari 2.0.0 branch builds?

Also please confirm: your HDP repos have not changed (I.e. Are you using local 
repos for the HDP stack packages)?

From: Greg Hill mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 12:22 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Did something get broken for webhcat today?

Starting this morning, we started seeing this on every single install.  I think 
someone at Hortonworks pushed out a broken RPM or something.  Any ideas?  This 
is rather urgent as we are no longer able to provision HDP 2.2 clusters at all 
because of it.


2015-03-18 15:58:05,982 - Group['hadoop'] {'ignore_failures': False}
2015-03-18 15:58:05,984 - Modifying group hadoop
2015-03-18 15:58:06,080 - Group['nobody'] {'ignore_failures': False}
2015-03-18 15:58:06,081 - Modifying group nobody
2015-03-18 15:58:06,219 - Group['users'] {'ignore_failures&#

Re: Did something get broken for webhcat today?

2015-03-18 Thread Greg Hill
We didn't change anything.  Ambari 1.7.0, HDP 2.2.  Repos are:

[root@gateway-1 ~]# cat /etc/yum.repos.d/HDP.repo
[HDP-2.2]
name=HDP
baseurl=http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.2.3.0
path=/
enabled=1
gpgcheck=0
[root@gateway-1 ~]# cat /etc/yum.repos.d/HDP-UTILS.repo
[HDP-UTILS-1.1.0.20]
name=HDP-UTILS
baseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6
path=/
enabled=1
gpgcheck=0
[root@gateway-1 ~]# cat /etc/yum.repos.d/ambari.repo
[ambari-1.x]
name=Ambari 1.x
baseurl=http://public-repo-1.hortonworks.com/ambari/centos6/1.x/GA
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

[Updates-ambari-1.7.0]
name=ambari-1.7.0 - Updates
baseurl=http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1



From: Jeff Sposetti mailto:j...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 11:26 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:Re: Did something get broken for webhcat today?

Are you using ambari trunk or ambari 2.0.0 branch builds?

Also please confirm: your HDP repos have not changed (I.e. Are you using local 
repos for the HDP stack packages)?

From: Greg Hill mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, March 18, 2015 at 12:22 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Did something get broken for webhcat today?

Starting this morning, we started seeing this on every single install.  I think 
someone at Hortonworks pushed out a broken RPM or something.  Any ideas?  This 
is rather urgent as we are no longer able to provision HDP 2.2 clusters at all 
because of it.


2015-03-18 15:58:05,982 - Group['hadoop'] {'ignore_failures': False}
2015-03-18 15:58:05,984 - Modifying group hadoop
2015-03-18 15:58:06,080 - Group['nobody'] {'ignore_failures': False}
2015-03-18 15:58:06,081 - Modifying group nobody
2015-03-18 15:58:06,219 - Group['users'] {'ignore_failures': False}
2015-03-18 15:58:06,220 - Modifying group users
2015-03-18 15:58:06,370 - Group['nagios'] {'ignore_failures': False}
2015-03-18 15:58:06,371 - Modifying group nagios
2015-03-18 15:58:06,474 - User['nobody'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'nobody']}
2015-03-18 15:58:06,475 - Modifying user nobody
2015-03-18 15:58:06,558 - User['hive'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:06,559 - Modifying user hive
2015-03-18 15:58:06,634 - User['mapred'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:06,635 - Modifying user mapred
2015-03-18 15:58:06,722 - User['nagios'] {'gid': 'nagios', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:06,723 - Modifying user nagios
2015-03-18 15:58:06,841 - User['ambari-qa'] {'gid': 'hadoop', 
'ignore_failures': False, 'groups': [u'users']}
2015-03-18 15:58:06,842 - Modifying user ambari-qa
2015-03-18 15:58:06,963 - User['zookeeper'] {'gid': 'hadoop', 
'ignore_failures': False, 'groups': [u'hadoop']}
2015-03-18 15:58:06,964 - Modifying user zookeeper
2015-03-18 15:58:07,093 - User['tez'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'users']}
2015-03-18 15:58:07,094 - Modifying user tez
2015-03-18 15:58:07,217 - User['hdfs'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:07,218 - Modifying user hdfs
2015-03-18 15:58:07,354 - User['yarn'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:07,355 - Modifying user yarn
2015-03-18 15:58:07,485 - User['hcat'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:07,486 - Modifying user hcat
2015-03-18 15:58:07,629 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] 

Did something get broken for webhcat today?

2015-03-18 Thread Greg Hill
Starting this morning, we started seeing this on every single install.  I think 
someone at Hortonworks pushed out a broken RPM or something.  Any ideas?  This 
is rather urgent as we are no longer able to provision HDP 2.2 clusters at all 
because of it.


2015-03-18 15:58:05,982 - Group['hadoop'] {'ignore_failures': False}
2015-03-18 15:58:05,984 - Modifying group hadoop
2015-03-18 15:58:06,080 - Group['nobody'] {'ignore_failures': False}
2015-03-18 15:58:06,081 - Modifying group nobody
2015-03-18 15:58:06,219 - Group['users'] {'ignore_failures': False}
2015-03-18 15:58:06,220 - Modifying group users
2015-03-18 15:58:06,370 - Group['nagios'] {'ignore_failures': False}
2015-03-18 15:58:06,371 - Modifying group nagios
2015-03-18 15:58:06,474 - User['nobody'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'nobody']}
2015-03-18 15:58:06,475 - Modifying user nobody
2015-03-18 15:58:06,558 - User['hive'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:06,559 - Modifying user hive
2015-03-18 15:58:06,634 - User['mapred'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:06,635 - Modifying user mapred
2015-03-18 15:58:06,722 - User['nagios'] {'gid': 'nagios', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:06,723 - Modifying user nagios
2015-03-18 15:58:06,841 - User['ambari-qa'] {'gid': 'hadoop', 
'ignore_failures': False, 'groups': [u'users']}
2015-03-18 15:58:06,842 - Modifying user ambari-qa
2015-03-18 15:58:06,963 - User['zookeeper'] {'gid': 'hadoop', 
'ignore_failures': False, 'groups': [u'hadoop']}
2015-03-18 15:58:06,964 - Modifying user zookeeper
2015-03-18 15:58:07,093 - User['tez'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'users']}
2015-03-18 15:58:07,094 - Modifying user tez
2015-03-18 15:58:07,217 - User['hdfs'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:07,218 - Modifying user hdfs
2015-03-18 15:58:07,354 - User['yarn'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:07,355 - Modifying user yarn
2015-03-18 15:58:07,485 - User['hcat'] {'gid': 'hadoop', 'ignore_failures': 
False, 'groups': [u'hadoop']}
2015-03-18 15:58:07,486 - Modifying user hcat
2015-03-18 15:58:07,629 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] 
{'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2015-03-18 15:58:07,631 - Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh 
ambari-qa 
/tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa
 2>/dev/null'] {'not_if': 'test $(id -u ambari-qa) -gt 1000'}
2015-03-18 15:58:07,768 - Skipping 
Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa 
/tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa
 2>/dev/null'] due to not_if
2015-03-18 15:58:07,769 - Directory['/etc/hadoop/conf.empty'] {'owner': 'root', 
'group': 'root', 'recursive': True}
2015-03-18 15:58:07,770 - Link['/etc/hadoop/conf'] {'not_if': 'ls 
/etc/hadoop/conf', 'to': '/etc/hadoop/conf.empty'}
2015-03-18 15:58:07,895 - Skipping Link['/etc/hadoop/conf'] due to not_if
2015-03-18 15:58:07,960 - File['/etc/hadoop/conf/hadoop-env.sh'] {'content': 
InlineTemplate(...), 'owner': 'hdfs'}
2015-03-18 15:58:08,092 - Execute['/bin/echo 0 > /selinux/enforce'] {'only_if': 
'test -f /selinux/enforce'}
2015-03-18 15:58:08,240 - Skipping Execute['/bin/echo 0 > /selinux/enforce'] 
due to only_if
2015-03-18 15:58:08,241 - Directory['/var/log/hadoop'] {'owner': 'root', 
'group': 'hadoop', 'mode': 0775, 'recursive': True}
2015-03-18 15:58:08,244 - Directory['/var/run/hadoop'] {'owner': 'root', 
'group': 'root', 'recursive': True}
2015-03-18 15:58:08,250 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 
'recursive': True}
2015-03-18 15:58:08,278 - File['/etc/hadoop/conf/commons-logging.properties'] 
{'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'}
2015-03-18 15:58:08,288 - File['/etc/hadoop/conf/health_check'] {'content': 
Template('health_check-v2.j2'), 'owner': 'hdfs'}
2015-03-18 15:58:08,295 - File['/etc/hadoop/conf/log4j.properties'] {'content': 
'...', 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
2015-03-18 15:58:08,322 - File['/etc/hadoop/conf/hadoop-metrics2.properties'] 
{'content': Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs'}
2015-03-18 15:58:08,325 - File['/etc/hadoop/conf/task-log4j.properties'] 
{'content': StaticFile('task-log4j.properties'), 'mode': 0755}
2015-03-18 15:58:08,330 - File['/etc/hadoop/conf/configuration.xsl'] {'owner': 
'hdfs', 'group': 'hadoop'}
2015-03-18 15:58:09,219 - HdfsDirectory['/user/hcat'] {'security_enabled': 
False, 'keytab': [EMPTY], 'conf_dir': '/etc/hadoop/conf', 'hdfs_user': 'hdfs', 
'kinit_path_local': '', 'mode': 0755, 'owner': 'hcat', 'bin_dir': 
'/usr/hdp/current/hadoop-client/bin', 'action': ['create_delayed']}
2015-03-18 15:58:09,220 - HdfsDi

Re: COMMERCIAL:Re: decommission multiple nodes issue

2015-03-04 Thread Greg Hill
IIRC switching it to HOST_COMPONENT made it so I couldn't pass in multiple 
hosts (that was what I was doing originally, and Ambari just rejected the 
request outright, unless my memory is tricking me).  Maybe I just needed 
slightly different syntax for that case?

Also, decommissioning NODEMANAGER using CLUSTER and a list of hosts did not 
exhibit the same behavior.  It seemed to decommission them properly, even when 
in maintenance mode.

Greg

From: Yusaku Sako mailto:yus...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Tuesday, March 3, 2015 at 9:41 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>, Sean Roberts 
mailto:srobe...@hortonworks.com>>
Subject: COMMERCIAL:Re: decommission multiple nodes issue

Hi Greg,

This is actually by design.
If you want to decommission all DataNodes regardless of their host maintenance 
mode, you need to change "RequestInfo/level" from "CLUSTER" to "HOST_COMPONENT".
When you set the "level" to "CLUSTER", bulk operations (in this case 
decommission) would be skipped on the matching target resources in case the 
host(s) are in maintenance mode.
If you set to "HOST_COMPONENT", it would ignore any host-level maintenance mode.
This is a really mysterious, undocumented part of Ambari, unfortunately.

Yusaku

From: Greg Hill mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Tuesday, March 3, 2015 9:32 AM
To: Sean Roberts mailto:srobe...@hortonworks.com>>, 
"user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: decommission multiple nodes issue

I have verified that if maintenance mode is set on a host, then it is ignored 
by the decommission process, but only if you try to decommission multiple hosts 
at the same time.  I'll open a bug.

Greg

From: Sean Roberts mailto:srobe...@hortonworks.com>>
Date: Monday, March 2, 2015 at 1:34 PM
To: Greg mailto:greg.h...@rackspace.com>>, 
"user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: decommission multiple nodes issue

Greg - Same here on submitting JSON. Although they are JSON documents you have 
to submit them as plain form. This is true across all of Ambari. I opened a bug 
for it a month back.


--
Hortonworks - We do Hadoop

Sean Roberts
Partner Solutions Engineer - EMEA
@seano

From: Greg Hill <mailto:greg.h...@rackspace.com>
Date: March 2, 2015 at 19:32:34
To: Sean Roberts ><mailto:srobe...@hortonworks.com>, 
user@ambari.apache.org<mailto:user@ambari.apache.org>><mailto:user@ambari.apache.org>
Subject:  Re: decommission multiple nodes issue

That causes a server error.  I’ve yet to see any part of the API that accepts 
JSON arrays like that as input; it’s almost always, if not always, a 
comma-separated string like I posted.  Many methods even return double-encoded 
JSON values (i.e. “key”: “[\”value1\”,\”value2\”]").  It’s kind of annoying and 
inconsistent, honestly, and not documented anywhere.  You just have to have 
your client code choke on it and then go add another data[key] = 
json.loads(data[key]) in the client to account for it.

I am starting to think it’s because I set the nodes into maintenance mode 
first, as doing the decommission command manually from the client works fine 
when the nodes aren’t in maintenance mode.  I’ll keep digging, I guess, but it 
is weird that the exact same command worked this time (the commandArgs are 
identical to the one that did nothing).

Greg

From: Sean Roberts mailto:srobe...@hortonworks.com>>
Date: Monday, March 2, 2015 at 1:22 PM
To: Greg mailto:greg.h...@rackspace.com>>, 
"user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: decommission multiple nodes issue


Racker Greg - I’m not familiar with the decommissioning API, but if it’s 
consistent with the rest of Ambari, you’ll need to change from this:

"excluded_hosts": “slave-1.local,slave-2.local"

To this:

"excluded_hosts" : [ "slave-1.local","slave-2.local" ]


--
Hortonworks - We do Hadoop

Sean Roberts
Partner Solutions Engineer - EMEA
@seano

From: Greg Hill <mailto:greg.h...@rackspace.com>
Reply: 
user@ambari.apache.org<mailto:user@ambari.apache.org>><mailto:user@ambari.apache.org>
Date: March 2, 2015 at 19:08:13
To: 
user@ambari.apache.org<mailto:user@ambari.apache.org>><mailto:user@ambari.apache.org>
Subject:  decommission multiple nodes issue

I have some code for decommissioning datanodes prior to rem

Re: decommission multiple nodes issue

2015-03-03 Thread Greg Hill
I have verified that if maintenance mode is set on a host, then it is ignored 
by the decommission process, but only if you try to decommission multiple hosts 
at the same time.  I'll open a bug.

Greg

From: Sean Roberts mailto:srobe...@hortonworks.com>>
Date: Monday, March 2, 2015 at 1:34 PM
To: Greg mailto:greg.h...@rackspace.com>>, 
"user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: decommission multiple nodes issue

Greg - Same here on submitting JSON. Although they are JSON documents you have 
to submit them as plain form. This is true across all of Ambari. I opened a bug 
for it a month back.


--
Hortonworks - We do Hadoop

Sean Roberts
Partner Solutions Engineer - EMEA
@seano

From: Greg Hill <mailto:greg.h...@rackspace.com>
Date: March 2, 2015 at 19:32:34
To: Sean Roberts ><mailto:srobe...@hortonworks.com>, 
user@ambari.apache.org<mailto:user@ambari.apache.org>><mailto:user@ambari.apache.org>
Subject:  Re: decommission multiple nodes issue

That causes a server error.  I’ve yet to see any part of the API that accepts 
JSON arrays like that as input; it’s almost always, if not always, a 
comma-separated string like I posted.  Many methods even return double-encoded 
JSON values (i.e. “key”: “[\”value1\”,\”value2\”]").  It’s kind of annoying and 
inconsistent, honestly, and not documented anywhere.  You just have to have 
your client code choke on it and then go add another data[key] = 
json.loads(data[key]) in the client to account for it.

I am starting to think it’s because I set the nodes into maintenance mode 
first, as doing the decommission command manually from the client works fine 
when the nodes aren’t in maintenance mode.  I’ll keep digging, I guess, but it 
is weird that the exact same command worked this time (the commandArgs are 
identical to the one that did nothing).

Greg

From: Sean Roberts mailto:srobe...@hortonworks.com>>
Date: Monday, March 2, 2015 at 1:22 PM
To: Greg mailto:greg.h...@rackspace.com>>, 
"user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: decommission multiple nodes issue


Racker Greg - I’m not familiar with the decommissioning API, but if it’s 
consistent with the rest of Ambari, you’ll need to change from this:

"excluded_hosts": “slave-1.local,slave-2.local"

To this:

"excluded_hosts" : [ "slave-1.local","slave-2.local" ]


--
Hortonworks - We do Hadoop

Sean Roberts
Partner Solutions Engineer - EMEA
@seano

From: Greg Hill <mailto:greg.h...@rackspace.com>
Reply: 
user@ambari.apache.org<mailto:user@ambari.apache.org>><mailto:user@ambari.apache.org>
Date: March 2, 2015 at 19:08:13
To: 
user@ambari.apache.org<mailto:user@ambari.apache.org>><mailto:user@ambari.apache.org>
Subject:  decommission multiple nodes issue

I have some code for decommissioning datanodes prior to removal.  It seems to 
work fine with a single node, but with multiple nodes it fails.  When passing 
multiple hosts, I am putting the names in a comma-separated string, as seems to 
be the custom with other Ambari API commands.  I attempted to send it as a JSON 
array, but the server complained about that.  Let me know if that is the wrong 
format.  The decommission request completes successfully, it just never writes 
the excludes file so no nodes are decommissioned.

This fails for mutiple nodes:

"RequestInfo": {
"command": "DECOMMISSION",
"context": "Decommission DataNode”),
"parameters": {"slave_type": “DATANODE", "excluded_hosts": 
“slave-1.local,slave-2.local"},
"operation_level": {
“level”: “CLUSTER”,
“cluster_name”: cluster_name
},
},
"Requests/resource_filters": [{
"service_name": “HDFS",
"component_name": “NAMENODE",
}],

But this works for a single node:

"RequestInfo": {
"command": "DECOMMISSION",
"context": "Decommission DataNode”),
"parameters": {"slave_type": “DATANODE", "excluded_hosts": 
“slave-1.local"},
"operation_level": {
“level”: “HOST_COMPONENT”,
“cluster_name”: cluster_name,
“host_name”: “slave-1.local”,
“service_name”: “HDFS”
},
},
"Requests/resource_filters": [{
"service_name": “HDFS",
"component_name": “NAMENODE",
}],

Looking on the actual node, it’s obvious that the file isn’t being written by 
the command output:

(multiple hosts, notice there is n

Re: decommission multiple nodes issue

2015-03-02 Thread Greg Hill
That causes a server error.  I’ve yet to see any part of the API that accepts 
JSON arrays like that as input; it’s almost always, if not always, a 
comma-separated string like I posted.  Many methods even return double-encoded 
JSON values (i.e. “key”: “[\”value1\”,\”value2\”]").  It’s kind of annoying and 
inconsistent, honestly, and not documented anywhere.  You just have to have 
your client code choke on it and then go add another data[key] = 
json.loads(data[key]) in the client to account for it.

I am starting to think it’s because I set the nodes into maintenance mode 
first, as doing the decommission command manually from the client works fine 
when the nodes aren’t in maintenance mode.  I’ll keep digging, I guess, but it 
is weird that the exact same command worked this time (the commandArgs are 
identical to the one that did nothing).

Greg

From: Sean Roberts mailto:srobe...@hortonworks.com>>
Date: Monday, March 2, 2015 at 1:22 PM
To: Greg mailto:greg.h...@rackspace.com>>, 
"user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: decommission multiple nodes issue


Racker Greg - I’m not familiar with the decommissioning API, but if it’s 
consistent with the rest of Ambari, you’ll need to change from this:

"excluded_hosts": “slave-1.local,slave-2.local"

To this:

"excluded_hosts" : [ "slave-1.local","slave-2.local" ]


--
Hortonworks - We do Hadoop

Sean Roberts
Partner Solutions Engineer - EMEA
@seano

From: Greg Hill <mailto:greg.h...@rackspace.com>
Reply: user@ambari.apache.org<mailto:user@ambari.apache.org> 
><mailto:user@ambari.apache.org>
Date: March 2, 2015 at 19:08:13
To: user@ambari.apache.org<mailto:user@ambari.apache.org> 
><mailto:user@ambari.apache.org>
Subject:  decommission multiple nodes issue

I have some code for decommissioning datanodes prior to removal.  It seems to 
work fine with a single node, but with multiple nodes it fails.  When passing 
multiple hosts, I am putting the names in a comma-separated string, as seems to 
be the custom with other Ambari API commands.  I attempted to send it as a JSON 
array, but the server complained about that.  Let me know if that is the wrong 
format.  The decommission request completes successfully, it just never writes 
the excludes file so no nodes are decommissioned.

This fails for mutiple nodes:

"RequestInfo": {
"command": "DECOMMISSION",
"context": "Decommission DataNode”),
"parameters": {"slave_type": “DATANODE", "excluded_hosts": 
“slave-1.local,slave-2.local"},
"operation_level": {
“level”: “CLUSTER”,
“cluster_name”: cluster_name
},
},
"Requests/resource_filters": [{
"service_name": “HDFS",
"component_name": “NAMENODE",
}],

But this works for a single node:

"RequestInfo": {
"command": "DECOMMISSION",
"context": "Decommission DataNode”),
"parameters": {"slave_type": “DATANODE", "excluded_hosts": 
“slave-1.local"},
"operation_level": {
“level”: “HOST_COMPONENT”,
“cluster_name”: cluster_name,
“host_name”: “slave-1.local”,
“service_name”: “HDFS”
},
},
"Requests/resource_filters": [{
"service_name": “HDFS",
"component_name": “NAMENODE",
}],

Looking on the actual node, it’s obvious that the file isn’t being written by 
the command output:

(multiple hosts, notice there is no ‘Writing File’ line)
File['/etc/hadoop/conf/dfs.exclude'] {'owner': 'hdfs', 'content': 
Template('exclude_hosts_list.j2'), 'group': 'hadoop'}
Execute[''] {'user': 'hdfs'}
ExecuteHadoop['dfsadmin -refreshNodes'] {'bin_dir': 
'/usr/hdp/current/hadoop-client/bin', 'conf_dir': '/etc/hadoop/conf', 
'kinit_override': True, 'user': 'hdfs'}
Execute['hadoop --config /etc/hadoop/conf dfsadmin -refreshNodes'] 
{'logoutput': False, 'path': ['/usr/hdp/current/hadoop-client/bin'], 'tries': 
1, 'user': 'hdfs', 'try_sleep': 0}

(single host, it writes the exclude file)
File['/etc/hadoop/conf/dfs.exclude'] {'owner': 'hdfs', 'content': 
Template('exclude_hosts_list.j2'), 'group': 'hadoop'}
Writing File['/etc/hadoop/conf/dfs.exclude'] because content

decommission multiple nodes issue

2015-03-02 Thread Greg Hill
I have some code for decommissioning datanodes prior to removal.  It seems to 
work fine with a single node, but with multiple nodes it fails.  When passing 
multiple hosts, I am putting the names in a comma-separated string, as seems to 
be the custom with other Ambari API commands.  I attempted to send it as a JSON 
array, but the server complained about that.  Let me know if that is the wrong 
format.  The decommission request completes successfully, it just never writes 
the excludes file so no nodes are decommissioned.

This fails for mutiple nodes:

"RequestInfo": {
"command": "DECOMMISSION",
"context": "Decommission DataNode"),
"parameters": {"slave_type": "DATANODE", "excluded_hosts": 
"slave-1.local,slave-2.local"},
"operation_level": {
"level": "CLUSTER",
"cluster_name": cluster_name
},
},
"Requests/resource_filters": [{
"service_name": "HDFS",
"component_name": "NAMENODE",
}],

But this works for a single node:

"RequestInfo": {
"command": "DECOMMISSION",
"context": "Decommission DataNode"),
"parameters": {"slave_type": "DATANODE", "excluded_hosts": 
"slave-1.local"},
"operation_level": {
"level": "HOST_COMPONENT",
"cluster_name": cluster_name,
"host_name": "slave-1.local",
"service_name": "HDFS"
},
},
"Requests/resource_filters": [{
"service_name": "HDFS",
"component_name": "NAMENODE",
}],

Looking on the actual node, it's obvious that the file isn't being written by 
the command output:

(multiple hosts, notice there is no 'Writing File' line)
File['/etc/hadoop/conf/dfs.exclude'] {'owner': 'hdfs', 'content': 
Template('exclude_hosts_list.j2'), 'group': 'hadoop'}
Execute[''] {'user': 'hdfs'}
ExecuteHadoop['dfsadmin -refreshNodes'] {'bin_dir': 
'/usr/hdp/current/hadoop-client/bin', 'conf_dir': '/etc/hadoop/conf', 
'kinit_override': True, 'user': 'hdfs'}
Execute['hadoop --config /etc/hadoop/conf dfsadmin -refreshNodes'] 
{'logoutput': False, 'path': ['/usr/hdp/current/hadoop-client/bin'], 'tries': 
1, 'user': 'hdfs', 'try_sleep': 0}

(single host, it writes the exclude file)
File['/etc/hadoop/conf/dfs.exclude'] {'owner': 'hdfs', 'content': 
Template('exclude_hosts_list.j2'), 'group': 'hadoop'}
Writing File['/etc/hadoop/conf/dfs.exclude'] because contents don't match
Execute[''] {'user': 'hdfs'}
ExecuteHadoop['dfsadmin -refreshNodes'] {'bin_dir': 
'/usr/hdp/current/hadoop-client/bin', 'conf_dir': '/etc/hadoop/conf', 
'kinit_override': True, 'user': 'hdfs'}
Execute['hadoop --config /etc/hadoop/conf dfsadmin -refreshNodes'] 
{'logoutput': False, 'path': ['/usr/hdp/current/hadoop-client/bin'], 'tries': 
1, 'user': 'hdfs', 'try_sleep': 0}

The only notable difference in the command.json is the 
commandParams/excluded_hosts param, so it's not like the request is passing the 
information along incorrectly.  I'm going to play around with the format I use 
to pass it in and take some wild guesses like it's expecting double-encoded 
JSON as I've seen that in other places, but if someone knows the answer offhand 
and can help out, that would be appreciated.  If it turns out to be a bug in 
Ambari, I'll open a JIRA and rewrite our code to issue the decommission call 
independently for each host.

Greg


Re: COMMERCIAL:RE: Server Restarts

2015-02-19 Thread Greg Hill
That won’t make the agent auto-start components on restart.  Afaik, you have to 
do that manually.

Greg

From: johny casanova mailto:pcgamer2...@outlook.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Thursday, February 19, 2015 at 7:45 AM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: COMMERCIAL:RE: Server Restarts

chkconfig "service" on


Date: Thu, 19 Feb 2015 06:41:23 -0600
Subject: Server Restarts
From: daniel.j.cies...@gmail.com
To: user@ambari.apache.org

How does one ensure that when Ambari clients are rebooted that the services 
that Ambari manages are started automatically?

Thanks
Dan


Ambari 1.7 installing some packages I wouldn't expect

2015-02-09 Thread Greg Hill
I'm testing out using 1.7 to install HDP 2.2 and I noticed some weird things 
being installed while I was auditing the cluster that was built.

HCAT is installed on the ambari node, but we only install NAGIOS_SERVER and 
GANGLIA_SERVER there.  Why is that installed on that node?  It in turn requires 
a bunch of other dependencies that we otherwise wouldn't need on that server, 
some of which take quite a while to install.

GANGLIA_MONITOR installs httpd.  That can't be right, can it?  I would think 
only GANGLIA_SERVER would need that.  Is that a mistake or a legitimate 
dependency?

Anyway, if there are legit reasons, then that's cool.  If not, I'll file a bug 
to get this resolved.

Greg


Ambari API questions

2015-02-08 Thread Greg Hill
1. Is there a way via the API to force it to update the DecomHosts field with 
fresh data?   There's a slight delay after the decommission process finishes 
before it is returned in the DecomHosts field of the NAMENODE, which is 
creating a race condition in my automation (sometimes it doesn't see the 
decommissioning hosts and just goes ahead and removes the DATANODE before it 
has finished re-replicating blocks).
2. Where in the API does the UI detect that components have stale configs and 
need to be restarted? I haven't been able to find that yet.

Thanks in advance.

Greg


ambari-server postgres setup questions

2015-02-06 Thread Greg Hill
I was looking through the ambari-server postgres setup because we're having 
occasional issues with postgresql initdb failing.  That's kind of tangential, 
but I found something that concerns me that I'd like some feedback on.  Afaict, 
it sets up postgres to:

1. Listen for traffic from anywhere
2. Accept connections from anywhere (using a password, at least)

Is there any reason to set up this broad access? I thought that only the 
ambari-server processes used postgres directly, so it should be locked down to 
local connections only.  I can't think of any reason you'd need to allow remote 
access.  The agents should do everything through the agent API.  Let me know if 
there's a legitimate reason for this that I'm unaware of.

Greg



Re: ssl changes recently?

2015-01-09 Thread Greg Hill
So, I've been able to diagnose this further.  The only actual package that
was updated that seems to possibly be related was we updated Python 2.7.8
to 2.7.9.  Apparently that included back porting some ssl fixes from
Python 3.  Those fixes apparently make it so the cert that ambari
generates is no longer considered valid.  I guess I'll open a JIRA on this.

Greg

On 1/7/15 3:55 PM, "Greg Hill"  wrote:

>More info from the server log:
>
>21:20:41,833  INFO [main] Configuration:411 - Reading password from
>existing file
>21:20:41,866  INFO [main] Configuration:422 - API SSL Authentication is
>turned on.
>21:20:41,866 ERROR [main] Configuration:437 - There is no keystore for
>https UI connection.
>21:20:41,866 ERROR [main] Configuration:438 - Run "ambari-server
>setup-https" or set api.ssl = false.
>21:20:41,877 ERROR [main] ViewRegistry:249 - Caught exception extracting
>view archive /var/lib/ambari-server/resources/views/slider-0
>.0.1-SNAPSHOT.jar.
>com.google.inject.ProvisionException: Guice provision errors:
>
>1) Error injecting constructor, java.lang.RuntimeException: Error reading
>certificate password from file /var/lib/ambari-server/keys/
>https.pass.txt
>  at 
>org.apache.ambari.server.configuration.Configuration.(Configuration.
>j
>ava:330)
>  at 
>org.apache.ambari.server.configuration.Configuration.class(Configuration.j
>a
>va:321)
>  while locating org.apache.ambari.server.configuration.Configuration
>
>1 error
>at 
>com.google.inject.internal.InjectorImpl$4.get(InjectorImpl.java:987)
>at 
>com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1013
>)
>at 
>org.apache.ambari.server.view.ViewRegistry.main(ViewRegistry.java:240)
>Caused by: java.lang.RuntimeException: Error reading certificate password
>from file /var/lib/ambari-server/keys/https.pass.txt
>at 
>org.apache.ambari.server.configuration.Configuration.(Configuration.
>j
>ava:439)
>at 
>org.apache.ambari.server.configuration.Configuration.(Configuration.
>j
>ava:330)
>at 
>org.apache.ambari.server.configuration.Configuration$$FastClassByGuice$$3b
>5
>88b69.newInstance()
>at 
>com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(Fast
>C
>onstructor.java:40)
>at 
>com.google.inject.internal.DefaultConstructionProxyFactory$1.newInstance(D
>e
>faultConstructionProxyFactory.java:60)
>at 
>com.google.inject.internal.ConstructorInjector.construct(ConstructorInject
>o
>r.java:85)
>at 
>com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorB
>i
>ndingImpl.java:254)
>at 
>com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(Provide
>r
>ToInternalFactoryAdapter.java:46)
>at 
>com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:10
>3
>1)
>at 
>com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderTo
>I
>nternalFactoryAdapter.java:40)
>at com.google.inject.Scopes$1$1.get(Scopes.java:65)
>at 
>com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFa
>c
>toryToProviderAdapter.java:40)
>at 
>com.google.inject.internal.InjectorImpl$4$1.call(InjectorImpl.java:978)
>at 
>com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:10
>2
>4)
>at 
>com.google.inject.internal.InjectorImpl$4.get(InjectorImpl.java:974)
>... 2 more
>21:20:44,055  INFO [main] Configuration:411 - Reading password from
>existing file
>21:20:44,107  INFO [main] Configuration:422 - API SSL Authentication is
>turned on.
>21:20:44,107 ERROR [main] Configuration:437 - There is no keystore for
>https UI connection.
>21:20:44,107 ERROR [main] Configuration:438 - Run "ambari-server
>setup-https" or set api.ssl = false.
>
>ambari-server setup-https is not a valid command.  Also, it appears to
>eventually recover, as seen below:
>
>21:20:54,895  INFO [main] Configuration:411 - Reading password from
>existing file
>21:20:54,915  INFO [main] Configuration:422 - API SSL Authentication is
>turned on.
>21:20:54,915 ERROR [main] Configuration:437 - There is no keystore for
>https UI connection.
>21:20:54,915 ERROR [main] Configuration:438 - Run "ambari-server
>setup-https" or set api.ssl = false.
>21:21:23,930  INFO [main] Configuration:411 - Reading password from
>existing file
>21:21:23,950  INFO [main] Configuration:422 - API SSL Authentication is
>turned on.
>21:21:23,950  INFO [main] Configuration:427 - Reading password from
>existing file
>...
>
>21:21:35,586  INFO [main] CertificateManager:69 - Initia

Re: ssl changes recently?

2015-01-07 Thread Greg Hill
batch -infiles
/var/lib/ambari-server/keys/ca.csr was finished with exit code: 0 - the
operation was completely successfully.
21:21:36,728  INFO [main] ShellCommandUtil:44 - Command openssl pkcs12
-export -in /var/lib/ambari-server/keys/ca.crt -inkey
/var/lib/ambari-server/keys/ca.key -certfile
/var/lib/ambari-server/keys/ca.crt -out
/var/lib/ambari-server/keys/keystore.p12 -password pass: -passin
pass:
 was finished with exit code: 0 - the operation was completely
successfully.
21:21:37,048  INFO [main] Configuration:487 - Credential provider creation
failed. Reason: Master key initialization failed.


So, it manages to create all the key/cert/ca stuff, but then fails.

Any pointers are appreciated, but I'll keep digging tomorrow.

Greg


On 1/7/15 3:01 PM, "Greg Hill"  wrote:

>During agent registration.  They all fail to register because the ssl cert
>validation fails and it can't connect to the ambari server.
>
>I should note that we *are not* using bootstrapping.  We preinstall the
>agents manually.  Nothing has changed since it was working other than
>updating to the latest CentOS and Ambari updates (still Ambari 1.7.0,
>though, we're not using trunk or anything).
>
>Greg
>
>On 1/7/15 2:54 PM, "Erin Boyd"  wrote:
>
>>When do you get this error? During registration or some other time?
>>
>>Erin
>>
>>- Original Message -
>>From: "Greg Hill" 
>>To: "Erin Boyd" , user@ambari.apache.org
>>Sent: Wednesday, January 7, 2015 1:52:03 PM
>>Subject: Re: ssl changes recently?
>>
>>[root@ambari ~]# rpm -qa | grep openssl
>>openssl-1.0.1e-30.el6_6.4.x86_64
>>
>>
>>We apparently have an even newer version.  Perhaps they broke something
>>else more recently?  We just spun up this image yesterday with the latest
>>CentOS 6.5 stuff.
>>
>>Greg
>>
>>On 1/7/15 2:48 PM, "Erin Boyd"  wrote:
>>
>>>Hey Greg,
>>>On RHEL 6.5 we got a similar error during agent registration.
>>>Here is the workaround:
>>>http://hortonworks.com/community/forums/topic/ambari-agent-registration-
>>>f
>>>a
>>>ilure-on-rhel-6-5-due-to-openssl-2/
>>>
>>>Hope that helps,
>>>Erin
>>>
>>>
>>>- Original Message -
>>>From: "Greg Hill" 
>>>To: user@ambari.apache.org
>>>Sent: Wednesday, January 7, 2015 1:44:40 PM
>>>Subject: ssl changes recently?
>>>
>>>I sent this to the wrong list earlier.
>>>
>>>I recently updated our Ambari 1.7.0 image and am now getting SSL errors
>>>from the agents:
>>>
>>>INFO 2015-01-07 16:59:02,116 NetUtil.py:48 - Connecting to
>>>https://ambari.local:8440/ca
>>>ERROR 2015-01-07 16:59:02,645 NetUtil.py:66 - [SSL:
>>>CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)
>>>ERROR 2015-01-07 16:59:02,646 NetUtil.py:67 - SSLError: Failed to
>>>connect. Please check openssl library versions.
>>>Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more
>>>details.
>>>WARNING 2015-01-07 16:59:02,651 NetUtil.py:92 - Server at
>>>https://ambari.local:8440<https://ambari.local:8440/> is not reachable,
>>>sleeping for 10 secondsŠ
>>>
>>>We're just using the default SSL certs that Ambari creates for agent
>>>communication.  This worked up until we made this new image, which pull
>>>in upstream CentOS system updates.
>>>
>>>Is it possible that some change in upstream has broken this for Ambari?
>>>Is there a workaround?
>>>
>>>I have noticed that the "server_crt" (/var/lib/ambari-agent/keys/ca.crt)
>>>does not exist on the hosts.  Is this something I'm supposed to inject?
>>>We weren't before, but it was working just fine without it.
>>>
>>>Greg
>>>
>>
>



Re: ssl changes recently?

2015-01-07 Thread Greg Hill
During agent registration.  They all fail to register because the ssl cert
validation fails and it can't connect to the ambari server.

I should note that we *are not* using bootstrapping.  We preinstall the
agents manually.  Nothing has changed since it was working other than
updating to the latest CentOS and Ambari updates (still Ambari 1.7.0,
though, we're not using trunk or anything).

Greg

On 1/7/15 2:54 PM, "Erin Boyd"  wrote:

>When do you get this error? During registration or some other time?
>
>Erin
>
>----- Original Message -
>From: "Greg Hill" 
>To: "Erin Boyd" , user@ambari.apache.org
>Sent: Wednesday, January 7, 2015 1:52:03 PM
>Subject: Re: ssl changes recently?
>
>[root@ambari ~]# rpm -qa | grep openssl
>openssl-1.0.1e-30.el6_6.4.x86_64
>
>
>We apparently have an even newer version.  Perhaps they broke something
>else more recently?  We just spun up this image yesterday with the latest
>CentOS 6.5 stuff.
>
>Greg
>
>On 1/7/15 2:48 PM, "Erin Boyd"  wrote:
>
>>Hey Greg,
>>On RHEL 6.5 we got a similar error during agent registration.
>>Here is the workaround:
>>http://hortonworks.com/community/forums/topic/ambari-agent-registration-f
>>a
>>ilure-on-rhel-6-5-due-to-openssl-2/
>>
>>Hope that helps,
>>Erin
>>
>>
>>- Original Message -
>>From: "Greg Hill" 
>>To: user@ambari.apache.org
>>Sent: Wednesday, January 7, 2015 1:44:40 PM
>>Subject: ssl changes recently?
>>
>>I sent this to the wrong list earlier.
>>
>>I recently updated our Ambari 1.7.0 image and am now getting SSL errors
>>from the agents:
>>
>>INFO 2015-01-07 16:59:02,116 NetUtil.py:48 - Connecting to
>>https://ambari.local:8440/ca
>>ERROR 2015-01-07 16:59:02,645 NetUtil.py:66 - [SSL:
>>CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)
>>ERROR 2015-01-07 16:59:02,646 NetUtil.py:67 - SSLError: Failed to
>>connect. Please check openssl library versions.
>>Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more
>>details.
>>WARNING 2015-01-07 16:59:02,651 NetUtil.py:92 - Server at
>>https://ambari.local:8440<https://ambari.local:8440/> is not reachable,
>>sleeping for 10 secondsŠ
>>
>>We're just using the default SSL certs that Ambari creates for agent
>>communication.  This worked up until we made this new image, which pull
>>in upstream CentOS system updates.
>>
>>Is it possible that some change in upstream has broken this for Ambari?
>>Is there a workaround?
>>
>>I have noticed that the "server_crt" (/var/lib/ambari-agent/keys/ca.crt)
>>does not exist on the hosts.  Is this something I'm supposed to inject?
>>We weren't before, but it was working just fine without it.
>>
>>Greg
>>
>



Re: ssl changes recently?

2015-01-07 Thread Greg Hill
[root@ambari ~]# rpm -qa | grep openssl
openssl-1.0.1e-30.el6_6.4.x86_64


We apparently have an even newer version.  Perhaps they broke something
else more recently?  We just spun up this image yesterday with the latest
CentOS 6.5 stuff.

Greg

On 1/7/15 2:48 PM, "Erin Boyd"  wrote:

>Hey Greg,
>On RHEL 6.5 we got a similar error during agent registration.
>Here is the workaround:
>http://hortonworks.com/community/forums/topic/ambari-agent-registration-fa
>ilure-on-rhel-6-5-due-to-openssl-2/
>
>Hope that helps,
>Erin
>
>
>----- Original Message -
>From: "Greg Hill" 
>To: user@ambari.apache.org
>Sent: Wednesday, January 7, 2015 1:44:40 PM
>Subject: ssl changes recently?
>
>I sent this to the wrong list earlier.
>
>I recently updated our Ambari 1.7.0 image and am now getting SSL errors
>from the agents:
>
>INFO 2015-01-07 16:59:02,116 NetUtil.py:48 - Connecting to
>https://ambari.local:8440/ca
>ERROR 2015-01-07 16:59:02,645 NetUtil.py:66 - [SSL:
>CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)
>ERROR 2015-01-07 16:59:02,646 NetUtil.py:67 - SSLError: Failed to
>connect. Please check openssl library versions.
>Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more
>details.
>WARNING 2015-01-07 16:59:02,651 NetUtil.py:92 - Server at
>https://ambari.local:8440<https://ambari.local:8440/> is not reachable,
>sleeping for 10 secondsŠ
>
>We're just using the default SSL certs that Ambari creates for agent
>communication.  This worked up until we made this new image, which pull
>in upstream CentOS system updates.
>
>Is it possible that some change in upstream has broken this for Ambari?
>Is there a workaround?
>
>I have noticed that the "server_crt" (/var/lib/ambari-agent/keys/ca.crt)
>does not exist on the hosts.  Is this something I'm supposed to inject?
>We weren't before, but it was working just fine without it.
>
>Greg
>



ssl changes recently?

2015-01-07 Thread Greg Hill
I sent this to the wrong list earlier.

I recently updated our Ambari 1.7.0 image and am now getting SSL errors from 
the agents:

INFO 2015-01-07 16:59:02,116 NetUtil.py:48 - Connecting to 
https://ambari.local:8440/ca
ERROR 2015-01-07 16:59:02,645 NetUtil.py:66 - [SSL: CERTIFICATE_VERIFY_FAILED] 
certificate verify failed (_ssl.c:581)
ERROR 2015-01-07 16:59:02,646 NetUtil.py:67 - SSLError: Failed to connect. 
Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
WARNING 2015-01-07 16:59:02,651 NetUtil.py:92 - Server at 
https://ambari.local:8440 is not reachable, 
sleeping for 10 seconds…

We're just using the default SSL certs that Ambari creates for agent 
communication.  This worked up until we made this new image, which pull in 
upstream CentOS system updates.

Is it possible that some change in upstream has broken this for Ambari?
Is there a workaround?

I have noticed that the "server_crt" (/var/lib/ambari-agent/keys/ca.crt) does 
not exist on the hosts.  Is this something I'm supposed to inject?  We weren't 
before, but it was working just fine without it.

Greg



Re: problem with historyserver on secondary namenode

2014-12-23 Thread Greg Hill
The problem is the namenode is only listening on localhost:

[root@master-1 ~]# netstat -pl --numeric-ports --numeric-hosts | grep 10975
tcp0  0 127.0.0.1:8020  0.0.0.0:*   
LISTEN  10975/java
tcp0  0 127.0.0.1:50070 0.0.0.0:*   
LISTEN  10975/java
udp0  0 0.0.0.0:50091   0.0.0.0:*   
10975/java

Likely this is caused by a misconfiguration on our end.  Sorry for the false 
alarm.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Tuesday, December 23, 2014 2:01 PM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: Re: problem with historyserver on secondary namenode

I may have been hasty in my diagnosis.  It doesn't appear to start even after 
hdfs is up and running fine.  I'll dig more and see if I can figure out the 
real culprit here.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Tuesday, December 23, 2014 1:51 PM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: problem with historyserver on secondary namenode

Trying to use ambari 1.7.0 to provision and hdp 2.2 cluster.  The layout I'm 
using has the yarn history server on the same host as the secondary namenode 
(the primary namenode is on another host), but it fails to start because it 
tries to interact with hdfs before hdfs is ready.  Here's a gist with the error:

https://gist.github.com/jimbobhickville/a25cef3a2355fc273984

Is this a bug in Ambari?  Is there any way for me to control this behavior via 
configuration or in my stack layout?  I imagine this type of scenario has to 
have come up previously.

Greg



Re: problem with historyserver on secondary namenode

2014-12-23 Thread Greg Hill
I may have been hasty in my diagnosis.  It doesn't appear to start even after 
hdfs is up and running fine.  I'll dig more and see if I can figure out the 
real culprit here.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Tuesday, December 23, 2014 1:51 PM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: problem with historyserver on secondary namenode

Trying to use ambari 1.7.0 to provision and hdp 2.2 cluster.  The layout I'm 
using has the yarn history server on the same host as the secondary namenode 
(the primary namenode is on another host), but it fails to start because it 
tries to interact with hdfs before hdfs is ready.  Here's a gist with the error:

https://gist.github.com/jimbobhickville/a25cef3a2355fc273984

Is this a bug in Ambari?  Is there any way for me to control this behavior via 
configuration or in my stack layout?  I imagine this type of scenario has to 
have come up previously.

Greg



problem with historyserver on secondary namenode

2014-12-23 Thread Greg Hill
Trying to use ambari 1.7.0 to provision and hdp 2.2 cluster.  The layout I'm 
using has the yarn history server on the same host as the secondary namenode 
(the primary namenode is on another host), but it fails to start because it 
tries to interact with hdfs before hdfs is ready.  Here's a gist with the error:

https://gist.github.com/jimbobhickville/a25cef3a2355fc273984

Is this a bug in Ambari?  Is there any way for me to control this behavior via 
configuration or in my stack layout?  I imagine this type of scenario has to 
have come up previously.

Greg



question about adding custom components

2014-12-16 Thread Greg Hill
We have a need to add a feature to Ambari to inject iptables rules on all the 
nodes in a cluster to allow traffic only from other nodes in the cluster.  We'd 
need to rewrite those rules any time a node was added or removed from the 
cluster.   I was thinking the best way to handle this would be to add an 
IPTABLES component that we can assign to all of the nodes that would do this 
for us, but I'm not sure if there's an easy way to force the regeneration of 
the rules on cluster resize.

So, a few questions:

1. Is that something we could contribute to the project?  Not sure if it's 
something that's globally useful or not.  Happy to contribute it if it's 
welcomed.
2. Is the approach of adding a component the best way to handle this or is 
there some other method I'm not considering?  Maybe just a special Request type?

Thanks in advance for any feedback/direction.

Greg


Re: unofficial python client

2014-12-04 Thread Greg Hill
Thanks for the clarification Subin.  I'd be fine with moving the code into the 
Ambari repo if it makes sense to do so, but I think it would be much simpler to 
just use Github and pull requests for the client.  It's not a big project and 
can easily be managed that way.  I also don't want it to depend on maven or 
other Java ecosystem tools in order to test it.  I want to remove barriers to 
people being able to contribute.  People working on the Python client are going 
to be primarily Python developers that need to integrate the Ambari API, not 
Ambari developers, and Python developers generally kind of expect to use Python 
testing tools and to submit a pull-request to contribute.

As for the current Python client, it doesn't seem like anyone works on it.  
It's broken for some basic functionality in recent releases, and I was the only 
one contributing patches to fix those problems afaict.

I never intended for this to be a solo project.  We just needed a client that 
met our needs quickly, and trying to fix the current client was proving 
frustrating.  I asked about rewriting it, but was shot down, so I finally gave 
up and just wrote my own.  I think it's a much better client, so I released it 
to see if others wanted to join efforts.

In the end, it doesn't matter to me all that much if it's the official client, 
but I'd hate for others to put their efforts into rewriting or fixing up the 
official client when it would make more sense to work together, IMO.

So, I guess that leaves us at:

1. Does the client need to live in the Apache repos?  Many open source projects 
leave the clients separate from the server code, and it generally makes sense 
to do things this way so client developers can do what makes sense for their 
language's ecosystem.
2. Is anyone else actually working on the Python client, or wanting to?  Which 
approach would they prefer?

Greg

From: subin mailto:subin.apa...@gmail.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Thursday, December 4, 2014 3:06 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: unofficial python client

Hi Yusaku,

I was referring to the Rackspace copyright in the client code. Which looks like 
a oversight and has been thus removed from the code.
And so is not a issue anymore.

I have few reservations with this solo effort by Greg.I have requested Greg to 
include the client into the Apache Ambari Project and also embrace the Apache 
way .This can help bring more people into Ambari project for example when we 
have changes in stacks/blueprint/REST API etc this would be good for  a new 
contributor by giving him/her an opportunity to contribute the patches to the 
client code.
I would have really liked The existing client to be made more bug free along 
with the shell  but it is for the community as a whole to decide where we want 
to go with the client code ; do we throw away old efforts or build on it with 
Greg's ideas/code Do not intent to overlook Greg's efforts and will support the 
decision of the community .

Best regards
Subin


On Thu, Dec 4, 2014 at 2:16 AM, Yusaku Sako 
mailto:yus...@hortonworks.com>> wrote:
Thanks Greg.

> I've had some discussion with Subin about making this new client the official 
> one, but he had some reservations about contractual obligations requiring it 
> be bundled with the server (is that true?  That makes no sense to me).

Subin, can you clarify what you meant?

Yusaku

On Wed, Dec 3, 2014 at 8:40 AM, Greg Hill 
mailto:greg.h...@rackspace.com>> wrote:
> I wrote a new Python client and published it to Github.  Thought others
> might be interested.
>
> https://github.com/jimbobhickville/python-ambariclient
>
> I did first attempt to work on the official client, as I'm much more in
> favor of contributing over forking, but I didn't feel like the effort was
> well spent.  It needed a rewrite to a better foundation, and to do so
> required breaking backwards compatibility.
>
> I've had some discussion with Subin about making this new client the
> official one, but he had some reservations about contractual obligations
> requiring it be bundled with the server (is that true?  That makes no sense
> to me).  I'd rather work with the community on it than go solo, so hopefully
> we can resolve things to mutual satisfaction.
>
> In the meantime, if anyone else is interested in contributing to this
> client, please fork and submit a pull-request.  Or just try it out and
> submit bugs via Github.  I'd like to do everything in the open, so if
> there's sufficient interest, we could set up an open discussion to work
> together on improving it.
>
> Greg

--
CONFIDENTIALITY N

unofficial python client

2014-12-03 Thread Greg Hill
I wrote a new Python client and published it to Github.  Thought others might 
be interested.

https://github.com/jimbobhickville/python-ambariclient

I did first attempt to work on the official client, as I'm much more in favor 
of contributing over forking, but I didn't feel like the effort was well spent. 
 It needed a rewrite to a better foundation, and to do so required breaking 
backwards compatibility.

I've had some discussion with Subin about making this new client the official 
one, but he had some reservations about contractual obligations requiring it be 
bundled with the server (is that true?  That makes no sense to me).  I'd rather 
work with the community on it than go solo, so hopefully we can resolve things 
to mutual satisfaction.

In the meantime, if anyone else is interested in contributing to this client, 
please fork and submit a pull-request.  Or just try it out and submit bugs via 
Github.  I'd like to do everything in the open, so if there's sufficient 
interest, we could set up an open discussion to work together on improving it.

Greg


Re: a couple API questions

2014-11-10 Thread Greg Hill
Thanks man.  That works wonderfully.  Not sure how I missed the root services 
section.

Greg

From: Jeff Sposetti mailto:j...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Monday, November 10, 2014 3:22 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: a couple API questions

On your Question #1:

/api/v1/services/AMBARI/components/AMBARI_SERVER

You'll see something like this:

"component_version" : "1.7.0",

On Mon, Nov 10, 2014 at 3:15 PM, Greg Hill 
mailto:greg.h...@rackspace.com>> wrote:
1. Is there a way to query the API to see what version of ambari the server is 
running?  This would make auto-negotiation in the client easy, so it can 
automatically account for version differences.  If this doesn't exist, I can 
open a JIRA to have it added.  We have a dev on the team that is planning on 
doing some contributions to the server code soon.
2. Is there any documentation of user privileges?  Like, how do I add 
privileges to a new user via the API?  What privileges are possible to assign 
(maybe this is retrievable via a different URL)?

If there's an easier way to find this information, let me know so I can just 
look there in the future.  I can't seem to find any authoritative source in the 
code for what URLs exist and what parameters they expect, but maybe I just am 
searching for the wrong thing.

Thanks in advance.

Greg


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.


a couple API questions

2014-11-10 Thread Greg Hill
1. Is there a way to query the API to see what version of ambari the server is 
running?  This would make auto-negotiation in the client easy, so it can 
automatically account for version differences.  If this doesn't exist, I can 
open a JIRA to have it added.  We have a dev on the team that is planning on 
doing some contributions to the server code soon.
2. Is there any documentation of user privileges?  Like, how do I add 
privileges to a new user via the API?  What privileges are possible to assign 
(maybe this is retrievable via a different URL)?

If there's an easier way to find this information, let me know so I can just 
look there in the future.  I can't seem to find any authoritative source in the 
code for what URLs exist and what parameters they expect, but maybe I just am 
searching for the wrong thing.

Thanks in advance.

Greg


Re: Stop all components API call no longer seems to work

2014-11-07 Thread Greg Hill
The host is in maintenance mode, but the components are not stopped.  I'm
setting maintenance mode prior to stopping services because otherwise you
get nagios notifications when the components are stopped (or this used to
be the case anyway).

Adding the operation_level made everything work correctly.  It returned a
request, and the components were stopped after the request finished, at
which point I was then able to remove the components from the host (this
is where it was failing previously because the components were not
stopped).

This is my new request body:

"RequestInfo": {
"context": "Stop All Components",
"operation_level": {
"level": "HOST",
"cluster_name": self.cluster_name,
"host_name": self.host_name,
},
},
"Body": {
   "HostRoles": {"state": "INSTALLED"},
}

Thanks for the help, although the behavior still confuses me a little.
Why would it be prevented in maintenance mode when that's presumably the
reason maintenance mode exists (to be able to muck about with things
without getting false alarms)?  Maybe I misunderstand what maintenance
mode is for?

Greg


On 11/7/14 9:33 AM, "Yusaku Sako"  wrote:

>Hi Greg,
>
>The API call you mentioned to stop all components on a host still
>works in 1.7.0 (I just verified on my recent 1.7.0 cluster).
>Operation_level is not mandatory and the WARN can be ignored.
>Operation_level drives the behavior of operations when
>services/hosts/host_components are in maintenance mode.
>Unfortunately I don't see any documentation on this.
>I presume you are getting 200 because all components on the specified
>host are already stopped.
>
>Yusaku
>
>On Fri, Nov 7, 2014 at 5:55 AM, Greg Hill  wrote:
>> This used to work in earlier 1.7.0 builds, but doesn't seem to any
>>longer:
>>
>> PUT
>> 
>>/api/v1/clusters/testcluster/hosts/c6404.ambari.apache.org/host_component
>>s
>> {"RequestInfo": {"context": "Stop All Components"}, "Body":
>>{"HostRoles":
>> {"state": "INSTALLED"}}}
>>
>> Seeing this in the server logs:
>> 13:05:42,082  WARN [qtp1842914725-24]
>>AmbariManagementControllerImpl:2149 -
>> Can not determine request operation level. Operation level property
>>should
>> be specified for this request.
>> 13:05:42,082  INFO [qtp1842914725-24]
>>AmbariManagementControllerImpl:2162 -
>> Received a updateHostComponent request, clusterName=testcluster,
>> serviceName=HDFS, componentName=DATANODE,
>>hostname=c6404.ambari.apache.org,
>> request={ clusterName=testcluster, serviceName=HDFS,
>>componentName=DATANODE,
>> hostname=c6404.ambari.apache.org, desiredState=INSTALLED,
>> desiredStackId=null, staleConfig=null, adminState=null}
>> 13:05:42,083  INFO [qtp1842914725-24]
>>AmbariManagementControllerImpl:2162 -
>> Received a updateHostComponent request, clusterName=testcluster,
>> serviceName=GANGLIA, componentName=GANGLIA_MONITOR,
>> hostname=c6404.ambari.apache.org, request={ clusterName=testcluster,
>> serviceName=GANGLIA, componentName=GANGLIA_MONITOR,
>> hostname=c6404.ambari.apache.org, desiredState=INSTALLED,
>> desiredStackId=null, staleConfig=null, adminState=null}
>> 13:05:42,083  INFO [qtp1842914725-24]
>>AmbariManagementControllerImpl:2162 -
>> Received a updateHostComponent request, clusterName=testcluster,
>> serviceName=YARN, componentName=NODEMANAGER,
>> hostname=c6404.ambari.apache.org, request={ clusterName=testcluster,
>> serviceName=YARN, componentName=NODEMANAGER,
>> hostname=c6404.ambari.apache.org, desiredState=INSTALLED,
>> desiredStackId=null, staleConfig=null, adminState=null}
>>
>> But I get an empty response with status 200 and no request was created.
>> Shouldn't that be an error if it can't act on my request?
>>
>> Are there some docs about how to formulate the 'operation level' part
>>of the
>> request?
>>
>> Greg
>>
>
>-- 
>CONFIDENTIALITY NOTICE
>NOTICE: This message is intended for the use of the individual or entity
>to 
>which it is addressed and may contain information that is confidential,
>privileged and exempt from disclosure under applicable law. If the reader
>of this message is not the intended recipient, you are hereby notified
>that 
>any printing, copying, dissemination, distribution, disclosure or
>forwarding of this communication is strictly prohibited. If you have
>received this communication in error, please contact the sender
>immediately 
>and delete it from your system. Thank You.



Stop all components API call no longer seems to work

2014-11-07 Thread Greg Hill
This used to work in earlier 1.7.0 builds, but doesn't seem to any longer:

PUT /api/v1/clusters/testcluster/hosts/c6404.ambari.apache.org/host_components
{"RequestInfo": {"context": "Stop All Components"}, "Body": {"HostRoles": 
{"state": "INSTALLED"}}}

Seeing this in the server logs:
13:05:42,082  WARN [qtp1842914725-24] AmbariManagementControllerImpl:2149 - Can 
not determine request operation level. Operation level property should be 
specified for this request.
13:05:42,082  INFO [qtp1842914725-24] AmbariManagementControllerImpl:2162 - 
Received a updateHostComponent request, clusterName=testcluster, 
serviceName=HDFS, componentName=DATANODE, hostname=c6404.ambari.apache.org, 
request={ clusterName=testcluster, serviceName=HDFS, componentName=DATANODE, 
hostname=c6404.ambari.apache.org, desiredState=INSTALLED, desiredStackId=null, 
staleConfig=null, adminState=null}
13:05:42,083  INFO [qtp1842914725-24] AmbariManagementControllerImpl:2162 - 
Received a updateHostComponent request, clusterName=testcluster, 
serviceName=GANGLIA, componentName=GANGLIA_MONITOR, 
hostname=c6404.ambari.apache.org, request={ clusterName=testcluster, 
serviceName=GANGLIA, componentName=GANGLIA_MONITOR, 
hostname=c6404.ambari.apache.org, desiredState=INSTALLED, desiredStackId=null, 
staleConfig=null, adminState=null}
13:05:42,083  INFO [qtp1842914725-24] AmbariManagementControllerImpl:2162 - 
Received a updateHostComponent request, clusterName=testcluster, 
serviceName=YARN, componentName=NODEMANAGER, hostname=c6404.ambari.apache.org, 
request={ clusterName=testcluster, serviceName=YARN, componentName=NODEMANAGER, 
hostname=c6404.ambari.apache.org, desiredState=INSTALLED, desiredStackId=null, 
staleConfig=null, adminState=null}

But I get an empty response with status 200 and no request was created.  
Shouldn't that be an error if it can't act on my request?

Are there some docs about how to formulate the 'operation level' part of the 
request?

Greg



Re: possible bug in the Ambari API

2014-11-03 Thread Greg Hill
Weird, I got the url from that page previously.  I guess it changed since then 
and I didn't notice when I rebuilt my test cluster.  Thanks for the heads up.

Greg

From: Yusaku Sako mailto:yus...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Monday, November 3, 2014 3:50 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: possible bug in the Ambari API

Hi Greg,

The yum repo you referred to is old and no longer maintained (I just installed 
ambari-server off of it and I see the hash is 
trunk:0e959b0ed80fc1a170cc10b1c75050c88a7b2d06.
This is trunk code from Oct 4.
Please use the URLs shown in the Quick Start Guide: 
https://cwiki.apache.org/confluence/display/AMBARI/Quick+Start+Guide

# to test the 1.7.0 branch build - updated nightly
wget -O /etc/yum.repos.d/ambari.repo 
http://s3.amazonaws.com/dev.hortonworks.com/ambari/centos6/1.x/latest/1.7.0/ambari.repo
OR
#  to test the trunk build - updated multiple times a day
wget -O /etc/yum.repos.d/ambari.repo 
http://s3.amazonaws.com/dev.hortonworks.com/ambari/centos6/1.x/latest/trunk/ambari.repo

Thanks,
Yusaku


On Mon, Nov 3, 2014 at 12:42 PM, Greg Hill 
mailto:greg.h...@rackspace.com>> wrote:
/api/v1/stacks/HDP/versions/2.1/services/HBASE/configurations works fine, just 
like any other GET method on a list of resources.

I did a yum update and ambari-server restart on my ambari node to rule that 
out.  Still get the same issue.  Happens for 
/api/v1/stacks/HDP/versions/2.1/services/HDFS/configurations/content as well.

This is my yum repo:
http://s3.amazonaws.com/dev.hortonworks.com/ambari/centos6/1.x/updates/1.7.0.trunk/

Am I missing some header that fixes things?  All I'm passing in in 
X-Requested-By.

Why does a GET on a single resource return two resources anyway?  That seems 
like it should be subdivided further if that's how it works.

Greg

From: Srimanth Gunturi 
mailto:srima...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Monday, November 3, 2014 3:17 PM

To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: possible bug in the Ambari API

Hi Greg,
I attempted the same API on latest 1.7.0 build, and do not see the issue (the 
comma is present between the two configurations).
Do you see the same when you access 
"/api/v1/stacks2/HDP/versions/2.1/stackServices/HBASE/configurations" or  
"/api/v1/stacks2/HDP/versions/2.1/stackServices/HDFS/configurations/content" ?
Regards,
Srimanth



On Mon, Nov 3, 2014 at 12:12 PM, Greg Hill 
mailto:greg.h...@rackspace.com>> wrote:
Also, I still get the same broken response using 'stacks' instead of 'stacks2'. 
 Is this a bug that was fixed recently?  I'm using a build from last week.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Monday, November 3, 2014 3:05 PM

To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: possible bug in the Ambari API

Oh?  I was basing it off the python client using 'stacks2'.  I figured that 
stacks was deprecated, but I suppose I should have asked.  Neither API is 
documented.  Why are there two?

Greg

From: Jeff Sposetti mailto:j...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Monday, November 3, 2014 2:54 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: possible bug in the Ambari API

Greg, That's the /stacks2 API. Want to try with /stacks (which I think is the 
preferred API resource)?

http://c6401.ambari.apache.org:8080/api/v1/stacks/HDP/versions/2.1/services/HBASE/configurations/content



[
  {
"href" : 
"http://c6401.ambari.apache.org:8080/api/v1/stacks/HDP/versions/2.1/services/HBASE/configurations/content";,
"StackConfigurations" : {
  "final" : "false",
  "property_description" : "Custom log4j.properties",
  "property_name" : "content",
  "property_type" : [ ],
  "property_value" : "\n# Licensed to the Apache Software Foundation (ASF) 
under one\n# or more contributor license agreements.  See the NOTICE file\n# 
distributed with this work for additional information\n# regarding copyright 
ownership.  The ASF licenses this file\n# to you under the Apache License, 
Version 2.0 (the\n# \&quo

Re: possible bug in the Ambari API

2014-11-03 Thread Greg Hill
/api/v1/stacks/HDP/versions/2.1/services/HBASE/configurations works fine, just 
like any other GET method on a list of resources.

I did a yum update and ambari-server restart on my ambari node to rule that 
out.  Still get the same issue.  Happens for 
/api/v1/stacks/HDP/versions/2.1/services/HDFS/configurations/content as well.

This is my yum repo:
http://s3.amazonaws.com/dev.hortonworks.com/ambari/centos6/1.x/updates/1.7.0.trunk/

Am I missing some header that fixes things?  All I'm passing in in 
X-Requested-By.

Why does a GET on a single resource return two resources anyway?  That seems 
like it should be subdivided further if that's how it works.

Greg

From: Srimanth Gunturi 
mailto:srima...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Monday, November 3, 2014 3:17 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: possible bug in the Ambari API

Hi Greg,
I attempted the same API on latest 1.7.0 build, and do not see the issue (the 
comma is present between the two configurations).
Do you see the same when you access 
"/api/v1/stacks2/HDP/versions/2.1/stackServices/HBASE/configurations" or  
"/api/v1/stacks2/HDP/versions/2.1/stackServices/HDFS/configurations/content" ?
Regards,
Srimanth



On Mon, Nov 3, 2014 at 12:12 PM, Greg Hill 
mailto:greg.h...@rackspace.com>> wrote:
Also, I still get the same broken response using 'stacks' instead of 'stacks2'. 
 Is this a bug that was fixed recently?  I'm using a build from last week.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Monday, November 3, 2014 3:05 PM

To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: possible bug in the Ambari API

Oh?  I was basing it off the python client using 'stacks2'.  I figured that 
stacks was deprecated, but I suppose I should have asked.  Neither API is 
documented.  Why are there two?

Greg

From: Jeff Sposetti mailto:j...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Monday, November 3, 2014 2:54 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: possible bug in the Ambari API

Greg, That's the /stacks2 API. Want to try with /stacks (which I think is the 
preferred API resource)?

http://c6401.ambari.apache.org:8080/api/v1/stacks/HDP/versions/2.1/services/HBASE/configurations/content



[
  {
"href" : 
"http://c6401.ambari.apache.org:8080/api/v1/stacks/HDP/versions/2.1/services/HBASE/configurations/content";,
"StackConfigurations" : {
  "final" : "false",
  "property_description" : "Custom log4j.properties",
  "property_name" : "content",
  "property_type" : [ ],
  "property_value" : "\n# Licensed to the Apache Software Foundation (ASF) 
under one\n# or more contributor license agreements.  See the NOTICE file\n# 
distributed with this work for additional information\n# regarding copyright 
ownership.  The ASF licenses this file\n# to you under the Apache License, 
Version 2.0 (the\n# \"License\"); you may not use this file except in 
compliance\n# with the License.  You may obtain a copy of the License at\n#\n#  
   
http://www.apache.org/licenses/LICENSE-2.0\n#\n#<http://www.apache.org/licenses/LICENSE-2.0%5Cn#%5Cn%23>
 Unless required by applicable law or agreed to in writing, software\n# 
distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT 
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the 
License for the specific language governing permissions and\n# limitations 
under the License.\n\n\n# Define some default values that can be overridden by 
system 
properties\nhbase.root.logger=INFO,console\nhbase.security.logger=INFO,console\nhbase.log.dir=.\nhbase.log.file=hbase.log\n\n#
 Define the root logger to the system property 
\"hbase.root.logger\".\nlog4j.rootLogger=${hbase.root.logger}\n\n# Logging 
Threshold\nlog4j.threshold=ALL\n\n#\n# Daily Rolling File 
Appender\n#\nlog4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender\nlog4j.appender.DRFA.File=${hbase.log.dir}/${hbase.log.file}\n\n#
 Rollver at midnight\nlog4j.appender.DRFA.DatePattern=.-MM-dd\n\n# 30-day 
backup\n#log4j.appender.DRFA.MaxBackupIndex=30\nlog4j.appender.DRFA.layout=org.apache.log4j.PatternLayout\n\n#
 Pattern format: Date LogLevel Log

Re: possible bug in the Ambari API

2014-11-03 Thread Greg Hill
ze}} 
-Djava.security.auth.login.config={{master_jaas_config_file}}\"\nexport 
HBASE_REGIONSERVER_OPTS=\"$HBASE_REGIONSERVER_OPTS 
-Xmn{{regionserver_xmn_size}} -XX:CMSInitiatingOccupancyFraction=70  
-Xms{{regionserver_heapsize}} -Xmx{{regionserver_heapsize}} 
-Djava.security.auth.login.config={{regionserver_jaas_config_file}}\"\n{% else 
%}\nexport HBASE_OPTS=\"$HBASE_OPTS -XX:+UseConcMarkSweepGC 
-XX:ErrorFile={{log_dir}}/hs_err_pid%p.log\"\nexport 
HBASE_MASTER_OPTS=\"$HBASE_MASTER_OPTS -Xmx{{master_heapsize}}\"\nexport 
HBASE_REGIONSERVER_OPTS=\"$HBASE_REGIONSERVER_OPTS 
-Xmn{{regionserver_xmn_size}} -XX:CMSInitiatingOccupancyFraction=70  
-Xms{{regionserver_heapsize}} -Xmx{{regionserver_heapsize}}\"\n{% endif %}\n
",
  "service_name" : "HBASE",
  "stack_name" : "HDP",
  "stack_version" : "2.1",
  "type" : "hbase-env.xml"
}
  }
]





On Mon, Nov 3, 2014 at 2:45 PM, Greg Hill 
mailto:greg.h...@rackspace.com>> wrote:
The more I look at this, I think it's just two separate dictionaries separated 
by a space.  That's not a valid response at all.  It should be wrapped in list 
structure.  I'll go file a JIRA ticket.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Monday, November 3, 2014 12:04 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: possible bug in the Ambari API

On the latest Ambari 1.7.0 build, this API call returns invalid JSON that the 
parser chokes on.  Notice the lack of a comma between the end of the first 
'StackConfigurations' structure and the following one.  There's just "} {" 
instead of "}, {"

GET /api/v1/stacks2/HDP/versions/2.1/stackServices/HBASE/configurations/content

{
  "href" : 
"http://c6401.ambari.apache.org:8080/api/v1/stacks2/HDP/versions/2.1/stackServices/HBASE/configurations/content";,
  "StackConfigurations" : {
"final" : "false",
"property_description" : "Custom log4j.properties",
"property_name" : "content",
"property_type" : [ ],
"property_value" : "\n# Licensed to the Apache Software Foundation (ASF) 
under one\n# or more contributor license agreements.  See the NOTICE file\n# 
distributed with this work for additional information\n# regarding copyright 
ownership.  The ASF licenses this file\n# to you under the Apache License, 
Version 2.0 (the\n# \"License\"); you may not use this file except in 
compliance\n# with the License.  You may obtain a copy of the License at\n#\n#  
   
http://www.apache.org/licenses/LICENSE-2.0\n#\n#<http://www.apache.org/licenses/LICENSE-2.0%5Cn#%5Cn%23>
 Unless required by applicable law or agreed to in writing, software\n# 
distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT 
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the 
License for the specific language governing permissions and\n# limitations 
under the License.\n\n\n# Define some default values that can be overridden by 
system 
properties\nhbase.root.logger=INFO,console\nhbase.security.logger=INFO,console\nhbase.log.dir=.\nhbase.log.file=hbase.log\n\n#
 Define the root logger to the system property 
\"hbase.root.logger\".\nlog4j.rootLogger=${hbase.root.logger}\n\n# Logging 
Threshold\nlog4j.threshold=ALL\n\n#\n# Daily Rolling File 
Appender\n#\nlog4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender\nlog4j.appender.DRFA.File=${hbase.log.dir}/${hbase.log.file}\n\n#
 Rollver at midnight\nlog4j.appender.DRFA.DatePattern=.-MM-dd\n\n# 30-day 
backup\n#log4j.appender.DRFA.MaxBackupIndex=30\nlog4j.appender.DRFA.layout=org.apache.log4j.PatternLayout\n\n#
 Pattern format: Date LogLevel LoggerName 
LogMessage\nlog4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p [%t] 
%c{2}: %m%n\n\n# Rolling File Appender 
properties\nhbase.log.maxfilesize=256MB\nhbase.log.maxbackupindex=20\n\n# 
Rolling File 
Appender\nlog4j.appender.RFA=org.apache.log4j.RollingFileAppender\nlog4j.appender.RFA.File=${hbase.log.dir}/${hbase.log.file}\n\nlog4j.appender.RFA.MaxFileSize=${hbase.log.maxfilesize}\nlog4j.appender.RFA.MaxBackupIndex=${hbase.log.maxbackupindex}\n\nlog4j.appender.RFA.layout=org.apache.log4j.PatternLayout\nlog4j.appender.RFA.layout.ConversionPattern=%d{ISO8601}
 %-5p [%t] %c{2}: %m%n\n\n#\n# Security audit 
appender\n#\nhbase.security.log.file=SecurityAuth.audit\nhbase.security.log.maxfilesize=256MB\nhbase.security.log.maxbackupindex=20\nlog4j.appender.RFAS=org.apache.log4j.RollingFileAppender\nlog4j.appender.RFAS.File=${hbase.log.dir}/${hbase.security.log.file}\n

Re: possible bug in the Ambari API

2014-11-03 Thread Greg Hill
.logger.org.apache.hadoop.dfs=DEBUG\n#
 Set this class to log INFO only otherwise its OTT\n# Enable this to get 
detailed connection error/retry logging.\n# 
log4j.logger.org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation=TRACE\n\n\n#
 Uncomment this line to enable tracing on _every_ RPC call (this can be a lot 
of output)\n#log4j.logger.org.apache.hadoop.ipc.HBaseServer.trace=DEBUG\n\n# 
Uncomment the below if you want to remove logging of client region caching'\n# 
and scan of .META. messages\n# 
log4j.logger.org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation=INFO\n#
 log4j.logger.org.apache.hadoop.hbase.client.MetaScanner=INFO\n\n",
  "service_name" : "HBASE",
  "stack_name" : "HDP",
  "stack_version" : "2.1",
  "type" : "hbase-log4j.xml"
}
  },
  {
"href" : 
"http://c6401.ambari.apache.org:8080/api/v1/stacks/HDP/versions/2.1/services/HBASE/configurations/content";,
"StackConfigurations" : {
  "final" : "false",
  "property_description" : "This is the jinja template for hbase-env.sh 
file",
  "property_name" : "content",
  "property_type" : [ ],
  "property_value" : "\n# Set environment variables here.\n\n# The java 
implementation to use. Java 1.6 required.\nexport 
JAVA_HOME={{java64_home}}\n\n# HBase Configuration directory\nexport 
HBASE_CONF_DIR=${HBASE_CONF_DIR:-{{hbase_conf_dir}}}\n\n# Extra Java CLASSPATH 
elements. Optional.\nexport HBASE_CLASSPATH=${HBASE_CLASSPATH}\n\n# The maximum 
amount of heap to use, in MB. Default is 1000.\n# export 
HBASE_HEAPSIZE=1000\n\n# Extra Java runtime options.\n# Below are what we set 
by default. May only work with SUN JVM.\n# For more on why as well as other 
possible settings,\n# see 
http://wiki.apache.org/hadoop/PerformanceTuning\nexport 
SERVER_GC_OPTS=\"-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-Xloggc:{{log_dir}}/gc.log-`date +'%Y%m%d%H%M'`\"\n# Uncomment below to enable 
java garbage collection logging.\n# export HBASE_OPTS=\"$HBASE_OPTS -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-Xloggc:$HBASE_HOME/logs/gc-hbase.log\"\n\n# Uncomment and adjust to enable JMX 
exporting\n# See jmxremote.password and jmxremote.access in 
$JRE_HOME/lib/management to configure remote password access.\n# More details 
at: 
http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html\n#\n# 
export HBASE_JMX_BASE=\"-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false\"\n# If you want to configure 
BucketCache, specify '-XX: MaxDirectMemorySize=' with proper direct memory 
size\n# export HBASE_THRIFT_OPTS=\"$HBASE_JMX_BASE 
-Dcom.sun.management.jmxremote.port=10103\"\n# export 
HBASE_ZOOKEEPER_OPTS=\"$HBASE_JMX_BASE 
-Dcom.sun.management.jmxremote.port=10104\"\n\n# File naming hosts on which 
HRegionServers will run. $HBASE_HOME/conf/regionservers by default.\nexport 
HBASE_REGIONSERVERS=${HBASE_CONF_DIR}/regionservers\n\n# Extra ssh options. 
Empty by default.\n# export HBASE_SSH_OPTS=\"-o ConnectTimeout=1 -o 
SendEnv=HBASE_CONF_DIR\"\n\n# Where log files are stored. $HBASE_HOME/logs by 
default.\nexport HBASE_LOG_DIR={{log_dir}}\n\n# A string representing this 
instance of hbase. $USER by default.\n# export HBASE_IDENT_STRING=$USER\n\n# 
The scheduling priority for daemon processes. See 'man nice'.\n# export 
HBASE_NICENESS=10\n\n# The directory where pid files are stored. /tmp by 
default.\nexport HBASE_PID_DIR={{pid_dir}}\n\n# Seconds to sleep between slave 
commands. Unset by default. This\n# can be useful in large clusters, where, 
e.g., slave rsyncs can\n# otherwise arrive faster than the master can service 
them.\n# export HBASE_SLAVE_SLEEP=0.1\n\n# Tell HBase whether it should manage 
it's own instance of Zookeeper or not.\nexport HBASE_MANAGES_ZK=false\n\n{% if 
security_enabled %}\nexport HBASE_OPTS=\"$HBASE_OPTS -XX:+UseConcMarkSweepGC 
-XX:ErrorFile={{log_dir}}/hs_err_pid%p.log 
-Djava.security.auth.login.config={{client_jaas_config_file}}\"\nexport 
HBASE_MASTER_OPTS=\"$HBASE_MASTER_OPTS -Xmx{{master_heapsize}} 
-Djava.security.auth.login.config={{master_jaas_config_file}}\"\nexport 
HBASE_REGIONSERVER_OPTS=\"$HBASE_REGIONSERVER_OPTS 
-Xmn{{regionserver_xmn_size}} -XX:CMSInitiatingOccupancyFraction=70  
-Xms{{regionserver_heapsize}} -Xmx{{regionserver_heapsize}} 
-Djava.security.auth.login.config={{regionserver_jaas_config_file}}\"\n{% else 
%}\nexport HBASE_OPTS=\"$HBASE_OPTS -XX:+UseConcMarkSweepGC 
-XX:ErrorFile={{log_dir}}/hs_err_pid%p.log\"\nexport 
HBASE_MASTER_OPTS=\"$HBASE_MASTER_OPTS -Xmx{{master_heapsize}}\"\nexport 
HBASE_REGIONSERV

Re: possible bug in the Ambari API

2014-11-03 Thread Greg Hill
The more I look at this, I think it's just two separate dictionaries separated 
by a space.  That's not a valid response at all.  It should be wrapped in list 
structure.  I'll go file a JIRA ticket.

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Monday, November 3, 2014 12:04 PM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: possible bug in the Ambari API

On the latest Ambari 1.7.0 build, this API call returns invalid JSON that the 
parser chokes on.  Notice the lack of a comma between the end of the first 
'StackConfigurations' structure and the following one.  There's just "} {" 
instead of "}, {"

GET /api/v1/stacks2/HDP/versions/2.1/stackServices/HBASE/configurations/content

{
  "href" : 
"http://c6401.ambari.apache.org:8080/api/v1/stacks2/HDP/versions/2.1/stackServices/HBASE/configurations/content";,
  "StackConfigurations" : {
"final" : "false",
"property_description" : "Custom log4j.properties",
"property_name" : "content",
"property_type" : [ ],
"property_value" : "\n# Licensed to the Apache Software Foundation (ASF) 
under one\n# or more contributor license agreements.  See the NOTICE file\n# 
distributed with this work for additional information\n# regarding copyright 
ownership.  The ASF licenses this file\n# to you under the Apache License, 
Version 2.0 (the\n# \"License\"); you may not use this file except in 
compliance\n# with the License.  You may obtain a copy of the License at\n#\n#  
   http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by 
applicable law or agreed to in writing, software\n# distributed under the 
License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR 
CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the 
specific language governing permissions and\n# limitations under the 
License.\n\n\n# Define some default values that can be overridden by system 
properties\nhbase.root.logger=INFO,console\nhbase.security.logger=INFO,console\nhbase.log.dir=.\nhbase.log.file=hbase.log\n\n#
 Define the root logger to the system property 
\"hbase.root.logger\".\nlog4j.rootLogger=${hbase.root.logger}\n\n# Logging 
Threshold\nlog4j.threshold=ALL\n\n#\n# Daily Rolling File 
Appender\n#\nlog4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender\nlog4j.appender.DRFA.File=${hbase.log.dir}/${hbase.log.file}\n\n#
 Rollver at midnight\nlog4j.appender.DRFA.DatePattern=.-MM-dd\n\n# 30-day 
backup\n#log4j.appender.DRFA.MaxBackupIndex=30\nlog4j.appender.DRFA.layout=org.apache.log4j.PatternLayout\n\n#
 Pattern format: Date LogLevel LoggerName 
LogMessage\nlog4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p [%t] 
%c{2}: %m%n\n\n# Rolling File Appender 
properties\nhbase.log.maxfilesize=256MB\nhbase.log.maxbackupindex=20\n\n# 
Rolling File 
Appender\nlog4j.appender.RFA=org.apache.log4j.RollingFileAppender\nlog4j.appender.RFA.File=${hbase.log.dir}/${hbase.log.file}\n\nlog4j.appender.RFA.MaxFileSize=${hbase.log.maxfilesize}\nlog4j.appender.RFA.MaxBackupIndex=${hbase.log.maxbackupindex}\n\nlog4j.appender.RFA.layout=org.apache.log4j.PatternLayout\nlog4j.appender.RFA.layout.ConversionPattern=%d{ISO8601}
 %-5p [%t] %c{2}: %m%n\n\n#\n# Security audit 
appender\n#\nhbase.security.log.file=SecurityAuth.audit\nhbase.security.log.maxfilesize=256MB\nhbase.security.log.maxbackupindex=20\nlog4j.appender.RFAS=org.apache.log4j.RollingFileAppender\nlog4j.appender.RFAS.File=${hbase.log.dir}/${hbase.security.log.file}\nlog4j.appender.RFAS.MaxFileSize=${hbase.security.log.maxfilesize}\nlog4j.appender.RFAS.MaxBackupIndex=${hbase.security.log.maxbackupindex}\nlog4j.appender.RFAS.layout=org.apache.log4j.PatternLayout\nlog4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601}
 %p %c: 
%m%n\nlog4j.category.SecurityLogger=${hbase.security.logger}\nlog4j.additivity.SecurityLogger=false\n#log4j.logger.SecurityLogger.org.apache.hadoop.hbase.security.access.AccessController=TRACE\n\n#\n#
 Null 
Appender\n#\nlog4j.appender.NullAppender=org.apache.log4j.varia.NullAppender\n\n#\n#
 console\n# Add \"console\" to rootlogger above if you want to use 
this\n#\nlog4j.appender.console=org.apache.log4j.ConsoleAppender\nlog4j.appender.console.target=System.err\nlog4j.appender.console.layout=org.apache.log4j.PatternLayout\nlog4j.appender.console.layout.ConversionPattern=%d{ISO8601}
 %-5p [%t] %c{2}: %m%n\n\n# Custom Logging 
levels\n\nlog4j.logger.org.apache.zookeeper=INFO\n#log4j.logger.org.apache.hadoop.fs.FSNamesystem=DEBUG\nlog4j.logger.org.apache.hadoop.hbase=DEBUG\n#
 Make these two classes INFO-level. Make them DEBUG to see more zk 
debug.\nlog4j.logger.org.apache.hadoop.hbase.zookeeper.ZKUtil=INFO\nlog4j.logger.org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher=INFO\n#log4j.logger.org.apache.hadoop.dfs=DEBUG\n#
 Set this class to log INFO only otherwise its OTT\n# Enable this to get 

possible bug in the Ambari API

2014-11-03 Thread Greg Hill
On the latest Ambari 1.7.0 build, this API call returns invalid JSON that the 
parser chokes on.  Notice the lack of a comma between the end of the first 
'StackConfigurations' structure and the following one.  There's just "} {" 
instead of "}, {"

GET /api/v1/stacks2/HDP/versions/2.1/stackServices/HBASE/configurations/content

{
  "href" : 
"http://c6401.ambari.apache.org:8080/api/v1/stacks2/HDP/versions/2.1/stackServices/HBASE/configurations/content";,
  "StackConfigurations" : {
"final" : "false",
"property_description" : "Custom log4j.properties",
"property_name" : "content",
"property_type" : [ ],
"property_value" : "\n# Licensed to the Apache Software Foundation (ASF) 
under one\n# or more contributor license agreements.  See the NOTICE file\n# 
distributed with this work for additional information\n# regarding copyright 
ownership.  The ASF licenses this file\n# to you under the Apache License, 
Version 2.0 (the\n# \"License\"); you may not use this file except in 
compliance\n# with the License.  You may obtain a copy of the License at\n#\n#  
   http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by 
applicable law or agreed to in writing, software\n# distributed under the 
License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR 
CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the 
specific language governing permissions and\n# limitations under the 
License.\n\n\n# Define some default values that can be overridden by system 
properties\nhbase.root.logger=INFO,console\nhbase.security.logger=INFO,console\nhbase.log.dir=.\nhbase.log.file=hbase.log\n\n#
 Define the root logger to the system property 
\"hbase.root.logger\".\nlog4j.rootLogger=${hbase.root.logger}\n\n# Logging 
Threshold\nlog4j.threshold=ALL\n\n#\n# Daily Rolling File 
Appender\n#\nlog4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender\nlog4j.appender.DRFA.File=${hbase.log.dir}/${hbase.log.file}\n\n#
 Rollver at midnight\nlog4j.appender.DRFA.DatePattern=.-MM-dd\n\n# 30-day 
backup\n#log4j.appender.DRFA.MaxBackupIndex=30\nlog4j.appender.DRFA.layout=org.apache.log4j.PatternLayout\n\n#
 Pattern format: Date LogLevel LoggerName 
LogMessage\nlog4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p [%t] 
%c{2}: %m%n\n\n# Rolling File Appender 
properties\nhbase.log.maxfilesize=256MB\nhbase.log.maxbackupindex=20\n\n# 
Rolling File 
Appender\nlog4j.appender.RFA=org.apache.log4j.RollingFileAppender\nlog4j.appender.RFA.File=${hbase.log.dir}/${hbase.log.file}\n\nlog4j.appender.RFA.MaxFileSize=${hbase.log.maxfilesize}\nlog4j.appender.RFA.MaxBackupIndex=${hbase.log.maxbackupindex}\n\nlog4j.appender.RFA.layout=org.apache.log4j.PatternLayout\nlog4j.appender.RFA.layout.ConversionPattern=%d{ISO8601}
 %-5p [%t] %c{2}: %m%n\n\n#\n# Security audit 
appender\n#\nhbase.security.log.file=SecurityAuth.audit\nhbase.security.log.maxfilesize=256MB\nhbase.security.log.maxbackupindex=20\nlog4j.appender.RFAS=org.apache.log4j.RollingFileAppender\nlog4j.appender.RFAS.File=${hbase.log.dir}/${hbase.security.log.file}\nlog4j.appender.RFAS.MaxFileSize=${hbase.security.log.maxfilesize}\nlog4j.appender.RFAS.MaxBackupIndex=${hbase.security.log.maxbackupindex}\nlog4j.appender.RFAS.layout=org.apache.log4j.PatternLayout\nlog4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601}
 %p %c: 
%m%n\nlog4j.category.SecurityLogger=${hbase.security.logger}\nlog4j.additivity.SecurityLogger=false\n#log4j.logger.SecurityLogger.org.apache.hadoop.hbase.security.access.AccessController=TRACE\n\n#\n#
 Null 
Appender\n#\nlog4j.appender.NullAppender=org.apache.log4j.varia.NullAppender\n\n#\n#
 console\n# Add \"console\" to rootlogger above if you want to use 
this\n#\nlog4j.appender.console=org.apache.log4j.ConsoleAppender\nlog4j.appender.console.target=System.err\nlog4j.appender.console.layout=org.apache.log4j.PatternLayout\nlog4j.appender.console.layout.ConversionPattern=%d{ISO8601}
 %-5p [%t] %c{2}: %m%n\n\n# Custom Logging 
levels\n\nlog4j.logger.org.apache.zookeeper=INFO\n#log4j.logger.org.apache.hadoop.fs.FSNamesystem=DEBUG\nlog4j.logger.org.apache.hadoop.hbase=DEBUG\n#
 Make these two classes INFO-level. Make them DEBUG to see more zk 
debug.\nlog4j.logger.org.apache.hadoop.hbase.zookeeper.ZKUtil=INFO\nlog4j.logger.org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher=INFO\n#log4j.logger.org.apache.hadoop.dfs=DEBUG\n#
 Set this class to log INFO only otherwise its OTT\n# Enable this to get 
detailed connection error/retry logging.\n# 
log4j.logger.org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation=TRACE\n\n\n#
 Uncomment this line to enable tracing on _every_ RPC call (this can be a lot 
of output)\n#log4j.logger.org.apache.hadoop.ipc.HBaseServer.trace=DEBUG\n\n# 
Uncomment the below if you want to remove logging of client region caching'\n# 
and scan of .META. messages\n# 
log4j.logger.org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation=INFO\n#
 log4j.log

Add/update users through ambari API?

2014-09-30 Thread Greg Hill
I can't find anything in the docs about how to create new users or update 
existing users (i.e. Change the password for the default user) via the Ambari 
API.  Is this possible?  If so, what is the URL I should be hitting?

Thanks in advance.

Greg



Re: delete host/component

2014-09-24 Thread Greg Hill
I'm not sure.  The first version I used was 1.6.0 and it existed by then.

Greg

From: Aaron Cody mailto:ac...@hexiscyber.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, September 24, 2014 2:52 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: delete host/component

great! when was that functionality introduced? ( I’m still on 1.2.4)
thanks

From: Greg Hill mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, September 24, 2014 at 12:49 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: delete host/component

Yes, they are.  You have to stop the components on that host, then remove the 
components from the host, then you can remove the host from the cluster.

Greg

From: Aaron Cody mailto:ac...@hexiscyber.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Wednesday, September 24, 2014 2:41 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: delete host/component

hi
are these operations supported in the REST api yet?



Re: delete host/component

2014-09-24 Thread Greg Hill
Yes, they are.  You have to stop the components on that host, then remove the 
components from the host, then you can remove the host from the cluster.

Greg

From: Aaron Cody mailto:ac...@hexiscyber.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Wednesday, September 24, 2014 2:41 PM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: delete host/component

hi
are these operations supported in the REST api yet?



Re: Is there some bugs to create hbase cluster via rest api with blueprint?

2014-07-24 Thread Greg Hill
I think the UI fills in a lot of required configuration for you with sensible 
defaults, but the API does not necessarily do that as well.  It sounds like you 
just need to pass in mapred_user.  You might open a bug to have that defaulted 
on the server-side rather than the UI, since it's something that most people 
won't ever change.

Greg

From: Qing Chi 79624 mailto:c...@vmware.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Wednesday, July 23, 2014 9:00 PM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: Is there some bugs to create hbase cluster via rest api with blueprint?

Hi guys,

There are following error  message of installing cluster packages when create 
hbase cluster via rest api with blueprint. But it works normally when create 
hbase cluster via ambari UI.  Is there some bugs to create hbase cluster via 
rest api with blueprint?
tderr:   /var/lib/ambari-agent/data/errors-5.txt

2014-07-23 10:36:45,509 - Error while executing command 'install':
Traceback (most recent call last):
  File 
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
 line 105, in execute
method(env)
  File 
"/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-INSTALL/scripts/hook.py",
 line 34, in hook
setup_users()
  File 
"/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-INSTALL/scripts/shared_initialization.py",
 line 88, in setup_users
groups=[params.user_group]
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
line 119, in __new__
env.resources[r_type][name] = obj
  File 
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/config_dictionary.py",
 line 75, in __getattr__
raise Fail("Configuration parameter '"+self.name+"' was not found in 
configurations dictionary!")
Fail: Configuration parameter 'mapred_user' was not found in configurations 
dictionary!

stdout:   /var/lib/ambari-agent/data/output-5.txt

2014-07-23 10:35:09,755 - Package['unzip'] {}
2014-07-23 10:35:09,942 - Installing package unzip ('/usr/bin/yum -d 0 -e 0 -y 
install unzip')
2014-07-23 10:35:11,310 - Package['curl'] {}
2014-07-23 10:35:11,473 - Skipping installing existent package curl
2014-07-23 10:35:11,474 - Package['net-snmp-utils'] {}
2014-07-23 10:35:11,634 - Installing package net-snmp-utils ('/usr/bin/yum -d 0 
-e 0 -y install net-snmp-utils')
2014-07-23 10:35:14,096 - Package['net-snmp'] {}
2014-07-23 10:35:14,257 - Installing package net-snmp ('/usr/bin/yum -d 0 -e 0 
-y install net-snmp')
2014-07-23 10:35:15,915 - Execute['mkdir -p /tmp/HDP-artifacts/ ;   curl -kf   
--retry 10 
http://sin2-pekaurora-bdcqevlan114-146.eng.vmware.com:8080/resources//jdk-7u45-linux-x64.tar.gz
 -o /tmp/HDP-artifacts//jdk-7u45-linux-x64.tar.gz'] {'environment': ..., 
'not_if': 'test -e /usr/jdk64/jdk1.7.0_45/bin/java', 'path': ['/bin', 
'/usr/bin/']}
2014-07-23 10:36:38,415 - Execute['mkdir -p /usr/jdk64 ; cd /usr/jdk64 ; tar 
-xf /tmp/HDP-artifacts//jdk-7u45-linux-x64.tar.gz > /dev/null 2>&1'] {'not_if': 
'test -e /usr/jdk64/jdk1.7.0_45/bin/java', 'path': ['/bin', '/usr/bin/']}
2014-07-23 10:36:42,942 - Execute['mkdir -p /tmp/HDP-artifacts/; curl -kf 
--retry 10 
http://sin2-pekaurora-bdcqevlan114-146.eng.vmware.com:8080/resources//UnlimitedJCEPolicyJDK7.zip
 -o /tmp/HDP-artifacts//UnlimitedJCEPolicyJDK7.zip'] {'environment': ..., 
'not_if': 'test -e /tmp/HDP-artifacts//UnlimitedJCEPolicyJDK7.zip', 
'ignore_failures': True, 'path': ['/bin', '/usr/bin/']}
2014-07-23 10:36:43,187 - Group['hadoop'] {}
2014-07-23 10:36:43,188 - Adding group Group['hadoop']
2014-07-23 10:36:44,314 - Group['users'] {}
2014-07-23 10:36:44,315 - Modifying group users
2014-07-23 10:36:44,463 - Group['users'] {}
2014-07-23 10:36:44,464 - Modifying group users
2014-07-23 10:36:44,611 - User['ambari-qa'] {'gid': 'hadoop', 'groups': 
[u'users']}
2014-07-23 10:36:44,612 - Adding user User['ambari-qa']
2014-07-23 10:36:44,821 - File['/tmp/changeUid.sh'] {'content': 
StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2014-07-23 10:36:44,822 - Writing File['/tmp/changeUid.sh'] because it doesn't 
exist
2014-07-23 10:36:44,822 - Changing permission for /tmp/changeUid.sh from 644 to 
555
2014-07-23 10:36:44,823 - Execute['/tmp/changeUid.sh ambari-qa 
/tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa
 2>/dev/null'] {'not_if': 'test $(id -u ambari-qa) -gt 1000'}
2014-07-23 10:36:45,116 - User['hbase'] {'gid': 'hadoop', 'groups': [u'hadoop']}
2014-07-23 10:36:45,117 - Adding user User['hbase']
2014-07-23 10:36:45,262 - File['/tmp/changeUid.sh'] {'content': 
StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2014-07-23 10:36:45,264 - Execute['/tmp/changeUid.sh hbase 
/home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/hadoop/hbase 
2>/dev/null'] {'not_if': 'test $(id -u hbase) -gt 1000'}
2014-07-23 10:36:45,377 - 

Re: stopping host components via API

2014-07-23 Thread Greg Hill
I opened https://issues.apache.org/jira/browse/AMBARI-6556

I'll look into adjusting the code to use the alternative approach.

Greg

From: Srimanth Gunturi 
mailto:srima...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Tuesday, July 22, 2014 4:40 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: stopping host components via API

Hi Greg,
I would recommend putting the host in maintenance mode as shown in my API call 
(PUT http://c6401:8080/api/v1/clusters/c1/hosts).

Also, can you please open a JIRA about this problem with your current API usage?
Regards,
Srimanth



On Tue, Jul 22, 2014 at 5:16 AM, Greg Hill 
mailto:greg.h...@rackspace.com>> wrote:
Sure.  It's possible I'm doing something wrong.

Setting maintenance mode:

PUT 
/clusters/testcluster/hosts/c6401.ambari.apache.org/host_components?fields=HostRoles/state<http://c6401.ambari.apache.org/host_components?fields=HostRoles/state>
   {
"RequestInfo": {
"context" :"Start Maintanence Mode",
},
"Body": {
"HostRoles": {"maintenance_state": "ON"},
},
}

Stopping all components:

PUT 
/clusters/testcluster/hosts/c6401.ambari.apache.org/host_components?fields=HostRoles/state<http://c6401.ambari.apache.org/host_components?fields=HostRoles/state>
   {
"RequestInfo": {
"context" :"Stop All Components",
},
"Body": {
"HostRoles": {"state": "INSTALLED"},
},
}

From: Srimanth Gunturi 
mailto:srima...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Monday, July 21, 2014 3:03 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: stopping host components via API

Hi Greg,
Was wondering which version of Ambari you were using?

When maintenance mode is enabled on service, bulk host operations are ignored.
When maintenance mode is enabled on hosts, service level operations are ignored 
on host.
So was wondering if you enabled maintenance mode in one level, and performed 
operations on another?

I tried the following on trunk, and it stopped all host-components in 
maintenance mode.
=
PUT http://c6401:8080/api/v1/clusters/c1/hosts
{
  "RequestInfo": {
"context": "Turn On Maintenance Mode for host",
"query": 
"Hosts/host_name.in<http://host_name.in>(c6401.ambari.apache.org<http://c6401.ambari.apache.org>)"
  },
  "Body": {
"Hosts": {
  "maintenance_state": "ON"
}
  }
}
=
PUT 
http://c6401:8080/api/v1/clusters/c1/hosts/c6401.ambari.apache.org/host_components?
{
  "RequestInfo": {
"context": "Stop All Host Components",
"operation_level": {
  "level": "HOST",
  "cluster_name": "c1",
  "host_names": "c6401.ambari.apache.org<http://c6401.ambari.apache.org>"
},
"query": 
"HostRoles/component_name.in<http://component_name.in>(APP_TIMELINE_SERVER,DATANODE,HISTORYSERVER,NAMENODE,NODEMANAGER,RESOURCEMANAGER,SECONDARY_NAMENODE,ZOOKEEPER_SERVER)"
  },
  "Body": {
"HostRoles": {
  "state": "INSTALLED"
}
  }
}
=

Maybe listing the API calls you make might help.
Regards,
Srimanth




On Mon, Jul 21, 2014 at 10:54 AM, Greg Hill 
mailto:greg.h...@rackspace.com>> wrote:
Anyone know if this is intentional or not?  It seems to ignore setting the
HostRole/state if the host_component is in maintenance mode.  I was able
to work around it by immediately setting maintenance mode after changing
the state. But that leads to a race condition as to whether nagios notices
the downed services before maintenance mode kicks in.

IMO, it shouldn't behave this way.  There's no safe way to stop services
as it is right now.  It should let you stop them during maintenance, as
that's really the primary reason you'd want to set maintenance.

Should I just open a bug?

Greg

On 7/21/14 8:02 AM, "Greg Hill" 
mailto:greg.h...@rackspace.com>> wrote:

>I did some debugging and it turns out that the problem is that I set
>maintenance mode prior to stopping the components.  Unfortunately, this
>makes it so nagios starts alerting me.  My script is attempting to remove
>a slave node from a clus

Re: stopping host components via API

2014-07-22 Thread Greg Hill
Sure.  It's possible I'm doing something wrong.

Setting maintenance mode:

PUT 
/clusters/testcluster/hosts/c6401.ambari.apache.org/host_components?fields=HostRoles/state
   {
"RequestInfo": {
"context" :"Start Maintanence Mode",
},
"Body": {
"HostRoles": {"maintenance_state": "ON"},
},
}

Stopping all components:

PUT 
/clusters/testcluster/hosts/c6401.ambari.apache.org/host_components?fields=HostRoles/state
   {
"RequestInfo": {
"context" :"Stop All Components",
},
"Body": {
"HostRoles": {"state": "INSTALLED"},
},
}

From: Srimanth Gunturi 
mailto:srima...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Date: Monday, July 21, 2014 3:03 PM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
mailto:user@ambari.apache.org>>
Subject: Re: stopping host components via API

Hi Greg,
Was wondering which version of Ambari you were using?

When maintenance mode is enabled on service, bulk host operations are ignored.
When maintenance mode is enabled on hosts, service level operations are ignored 
on host.
So was wondering if you enabled maintenance mode in one level, and performed 
operations on another?

I tried the following on trunk, and it stopped all host-components in 
maintenance mode.
=
PUT http://c6401:8080/api/v1/clusters/c1/hosts
{
  "RequestInfo": {
"context": "Turn On Maintenance Mode for host",
"query": 
"Hosts/host_name.in<http://host_name.in>(c6401.ambari.apache.org<http://c6401.ambari.apache.org>)"
  },
  "Body": {
"Hosts": {
  "maintenance_state": "ON"
}
  }
}
=
PUT 
http://c6401:8080/api/v1/clusters/c1/hosts/c6401.ambari.apache.org/host_components?
{
  "RequestInfo": {
"context": "Stop All Host Components",
"operation_level": {
  "level": "HOST",
  "cluster_name": "c1",
  "host_names": "c6401.ambari.apache.org<http://c6401.ambari.apache.org>"
},
"query": 
"HostRoles/component_name.in<http://component_name.in>(APP_TIMELINE_SERVER,DATANODE,HISTORYSERVER,NAMENODE,NODEMANAGER,RESOURCEMANAGER,SECONDARY_NAMENODE,ZOOKEEPER_SERVER)"
  },
  "Body": {
"HostRoles": {
  "state": "INSTALLED"
}
  }
}
=

Maybe listing the API calls you make might help.
Regards,
Srimanth




On Mon, Jul 21, 2014 at 10:54 AM, Greg Hill 
mailto:greg.h...@rackspace.com>> wrote:
Anyone know if this is intentional or not?  It seems to ignore setting the
HostRole/state if the host_component is in maintenance mode.  I was able
to work around it by immediately setting maintenance mode after changing
the state. But that leads to a race condition as to whether nagios notices
the downed services before maintenance mode kicks in.

IMO, it shouldn't behave this way.  There's no safe way to stop services
as it is right now.  It should let you stop them during maintenance, as
that's really the primary reason you'd want to set maintenance.

Should I just open a bug?

Greg

On 7/21/14 8:02 AM, "Greg Hill" 
mailto:greg.h...@rackspace.com>> wrote:

>I did some debugging and it turns out that the problem is that I set
>maintenance mode prior to stopping the components.  Unfortunately, this
>makes it so nagios starts alerting me.  My script is attempting to remove
>a slave node from a cluster by doing the following:
>
>1. Set maintenance mode on all host_components.
>2. Stop all host_components.
>3. Remove all host_components.
>4. Remove host from cluster.
>
>This was what I had worked out to be the proper procedure a few weeks
>back.  Am I doing something wrong or did I discover a bug?
>
>Greg
>
>
>On 7/18/14 7:15 PM, "Yusaku Sako" 
>mailto:yus...@hortonworks.com>> wrote:
>
>>Greg,
>>
>>That should not be broken.
>>What is the exact call you are trying to make (can you post the
>>equivalent curl call)?
>>
>>Yusaku
>>
>>On Fri, Jul 18, 2014 at 2:13 PM, Greg Hill 
>>mailto:greg.h...@rackspace.com>>
>>wrote:
>>> This used to be accomplished by doing a PUT with this message to the
>>>host
>>> resource:
>>>
>>> {"RequestInfo": {"context" :"Stop All Components"}, "Body":
>

Re: stopping host components via API

2014-07-21 Thread Greg Hill
Anyone know if this is intentional or not?  It seems to ignore setting the
HostRole/state if the host_component is in maintenance mode.  I was able
to work around it by immediately setting maintenance mode after changing
the state. But that leads to a race condition as to whether nagios notices
the downed services before maintenance mode kicks in.

IMO, it shouldn't behave this way.  There's no safe way to stop services
as it is right now.  It should let you stop them during maintenance, as
that's really the primary reason you'd want to set maintenance.

Should I just open a bug?

Greg

On 7/21/14 8:02 AM, "Greg Hill"  wrote:

>I did some debugging and it turns out that the problem is that I set
>maintenance mode prior to stopping the components.  Unfortunately, this
>makes it so nagios starts alerting me.  My script is attempting to remove
>a slave node from a cluster by doing the following:
>
>1. Set maintenance mode on all host_components.
>2. Stop all host_components.
>3. Remove all host_components.
>4. Remove host from cluster.
>
>This was what I had worked out to be the proper procedure a few weeks
>back.  Am I doing something wrong or did I discover a bug?
>
>Greg
>
>
>On 7/18/14 7:15 PM, "Yusaku Sako"  wrote:
>
>>Greg,
>>
>>That should not be broken.
>>What is the exact call you are trying to make (can you post the
>>equivalent curl call)?
>>
>>Yusaku
>>
>>On Fri, Jul 18, 2014 at 2:13 PM, Greg Hill 
>>wrote:
>>> This used to be accomplished by doing a PUT with this message to the
>>>host
>>> resource:
>>>
>>> {"RequestInfo": {"context" :"Stop All Components"}, "Body":
>>>{"HostRoles":
>>> {"state": "INSTALLED"}}}
>>>
>>> But that doesn't appear to work any more.  It worked a few weeks ago.
>>>Is
>>> there somewhere where changes like this are being documented?
>>>
>>> Greg
>>
>>-- 
>>CONFIDENTIALITY NOTICE
>>NOTICE: This message is intended for the use of the individual or entity
>>to 
>>which it is addressed and may contain information that is confidential,
>>privileged and exempt from disclosure under applicable law. If the reader
>>of this message is not the intended recipient, you are hereby notified
>>that 
>>any printing, copying, dissemination, distribution, disclosure or
>>forwarding of this communication is strictly prohibited. If you have
>>received this communication in error, please contact the sender
>>immediately 
>>and delete it from your system. Thank You.
>



Re: stopping host components via API

2014-07-21 Thread Greg Hill
I did some debugging and it turns out that the problem is that I set
maintenance mode prior to stopping the components.  Unfortunately, this
makes it so nagios starts alerting me.  My script is attempting to remove
a slave node from a cluster by doing the following:

1. Set maintenance mode on all host_components.
2. Stop all host_components.
3. Remove all host_components.
4. Remove host from cluster.

This was what I had worked out to be the proper procedure a few weeks
back.  Am I doing something wrong or did I discover a bug?

Greg


On 7/18/14 7:15 PM, "Yusaku Sako"  wrote:

>Greg,
>
>That should not be broken.
>What is the exact call you are trying to make (can you post the
>equivalent curl call)?
>
>Yusaku
>
>On Fri, Jul 18, 2014 at 2:13 PM, Greg Hill 
>wrote:
>> This used to be accomplished by doing a PUT with this message to the
>>host
>> resource:
>>
>> {"RequestInfo": {"context" :"Stop All Components"}, "Body":
>>{"HostRoles":
>> {"state": "INSTALLED"}}}
>>
>> But that doesn't appear to work any more.  It worked a few weeks ago.
>>Is
>> there somewhere where changes like this are being documented?
>>
>> Greg
>
>-- 
>CONFIDENTIALITY NOTICE
>NOTICE: This message is intended for the use of the individual or entity
>to 
>which it is addressed and may contain information that is confidential,
>privileged and exempt from disclosure under applicable law. If the reader
>of this message is not the intended recipient, you are hereby notified
>that 
>any printing, copying, dissemination, distribution, disclosure or
>forwarding of this communication is strictly prohibited. If you have
>received this communication in error, please contact the sender
>immediately 
>and delete it from your system. Thank You.



stopping host components via API

2014-07-18 Thread Greg Hill
This used to be accomplished by doing a PUT with this message to the host 
resource:

{"RequestInfo": {"context" :"Stop All Components"}, "Body": {"HostRoles": 
{"state": "INSTALLED"}}}

But that doesn't appear to work any more.  It worked a few weeks ago.  Is there 
somewhere where changes like this are being documented?

Greg


Re: new error

2014-07-18 Thread Greg Hill
Oh, I see the key changed from 'global' to 'nagios-env'.  Were the docs just 
wrong and it was escaping detection previously or did this 
backwards-incompatible change get purposely made in a point release?

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Friday, July 18, 2014 1:43 PM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: new error

I was revisiting my Ambari API testing with the most recent changes and the 
code that worked a few weeks ago no longer is.  Apparently the way that is 
shown in the docs to define the nagios_contact no longer works.

I sent this in my blueprint:

"configurations": [
{
"global" : {
"nagios_contact" : 
"greg.h...@rackspace.com"
}
}
],

But when I attempt to create that blueprint, I get a 400 error with the 
following message:

Required configurations are missing from the specified host groups: 
{gateway={nagios-env=[nagios_contact]}}

('gateway' is the name of one of my host_groups).  This error did not occur as 
of the latest code 3-4 weeks ago.  What do I need to be doing differently?

Greg


new error

2014-07-18 Thread Greg Hill
I was revisiting my Ambari API testing with the most recent changes and the 
code that worked a few weeks ago no longer is.  Apparently the way that is 
shown in the docs to define the nagios_contact no longer works.

I sent this in my blueprint:

"configurations": [
{
"global" : {
"nagios_contact" : "greg.h...@rackspace.com"
}
}
],

But when I attempt to create that blueprint, I get a 400 error with the 
following message:

Required configurations are missing from the specified host groups: 
{gateway={nagios-env=[nagios_contact]}}

('gateway' is the name of one of my host_groups).  This error did not occur as 
of the latest code 3-4 weeks ago.  What do I need to be doing differently?

Greg


updated API docs

2014-06-25 Thread Greg Hill
Are there any API docs that include the changes for 1.6.0?  The docs I found 
online are not current:

https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md

More specifically, I'm interested in how to add a host to an existing cluster 
that was created with a blueprint.  Do I just have to add the host to the 
cluster, then assign all the relevant components, then do install 
services/start services?  Or is there a simpler way?

Greg



Re: Ambari 1.6.0 installation error

2014-06-24 Thread Greg Hill
Having just done this on Friday, I can confirm that doing it as root works.  
The vagrant scripts installed the same key to both the vagrant and root users, 
so all I had to do was upload the insecure_private_key file in the Ambari UI 
and it worked fine.  I left everything else at the defaults.

Greg

From: "ÐΞ€ρ@Ҝ (๏̯͡๏)" mailto:deepuj...@gmail.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Tuesday, June 24, 2014 6:55 AM
To: user mailto:user@ambari.apache.org>>
Subject: Re: Ambari 1.6.0 installation error

Hello,
I had faced the exact same error and the user must be root, otherwise it does 
not work.
It did not work with my user upon switching to root everything worked.
You will need to get private key of root@ambari-server and copy public 
certificate of root@ambari-server into all the compute nodes.

Switch to root and everything will be dead smooth :)
Regards,
Deepak



On Tue, Jun 24, 2014 at 5:13 PM, Dimitris Bouras 
mailto:dmpou...@gmail.com>> wrote:
Thank you both for your promt reply

Deepak, I will try out your solution thanks.

Yusako, I still cannot seem to install the ambari-cluster following the quick 
start guide. I have attached my steps in a "steps.txt" please point out any 
steps I might be missing or user privileges that may be involved.

Furthermore i have attached the console output of installation and the tail of  
/ambari-server.log & (ui logs)  for both cases as root and vagrant using same 
insecure key. I cant seem to get to work with either.

Thanks in advance for your help

Dimitri






On Tue, Jun 24, 2014 at 12:59 AM, Yusaku Sako 
mailto:yus...@hortonworks.com>> wrote:
Hi all,

The Quick Start Guide [1] was missing the step to copy the 
insecure_priviate_key file to the OS-specific project folder.
The Wiki has been updated to add this step.
You don't need to manually distribute or configure SSH on all VMs.  Once the 
insecure_private_key file is in place in the project folder, bootstrap.sh does 
the rest (including passwordless ssh login for root).

[1] https://cwiki.apache.org/confluence/display/AMBARI/Quick+Start+Guide

Yusaku


On Mon, Jun 23, 2014 at 7:41 AM, Deepak 
mailto:deepuj...@gmail.com>> wrote:
Hello
I have had similar error.
You need to store ssh public key of ambari server into root user of 
.ssh/authorized_keys file in every host you add using web GUI. This must be 
done manually. Remember its root user and not any other user.

Sent from my iPhone

On 23-Jun-2014, at 7:35 pm, Dimitris Bouras 
mailto:dmpou...@gmail.com>> wrote:

Thanks,

 Shouldn't the shell command in bootstrap.sh do this fo rme?

mkdir -p /root/.ssh; chmod 600 /root/.ssh; cp 
/home/vagrant/.ssh/authorized_keys /root/.ssh/


On Mon, Jun 23, 2014 at 3:05 PM, Olivier Renault 
mailto:orena...@hortonworks.com>> wrote:

It's probably a good assumption. You need to have distributed the public ssh 
key to every hosts ahead of time.

An alternative is to install ambari-agent on each node, configured it (edit 
/etc/ambari-agent/conf/ambari-agent.conf and point it to your ambari-server) 
and start the agent.

Thanks
Olivier

On 23 Jun 2014 12:48, "Dimitris Bouras" 
mailto:dmpou...@gmail.com>> wrote:

The steps I follow to setup ambari are based on the Quick start Guide are:

1)cd centos6.5
2)./up.sh 3
3) vagrant ssh c6501
4) sudo su -
4a) yum install wget
5) wget 
http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.6.0/ambari.repo
 ..setup star...etc.
6) login to c6501:8080
7) Define my range expression:
c65[01-03].ambari.apache.org

Specify the the non-root SSH user vagrant, and upload insecure_private_key file 
that I copied earlier as the private key.

Here is where the process fails, probably due to me not setting up  
passwordless ssh. Is this a correct assumption ?
or is there some other kind of error in my setup ? (since the vagrant insecure 
key should do the trick?)
Kind regards

Dimitri
[X]


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strict

Re: weird error using the ambari API to kick a cluster

2014-06-23 Thread Greg Hill
Oh, I see I skipped a step with the 'add_config' section.  Does Ambari not pick 
sensible defaults if the user doesn't provide this information?

Greg

From: Greg mailto:greg.h...@rackspace.com>>
Reply-To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Date: Monday, June 23, 2014 8:27 AM
To: "user@ambari.apache.org" 
mailto:user@ambari.apache.org>>
Subject: weird error using the ambari API to kick a cluster

Thanks for the help on Friday.  I was able to get most of the way to using the 
Ambari API to set up a test cluster, but I'm now stuck on a fairly cryptic 
problem again.

I'm running into an error on the Ambari agent side, but I'm confused as to the 
source of the error.  I imagine I'm not passing in some required configuration, 
but the API made no complaints about my lack of providing it.

My setup is a 3 VM vagrant/virtualbox setup on my local machine using the 
Ambari quick start guide located here:

https://cwiki.apache.org/confluence/display/AMBARI/Quick+Start+Guide

I wrote a POC script based on the instructions here:
https://cwiki.apache.org/confluence/display/AMBARI/Create+a+new+Cluster

Here's a gist with the error message I'm getting from the ambari agent and the 
script I'm using:
https://gist.github.com/jimbobhickville/e0f744624742b02d2ce6

As far as I can tell, the error occurs after I call install_all_services()

It sounds like the Ambari agent is expecting a 'global' config section to 
exist, but it is not requiring me to provide it, and it is not provided by the 
bootstrap process.

Any ideas?

Greg


weird error using the ambari API to kick a cluster

2014-06-23 Thread Greg Hill
Thanks for the help on Friday.  I was able to get most of the way to using the 
Ambari API to set up a test cluster, but I'm now stuck on a fairly cryptic 
problem again.

I'm running into an error on the Ambari agent side, but I'm confused as to the 
source of the error.  I imagine I'm not passing in some required configuration, 
but the API made no complaints about my lack of providing it.

My setup is a 3 VM vagrant/virtualbox setup on my local machine using the 
Ambari quick start guide located here:

https://cwiki.apache.org/confluence/display/AMBARI/Quick+Start+Guide

I wrote a POC script based on the instructions here:
https://cwiki.apache.org/confluence/display/AMBARI/Create+a+new+Cluster

Here's a gist with the error message I'm getting from the ambari agent and the 
script I'm using:
https://gist.github.com/jimbobhickville/e0f744624742b02d2ce6

As far as I can tell, the error occurs after I call install_all_services()

It sounds like the Ambari agent is expecting a 'global' config section to 
exist, but it is not requiring me to provide it, and it is not provided by the 
bootstrap process.

Any ideas?

Greg