Making the agent retry custom actions

2016-02-19 Thread Greg Hill
This is for Ambari 2.1.1 so apologies if this has since been fixed. We saw a failure today in one of our custom actions caused by a temporary network hiccup: Caught an exception while executing custom service command: : Can not download file from url

Re: About the Ambari python client

2016-02-19 Thread Greg Hill
It hasn't been worked on in over a year. I wrote my own and released it to Pypi: https://github.com/jimbobhickville/python-ambariclient I was working on contributing it back to Ambari to replace this older client, but I could never get the Ambari tests to finish on my machine and something in

Re: custom action times out even though it never even started

2016-02-04 Thread Greg Hill
ion times out even though it never even started Generally, if host is in heart-beat lost or unknown state then commands timeout immediately. Adding health check will help for sure. From: Greg Hill <greg.h...@rackspace.com<mailto:greg.h...@rackspace.com>> Se

custom action times out even though it never even started

2016-02-03 Thread Greg Hill
In our Ambari setup, we inject some custom actions. Generally this has worked well, but lately I've been testing a specific one and the behavior I'm seeing confuses me. We have one custom action that will download a script from a URL and run it. However, despite my setting the timeout on the

Re: openjdk update breaks ambari-agent 2-way ssl

2016-01-22 Thread Greg Hill
art I think that this should be permanently changed in Ambari since md5 is no longer trusted. Then again sha1 isn't either, so maybe the default needs to be sha256. I hope this helps, Rob From: Greg Hill <greg.h...@rackspace.com<mailto:greg.h...@rackspace.com>> Reply-To:

openjdk update breaks ambari-agent 2-way ssl

2016-01-22 Thread Greg Hill
We discovered a bug last night when our centos mirror updated openjdk and caused cluster builds to start failing. This is in Ambari 2.1.1 but I didn't see anything in github to indicate that this code has since changed. We tracked it down to the removal of the md5 algorithm from the list of

Re: Multiple CentOS versions in same stack?

2016-01-14 Thread Greg Hill
Honestly, I don't know that anyone has ever tried, but I have a feeling it might not work out well. The 'repo' is specified at the stack level, so you'd have to make a new cluster after modifying the repo url on the stack in order for the new nodes to even know to use a different repo from the

Re: Systemd update breaks ambari-server and ambari-agent

2015-12-30 Thread Greg Hill
I was mistaken on one detail, ambari-agent does appear to still work with systemd, just not ambari-server. Greg From: Greg > Reply-To: "user@ambari.apache.org"

Re: Systemd update breaks ambari-server and ambari-agent

2015-12-30 Thread Greg Hill
This seems to work so far, in case someone else runs into the same problem: /usr/lib/systemd/system/ambari-server.service -- [Unit] Description=ambari-server service After=xe-linux-distribution.service [Service] Type=forking

Re: Systemd update breaks ambari-server and ambari-agent

2015-12-30 Thread Greg Hill
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" <user@ambari.apache.org<mailto:user@ambari.apache.org>> Subject: Re: Systemd update breaks ambari-server and ambari-agent ​Thanks Greg. Can you open a JIRA and add these to the description. -Sumit _

Systemd update breaks ambari-server and ambari-agent

2015-12-30 Thread Greg Hill
A recent CentOS update (7.2) is causing ambari-server to not work with systemd. systemctl restart ambari-server Unit ambari-server.service failed to load: No such file or directory. This is because ambari-server does not install a service definition file in:

Re: NullPointerException when posting a Request

2015-12-07 Thread Greg Hill
So, looking at our code a bit more, we submit multiple requests in parallel, and only one of them failed with the 500 error. Perhaps a locking (or lack thereof) issue? I can adjust our workflow to prevent the parallel submission as a workaround, but seems like Ambari should be able to handle

Re: MYSQL_SERVER install failing 100% of the time now

2015-11-30 Thread Greg Hill
rackspace.com<mailto:greg.h...@rackspace.com>> Subject: Re: MYSQL_SERVER install failing 100% of the time now Hi Greg, what do you get after running, yum info mysql* It should contain the repo that provided it. Thanks, Alejandro From: Greg Hill <greg.h...@rackspace.com<mailto:

Re: Tez View not loading in Ambari 2.1.2

2015-10-13 Thread Greg Hill
The Tez view requires Kerberos. I'm not really sure why it does, but it seems to have something to do with the proxyuser settings in the YARN timeline server, IIRC. I don't know why it doesn't allow to proxy system users like every other component. I could be wrong on that, since the docs

Re: COMMERCIAL:Hue on Ambrai

2015-09-23 Thread Greg Hill
The version of HUE in Ambari is really old and deprecated. I don't think it's officially supported any more. It doesn't appear that it even supports overriding hue-env, afaict, so you wouldn't be able to use Ambari to override that configuration. There are some attempts at making an updated

Tez view questions

2015-08-31 Thread Greg Hill
So with the 2.1.1 release, I can now enable the Tez view on my clusters, but it doesn't seem to show anything. Similarly, the Tez tab on the Hive view doesn't show anything. It should show any Pig or Hive jobs that were run previously, right? I verified that the job history page has the

Re: COMMERCIAL:Setting the Ambari API hostname in views

2015-08-28 Thread Greg Hill
FWIW, I figured this out. It is not configurable, it just looks up the local system hostname. It looks like I can tell the agent to use a custom script to figure out what the agent hostname should be, so I can configure the system with the fqdn as the hostname and inject a script for the

When is 2.1.1 planned for release?

2015-08-28 Thread Greg Hill
I need the HTTP_ONLY fix for the Tez view. What's the ETA on release for 2.1.1? Greg

Setting the Ambari API hostname in views

2015-08-25 Thread Greg Hill
I'm trying to enable views, but Ambari is automatically configuring them with the local domain hostname that we use for internal cluster traffic, rather than the public fqdn that we use for the Ambari API. Because of this we get an SSL hostname mismatch and the views don't work. I tried

Url validation bug in views?

2015-08-20 Thread Greg Hill
I'm working on enabling views, and there's an issue in that the ambari server is known locally on the cluster as 'ambari.local', but has a fqdn for external access. The fqdn looks something like: ambari-$uuid.domain.com For some reason, when I try to configure the CAPACITY-SCHEDULER view, if

GANGLIA broken in Ambari 2.1?

2015-07-27 Thread Greg Hill
In the Centos7 HDP2.3 stack, it attempts to run '/etc/init.d/httpd' which doesn't exist, rather than using the 'service' shortcut that does still work, even though it forwards to 'systemctl'. I injected a script into /etc/init.d/httpd to work around this, but the stack should probably be

Re: COMMERCIAL:GANGLIA broken in Ambari 2.1?

2015-07-27 Thread Greg Hill
Apparently the gmetad service was running and the hdp-gmetad wrapper service didn't recognize that and tried to start it again on the same port. Had to chkconfig gmetad off and service gmetad stop on the server before handing it off to Ambari. I'll submit a JIRA tomorrow. Greg From: Greg

Re: COMMERCIAL:Re: GANGLIA broken in Ambari 2.1?

2015-07-27 Thread Greg Hill
and a metrics collector subsystem. You will find some details on the wiki. Thanks, Jayesh From: Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com To: user@ambari.apache.orgmailto:user@ambari.apache.org user@ambari.apache.orgmailto:user@ambari.apache.org

Re: Adding Local Repos to Ambari via REST APIs

2015-07-21 Thread Greg Hill
Just do GET instead of PUT to get the current values. Greg From: Pratik Gadiya pratik_gad...@persistent.commailto:pratik_gad...@persistent.com Reply-To: user@ambari.apache.orgmailto:user@ambari.apache.org user@ambari.apache.orgmailto:user@ambari.apache.org Date: Tuesday, July 21, 2015 at 5:22

Re: Ambari 2.0 DECOMMISSION

2015-05-15 Thread Greg Hill
exclude file when the host gets deleted. This was added at some point so that when the host is added back the DN can join normally. Host component start/stop should not trigger this. From: Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com Sent

Re: Ambari 2.0 DECOMMISSION

2015-05-14 Thread Greg Hill
Some further testing results: 1. Turning on maintenance mode beforehand didn't seem to affect it. 2. The datanodes do go to decommissioning briefly before they go back to live, so it is at least trying to decommission them. Shouldn't they go to 'decommissioned' after it finishes though? 3.

zookeeper required for Ambari 2.0?

2015-05-13 Thread Greg Hill
The YARN resource manager keeps crashing in Ambari 2.0 + HDP 2.2.4.2 clusters for me. The error log indicates that it can't connect to zookeeper, which makes sense since I didn't provision zookeeper as I don't use it. I found the relevant settings in the Ambari UI:

Re: zookeeper required for Ambari 2.0?

2015-05-13 Thread Greg Hill
Looks like yarn.resourcemanager.store.class defaulted to org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore Ambari should probably not set that unless there is a zookeeper_server in the cluster. Greg From: Greg greg.h...@rackspace.commailto:greg.h...@rackspace.com Reply-To:

Ambari 2.0 HIVE_CLIENT disappeared

2015-05-12 Thread Greg Hill
I'm not sure what I'm doing to cause this. When I use the Quick Start guide, I can install everything fine. But when I tried to then use Ambari 2.0 in our product, HIVE_CLIENT is no longer a valid component according to Ambari. When I try to add it to a blueprint, I get a 400 Error: The

Re: Changing Ambari Portal Password

2015-05-11 Thread Greg Hill
It can be done in the Ambari Portal itself or via the Ambari API. Go to the Admin section and there should be an option to edit users there. On the API, a PUT call to the user record should let you update it: PUT /api/v1/users/admin { Users: { user_name: $desired_username,

centos 6 + ambari 2.0 + hdp 2.2 == timeout

2015-05-04 Thread Greg Hill
I can't get an install to work on ambari 2.0 + hdp 2.2 using the public yum repos. It just times out after 1800s on a yum install every time. Are hortonworks repos being super slow or is something else screwy happening? My repos were defaulted by Ambari to:

Re: adjust the agent heartbeat?

2015-04-17 Thread Greg Hill
heartbeat? ​Not without code change. This is probably a good feature to add. Can you create a task? From: Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com Sent: Friday, April 17, 2015 8:32 AM To: user@ambari.apache.orgmailto:user@ambari.apache.org

adjust the agent heartbeat?

2015-04-17 Thread Greg Hill
https://github.com/apache/ambari/blob/trunk/ambari-agent/src/main/python/ambari_agent/NetUtil.py#L34 Is there any way to tweak that heartbeat interval setting? If I'm reading the code right, it checks in with the server every 10s. I'd like to be able to tweak that and see if I can speed up

potential bug

2015-04-08 Thread Greg Hill
I'm diagnosing an issue, and I think I found a bug with the ambari-agent code: https://github.com/apache/ambari/blob/trunk/ambari-agent/src/main/python/ambari_agent/Controller.py#L390 If 'cluster_name' has spaces in it, this request fails because it fails to URL-encode value. This causes all

Re: Did something get broken for webhcat today?

2015-03-18 Thread Greg Hill
From: Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com Sent: Wednesday, March 18, 2015 12:41 PM To: user@ambari.apache.orgmailto:user@ambari.apache.org Subject: Re: Did something get broken for webhcat today? We didn't change anything. Ambari 1.7.0, HDP 2.2. Repos are: [root

Did something get broken for webhcat today?

2015-03-18 Thread Greg Hill
Starting this morning, we started seeing this on every single install. I think someone at Hortonworks pushed out a broken RPM or something. Any ideas? This is rather urgent as we are no longer able to provision HDP 2.2 clusters at all because of it. 2015-03-18 15:58:05,982 -

Re: COMMERCIAL:Re: Did something get broken for webhcat today?

2015-03-18 Thread Greg Hill
the cluster with. Yusaku From: Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com Reply-To: user@ambari.apache.orgmailto:user@ambari.apache.org user@ambari.apache.orgmailto:user@ambari.apache.org Date: Thursday, March 19, 2015 1:56 AM To: user@ambari.apache.orgmailto:user@ambari.apache.org

Re: COMMERCIAL:Re: COMMERCIAL:Re: Did something get broken for webhcat today?

2015-03-18 Thread Greg Hill
-Step4:SetupStackRepositories%28Optional%29 From: Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com Reply-To: user@ambari.apache.orgmailto:user@ambari.apache.org user@ambari.apache.orgmailto:user@ambari.apache.org Date: Wednesday, March 18, 2015 at 1:11 PM To: user

Re: COMMERCIAL:Re: decommission multiple nodes issue

2015-03-04 Thread Greg Hill
be skipped on the matching target resources in case the host(s) are in maintenance mode. If you set to HOST_COMPONENT, it would ignore any host-level maintenance mode. This is a really mysterious, undocumented part of Ambari, unfortunately. Yusaku From: Greg Hill greg.h...@rackspace.commailto:greg.h

Re: decommission multiple nodes issue

2015-03-02 Thread Greg Hill
Partner Solutions Engineer - EMEA @seano From: Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com Reply: user@ambari.apache.orgmailto:user@ambari.apache.org user@ambari.apache.orgmailto:user@ambari.apache.org Date: March 2, 2015 at 19:08:13 To: user@ambari.apache.orgmailto:user

decommission multiple nodes issue

2015-03-02 Thread Greg Hill
I have some code for decommissioning datanodes prior to removal. It seems to work fine with a single node, but with multiple nodes it fails. When passing multiple hosts, I am putting the names in a comma-separated string, as seems to be the custom with other Ambari API commands. I attempted

Re: COMMERCIAL:RE: Server Restarts

2015-02-19 Thread Greg Hill
That won’t make the agent auto-start components on restart. Afaik, you have to do that manually. Greg From: johny casanova pcgamer2...@outlook.commailto:pcgamer2...@outlook.com Reply-To: user@ambari.apache.orgmailto:user@ambari.apache.org user@ambari.apache.orgmailto:user@ambari.apache.org

ambari-server postgres setup questions

2015-02-06 Thread Greg Hill
I was looking through the ambari-server postgres setup because we're having occasional issues with postgresql initdb failing. That's kind of tangential, but I found something that concerns me that I'd like some feedback on. Afaict, it sets up postgres to: 1. Listen for traffic from anywhere

Re: ssl changes recently?

2015-01-07 Thread Greg Hill
: Hey Greg, On RHEL 6.5 we got a similar error during agent registration. Here is the workaround: http://hortonworks.com/community/forums/topic/ambari-agent-registration-fa ilure-on-rhel-6-5-due-to-openssl-2/ Hope that helps, Erin - Original Message - From: Greg Hill greg.h

ssl changes recently?

2015-01-07 Thread Greg Hill
I sent this to the wrong list earlier. I recently updated our Ambari 1.7.0 image and am now getting SSL errors from the agents: INFO 2015-01-07 16:59:02,116 NetUtil.py:48 - Connecting to https://ambari.local:8440/ca ERROR 2015-01-07 16:59:02,645 NetUtil.py:66 - [SSL: CERTIFICATE_VERIFY_FAILED]

problem with historyserver on secondary namenode

2014-12-23 Thread Greg Hill
Trying to use ambari 1.7.0 to provision and hdp 2.2 cluster. The layout I'm using has the yarn history server on the same host as the secondary namenode (the primary namenode is on another host), but it fails to start because it tries to interact with hdfs before hdfs is ready. Here's a gist

Re: problem with historyserver on secondary namenode

2014-12-23 Thread Greg Hill
I may have been hasty in my diagnosis. It doesn't appear to start even after hdfs is up and running fine. I'll dig more and see if I can figure out the real culprit here. Greg From: Greg greg.h...@rackspace.commailto:greg.h...@rackspace.com Reply-To:

Re: problem with historyserver on secondary namenode

2014-12-23 Thread Greg Hill
The problem is the namenode is only listening on localhost: [root@master-1 ~]# netstat -pl --numeric-ports --numeric-hosts | grep 10975 tcp0 0 127.0.0.1:8020 0.0.0.0:* LISTEN 10975/java tcp0 0 127.0.0.1:50070 0.0.0.0:*

question about adding custom components

2014-12-16 Thread Greg Hill
We have a need to add a feature to Ambari to inject iptables rules on all the nodes in a cluster to allow traffic only from other nodes in the cluster. We'd need to rewrite those rules any time a node was added or removed from the cluster. I was thinking the best way to handle this would be

Re: unofficial python client

2014-12-04 Thread Greg Hill
you meant? Yusaku On Wed, Dec 3, 2014 at 8:40 AM, Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com wrote: I wrote a new Python client and published it to Github. Thought others might be interested. https://github.com/jimbobhickville/python-ambariclient I did first attempt

unofficial python client

2014-12-03 Thread Greg Hill
I wrote a new Python client and published it to Github. Thought others might be interested. https://github.com/jimbobhickville/python-ambariclient I did first attempt to work on the official client, as I'm much more in favor of contributing over forking, but I didn't feel like the effort was

a couple API questions

2014-11-10 Thread Greg Hill
1. Is there a way to query the API to see what version of ambari the server is running? This would make auto-negotiation in the client easy, so it can automatically account for version differences. If this doesn't exist, I can open a JIRA to have it added. We have a dev on the team that is

Re: a couple API questions

2014-11-10 Thread Greg Hill
, On Mon, Nov 10, 2014 at 3:15 PM, Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com wrote: 1. Is there a way to query the API to see what version of ambari the server is running? This would make auto-negotiation in the client easy, so it can automatically account for version

Re: Stop all components API call no longer seems to work

2014-11-07 Thread Greg Hill
. Unfortunately I don't see any documentation on this. I presume you are getting 200 because all components on the specified host are already stopped. Yusaku On Fri, Nov 7, 2014 at 5:55 AM, Greg Hill greg.h...@rackspace.com wrote: This used to work in earlier 1.7.0 builds, but doesn't seem to any

possible bug in the Ambari API

2014-11-03 Thread Greg Hill
On the latest Ambari 1.7.0 build, this API call returns invalid JSON that the parser chokes on. Notice the lack of a comma between the end of the first 'StackConfigurations' structure and the following one. There's just } { instead of }, { GET

Re: possible bug in the Ambari API

2014-11-03 Thread Greg Hill
, stack_name : HDP, stack_version : 2.1, type : hbase-env.xml } } ] On Mon, Nov 3, 2014 at 2:45 PM, Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com wrote: The more I look at this, I think it's just two separate dictionaries separated by a space. That's

Re: possible bug in the Ambari API

2014-11-03 Thread Greg Hill
: hbase-env.xml } } ] On Mon, Nov 3, 2014 at 2:45 PM, Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com wrote: The more I look at this, I think it's just two separate dictionaries separated by a space. That's not a valid response at all. It should be wrapped in list

Re: possible bug in the Ambari API

2014-11-03 Thread Greg Hill
/dev.hortonworks.com/ambari/centos6/1.x/latest/trunk/ambari.repo Thanks, Yusaku On Mon, Nov 3, 2014 at 12:42 PM, Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com wrote: /api/v1/stacks/HDP/versions/2.1/services/HBASE/configurations works fine, just like any other GET method on a list of resources

Re: delete host/component

2014-09-24 Thread Greg Hill
:52 PM To: user@ambari.apache.orgmailto:user@ambari.apache.org user@ambari.apache.orgmailto:user@ambari.apache.org Subject: Re: delete host/component great! when was that functionality introduced? ( I’m still on 1.2.4) thanks From: Greg Hill greg.h...@rackspace.commailto:greg.h...@rackspace.com

Re: Is there some bugs to create hbase cluster via rest api with blueprint?

2014-07-24 Thread Greg Hill
I think the UI fills in a lot of required configuration for you with sensible defaults, but the API does not necessarily do that as well. It sounds like you just need to pass in mapred_user. You might open a bug to have that defaulted on the server-side rather than the UI, since it's

Re: stopping host components via API

2014-07-21 Thread Greg Hill
wrote: Greg, That should not be broken. What is the exact call you are trying to make (can you post the equivalent curl call)? Yusaku On Fri, Jul 18, 2014 at 2:13 PM, Greg Hill greg.h...@rackspace.com wrote: This used to be accomplished by doing a PUT with this message to the host resource

weird error using the ambari API to kick a cluster

2014-06-23 Thread Greg Hill
Thanks for the help on Friday. I was able to get most of the way to using the Ambari API to set up a test cluster, but I'm now stuck on a fairly cryptic problem again. I'm running into an error on the Ambari agent side, but I'm confused as to the source of the error. I imagine I'm not