Hi Robert I started the proxyserver explicitly by specifying a value for the yarn.web-proxy.address in yarn-site.xml. The proxyserver did start and I tried getting the JSON response using the following command :
curl --compressed -H "Accept: application/json" -X GET " http://localhost:8090/proxy/application_1341823967331_0001/ws/v1/mapreduce/jobs/job_1341823967331_0001 " However, it refused connection and below is the excerpt from the Proxyserver logs: --------- 2012-07-09 14:26:40,402 INFO org.mortbay.log: Extract jar:file:/home/prajakta/Projects/IRL/hadoop-common/hadoop-dist/target/hadoop-3.0.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-yarn-common-3.0.0-SNAPSHOT.jar!/webapps/proxy to /tmp/Jetty_localhost_8090_proxy____.ak3o30/webapp 2012-07-09 14:26:40,992 INFO org.mortbay.log: Started SelectChannelConnector@localhost:8090 2012-07-09 14:26:40,993 INFO org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.yarn.server.webproxy.WebAppProxy is started. 2012-07-09 14:26:40,993 INFO org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer is started. 2012-07-09 14:33:26,039 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://prajakta:44314/ws/v1/mapreduce/jobs/job_1341823967331_0001 which is the app master GUI of application_1341823967331_0001 owned by prajakta 2012-07-09 14:33:29,277 INFO org.apache.commons.httpclient.HttpMethodDirector: I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server prajakta failed to respond 2012-07-09 14:33:29,277 INFO org.apache.commons.httpclient.HttpMethodDirector: Retrying request 2012-07-09 14:33:29,284 WARN org.mortbay.log: /proxy/application_1341823967331_0001/ws/v1/mapreduce/jobs/job_1341823967331_0001: java.net.SocketException: Connection reset 2012-07-09 14:37:33,834 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://prajakta:19888/jobhistory/job/job_1341823967331_0001/jobhistory/job/job_1341823967331_0001which is the app master GUI of application_1341823967331_0001 owned by prajakta --------------- I am not sure why http request object is setting my remoteUser to dr.who. :( I gather from <https://issues.apache.org/jira/browse/MAPREDUCE-2858> that this warning is posted only in case where security is disabled. I assume that the proxy server is not disabled if security is disabled. Any idea what could be the reason for this I/O exception? Am I missing setting any property for proper access. Please let me know. Regards, Prajakta On Fri, Jul 6, 2012 at 10:59 PM, Prajakta Kalmegh <pkalm...@gmail.com>wrote: > I am using hadoop trunk (forked from github). It supports RESTful APIs as > I am able to retrieve JSON objects for RM (cluster/nodes info)+ > Historyserver. The only issue is with AppMaster REST API. > > Regards, > Prajakta > > > > On Fri, Jul 6, 2012 at 10:55 PM, Robert Evans <ev...@yahoo-inc.com> wrote: > >> What version of hadoop are you using? It could be that the version you >> have does not have the RESTful APIs in it yet, and the proxy is working >> just fine. >> >> --Bobby Evans >> >> On 7/6/12 12:06 PM, "Prajakta Kalmegh" <pkalm...@gmail.com> wrote: >> >> >Robert , Thanks for the response. If I do not provide any explicit >> >configuration for the proxy server, do I still need to start it using the >> >'yarn start proxy server'? I am currently not doing it. >> > >> >Also, I am able to access the html page for proxy using the >> ><http://localhost:8088/proxy/{appid}/mapreduce/jobs> URL. (Note this url >> >does not have the '/ws/v1/ part in it. I get the html response when I >> >query >> >for this URL in runtime. >> > >> >So I assume the proxy server must be starting fine since I am able to >> >access this URL. I will try logging more details tomorrow from my office >> >machine and will let you know the result. >> > >> >Regards, >> >Prajakta >> > >> > >> > >> >On Fri, Jul 6, 2012 at 10:22 PM, Robert Evans <ev...@yahoo-inc.com> >> wrote: >> > >> >> Sorry I did not respond sooner. The default behavior is to have the >> >>proxy >> >> server run as part of the RM. I am not really sure why it is not doing >> >> this in your case. If you set the config yourself to be a URI that is >> >> different from that of the RM then you need to launch a standalone >> proxy >> >> server. You can do this by running >> >> >> >> yarn start proxy server >> >> >> >> Without sitting down with you it is going to be somewhat difficult to >> >> debug why this is happening. However, in retrospect it would be nice >> to >> >> add in some extra logging to help indicate why the proxy server is not >> >> functioning as desired. If you could file a JIRA to add in the logging >> >>I >> >> would be happy to provide a patch to you and we can try and debug the >> >> issue further. Please file it under the MAPREDUCE JIRA project. >> >> >> >> --Bobby >> >> >> >> On 7/6/12 3:29 AM, "Prajakta Kalmegh" <pkalm...@gmail.com> wrote: >> >> >> >> >Re-posting as I haven't got a solution yet. Sorry for spamming. I >> >>won't be >> >> >able to proceed in my code until I get a JSON response using AppMaster >> >> >REST >> >> >URL. :( >> >> > >> >> >Thanks, >> >> >Prajakta >> >> > >> >> > >> >> >On Wed, Jul 4, 2012 at 5:55 PM, Prajakta Kalmegh <pkalm...@gmail.com> >> >> >wrote: >> >> > >> >> >> Hi Robert/Harsh >> >> >> >> >> >> Thanks for your reply. >> >> >> >> >> >> My RM is starting just fine. The problem is with the use of >> >> >>http://<proxy httpddress:port>/proxy/{appid}/ws/v1/mapreduce >> >> >> to get the JSON response. >> >> >> >> >> >> As I said before, I had not configured the yarn.web-proxy.address >> >> >>property in yarn-site.xml. I assumed it will use the RM's >> >> >>yarn.resourcemanager.webapp.address property value as default. >> >>However, >> >> >>it gives me a '404-Page not found error'. Today I tried specifying a >> >> >>value explicitly for the yarn.web-proxy.address property. >> >> >> >> >> >> On running the wordcount example, it even gives a url >> >> >><http://localhost:8090>/proxy/{appid}/> to track the App Mast info. >> >> >>However, I am still not able to get a json response. >> >> >> >> >> >> Also, I tried to get the data from historyserver instead of runtime >> >> >>using the instructions given on page >> >> >>< >> >> >> http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yar >> >> >>n-site/HistoryServerRest.html> >> >> >> >> >> >> HistoryServer REST response does not give me jobids corresponding to >> >>an >> >> >>application. It just lists all the jobs run until now. By the way, >> the >> >> >>documentation does say >> >> >> >> >> >> ---------- >> >> >> >> >> >> "Both of the following URI's give you the history server >> information, >> >> >>from an application id identified by the appid value. >> >> >> * http://<history server http address:port>/ws/v1/history >> >> >> * http://<history server http address:port>/ws/v1/history/info" >> >> >> --------- >> >> >> >> >> >> But there is no provision to specify the application id with these >> >>REST >> >> >>URLs. >> >> >> >> >> >> Any idea how I can get the Application Master REST working and also >> >> >>linking jobids to application id using the HistoryServerREST API? >> >> >> >> >> >> Any help is appreciated. Thanks in advance. >> >> >> Regards, >> >> >> Prajakta >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Fri, Jun 29, 2012 at 8:55 PM, Robert Evans <ev...@yahoo-inc.com> >> >> >>wrote: >> >> >> >> >> >>> Please don't file that JIRA. The proxy server is intended to front >> >>the >> >> >>> web server for all calls to the AM. This is so you only have to go >> >>to >> >> >>>a >> >> >>> single location to get to any AM's web service. The proxy server >> >>is a >> >> >>> very simple proxy and just forwards the extra part of the path on >> to >> >> >>>the >> >> >>> AM. >> >> >>> >> >> >>> If you are having issues with this please include the version you >> >>are >> >> >>> having problems with. Also please look at the logs for the RM on >> >> >>>startup >> >> >>> to see if there is anything there indicating why it is not starting >> >>up. >> >> >>> >> >> >>> --Bobby Evans >> >> >>> >> >> >>> On 6/28/12 9:46 AM, "Harsh J" <ha...@cloudera.com> wrote: >> >> >>> >> >> >>> >As far as I can tell, the MR WebApp, as the name itself indicates >> >>on >> >> >>> >its doc page, starts only at the MR AM (which may be running at >> any >> >> >>> >NM), and it starts as an ephemeral port logged at in the AM logs >> >> >>> >usually as: >> >> >>> > >> >> >>> >INFO Web app /mapreduce started at [PORT] >> >> >>> > >> >> >>> >That it starts its own server with an ephemeral access point makes >> >> >>> >sense, since each job uses its own AM and having a common location >> >>may >> >> >>> >not work with the form of REST API documented at your link. Can >> you >> >> >>> >please file a JIRA to fix the doc and remove the proxy server >> refs, >> >> >>> >which are misleading? >> >> >>> > >> >> >>> >Do correct me if I'm wrong. >> >> >>> > >> >> >>> >On Thu, Jun 28, 2012 at 6:13 PM, Prajakta Kalmegh >> >><pkalm...@gmail.com >> >> > >> >> >>> >wrote: >> >> >>> >> Hi >> >> >>> >> >> >> >>> >> I am trying to get the ApplicationMaster info using the >> >> >>><http://<proxy >> >> >>> >>http >> >> >>> >> address:port>/proxy/{appid}/ws/v1/mapreduce/info> link as >> >>described >> >> >>>on >> >> >>> >>the < >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >>> >> >> >> http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yar >> >> >>>n >> >> >>> >>-site/MapredAppMasterRest.html> >> >> >>> >> page. >> >> >>> >> >> >> >>> >> I am able to access and retrieve JSON response for other modules >> >> >>> >> (ResourceManager, NodeManager and HistoryServer). However, I am >> >> >>>getting >> >> >>> >> 'Page not found' when I try to use my ResourceManager Http >> >>address >> >> >>>to >> >> >>> >> access the ApplicationMaster info. I am using < >> >> >>> >> http://localhost:8088/proxy/{appid}/ws/v1/mapreduce/info> to >> >> >>>retrieve >> >> >>> >>JSON >> >> >>> >> response. >> >> >>> >> >> >> >>> >> The instructions say "The application master should be accessed >> >>via >> >> >>>the >> >> >>> >> proxy. This proxy is configurable to run either on the resource >> >> >>>manager >> >> >>> >>or >> >> >>> >> on a separate host." >> >> >>> >> >> >> >>> >> My yarn-default.xml contains: >> >> >>> >> <property> >> >> >>> >> <description>The address for the web proxy as HOST:PORT, if >> >>this >> >> >>>is >> >> >>> >>not >> >> >>> >> given then the proxy will run as part of the >> RM</description> >> >> >>> >> <name>yarn.web-proxy.address</name> >> >> >>> >> <value/> >> >> >>> >> </property> >> >> >>> >> >> >> >>> >> and I did not set a value explicitly in yarn-site.xml. Any idea >> >> >>>how I >> >> >>> >>can >> >> >>> >> get this working? Thanks in advance. >> >> >>> >> >> >> >>> >> Regards, >> >> >>> >> Prajakta >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> >-- >> >> >>> >Harsh J >> >> >>> >> >> >>> >> >> >> >> >> >> >> >> >> >