Re: Maven Cloudera Configuration problem
there is no way to specify conf settings in the maven pom.xml,instead you can build your project based on the profile and specify the properties into up property file. for setting the conf properties its better to create a shell script to run your jar, in which you need to provide the conf parameters. Raj K Singh http://www.rajkrrsingh.blogspot.com Mobile Tel: +91 (0)9899821370 On Tue, Aug 13, 2013 at 4:49 PM, Pavan Sudheendra wrote: > Hi, > I'm currently using maven to build the jars necessary for my > map-reduce program to run and it works for a single node cluster.. > > For a multi node cluster, how do i specify my map-reduce program to > ingest the cluster settings instead of localhost settings? > I don't know how to specify this using maven to build my jar. > > I'm using the cdh distribution by the way.. > -- > Regards- > Pavan >
Re: Maven Cloudera Configuration problem
Here are the log details when i run the jar file: 08:10:29,738 INFO ZooKeeper:438 - Initiating client connection, connectString=localhost:2181 sessionTimeout=18 watcher=hconnection 08:10:29,777 INFO RecoverableZooKeeper:104 - The identifier of this process is 12...@xx--xxx-xx.eu-west- 1.compute.internal 08:10:29,784 INFO ClientCnxn:966 - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration) 08:10:29,796 INFO ClientCnxn:849 - Socket connection established to localhost/127.0.0.1:2181, initiating session 08:10:29,804 INFO ClientCnxn:1207 - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13ff1cff71b5503, negotiated timeout = 6 08:10:29,905 WARN Configuration:824 - hadoop.native.lib is deprecated. Instead, use io.native.lib.available Is it utilizing the cluster? Sorry for a noob question. On Wed, Aug 14, 2013 at 5:24 AM, Suresh Srinivas wrote: > Folks, can you please take this thread to CDH related mailing list? > > > On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox wrote: >> >> That link got my hopes up. But Cloudera Manager (what I'm running; on >> CDH4) does not offer an "Export Client Config" option. What am I missing? >> >> On Aug 13, 2013, at 4:04 PM, Shahab Yunus wrote: >> >> You should not use LocalJobRunner. Make sure that the mapred.job.tracker >> property does not point to 'local' an instead to your job-tracker host and >> port. >> >> *But before that* as Sandy said, your client machine (from where you will >> be kicking of your jobs and apps) should be using config files which will >> have your cluster's configuration. This is the alternative that you should >> follow if you don't want to bundle the configs for your cluster in the >> application itself (either in java code or separate copies of relevant >> properties set of config files.) This was something which I was suggesting >> early on to just to get you started using your cluster instead of local >> mode. >> >> By the way have you seen the following link? It gives you step by step >> information about how to generate config files from your cluster specific >> to your cluster and then how to place them and use the from any machine >> you >> want to designate as your client. Running your jobs form one of the >> datanodes without proper config would not work. >> >> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration >> >> Regards, >> Shahab >> >> >> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra >> wrote: >> >> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the >> job on one datanode.. >> >> What changes should i make so that my application would take advantage >> of the cluster as a whole? >> >> On Tue, Aug 13, 2013 at 10:33 PM, wrote: >> >> Nothing in your pom.xml should affect the configurations your job runs >> >> with. >> >> >> Are you running your job from a node on the cluster? When you say >> >> localhost configurations, do you mean it's using the LocalJobRunner? >> >> >> -sandy >> >> (iphnoe tpying) >> >> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra >> >> wrote: >> >> >> When i actually run the job on the multi node cluster, logs shows it >> uses localhost configurations which i don't want.. >> >> I just have a pom.xml which lists all the dependencies like standard >> hadoop, standard hbase, standard zookeeper etc., Should i remove these >> dependencies? >> >> I want the cluster settings to apply in my map-reduce application.. >> So, this is where i'm stuck at.. >> >> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra >> >> wrote: >> >> Hi Shabab and Sandy, >> The thing is we have a 6 node cloudera cluster running.. For >> development purposes, i was building a map-reduce application on a >> single node apache distribution hadoop with maven.. >> >> To be frank, i don't know how to deploy this application on a multi >> node cloudera cluster. I am fairly well versed with Multi Node Apache >> Hadoop Distribution.. So, how can i go forward? >> >> Thanks for all the help :) >> >> On Tue, Aug 13, 2013 at 9:22 PM, wrote: >> >> Hi Pavan, >> >> Configuration properties generally aren't included in the jar itself >> >> unless you explicitly set them in your java code. Rather they're picked up >> from the mapred-site.xml file located in the Hadoop configuration >> directory >> on the host you're running your job from. >> >> >> Is there an issue you're coming up against when trying to run your >> >> job on a cluster? >> >> >> -Sandy >> >> (iphnoe tpying) >> >> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra >> >> wrote: >> >> >> Hi, >> I'm currently using maven to build the jars necessary for my >> map-reduce program to run and it works for a single node cluster.. >> >> For a multi node cluster, how do i specify my map-reduce program to >> ingest the cluster settings instead of localhost settings? >> I don't know how to specify this using maven to build my jar
Re: Maven Cloudera Configuration problem
Folks, can you please take this thread to CDH related mailing list? On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox wrote: > That link got my hopes up. But Cloudera Manager (what I'm running; on > CDH4) does not offer an "Export Client Config" option. What am I missing? > > On Aug 13, 2013, at 4:04 PM, Shahab Yunus wrote: > > You should not use LocalJobRunner. Make sure that the mapred.job.tracker > property does not point to 'local' an instead to your job-tracker host and > port. > > *But before that* as Sandy said, your client machine (from where you will > be kicking of your jobs and apps) should be using config files which will > have your cluster's configuration. This is the alternative that you should > follow if you don't want to bundle the configs for your cluster in the > application itself (either in java code or separate copies of relevant > properties set of config files.) This was something which I was suggesting > early on to just to get you started using your cluster instead of local > mode. > > By the way have you seen the following link? It gives you step by step > information about how to generate config files from your cluster specific > to your cluster and then how to place them and use the from any machine you > want to designate as your client. Running your jobs form one of the > datanodes without proper config would not work. > > https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration > > Regards, > Shahab > > > On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra >wrote: > > Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the > job on one datanode.. > > What changes should i make so that my application would take advantage > of the cluster as a whole? > > On Tue, Aug 13, 2013 at 10:33 PM, wrote: > > Nothing in your pom.xml should affect the configurations your job runs > > with. > > > Are you running your job from a node on the cluster? When you say > > localhost configurations, do you mean it's using the LocalJobRunner? > > > -sandy > > (iphnoe tpying) > > On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra > > wrote: > > > When i actually run the job on the multi node cluster, logs shows it > uses localhost configurations which i don't want.. > > I just have a pom.xml which lists all the dependencies like standard > hadoop, standard hbase, standard zookeeper etc., Should i remove these > dependencies? > > I want the cluster settings to apply in my map-reduce application.. > So, this is where i'm stuck at.. > > On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra > > wrote: > > Hi Shabab and Sandy, > The thing is we have a 6 node cloudera cluster running.. For > development purposes, i was building a map-reduce application on a > single node apache distribution hadoop with maven.. > > To be frank, i don't know how to deploy this application on a multi > node cloudera cluster. I am fairly well versed with Multi Node Apache > Hadoop Distribution.. So, how can i go forward? > > Thanks for all the help :) > > On Tue, Aug 13, 2013 at 9:22 PM, wrote: > > Hi Pavan, > > Configuration properties generally aren't included in the jar itself > > unless you explicitly set them in your java code. Rather they're picked up > from the mapred-site.xml file located in the Hadoop configuration directory > on the host you're running your job from. > > > Is there an issue you're coming up against when trying to run your > > job on a cluster? > > > -Sandy > > (iphnoe tpying) > > On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra > > wrote: > > > Hi, > I'm currently using maven to build the jars necessary for my > map-reduce program to run and it works for a single node cluster.. > > For a multi node cluster, how do i specify my map-reduce program to > ingest the cluster settings instead of localhost settings? > I don't know how to specify this using maven to build my jar. > > I'm using the cdh distribution by the way.. > -- > Regards- > Pavan > > > > > -- > Regards- > Pavan > > > > > -- > Regards- > Pavan > > > > > -- > Regards- > Pavan > > > Dr. Brad J. CoxCell: 703-594-1883 Blog: http://bradjcox.blogspot.com > http://virtualschool.edu > > > > > -- http://hortonworks.com/download/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Maven Cloudera Configuration problem
In our Clouder 4.2.0 cluster, Iog-in with *admin* user (do you have appropriate permissions by the way?) Then I click on any one of the 3 services (hbase, mapred, hdfs and excluding zookeeper) from the top-leftish menu. Then for each of these I can click the *Configuration* tab which is in the top-middlish section of the page. Once the configuration page opens then I click on Action menu on the top-right. One of the sub-menu of this is *Download Client Configuration* which as the name says downloads the config files (zip file to be exact) to be used at client machines. Regards, Shahab On Tue, Aug 13, 2013 at 6:07 PM, Brad Cox wrote: > That link got my hopes up. But Cloudera Manager (what I'm running; on > CDH4) does not offer an "Export Client Config" option. What am I missing? > > On Aug 13, 2013, at 4:04 PM, Shahab Yunus wrote: > > You should not use LocalJobRunner. Make sure that the mapred.job.tracker > property does not point to 'local' an instead to your job-tracker host and > port. > > *But before that* as Sandy said, your client machine (from where you will > be kicking of your jobs and apps) should be using config files which will > have your cluster's configuration. This is the alternative that you should > follow if you don't want to bundle the configs for your cluster in the > application itself (either in java code or separate copies of relevant > properties set of config files.) This was something which I was suggesting > early on to just to get you started using your cluster instead of local > mode. > > By the way have you seen the following link? It gives you step by step > information about how to generate config files from your cluster specific > to your cluster and then how to place them and use the from any machine you > want to designate as your client. Running your jobs form one of the > datanodes without proper config would not work. > > https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration > > Regards, > Shahab > > > On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra >wrote: > > Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the > job on one datanode.. > > What changes should i make so that my application would take advantage > of the cluster as a whole? > > On Tue, Aug 13, 2013 at 10:33 PM, wrote: > > Nothing in your pom.xml should affect the configurations your job runs > > with. > > > Are you running your job from a node on the cluster? When you say > > localhost configurations, do you mean it's using the LocalJobRunner? > > > -sandy > > (iphnoe tpying) > > On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra > > wrote: > > > When i actually run the job on the multi node cluster, logs shows it > uses localhost configurations which i don't want.. > > I just have a pom.xml which lists all the dependencies like standard > hadoop, standard hbase, standard zookeeper etc., Should i remove these > dependencies? > > I want the cluster settings to apply in my map-reduce application.. > So, this is where i'm stuck at.. > > On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra > > wrote: > > Hi Shabab and Sandy, > The thing is we have a 6 node cloudera cluster running.. For > development purposes, i was building a map-reduce application on a > single node apache distribution hadoop with maven.. > > To be frank, i don't know how to deploy this application on a multi > node cloudera cluster. I am fairly well versed with Multi Node Apache > Hadoop Distribution.. So, how can i go forward? > > Thanks for all the help :) > > On Tue, Aug 13, 2013 at 9:22 PM, wrote: > > Hi Pavan, > > Configuration properties generally aren't included in the jar itself > > unless you explicitly set them in your java code. Rather they're picked up > from the mapred-site.xml file located in the Hadoop configuration directory > on the host you're running your job from. > > > Is there an issue you're coming up against when trying to run your > > job on a cluster? > > > -Sandy > > (iphnoe tpying) > > On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra > > wrote: > > > Hi, > I'm currently using maven to build the jars necessary for my > map-reduce program to run and it works for a single node cluster.. > > For a multi node cluster, how do i specify my map-reduce program to > ingest the cluster settings instead of localhost settings? > I don't know how to specify this using maven to build my jar. > > I'm using the cdh distribution by the way.. > -- > Regards- > Pavan > > > > > -- > Regards- > Pavan > > > > > -- > Regards- > Pavan > > > > > -- > Regards- > Pavan > > > Dr. Brad J. CoxCell: 703-594-1883 Blog: http://bradjcox.blogspot.com > http://virtualschool.edu > > > > >
Re: Maven Cloudera Configuration problem
That link got my hopes up. But Cloudera Manager (what I'm running; on CDH4) does not offer an "Export Client Config" option. What am I missing? On Aug 13, 2013, at 4:04 PM, Shahab Yunus wrote: > You should not use LocalJobRunner. Make sure that the mapred.job.tracker > property does not point to 'local' an instead to your job-tracker host and > port. > > *But before that* as Sandy said, your client machine (from where you will > be kicking of your jobs and apps) should be using config files which will > have your cluster's configuration. This is the alternative that you should > follow if you don't want to bundle the configs for your cluster in the > application itself (either in java code or separate copies of relevant > properties set of config files.) This was something which I was suggesting > early on to just to get you started using your cluster instead of local > mode. > > By the way have you seen the following link? It gives you step by step > information about how to generate config files from your cluster specific > to your cluster and then how to place them and use the from any machine you > want to designate as your client. Running your jobs form one of the > datanodes without proper config would not work. > > https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration > > Regards, > Shahab > > > On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra wrote: > >> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the >> job on one datanode.. >> >> What changes should i make so that my application would take advantage >> of the cluster as a whole? >> >> On Tue, Aug 13, 2013 at 10:33 PM, wrote: >>> Nothing in your pom.xml should affect the configurations your job runs >> with. >>> >>> Are you running your job from a node on the cluster? When you say >> localhost configurations, do you mean it's using the LocalJobRunner? >>> >>> -sandy >>> >>> (iphnoe tpying) >>> >>> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra >> wrote: >>> When i actually run the job on the multi node cluster, logs shows it uses localhost configurations which i don't want.. I just have a pom.xml which lists all the dependencies like standard hadoop, standard hbase, standard zookeeper etc., Should i remove these dependencies? I want the cluster settings to apply in my map-reduce application.. So, this is where i'm stuck at.. On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra >> wrote: > Hi Shabab and Sandy, > The thing is we have a 6 node cloudera cluster running.. For > development purposes, i was building a map-reduce application on a > single node apache distribution hadoop with maven.. > > To be frank, i don't know how to deploy this application on a multi > node cloudera cluster. I am fairly well versed with Multi Node Apache > Hadoop Distribution.. So, how can i go forward? > > Thanks for all the help :) > > On Tue, Aug 13, 2013 at 9:22 PM, wrote: >> Hi Pavan, >> >> Configuration properties generally aren't included in the jar itself >> unless you explicitly set them in your java code. Rather they're picked up >> from the mapred-site.xml file located in the Hadoop configuration directory >> on the host you're running your job from. >> >> Is there an issue you're coming up against when trying to run your >> job on a cluster? >> >> -Sandy >> >> (iphnoe tpying) >> >> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra >> wrote: >> >>> Hi, >>> I'm currently using maven to build the jars necessary for my >>> map-reduce program to run and it works for a single node cluster.. >>> >>> For a multi node cluster, how do i specify my map-reduce program to >>> ingest the cluster settings instead of localhost settings? >>> I don't know how to specify this using maven to build my jar. >>> >>> I'm using the cdh distribution by the way.. >>> -- >>> Regards- >>> Pavan > > > > -- > Regards- > Pavan -- Regards- Pavan >> >> >> >> -- >> Regards- >> Pavan >> Dr. Brad J. CoxCell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu
Re: Maven Cloudera Configuration problem
You should not use LocalJobRunner. Make sure that the mapred.job.tracker property does not point to 'local' an instead to your job-tracker host and port. *But before that* as Sandy said, your client machine (from where you will be kicking of your jobs and apps) should be using config files which will have your cluster's configuration. This is the alternative that you should follow if you don't want to bundle the configs for your cluster in the application itself (either in java code or separate copies of relevant properties set of config files.) This was something which I was suggesting early on to just to get you started using your cluster instead of local mode. By the way have you seen the following link? It gives you step by step information about how to generate config files from your cluster specific to your cluster and then how to place them and use the from any machine you want to designate as your client. Running your jobs form one of the datanodes without proper config would not work. https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration Regards, Shahab On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra wrote: > Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the > job on one datanode.. > > What changes should i make so that my application would take advantage > of the cluster as a whole? > > On Tue, Aug 13, 2013 at 10:33 PM, wrote: > > Nothing in your pom.xml should affect the configurations your job runs > with. > > > > Are you running your job from a node on the cluster? When you say > localhost configurations, do you mean it's using the LocalJobRunner? > > > > -sandy > > > > (iphnoe tpying) > > > > On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra > wrote: > > > >> When i actually run the job on the multi node cluster, logs shows it > >> uses localhost configurations which i don't want.. > >> > >> I just have a pom.xml which lists all the dependencies like standard > >> hadoop, standard hbase, standard zookeeper etc., Should i remove these > >> dependencies? > >> > >> I want the cluster settings to apply in my map-reduce application.. > >> So, this is where i'm stuck at.. > >> > >> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra > wrote: > >>> Hi Shabab and Sandy, > >>> The thing is we have a 6 node cloudera cluster running.. For > >>> development purposes, i was building a map-reduce application on a > >>> single node apache distribution hadoop with maven.. > >>> > >>> To be frank, i don't know how to deploy this application on a multi > >>> node cloudera cluster. I am fairly well versed with Multi Node Apache > >>> Hadoop Distribution.. So, how can i go forward? > >>> > >>> Thanks for all the help :) > >>> > >>> On Tue, Aug 13, 2013 at 9:22 PM, wrote: > Hi Pavan, > > Configuration properties generally aren't included in the jar itself > unless you explicitly set them in your java code. Rather they're picked up > from the mapred-site.xml file located in the Hadoop configuration directory > on the host you're running your job from. > > Is there an issue you're coming up against when trying to run your > job on a cluster? > > -Sandy > > (iphnoe tpying) > > On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra > wrote: > > > Hi, > > I'm currently using maven to build the jars necessary for my > > map-reduce program to run and it works for a single node cluster.. > > > > For a multi node cluster, how do i specify my map-reduce program to > > ingest the cluster settings instead of localhost settings? > > I don't know how to specify this using maven to build my jar. > > > > I'm using the cdh distribution by the way.. > > -- > > Regards- > > Pavan > >>> > >>> > >>> > >>> -- > >>> Regards- > >>> Pavan > >> > >> > >> > >> -- > >> Regards- > >> Pavan > > > > -- > Regards- > Pavan >
Re: Maven Cloudera Configuration problem
Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the job on one datanode.. What changes should i make so that my application would take advantage of the cluster as a whole? On Tue, Aug 13, 2013 at 10:33 PM, wrote: > Nothing in your pom.xml should affect the configurations your job runs with. > > Are you running your job from a node on the cluster? When you say localhost > configurations, do you mean it's using the LocalJobRunner? > > -sandy > > (iphnoe tpying) > > On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra wrote: > >> When i actually run the job on the multi node cluster, logs shows it >> uses localhost configurations which i don't want.. >> >> I just have a pom.xml which lists all the dependencies like standard >> hadoop, standard hbase, standard zookeeper etc., Should i remove these >> dependencies? >> >> I want the cluster settings to apply in my map-reduce application.. >> So, this is where i'm stuck at.. >> >> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra >> wrote: >>> Hi Shabab and Sandy, >>> The thing is we have a 6 node cloudera cluster running.. For >>> development purposes, i was building a map-reduce application on a >>> single node apache distribution hadoop with maven.. >>> >>> To be frank, i don't know how to deploy this application on a multi >>> node cloudera cluster. I am fairly well versed with Multi Node Apache >>> Hadoop Distribution.. So, how can i go forward? >>> >>> Thanks for all the help :) >>> >>> On Tue, Aug 13, 2013 at 9:22 PM, wrote: Hi Pavan, Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from. Is there an issue you're coming up against when trying to run your job on a cluster? -Sandy (iphnoe tpying) On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra wrote: > Hi, > I'm currently using maven to build the jars necessary for my > map-reduce program to run and it works for a single node cluster.. > > For a multi node cluster, how do i specify my map-reduce program to > ingest the cluster settings instead of localhost settings? > I don't know how to specify this using maven to build my jar. > > I'm using the cdh distribution by the way.. > -- > Regards- > Pavan >>> >>> >>> >>> -- >>> Regards- >>> Pavan >> >> >> >> -- >> Regards- >> Pavan -- Regards- Pavan
Re: Maven Cloudera Configuration problem
Nothing in your pom.xml should affect the configurations your job runs with. Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner? -sandy (iphnoe tpying) On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra wrote: > When i actually run the job on the multi node cluster, logs shows it > uses localhost configurations which i don't want.. > > I just have a pom.xml which lists all the dependencies like standard > hadoop, standard hbase, standard zookeeper etc., Should i remove these > dependencies? > > I want the cluster settings to apply in my map-reduce application.. > So, this is where i'm stuck at.. > > On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra wrote: >> Hi Shabab and Sandy, >> The thing is we have a 6 node cloudera cluster running.. For >> development purposes, i was building a map-reduce application on a >> single node apache distribution hadoop with maven.. >> >> To be frank, i don't know how to deploy this application on a multi >> node cloudera cluster. I am fairly well versed with Multi Node Apache >> Hadoop Distribution.. So, how can i go forward? >> >> Thanks for all the help :) >> >> On Tue, Aug 13, 2013 at 9:22 PM, wrote: >>> Hi Pavan, >>> >>> Configuration properties generally aren't included in the jar itself unless >>> you explicitly set them in your java code. Rather they're picked up from >>> the mapred-site.xml file located in the Hadoop configuration directory on >>> the host you're running your job from. >>> >>> Is there an issue you're coming up against when trying to run your job on a >>> cluster? >>> >>> -Sandy >>> >>> (iphnoe tpying) >>> >>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra wrote: >>> Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan >> >> >> >> -- >> Regards- >> Pavan > > > > -- > Regards- > Pavan
Re: Maven Cloudera Configuration problem
I've been stuck on the same question lately so don't take this as definitive, just my best guess at what's required. Using maven as your hadoop source is going to give you a "vanilla" hadoop; one that runs on localhost. You need one that you've customized to point to your remote cluster and you can't get that via maven. So my *GUESS* is you need to do a plain local install of hadoop and point HADOOP_HOME at that. Customize as required, then convince eclipse to use that instead of going thru maven (i.e. remove hadoop from the dependency list). Everyone; is this on the right path? Anyone know of exact instructions? On Aug 13, 2013, at 12:07 PM, Pavan Sudheendra wrote: > When i actually run the job on the multi node cluster, logs shows it > uses localhost configurations which i don't want.. > > I just have a pom.xml which lists all the dependencies like standard > hadoop, standard hbase, standard zookeeper etc., Should i remove these > dependencies? > > I want the cluster settings to apply in my map-reduce application.. > So, this is where i'm stuck at.. > > On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra wrote: >> Hi Shabab and Sandy, >> The thing is we have a 6 node cloudera cluster running.. For >> development purposes, i was building a map-reduce application on a >> single node apache distribution hadoop with maven.. >> >> To be frank, i don't know how to deploy this application on a multi >> node cloudera cluster. I am fairly well versed with Multi Node Apache >> Hadoop Distribution.. So, how can i go forward? >> >> Thanks for all the help :) >> >> On Tue, Aug 13, 2013 at 9:22 PM, wrote: >>> Hi Pavan, >>> >>> Configuration properties generally aren't included in the jar itself unless >>> you explicitly set them in your java code. Rather they're picked up from >>> the mapred-site.xml file located in the Hadoop configuration directory on >>> the host you're running your job from. >>> >>> Is there an issue you're coming up against when trying to run your job on a >>> cluster? >>> >>> -Sandy >>> >>> (iphnoe tpying) >>> >>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra wrote: >>> Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan >> >> >> >> -- >> Regards- >> Pavan > > > > -- > Regards- > Pavan Dr. Brad J. CoxCell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu
Re: Maven Cloudera Configuration problem
When i actually run the job on the multi node cluster, logs shows it uses localhost configurations which i don't want.. I just have a pom.xml which lists all the dependencies like standard hadoop, standard hbase, standard zookeeper etc., Should i remove these dependencies? I want the cluster settings to apply in my map-reduce application.. So, this is where i'm stuck at.. On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra wrote: > Hi Shabab and Sandy, > The thing is we have a 6 node cloudera cluster running.. For > development purposes, i was building a map-reduce application on a > single node apache distribution hadoop with maven.. > > To be frank, i don't know how to deploy this application on a multi > node cloudera cluster. I am fairly well versed with Multi Node Apache > Hadoop Distribution.. So, how can i go forward? > > Thanks for all the help :) > > On Tue, Aug 13, 2013 at 9:22 PM, wrote: >> Hi Pavan, >> >> Configuration properties generally aren't included in the jar itself unless >> you explicitly set them in your java code. Rather they're picked up from the >> mapred-site.xml file located in the Hadoop configuration directory on the >> host you're running your job from. >> >> Is there an issue you're coming up against when trying to run your job on a >> cluster? >> >> -Sandy >> >> (iphnoe tpying) >> >> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra wrote: >> >>> Hi, >>> I'm currently using maven to build the jars necessary for my >>> map-reduce program to run and it works for a single node cluster.. >>> >>> For a multi node cluster, how do i specify my map-reduce program to >>> ingest the cluster settings instead of localhost settings? >>> I don't know how to specify this using maven to build my jar. >>> >>> I'm using the cdh distribution by the way.. >>> -- >>> Regards- >>> Pavan > > > > -- > Regards- > Pavan -- Regards- Pavan
Re: Maven Cloudera Configuration problem
Hi Shabab and Sandy, The thing is we have a 6 node cloudera cluster running.. For development purposes, i was building a map-reduce application on a single node apache distribution hadoop with maven.. To be frank, i don't know how to deploy this application on a multi node cloudera cluster. I am fairly well versed with Multi Node Apache Hadoop Distribution.. So, how can i go forward? Thanks for all the help :) On Tue, Aug 13, 2013 at 9:22 PM, wrote: > Hi Pavan, > > Configuration properties generally aren't included in the jar itself unless > you explicitly set them in your java code. Rather they're picked up from the > mapred-site.xml file located in the Hadoop configuration directory on the > host you're running your job from. > > Is there an issue you're coming up against when trying to run your job on a > cluster? > > -Sandy > > (iphnoe tpying) > > On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra wrote: > >> Hi, >> I'm currently using maven to build the jars necessary for my >> map-reduce program to run and it works for a single node cluster.. >> >> For a multi node cluster, how do i specify my map-reduce program to >> ingest the cluster settings instead of localhost settings? >> I don't know how to specify this using maven to build my jar. >> >> I'm using the cdh distribution by the way.. >> -- >> Regards- >> Pavan -- Regards- Pavan
Re: Maven Cloudera Configuration problem
Hi Pavan, Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from. Is there an issue you're coming up against when trying to run your job on a cluster? -Sandy (iphnoe tpying) On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra wrote: > Hi, > I'm currently using maven to build the jars necessary for my > map-reduce program to run and it works for a single node cluster.. > > For a multi node cluster, how do i specify my map-reduce program to > ingest the cluster settings instead of localhost settings? > I don't know how to specify this using maven to build my jar. > > I'm using the cdh distribution by the way.. > -- > Regards- > Pavan
Re: Maven Cloudera Configuration problem
You need to configure your namenode and jobtracker information in the configuration files within you application. Only set the relevant properties in the copy of the files that you are bundling in your job. For rest the default values would be used from the default configuration files (core-default.xml, mapred-default.xml) already bundled in the lib/jar provided by cloudera/hadoop. The assumption is that this is for MRv1. Anyway, you should go through this for details http://hadoop.apache.org/docs/stable/cluster_setup.html *core-site.xml *(teh security ones are optional and if you are not using anything special you can remove them and rely on the defaults which is also 'simple'. fs.defaultFS hdfs://server:8020 hadoop.security.authentication simple hadoop.security.auth_to_local DEFAULT *map-red.xml* mapred.job.tracker http://server: * * Regards, Shahab * * On Tue, Aug 13, 2013 at 7:19 AM, Pavan Sudheendra wrote: > Hi, > I'm currently using maven to build the jars necessary for my > map-reduce program to run and it works for a single node cluster.. > > For a multi node cluster, how do i specify my map-reduce program to > ingest the cluster settings instead of localhost settings? > I don't know how to specify this using maven to build my jar. > > I'm using the cdh distribution by the way.. > -- > Regards- > Pavan >
Maven Cloudera Configuration problem
Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan