Determine what is meant by "disaster recovery". What are the scenarious, what data.
Architect to the business need, not the buzz words *“Anyone who isn’t embarrassed by who they were last year probably isn’t learning enough.” - Alain de Botton* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Wed, Jul 26, 2017 at 9:11 PM, Atul Rajan <atul.raja...@icloud.com> wrote: > Hello all, > > We are planning to implement Data lake for our financial data. How can we > achieve Disaster Recovery for our Data Lake. > > initially all the data marts will be pushed to data lake but we want > something for our Data recovery. please suggest some ideas > > Thanks and Regards > Atul Rajan > > > -Sent from my iPhone > > On 12-Jan-2017, at 4:43 AM, Akash Mishra <akash.mishr...@gmail.com> wrote: > > You are getting NPE on > *org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.a.getName > * which is not in Hadoop codebase. I can see you are using some other > Scheduler Implementation > *com.pepperdata.supervisor.scheduler.PepperdataSupervisorYarnFair,* hence > you can check > *SourceFile:204 *for more details. > > My guess is that you need to set some Name parameter in which is requested > only on Debug level. > > Thanks, > > > > On Wed, Jan 11, 2017 at 10:59 PM, Stephen Sprague <sprag...@gmail.com> > wrote: > >> ok. i would attach but... i think there might be an aversion to >> attachments so i'll paste inline. hopefully its not too confusing. >> >> $ cat fair-scheduler.xml >> >> <?xml version="1.0"?> >> >> <!-- >> This is a sample configuration file for the Fair Scheduler. For details >> on the options, please refer to the fair scheduler documentation at >> http://hadoop.apache.org/core/docs/r0.21.0/fair_scheduler.html. >> >> To create your own configuration, copy this file to >> conf/fair-scheduler.xml >> and add the following property in mapred-site.xml to point Hadoop to the >> file, replacing [HADOOP_HOME] with the path to your installation >> directory: >> <property> >> <name>mapred.fairscheduler.allocation.file</name> >> <value>[HADOOP_HOME]/conf/fair-scheduler.xml</value> >> </property> >> >> Note that all the parameters in the configuration file below are >> optional, >> including the parameters inside <pool> and <user> elements. It is only >> necessary to set the ones you want to differ from the defaults. >> --> >> >> <!-- https://hadoop.apache.org/docs/r1.2.1/fair_scheduler.html --> >> >> <allocations> >> >> <!-- NOTE. ** Preemption IS NOT turn on! ** --> >> >> <!-- Preemption timeout for jobs below their fair share, in seconds. >> If a job is below half its fair share for this amount of time, it >> is allowed to kill tasks from other jobs to go up to its fair share. >> Requires mapred.fairscheduler.preemption to be true in >> mapred-site.xml. --> >> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout> >> >> <!-- Default min share preemption timeout for pools where it is not >> explicitly configured, in seconds. Requires >> mapred.fairscheduler.preemption >> to be set to true in your mapred-site.xml. --> >> <defaultMinSharePreemptionTimeout>600</defaultMinSharePreemp >> tionTimeout> >> >> <!-- Default running job limit pools where it is not explicitly set. --> >> <queueMaxJobsDefault>20</queueMaxJobsDefault> >> >> <!-- Default running job limit users where it is not explicitly set. --> >> <userMaxJobsDefault>10</userMaxJobsDefault> >> >> >> <!-- QUEUES: >> dwr.interactive : 10 at once >> dwr.batch_sql : 15 at once >> dwr.batch_hdfs : 5 at once (distcp, sqoop, hfs -put, >> anything besides 'sql') >> dwr.qa : 3 at once >> dwr.truck_lane : 1 at once >> >> cad.interactive : 5 at once >> cad.batch : 10 at once >> >> comms.interactive : 5 at once >> comms.batch : 3 at once >> >> default : 2 at once (to discourage its use) >> --> >> >> >> <!-- queue placement --> >> >> <queuePlacementPolicy> >> <rule name="specified" /> >> <rule name="default" /> >> </queuePlacementPolicy> >> >> >> <!-- footprint --> >> <queue name='footprint'> >> <schedulingPolicy>fair</schedulingPolicy> <!-- can be fifo too --> >> >> <maxRunningApps>4</maxRunningApps> >> <aclSubmitApps>*</aclSubmitApps> >> >> <minMaps>10</minMaps> >> <minReduces>5</minReduces> >> <userMaxJobsDefault>50</userMaxJobsDefault> >> >> <maxMaps>200</maxMaps> >> <maxReduces>200</maxReduces> >> <minResources>20000 mb, 10 vcores</minResources> >> <maxResources>500000 mb, 175 vcores</maxResources> >> >> <queue name="dev"> >> <maxMaps>200</maxMaps> >> <maxReduces>200</maxReduces> >> <minResources>20000 mb, 10 vcores</minResources> >> <maxResources>500000 mb, 175 vcores</maxResources> >> </queue> >> >> <queue name="stage"> >> <maxMaps>200</maxMaps> >> <maxReduces>200</maxReduces> >> <minResources>20000 mb, 10 vcores</minResources> >> <maxResources>500000 mb, 175 vcores</maxResources> >> </queue> >> </queue> >> >> <!-- comms --> >> <queue name='comms'> >> <schedulingPolicy>fair</schedulingPolicy> <!-- can be fifo too --> >> >> <queue name="interactive"> >> <maxRunningApps>5</maxRunningApps> >> <aclSubmitApps>*</aclSubmitApps> >> </queue> >> >> <queue name="batch"> >> <maxRunningApps>10</maxRunningApps> >> <aclSubmitApps>*</aclSubmitApps> >> </queue> >> >> </queue> >> >> <!-- cad --> >> <queue name='cad'> >> <schedulingPolicy>fair</schedulingPolicy> <!-- can be fifo too --> >> >> <queue name="interactive"> >> <maxRunningApps>5</maxRunningApps> >> <aclSubmitApps>*</aclSubmitApps> >> </queue> >> >> >> <queue name="batch"> >> <maxRunningApps>10</maxRunningApps> >> <aclSubmitApps>*</aclSubmitApps> >> </queue> >> >> </queue> >> >> >> >> <!-- dwr --> >> <queue name="dwr"> >> >> <schedulingPolicy>fair</schedulingPolicy> <!-- can be fifo too --> >> <minMaps>10</minMaps> >> <minReduces>5</minReduces> >> <userMaxJobsDefault>50</userMaxJobsDefault> >> >> <maxMaps>200</maxMaps> >> <maxReduces>200</maxReduces> >> <minResources>20000 mb, 10 vcores</minResources> >> <maxResources>500000 mb, 175 vcores</maxResources> >> >> <!-- INTERACTiVE. 5 at once --> >> <queue name="interactive"> >> <weight>2.0</weight> >> <maxRunningApps>5</maxRunningApps> >> >> <maxMaps>200</maxMaps> >> <maxReduces>200</maxReduces> >> <minResources>20000 mb, 10 vcores</minResources> >> <maxResources>500000 mb, 175 vcores</maxResources> >> >> <!-- not used. Number of seconds after which the pool can preempt other >> pools --> >> <minSharePreemptionTimeout>60</minSharePreemptionTimeout> >> >> <!-- per user. but given everything is dwr (for now) its not helpful --> >> <userMaxAppsDefault>5</userMaxAppsDefault> >> <aclSubmitApps>*</aclSubmitApps> >> </queue> >> >> >> <!-- BATCH. 15 at once --> >> <queue name="batch_sql"> >> <weight>1.5</weight> >> <maxRunningApps>15</maxRunningApps> >> >> <maxMaps>200</maxMaps> >> <maxReduces>200</maxReduces> >> <minResources>20000 mb, 10 vcores</minResources> >> <maxResources>500000 mb, 175 vcores</maxResources> >> >> <!-- not used. Number of seconds after which the pool can preempt other >> pools --> >> <minSharePreemptionTimeout>300</minSharePreemptionTimeout> >> >> <userMaxAppsDefault>50</userMaxAppsDefault> >> <aclSubmitApps>*</aclSubmitApps> >> </queue> >> >> >> <!-- sqoop, distcp, hdfs-put type jobs here. 3 at once --> >> <queue name="batch_hdfs"> >> <weight>1.0</weight> >> <maxRunningApps>3</maxRunningApps> >> >> <!-- not used. Number of seconds after which the pool can preempt other >> pools --> >> <minSharePreemptionTimeout>300</minSharePreemptionTimeout> >> <userMaxAppsDefault>50</userMaxAppsDefault> >> <aclSubmitApps>*</aclSubmitApps> >> </queue> >> >> >> <!-- QA. 3 at once --> >> <queue name="qa"> >> <weight>1.0</weight> >> <maxRunningApps>100</maxRunningApps> >> >> <!-- not used. Number of seconds after which the pool can preempt other >> pools --> >> <minSharePreemptionTimeout>300</minSharePreemptionTimeout> >> <aclSubmitApps>*</aclSubmitApps> >> <userMaxAppsDefault>50</userMaxAppsDefault> >> >> </queue> >> >> <!-- big, unruly jobs --> >> <queue name="truck_lane"> >> <weight>0.75</weight> >> <maxRunningApps>1</maxRunningApps> >> <minMaps>5</minMaps> >> <minReduces>5</minReduces> >> >> <!-- lets try without static values and see how the "weight" works >> --> >> <maxMaps>192</maxMaps> >> <maxReduces>192</maxReduces> >> <minResources>20000 mb, 10 vcores</minResources> >> <maxResources>500000 mb, 200 vcores</maxResources> >> >> <!-- not used. Number of seconds after which the pool can preempt other >> pools --> >> <!-- >> <minSharePreemptionTimeout>300</minSharePreemptionTimeout> >> <aclSubmitApps>*</aclSubmitApps> >> <userMaxAppsDefault>50</userMaxAppsDefault> >> --> >> </queue> >> </queue> >> >> <!-- DEFAULT. 2 at once --> >> <queue name="default"> >> <maxRunningApps>2</maxRunningApps> >> >> <maxMaps>40</maxMaps> >> <maxReduces>40</maxReduces> >> <minResources>20000 mb, 10 vcores</minResources> >> <maxResources>20000 mb, 10 vcores</maxResources> >> >> <!-- not used. Number of seconds after which the pool can preempt other >> pools --> >> <minSharePreemptionTimeout>60</minSharePreemptionTimeout> >> <userMaxAppsDefault>5</userMaxAppsDefault> >> <aclSubmitApps>*</aclSubmitApps> >> </queue> >> >> >> </allocations> >> >> >> >> <!-- some other stuff >> >> <minResources>10000 mb,0vcores</minResources> >> <maxResources>90000 mb,0vcores</maxResources> >> >> <minMaps>10</minMaps> >> <minReduces>5</minReduces> >> >> --> >> >> <!-- enabling >> * Bringing the queues in effect: >> Once the required parameters are defined in fair-scheduler.xml file, >> run the command to bring the changes in effect. >> yarn rmadmin -refreshQueues >> --> >> >> <!-- verifying >> Once the command runs properly, verify if the queues are setup using 2 >> options: >> >> 1) hadoop queue -list >> or >> 2) Open YARN resourcemanager GUI from Resource Manager GUI: http:// >> <Resouremanager-hostname>:8088, click Scheduler. >> >> --> >> >> >> <!-- notes >> [fail_user@phd11-nn ~]$ id >> uid=507(fail_user) gid=507(failgroup) groups=507(failgroup) >> [fail_user@phd11-nn ~]$ hadoop queue -showacls >> --> >> >> >> <!-- submit >> To submit an application use the parameter -Dmapred.job.queue.name >> =<queue-name> or -Dmapred.job.queuename=<queue-name> >> --> >> >> >> >> >> >> *** yarn-site.xml >> >> >> >> $ cat yarn-site.xml >> >> ssprague-mbpro:~ spragues$ cat yarn-site.xml >> <?xml version="1.0"?> >> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> >> >> <configuration> >> <!--Autogenerated yarn params from puppet yaml hash >> yarn_site_parameters__xml --> >> <property> >> <name>yarn.resourcemanager.hostname</name> >> <value>FOO.sv2.trulia.com</value> >> </property> >> <property> >> <name>yarn.nodemanager.aux-services</name> >> <value>mapreduce_shuffle</value> >> </property> >> <property> >> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> >> <value>org.apache.hadoop.mapred.ShuffleHandler</value> >> </property> >> <property> >> <name>yarn.nodemanager.local-dirs</name> >> <value>/storage0/hadoop/yarn/local,/storage1/hadoop/yarn/loc >> al,/storage2/hadoop/yarn/local,/storage3/hadoop/yarn/local,/ >> storage4/hadoop/yarn/local,/storage5/hadoop/yarn/local</value> >> </property> >> <property> >> <name>yarn.resourcemanager.scheduler.class</name> >> <value>com.pepperdata.supervisor.scheduler.PepperdataSupervi >> sorYarnFair</value> >> </property> >> <property> >> <name>yarn.application.classpath</name> >> <value>$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON >> _HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$ >> HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$HADOOP_ >> YARN_HOME/*,$HADOOP_YARN_HOME/lib/*,$TEZ_HOME/*,$TEZ_HOME/lib/*</value> >> </property> >> <property> >> <name>pepperdata.license.key.specification</name> >> <value>data://removed</value> >> </property> >> <property> >> <name>pepperdata.license.key.comments</name> >> <value>License Type: PRODUCTION Expiration Date (UTC): 2017/02/01 >> Company Name: Trulia, LLC Cluster Name: trulia-production Number of Nodes: >> 150 Contact Person Name: Deep Varma Contact Person Email: >> dva...@trulia.com</value> >> </property> >> <property> >> <name>yarn.timeline-service.hostname</name> >> <value>FOO.sv2.trulia.com</value> >> </property> >> <property> >> <name>yarn.timeline-service.enabled</name> >> <value>true</value> >> </property> >> <property> >> <name>yarn.timeline-service.webapp.address</name> >> <value>FOO.sv2.trulia.com:8188</value> >> </property> >> <property> >> <name>yarn.timeline-service.http-cross-origin.enabled</name> >> <value>true</value> >> </property> >> <property> >> <name>yarn.timeline-service.ttl-enable</name> >> <value>false</value> >> </property> >> >> <!-- >> <property> >> <name>yarn.timeline-service.store-class</name> >> <value>org.apache.hadoop.yarn.server.timeline.RollingLevelDb >> TimelineStore</value> >> </property> >> --> >> <property> >> <name>yarn.resourcemanager.system-metrics-publisher.enabled</name> >> <value>true</value> >> </property> >> <property> >> <name>yarn.scheduler.fair.user-as-default-queue</name> >> <value>true</value> >> </property> >> <property> >> <name>yarn.scheduler.fair.preemption</name> >> <value>false</value> >> </property> >> <property> >> <name>yarn.scheduler.fair.sizebasedweight</name> >> <value>true</value> >> </property> >> <property> >> <name>yarn.scheduler.minimum-allocation-mb</name> >> <value>2048</value> >> </property> >> <property> >> <name>yarn.scheduler.maximum-allocation-mb</name> >> <value>8192</value> >> </property> >> <property> >> <name>yarn.nodemanager.disk-health-checker.max-disk-utilizat >> ion-per-disk-percentage</name> >> <value>98.5</value> >> </property> >> <property> >> <name>yarn.log-aggregation.retain-seconds</name> >> <value>604800</value> >> </property> >> <property> >> <name>yarn.log-aggregation-enable</name> >> <value>true</value> >> </property> >> <property> >> <name>yarn.nodemanager.log-dirs</name> >> <value>${yarn.log.dir}/userlogs</value> >> </property> >> <property> >> <name>yarn.nodemanager.remote-app-log-dir</name> >> <value>/app-logs</value> >> </property> >> <property> >> <name>yarn.nodemanager.delete.debug-delay-sec</name> >> <value>600</value> >> </property> >> <property> >> <name>yarn.log.server.url</name> >> <value>http://FOO.sv2.trulia.com:19888/jobhistory/logs</value> >> </property> >> >> </configuration> >> >> >> On Wed, Jan 11, 2017 at 2:27 PM, Akash Mishra <akash.mishr...@gmail.com> >> wrote: >> >>> Please post your fair-scheduler.xml file and yarn-site.xml >>> >>> On Wed, Jan 11, 2017 at 9:14 PM, Stephen Sprague <sprag...@gmail.com> >>> wrote: >>> >>>> hey guys, >>>> i'm running the RM with the above options (version 2.6.1) and get an >>>> NPE upon startup. >>>> >>>> {code} >>>> 17/01/11 12:44:45 FATAL resourcemanager.ResourceManager: Error >>>> starting ResourceManager >>>> java.lang.NullPointerException >>>> at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair >>>> .a.getName(SourceFile:204) >>>> at org.apache.hadoop.service.CompositeService.addService(Compos >>>> iteService.java:73) >>>> at org.apache.hadoop.service.CompositeService.addIfService(Comp >>>> ositeService.java:88) >>>> at org.apache.hadoop.yarn.server.resourcemanager.ResourceManage >>>> r$RMActiveServices.serviceInit(ResourceManager.java:490) >>>> at org.apache.hadoop.service.AbstractService.init(AbstractServi >>>> ce.java:163) >>>> at org.apache.hadoop.yarn.server.resourcemanager.ResourceManage >>>> r.createAndInitActiveServices(ResourceManager.java:993) >>>> at org.apache.hadoop.yarn.server.resourcemanager.ResourceManage >>>> r.serviceInit(ResourceManager.java:255) >>>> at org.apache.hadoop.service.AbstractService.init(AbstractServi >>>> ce.java:163) >>>> at org.apache.hadoop.yarn.server.resourcemanager.ResourceManage >>>> r.main(ResourceManager.java:1214) >>>> 17/01/11 12:44:45 INFO resourcemanager.ResourceManager: SHUTDOWN_MSG: >>>> {code} >>>> >>>> the fair-scheduler.xml file is fine and works in INFO level logging so >>>> i'm pretty sure there's nothing "wrong" with it. So with DEBUG level its >>>> making this java call and barfing. >>>> >>>> Any ideas how to fix this? >>>> >>>> thanks, >>>> Stephen. >>>> >>> >>> >>> >>> -- >>> >>> Regards, >>> Akash Mishra. >>> >>> >>> "It's not our abilities that make us, but our decisions."--Albus >>> Dumbledore >>> >> >> > > > -- > > Regards, > Akash Mishra. > > > "It's not our abilities that make us, but our decisions."--Albus Dumbledore >