Re: DR for Data Lake

daemeon reiydelle Sat, 29 Jul 2017 00:12:30 -0700

Determine what is meant by "disaster recovery". What are the scenarious,
what data.


Architect to the business need, not the buzz words


*“Anyone who isn’t embarrassed by who they were last year probably isn’t
learning enough.” - Alain de Botton*


*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*


On Wed, Jul 26, 2017 at 9:11 PM, Atul Rajan <atul.raja...@icloud.com> wrote:

> Hello all,
>
> We are planning to implement Data lake for our financial data. How can we
> achieve Disaster Recovery for our Data Lake.
>
> initially all the data marts will be pushed to data lake but we want
> something for our Data recovery. please suggest some ideas
>
> Thanks and Regards
> Atul Rajan
>
>
> -Sent from my iPhone
>
> On 12-Jan-2017, at 4:43 AM, Akash Mishra <akash.mishr...@gmail.com> wrote:
>
> You are getting NPE on 
> *org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.a.getName
> * which is not in Hadoop codebase. I can see you are using some other
> Scheduler Implementation
> *com.pepperdata.supervisor.scheduler.PepperdataSupervisorYarnFair,* hence
> you can check
> *SourceFile:204 *for more details.
>
> My guess is that you need to set some Name parameter in which is requested
> only on Debug level.
>
> Thanks,
>
>
>
> On Wed, Jan 11, 2017 at 10:59 PM, Stephen Sprague <sprag...@gmail.com>
> wrote:
>
>> ok.  i would attach but... i think there might be an aversion to
>> attachments so i'll paste inline.  hopefully its not too confusing.
>>
>> $ cat fair-scheduler.xml
>>
>> <?xml version="1.0"?>
>>
>> <!--
>>   This is a sample configuration file for the Fair Scheduler. For details
>>   on the options, please refer to the fair scheduler documentation at
>>   http://hadoop.apache.org/core/docs/r0.21.0/fair_scheduler.html.
>>
>>   To create your own configuration, copy this file to
>> conf/fair-scheduler.xml
>>   and add the following property in mapred-site.xml to point Hadoop to the
>>   file, replacing [HADOOP_HOME] with the path to your installation
>> directory:
>>     <property>
>>       <name>mapred.fairscheduler.allocation.file</name>
>>       <value>[HADOOP_HOME]/conf/fair-scheduler.xml</value>
>>     </property>
>>
>>   Note that all the parameters in the configuration file below are
>> optional,
>>   including the parameters inside <pool> and <user> elements. It is only
>>   necessary to set the ones you want to differ from the defaults.
>> -->
>>
>> <!-- https://hadoop.apache.org/docs/r1.2.1/fair_scheduler.html -->
>>
>> <allocations>
>>
>>   <!-- NOTE. ** Preemption IS NOT turn on! ** -->
>>
>>   <!-- Preemption timeout for jobs below their fair share, in seconds.
>>     If a job is below half its fair share for this amount of time, it
>>     is allowed to kill tasks from other jobs to go up to its fair share.
>>     Requires mapred.fairscheduler.preemption to be true in
>> mapred-site.xml. -->
>>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>>
>>   <!-- Default min share preemption timeout for pools where it is not
>>     explicitly configured, in seconds. Requires
>> mapred.fairscheduler.preemption
>>     to be set to true in your mapred-site.xml. -->
>>   <defaultMinSharePreemptionTimeout>600</defaultMinSharePreemp
>> tionTimeout>
>>
>>   <!-- Default running job limit pools where it is not explicitly set. -->
>>   <queueMaxJobsDefault>20</queueMaxJobsDefault>
>>
>>   <!-- Default running job limit users where it is not explicitly set. -->
>>   <userMaxJobsDefault>10</userMaxJobsDefault>
>>
>>
>> <!--  QUEUES:
>>          dwr.interactive   : 10 at once
>>          dwr.batch_sql     : 15 at once
>>          dwr.batch_hdfs    : 5 at once   (distcp, sqoop, hfs -put,
>> anything besides 'sql')
>>          dwr.qa            : 3 at once
>>          dwr.truck_lane    : 1 at once
>>
>>          cad.interactive   : 5 at once
>>          cad.batch         : 10 at once
>>
>>          comms.interactive : 5 at once
>>          comms.batch       : 3 at once
>>
>>          default           : 2 at once   (to discourage its use)
>> -->
>>
>>
>> <!-- queue placement -->
>>
>>   <queuePlacementPolicy>
>>     <rule name="specified" />
>>     <rule name="default" />
>>   </queuePlacementPolicy>
>>
>>
>> <!-- footprint -->
>>  <queue name='footprint'>
>>     <schedulingPolicy>fair</schedulingPolicy>   <!-- can be fifo too -->
>>
>>     <maxRunningApps>4</maxRunningApps>
>>     <aclSubmitApps>*</aclSubmitApps>
>>
>>     <minMaps>10</minMaps>
>>     <minReduces>5</minReduces>
>>     <userMaxJobsDefault>50</userMaxJobsDefault>
>>
>>     <maxMaps>200</maxMaps>
>>     <maxReduces>200</maxReduces>
>>     <minResources>20000 mb, 10 vcores</minResources>
>>     <maxResources>500000 mb, 175 vcores</maxResources>
>>
>>     <queue name="dev">
>>        <maxMaps>200</maxMaps>
>>        <maxReduces>200</maxReduces>
>>        <minResources>20000 mb, 10 vcores</minResources>
>>        <maxResources>500000 mb, 175 vcores</maxResources>
>>     </queue>
>>
>>     <queue name="stage">
>>        <maxMaps>200</maxMaps>
>>        <maxReduces>200</maxReduces>
>>        <minResources>20000 mb, 10 vcores</minResources>
>>        <maxResources>500000 mb, 175 vcores</maxResources>
>>     </queue>
>>   </queue>
>>
>> <!-- comms -->
>>  <queue name='comms'>
>>     <schedulingPolicy>fair</schedulingPolicy>   <!-- can be fifo too -->
>>
>>     <queue name="interactive">
>>        <maxRunningApps>5</maxRunningApps>
>>        <aclSubmitApps>*</aclSubmitApps>
>>     </queue>
>>
>>     <queue name="batch">
>>        <maxRunningApps>10</maxRunningApps>
>>        <aclSubmitApps>*</aclSubmitApps>
>>     </queue>
>>
>>   </queue>
>>
>> <!-- cad -->
>>  <queue name='cad'>
>>     <schedulingPolicy>fair</schedulingPolicy>   <!-- can be fifo too -->
>>
>>     <queue name="interactive">
>>        <maxRunningApps>5</maxRunningApps>
>>        <aclSubmitApps>*</aclSubmitApps>
>>     </queue>
>>
>>
>>     <queue name="batch">
>>        <maxRunningApps>10</maxRunningApps>
>>        <aclSubmitApps>*</aclSubmitApps>
>>     </queue>
>>
>>   </queue>
>>
>>
>>
>> <!-- dwr -->
>>   <queue name="dwr">
>>
>>     <schedulingPolicy>fair</schedulingPolicy>   <!-- can be fifo too -->
>>     <minMaps>10</minMaps>
>>     <minReduces>5</minReduces>
>>     <userMaxJobsDefault>50</userMaxJobsDefault>
>>
>>     <maxMaps>200</maxMaps>
>>     <maxReduces>200</maxReduces>
>>     <minResources>20000 mb, 10 vcores</minResources>
>>     <maxResources>500000 mb, 175 vcores</maxResources>
>>
>> <!-- INTERACTiVE. 5 at once -->
>>     <queue name="interactive">
>>         <weight>2.0</weight>
>>         <maxRunningApps>5</maxRunningApps>
>>
>>        <maxMaps>200</maxMaps>
>>        <maxReduces>200</maxReduces>
>>        <minResources>20000 mb, 10 vcores</minResources>
>>        <maxResources>500000 mb, 175 vcores</maxResources>
>>
>> <!-- not used. Number of seconds after which the pool can preempt other
>> pools -->
>>         <minSharePreemptionTimeout>60</minSharePreemptionTimeout>
>>
>> <!-- per user. but given everything is dwr (for now) its not helpful -->
>>         <userMaxAppsDefault>5</userMaxAppsDefault>
>>         <aclSubmitApps>*</aclSubmitApps>
>>     </queue>
>>
>>
>> <!-- BATCH. 15 at once -->
>>     <queue name="batch_sql">
>>         <weight>1.5</weight>
>>         <maxRunningApps>15</maxRunningApps>
>>
>>        <maxMaps>200</maxMaps>
>>        <maxReduces>200</maxReduces>
>>        <minResources>20000 mb, 10 vcores</minResources>
>>        <maxResources>500000 mb, 175 vcores</maxResources>
>>
>> <!-- not used. Number of seconds after which the pool can preempt other
>> pools -->
>>         <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>>
>>         <userMaxAppsDefault>50</userMaxAppsDefault>
>>         <aclSubmitApps>*</aclSubmitApps>
>>     </queue>
>>
>>
>> <!-- sqoop, distcp, hdfs-put type jobs here. 3 at once -->
>>     <queue name="batch_hdfs">
>>         <weight>1.0</weight>
>>         <maxRunningApps>3</maxRunningApps>
>>
>> <!-- not used. Number of seconds after which the pool can preempt other
>> pools -->
>>         <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>>         <userMaxAppsDefault>50</userMaxAppsDefault>
>>         <aclSubmitApps>*</aclSubmitApps>
>>     </queue>
>>
>>
>> <!-- QA. 3 at once -->
>>     <queue name="qa">
>>         <weight>1.0</weight>
>>         <maxRunningApps>100</maxRunningApps>
>>
>> <!-- not used. Number of seconds after which the pool can preempt other
>> pools -->
>>         <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>>         <aclSubmitApps>*</aclSubmitApps>
>>         <userMaxAppsDefault>50</userMaxAppsDefault>
>>
>>     </queue>
>>
>> <!-- big, unruly jobs -->
>>     <queue name="truck_lane">
>>         <weight>0.75</weight>
>>         <maxRunningApps>1</maxRunningApps>
>>         <minMaps>5</minMaps>
>>         <minReduces>5</minReduces>
>>
>> <!-- lets try without static values and see how the "weight" works
>> -->
>>         <maxMaps>192</maxMaps>
>>         <maxReduces>192</maxReduces>
>>         <minResources>20000 mb, 10 vcores</minResources>
>>         <maxResources>500000 mb, 200 vcores</maxResources>
>>
>> <!-- not used. Number of seconds after which the pool can preempt other
>> pools -->
>> <!--
>>         <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>>         <aclSubmitApps>*</aclSubmitApps>
>>         <userMaxAppsDefault>50</userMaxAppsDefault>
>> -->
>>     </queue>
>>   </queue>
>>
>> <!-- DEFAULT. 2 at once -->
>>   <queue name="default">
>>        <maxRunningApps>2</maxRunningApps>
>>
>>        <maxMaps>40</maxMaps>
>>        <maxReduces>40</maxReduces>
>>        <minResources>20000 mb, 10 vcores</minResources>
>>        <maxResources>20000 mb, 10 vcores</maxResources>
>>
>> <!-- not used. Number of seconds after which the pool can preempt other
>> pools -->
>>       <minSharePreemptionTimeout>60</minSharePreemptionTimeout>
>>       <userMaxAppsDefault>5</userMaxAppsDefault>
>>       <aclSubmitApps>*</aclSubmitApps>
>>   </queue>
>>
>>
>> </allocations>
>>
>>
>>
>> <!-- some other stuff
>>
>>     <minResources>10000 mb,0vcores</minResources>
>>     <maxResources>90000 mb,0vcores</maxResources>
>>
>>     <minMaps>10</minMaps>
>>     <minReduces>5</minReduces>
>>
>> -->
>>
>> <!-- enabling
>>    * Bringing the queues in effect:
>>    Once the required parameters are defined in fair-scheduler.xml file,
>> run the command to bring the changes in effect.
>>    yarn rmadmin -refreshQueues
>> -->
>>
>> <!-- verifying
>>   Once the command runs properly, verify if the queues are setup using 2
>> options:
>>
>>   1) hadoop queue -list
>>   or
>>   2) Open YARN resourcemanager GUI from Resource Manager GUI: http://
>> <Resouremanager-hostname>:8088, click Scheduler.
>>
>> -->
>>
>>
>> <!-- notes
>>    [fail_user@phd11-nn ~]$ id
>>    uid=507(fail_user) gid=507(failgroup) groups=507(failgroup)
>>    [fail_user@phd11-nn ~]$ hadoop queue -showacls
>> -->
>>
>>
>> <!-- submit
>>    To submit an application use the parameter -Dmapred.job.queue.name
>> =<queue-name> or -Dmapred.job.queuename=<queue-name>
>> -->
>>
>>
>>
>>
>>
>> *** yarn-site.xml
>>
>>
>>
>> $ cat yarn-site.xml
>>
>> ssprague-mbpro:~ spragues$ cat yarn-site.xml
>> <?xml version="1.0"?>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>> <configuration>
>> <!--Autogenerated yarn params from puppet yaml hash
>> yarn_site_parameters__xml -->
>>   <property>
>>     <name>yarn.resourcemanager.hostname</name>
>>     <value>FOO.sv2.trulia.com</value>
>>   </property>
>>   <property>
>>     <name>yarn.nodemanager.aux-services</name>
>>     <value>mapreduce_shuffle</value>
>>   </property>
>>   <property>
>>     <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
>>     <value>org.apache.hadoop.mapred.ShuffleHandler</value>
>>   </property>
>>   <property>
>>     <name>yarn.nodemanager.local-dirs</name>
>>     <value>/storage0/hadoop/yarn/local,/storage1/hadoop/yarn/loc
>> al,/storage2/hadoop/yarn/local,/storage3/hadoop/yarn/local,/
>> storage4/hadoop/yarn/local,/storage5/hadoop/yarn/local</value>
>>   </property>
>>   <property>
>>     <name>yarn.resourcemanager.scheduler.class</name>
>>     <value>com.pepperdata.supervisor.scheduler.PepperdataSupervi
>> sorYarnFair</value>
>>   </property>
>>   <property>
>>     <name>yarn.application.classpath</name>
>>     <value>$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON
>> _HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$
>> HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$HADOOP_
>> YARN_HOME/*,$HADOOP_YARN_HOME/lib/*,$TEZ_HOME/*,$TEZ_HOME/lib/*</value>
>>   </property>
>>   <property>
>>     <name>pepperdata.license.key.specification</name>
>>     <value>data://removed</value>
>>   </property>
>>   <property>
>>     <name>pepperdata.license.key.comments</name>
>>     <value>License Type: PRODUCTION Expiration Date (UTC): 2017/02/01
>> Company Name: Trulia, LLC Cluster Name: trulia-production Number of Nodes:
>> 150 Contact Person Name: Deep Varma Contact Person Email:
>> dva...@trulia.com</value>
>>   </property>
>>   <property>
>>     <name>yarn.timeline-service.hostname</name>
>>     <value>FOO.sv2.trulia.com</value>
>>   </property>
>>   <property>
>>     <name>yarn.timeline-service.enabled</name>
>>     <value>true</value>
>>   </property>
>>   <property>
>>     <name>yarn.timeline-service.webapp.address</name>
>>     <value>FOO.sv2.trulia.com:8188</value>
>>   </property>
>>   <property>
>>     <name>yarn.timeline-service.http-cross-origin.enabled</name>
>>     <value>true</value>
>>   </property>
>>   <property>
>>     <name>yarn.timeline-service.ttl-enable</name>
>>     <value>false</value>
>>   </property>
>>
>> <!--
>>   <property>
>>     <name>yarn.timeline-service.store-class</name>
>>     <value>org.apache.hadoop.yarn.server.timeline.RollingLevelDb
>> TimelineStore</value>
>>   </property>
>> -->
>>   <property>
>>     <name>yarn.resourcemanager.system-metrics-publisher.enabled</name>
>>     <value>true</value>
>>   </property>
>>   <property>
>>     <name>yarn.scheduler.fair.user-as-default-queue</name>
>>     <value>true</value>
>>   </property>
>>   <property>
>>     <name>yarn.scheduler.fair.preemption</name>
>>     <value>false</value>
>>   </property>
>>   <property>
>>     <name>yarn.scheduler.fair.sizebasedweight</name>
>>     <value>true</value>
>>   </property>
>>   <property>
>>     <name>yarn.scheduler.minimum-allocation-mb</name>
>>     <value>2048</value>
>>   </property>
>>   <property>
>>     <name>yarn.scheduler.maximum-allocation-mb</name>
>>     <value>8192</value>
>>   </property>
>>   <property>
>>     <name>yarn.nodemanager.disk-health-checker.max-disk-utilizat
>> ion-per-disk-percentage</name>
>>     <value>98.5</value>
>>   </property>
>>   <property>
>>     <name>yarn.log-aggregation.retain-seconds</name>
>>     <value>604800</value>
>>   </property>
>>   <property>
>>     <name>yarn.log-aggregation-enable</name>
>>     <value>true</value>
>>   </property>
>>   <property>
>>     <name>yarn.nodemanager.log-dirs</name>
>>     <value>${yarn.log.dir}/userlogs</value>
>>   </property>
>>   <property>
>>     <name>yarn.nodemanager.remote-app-log-dir</name>
>>     <value>/app-logs</value>
>>   </property>
>>   <property>
>>     <name>yarn.nodemanager.delete.debug-delay-sec</name>
>>     <value>600</value>
>>   </property>
>>   <property>
>>     <name>yarn.log.server.url</name>
>>     <value>http://FOO.sv2.trulia.com:19888/jobhistory/logs</value>
>>   </property>
>>
>> </configuration>
>>
>>
>> On Wed, Jan 11, 2017 at 2:27 PM, Akash Mishra <akash.mishr...@gmail.com>
>> wrote:
>>
>>> Please post your fair-scheduler.xml file and yarn-site.xml
>>>
>>> On Wed, Jan 11, 2017 at 9:14 PM, Stephen Sprague <sprag...@gmail.com>
>>> wrote:
>>>
>>>> hey guys,
>>>> i'm running the RM with the above options (version 2.6.1) and get an
>>>> NPE upon startup.
>>>>
>>>> {code}
>>>> 17/01/11 12:44:45 FATAL resourcemanager.ResourceManager: Error
>>>> starting ResourceManager
>>>> java.lang.NullPointerException
>>>>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair
>>>> .a.getName(SourceFile:204)
>>>>         at org.apache.hadoop.service.CompositeService.addService(Compos
>>>> iteService.java:73)
>>>>         at org.apache.hadoop.service.CompositeService.addIfService(Comp
>>>> ositeService.java:88)
>>>>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
>>>> r$RMActiveServices.serviceInit(ResourceManager.java:490)
>>>>         at org.apache.hadoop.service.AbstractService.init(AbstractServi
>>>> ce.java:163)
>>>>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
>>>> r.createAndInitActiveServices(ResourceManager.java:993)
>>>>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
>>>> r.serviceInit(ResourceManager.java:255)
>>>>         at org.apache.hadoop.service.AbstractService.init(AbstractServi
>>>> ce.java:163)
>>>>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
>>>> r.main(ResourceManager.java:1214)
>>>> 17/01/11 12:44:45 INFO resourcemanager.ResourceManager: SHUTDOWN_MSG:
>>>> {code}
>>>>
>>>> the fair-scheduler.xml file is fine and works in INFO level logging so
>>>> i'm pretty sure there's nothing "wrong" with it. So with DEBUG level its
>>>> making this java call and barfing.
>>>>
>>>> Any ideas how to fix this?
>>>>
>>>> thanks,
>>>> Stephen.
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Regards,
>>> Akash Mishra.
>>>
>>>
>>> "It's not our abilities that make us, but our decisions."--Albus
>>> Dumbledore
>>>
>>
>>
>
>
> --
>
> Regards,
> Akash Mishra.
>
>
> "It's not our abilities that make us, but our decisions."--Albus Dumbledore
>

Re: DR for Data Lake

Reply via email to