Hi ,

Please share your profile  to  adi...@torquetek.com





*Site Reliability Engineering Architect *


*Alpharetta, GALong Term*

SRE architect will play the mission-critical role of ensuring that critical
systems are healthy, monitored, automated, and designed to scale. This role
requires a thoughtful problem solver with excellent organizational skills.
The Site Reliability engineering team is responsible for availability,
latency, performance, efficiency, change management, monitoring, emergency
response, and capacity planning. This role will be responsible for
responding to production problems, investigating their causes, and
engineering and advising on permanent solutions.

*Must have*
• Experience in defining the SRE Roadmap for organizations
• Experience with Cloud technologies and Solution. (GCP Preferred)
• Experience with IAC tools (Terraform, CloudFormation)
• Experience with configuration management tools like Ansible.
• Experience with container technology and orchestration (Kubernetes,
Docker).
• Proficiency with tools like Git, Bitbucket
• Linux operating system, testing tools and database management with MySQL.
• Experience in one or more of the following: Java, JS, Duck creek, Python,
Micro-services
• Experience with Monitoring tools like App Dynamics.
• Experience with Log management and ELK Stack. (Elastic Search, Logstash,
Kibana)
• Experience with APICA, Zebra tester for synthetic monitoring
• Experience with Pager Duty for Alerting.
• Understanding of the Application servers, Network and Databases.
• Excellent understanding of Scalability processes and techniques.
• Understanding of Jenkins or other build tools.
• Hands on experience in administering high availability and
high-performance environments, as well as managing large-scale deployments
of traffic-heavy applications.
• Someone who can handle multiple complex systems and not shy away from the
challenge of improving them.
• The willingness to try new technologies and make them harmonize with
existing systems to achieve better operations overall.

*Responsibilities*
• Engage in and improve the whole life-cycle of services—from inception and
design, through deployment, operation and refinement.
• Design, develop, ship, and motivate the creation of software and systems
to increase product reliability and organizational efficiency.
• Guide reliability practices through the entire software development
life-cycle through activities like architecture reviews, code reviews,
creating platforms and frameworks, capacity planning.
• Work with senior engineering and testing team members to build tools and
testing strategies for problem prevention, detection, and chaos testing.
• Design and create centralized logging and monitoring systems.
• Design and create robust logging, monitoring, and alerting systems.
• Troubleshoot production incidents in real time.
• Lead root cause investigations.
• Improve service reliability through blameless post-incident reviews and
using code to prevent or respond to problem recurrence.
• Proactively identify system anomalies.
• Recommend and execute testing strategies.
• Recognize automation opportunities.
• Participate in on-call rotation and be able to work on weekend during on
call schedule.
• Code level debugging on issues escalated to the team.
• Develop tools to automate routine jobs through knowledge learned on the
job.
• Plug into software release cycle. Work closely with developers to ensure
software releases are well designed, planned, implemented, released, and
monitored.
• Automate time-consuming and manual processes.
• Assess current SRE solution and define the SRE approach for products.
• Work with applications development teams on designing, implementing, and
improving SRE practices.
• Conduct SRE training sessions.
• Design and execute Scalability strategies that ensure the scalability and
the elasticity of the infrastructure.

*Added advantage*
• Experience of working in large financial services or retail chain
organizations
• Excellent communication and organizational skills
• Thriving as a member of a team excelling under pressure
• The ability to think fast; A natural problem-solver

*Qualifications*
• Bachelor’s degree or equivalent in Computer Science, Engineering or a
related field, or additional comparable experience
• Proven experience in IT, application development or DevOps, including
excellent knowledge of networking, computing and storage
• Background in Software Development, Software Validation, or Systems
Engineering
• Industry certification in cloud services / solutions preferred





Thanks & Regards



Aditya Noolu

IT Recruiter

Office: (919) 234-7048
__________________

*Torque Technologies LLC * A
*1135 Kildaire Farm Road, Suite #200, Cary, NC 27511*

*An E-Verified Company / INC 5000 Company*www.torquetek.com

Torque Technologies LLC is an Equal Opportunity Employer (EOE). Qualified
applicants are considered for employment without regard to age, race,
color, religion, sex, national origin, sexual orientation, disability, or
veteran status.

*Applicants*
*In compliance with federal law, all persons hired will be required to
verify identity and eligibility to work in the United States and complete
required employment eligibility verification documentation upon hire*.

-- 
You received this message because you are subscribed to the Google Groups 
"Android Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to android-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/android-developers/CAEgW9YzVP9F0YoNP3PMn6OAoM3KGPP9q5Q9rcx3fuuzU3tqEGQ%40mail.gmail.com.

Reply via email to