This website uses cookies to ensure you get the best experience. Learn more
The Walt Disney Company

Staff Systems Reliability Engineer

Job Summary:

At Disney, we‘re storytellers. We make the impossible, possible. We do this through utilizing and developing cutting-edge technology and pushing the envelope to bring stories to life through our movies, products, interactive games, parks and resorts, and media networks. Now is your chance to join our talented team that delivers unparalleled creative content to audiences around the world.

The Systems Reliability Engineering (SRE) Tech Evangelism team helps elevate SRE practices at TWDC, promoting and onboarding new technologies, solving complex problems and integrating with next generation digital platforms.

Systems Reliability Engineers use a software engineering approach to architect, design, automate, monitor, and build applications at scale. This includes operating and engineering software with close business segment alignment to deliver platforms through efficient, effective and resilient architectures. SREs are talented engineers that are focused on improving quality through a data driven approach: instrumentation, automation, and functional/unit testing.

This position is for an experienced systems reliability engineer (SRE) eager to play an integral role on the SRE Tech Evangelism team for The Walt Disney Company to help elevate SRE practices, onboard new technologies, solve complex problems and integrate next generation digital platforms.

The Staff SRE will help create, build and deliver amazing experiences for our guests, fans and businesses. Primary responsibilities include helping existing, new and emerging business teams onboard new technologies or platforms to accelerate their businesses. This will include consultation, designing, building, and supporting development pipelines, automating infrastructure and operations, creating telemetry for monitoring, engineering high reliability and reinforcing best practices to secure our company and guest data.

The Staff SRE is expected to have expert level systems administration skills in Linux and Windows platforms, and must have experience with software development (e.g. Python, Go, Java, Node), CI Pipeline tools (e.g.Gitlab CI Jenkins), Git source management, cloud hosting (AWS, GCP & Azure), container computing (e.g. Docker, Serverless Technologis), web technologies and the DevOps team culture. This position will also bring expertise on systems, network, operational excellence and application stability, security, performance, and capacity management, as well as documentation.

The Staff SRE must be prepared to work with engineering, creative and production teams in an extremely collaborative and high-energy environment to brainstorm, architect, gather requirements, troubleshoot, and provide stellar customer support. The ideal Staff SRE is passionate about constantly learning, taking technology to the next level to solve complex problems, and is a highly motivated, optimistic, proactive, creative thought leader and project manager and working closely with our Business Units & Segments.

Responsibilities:

Translate ideas into tangible products that shape experiences by focusing on a systematic approach to automation, resiliency, efficiency, stability, security, performance, and capacity management, as well as documentation and serve as a subject matter expert through internal and external tech talks and conferences.



Make an impact on a transformative team and culture by designing, building, and supporting systems for a large-scale enterprise production environment that hosts a variety of digital workloads and experiences for The Walt Disney Company.



Collaborate and serve as a thought partner to work with various Engineering and Production teams to gather requirements, troubleshoot issues, apply a scientific approach to continuous improvement, challenge status quo, promote a high accountability trust culture and provide stellar customer support.



Inspire and lead initial discovery, architecture, design, automation, implementation and operationalization, including:
  • Business Engagement and Requirements Gathering
  • Architectural Review, Proof of Concept Work, and Onboarding
  • Project: Build and Operationalize New Systems/Sites/Services/Products
  • Systematic Load Testing, Troubleshooting, Optimization and Tuning
  • Create System and Application Monitors, Trending Metrics and Reports
  • Development: Tools and Automation Frameworks
  • Hosting Platforms and Infrastructure Design and Support
Documentation: Creation of Application Infrastructure Design documents, Operational Run books, and Knowledge Base Articles



Key Responsibilities

  1. (Architecture) - Leads the architecture for a complete inter-connected set of applications that takes into account future industry direction and business product alignment.
  2. (Collaboration) - Communicate effectively with executive management.
  3. (Collaboration) - Forms partnerships with other Staff and Sr. Staff members to see where they can drive cross-team efficiencies.
  4. (Communication) - Tracks, communicates, and improves time spent resolving operational issues.
  5. (Reliability Engineering) - Working on designing architecture that gracefully fails and advocate for the integration of those solutions into the software products.
  6. (Security) - Ensure application communication and data practices are following security best practices.
  7. (Systems Integration) - Guards infrastructure against the introduction of unnecessarily complex solutions
  8. (Software Engineering) - designs tools that facilitate ease of management and operations of applications, systems, and infrastructure on behalf of product teams
  9. (Quality Engineering) - contribute to the design or requirements definition for Quality Assurance tools and tool chains into SRE development and workflow processes specific to configuration management, orchestration, and tool chain development and support
Career Profile
  • Interprets internal or external business strategies, opportunities and trends and recommends best practices for the business
  • Solves complex problems; takes a broad perspective to identify innovative solutions
  • Works independently, with guidance in only the most complex situations
  • May lead functional teams or projects
  • Provides mentoring to junior members of the engineering Teams.
  • Influences the direction and adoption of technology across multiple engineering teams and businesses.
  • Able to present technical subjects to both technical and non-technical audiences, large forums, and executives

Basic Qualifications:

Technical Requirements
  • Expertise in multiple scripting languages and advanced skills in programming languages (e.g. Go, Python, Ruby, Dart, Node, Java, others alike) with ability to build test coverage for all software being developed.
  • Systems administration skills on Linux and Windows platforms
  • Networking skills and protocols (e.g. HTTP, TLS, SSH, DNS)
  • Software Development Continuous Integration (CI) Pipeline knowledge (e.g. Jenkins, Gitlab CI)
  • Expertise with Distributed Systems and Container Platforms (e.g. Kubernetes/GKE, ECS, Mesos, Fargate, Nomad)
  • Experience with Source Control Management systems (e.g. Git)
  • Expertise in public and private cloud hosting services (AWS, Google Cloud, Azure)
  • Recognized as a subject matter expert on at least one OS and proficient in multiple operating systems, including OS performance monitoring, setup, configuration, tuning, and troubleshooting.
  • Proficient in web server technologies (e.g. Apache, Node.js, NginX, Tomcat, IIS, Caddy Server) including setup, configuration, performance monitoring, tuning, clustering, and debugging (e.g. JConsole).
  • Proficient with data technologies (e.g. NoSQL, MySQL, MongoDB, Redis, Elastic) including being able to perform basic setup, configuration, and troubleshooting.
  • Able to implement existing base standards for new systems and/or applications for all of the following:
    • Site/Systems monitoring and instrumentation
    • Application monitoring and instrumentation
    • System monitoring and instrumentation
    • Resilience, performance & Telemetry data
  • Able to diagnose simple to complex systems and process problems.
  • Able to perform and provide in depth analysis on load test runs against a moderately complex system.
  • Demonstrate exceptional troubleshooting methodology, including the ability to author and instruct new methodologies to the SRE team.
  • Independently resolve moderately to highly complex system and application incidents.
  • Able to identify and propose system and application fixes for performance bottlenecks.
  • Able to evaluate new application requirements for capacity and run-time best practices.
  • Able to evaluate new system and/or infrastructure solutions for technical feasibility against known requirements and standards.
  • Effective at dealing with change: Able to transition in role or handle a significant modification or technology with minimal ramp-up time and with very little guidance.
Communication and Leadership Requirements
  • Excellent verbal and written communication to all levels in the organization.
  • Inspires and creates excitement about new technologies, platforms and methodologies
  • Demonstrates curiosity and continuous learning and self-improvement.
  • Ability to lead functional teams in systems integration and design including writing operational specs, architectural diagrams, test plans and requirements management.
  • Communication of ideas and solutions in a clear and organized manner.
  • Clear and effective presentations to groups of people, including internal and external conference presentations.
  • Effective project management and planning on large-scale projects (familiarity with agile/scrum project management).
  • Ability to design and deliver training to other technologists.
  • Construction of concise and complete technical documentation.
  • Mentoring of other Staff on technical material.
  • Viewed as a reliable technical resource for others.
  • Able to quickly and adeptly understand the needs of the business and be able to translate those needs into actionable items.

Preferred Qualifications:

DISNEYTECH

Required Education

Bachelor of Science degree in computer science or related field or equivalent experience in technical operations and software engineering

About The Walt Disney Company (Corporate):

At Disney Corporate you can see how the businesses behind the Company’s powerful brands come together to create the most innovative, far-reaching and admired entertainment company in the world. As a member of a corporate team, you’ll work with world-class leaders driving the strategies that keep The Walt Disney Company at the leading edge of entertainment. See and be seen by other innovative thinkers as you enable the greatest storytellers in the world to create memories for millions of families around the globe.

About The Walt Disney Company:

The Walt Disney Company, together with its subsidiaries and affiliates, is a leading diversified international family entertainment and media enterprise with the following business segments: media networks, parks and resorts, studio entertainment, consumer products and interactive media. From humble beginnings as a cartoon studio in the 1920s to its preeminent name in the entertainment industry today, Disney proudly continues its legacy of creating world-class stories and experiences for every member of the family. Disney’s stories, characters and experiences reach consumers and guests from every corner of the globe. With operations in more than 40 countries, our employees and cast members work together to create entertainment experiences that are both universally and locally cherished.

This position is with Disney Worldwide Services, Inc., which is part of a business segment we call The Walt Disney Company (Corporate).

Disney Worldwide Services, Inc. is an equal opportunity employer. Applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability, or protected veteran status or any other basis prohibited by federal, state or local law. Disney fosters a business culture where ideas and decisions from all people help us grow, innovate, create the best stories and be relevant in a rapidly changing world.

Apply Now

Share this