SRE (Site Reliability Engineer) Ssr/Sr

Management and IT services provider

Buenos Aires - Palermo

19-02-2020

Semi Senior / Senior

Our client is hiring a SRE (Site Reliability Engineer) SSR/SR to join team.

Requirements:

Windows Server or Linux (RedHat and/or Debian based distributions) Administration. [required]
Experience with at least one of the following programming languages: Python, Go. [required]
Application monitoring, troubleshooting, log analysis, system metrics analysis. [required]
Strong understanding of networking concepts (switching and routing, OSI Protocol). [required]
Experience working with VCS systems such as Git. [required]
Handle second level real-time alerts.
Resolve high-impact incidents together with an incident response team.
Feel confident learning native scripting languages (bash, powershell) to implement solutions.
Experience coordinating resources to achieve service restoration aka Incident Management.
Basic knowledge of Cloud infrastructure (AWS, Azure).
Operating System Monitoring.
Read and interpret monitoring system graphics.
Knowledge about application servers such as Red Hat JBoss/WildFly.

It's a PLUS!:

Experience working with configuration management tools such as Puppet or Ansible (Preferred).
Experience in on-premise infrastructure management and cloud-based infrastructure.
Experience in tracking problems with ticketing systems. Jira service desk (Preferred).
Experience working with containerization software - Docker Engine.
English Certifications, driver's license, U.S. Visa or European Passport are a big plus

It's required:

Availability to travel at least 2 weeks to USA.
Advanced English (writing and speaking skills) is required to communicate with technical teams and customers.
Availability to be on a passive on-call schedule.

You will be:

Working closely with a cross-functional team of SREs, DBAs, developers, and Engineers to ensure the reliability of the platform.
Participate on an Agile team, with daily scrum meetings, as well as planning and grooming meetings.
Developing your monitoring skills by using different monitoring tools.
Developing custom tools to automate processes as you see fit in order to reduce toil and increase engineering work.
Monitoring metrics for overall reliability of a distributed SaaS product.
Interacting with Cloud Services from Azure. Working mostly on Windows platforms. Working on some Linux platforms.
Troubleshooting over distributed systems and applications.

Location: Palermo.

Know the proposals