Cloud & Dev Operations Engineer
Company: ThirdLaw AI
Location: Berkeley
Posted on: May 3, 2025
Job Description:
About the Challenge We're Tackling:As enterprises integrate LLMs
into their existing applications, traditional observability tools
fall short in addressing the unique safety and operational risks
posed by LLM interactions. These tools are adept at monitoring
conventional metrics like rate limits, latency, and cost breakdowns
but lack the capacity to assess the stochastic risks inherent in
LLM inputs, outputs, and inter-LLM communications. This gap
represents the primary barrier to confidently deploying LLMs in
enterprise settings. At ThirdLaw, we empower IT and Security teams
with the tools to answer the foundational question; "Is this OK?"
and take decisive action when it isn't. We provide the
next-generation monitoring solutions necessary to evaluate,
investigate, and mitigate the unique risks associated with LLM
deployments.About the role:AI is reshaping software development,
enterprise knowledge management, and the way work gets done. By
giving IT and Security professionals the tools to make sure AI is
doing everything it should, and nothing it shouldn't, you'll be
enabling the safest path to a wave of incredible AI-powered
innovation. This role is responsible for ensuring the availability,
reliability, and performance of cloud infrastructure and services.
This includes CI/CD, automation, and infrastructure as code, as
well as sensible and cost-effective choices on cloud infrastructure
and services.What you'll be doing:
- Cloud Operations: You'll work within a small but mighty team of
AI engineers and backend engineers to provision and manage cloud
resources, establish observability and incident response, enforce
security and compliance controls, and optimize costs.
- Deployment Operations: Build the deployment and maintenance
infrastructure to support complex hybrid deployments that work
across cloud-hosted and customer-hosted platform components.
- Development Operations: Build and maintain CI/CD pipelines, use
Terraform to enable repeatable and scalable infrastructure, manage
deployments, and ensure fast identification and resolution of
issues across any environment.
- Every day, you will lay the foundation for our service. Most of
this work is first-tracks / ground up / from scratch, with your
impact as clear as day. This is an enterprise solution and has real
expectations around reliability, security, and scalability.
- Start-up responsibility; you are the first and often last stop
on whether our service is good or great.Skills and Qualities you'll
need to bring:
- Cloud Infra Expert: Expertise in major cloud providers
including AWS and Azure, including provisioning and managing
services (EC2, VPC, S3, etc.). Experience with IaC tools like
Terraform, CloudFormation, or Ansible to manage and version
infrastructure declaratively.
- CI/CD Pipeline Development: Proficiency in setting up and
managing CI/CD pipelines with tools like Jenkins, GitLab CI, or
GitHub Actions to automate software builds, tests, and
deployments.
- Containerization & Orchestration: Skills in Docker and
Kubernetes for packaging applications, scaling deployments, and
managing dependencies. Good understanding of Docker and K8
internals.
- Infrastructure Monitoring & Incident Management: Familiarity
with monitoring tools (e.g., CloudWatch, Datadog) and incident
management best practices to ensure high uptime and
performance.
- Security & Compliance: Skills in managing identity and access
management (IAM), encryption, network security, and compliance
frameworks (e.g., SOC 2, GDPR).
- Cost Optimization: Ability to analyze cloud usage patterns,
manage budgets, and implement cost-saving measures (e.g.,
rightsizing, spot instances).
- AI-first: Interest and willingness to learn concepts in
artificial intelligence, machine learning, and deep neural
networks. You are excited about the possibilities of
LLMs.Nice-to-have:
- Ideally, you live in the Bay Area or want to be here enough to
collaborate in person sometimes, but we are able to work with
anyone in the continental United States.Join us as we pursue our
mission to unlock the boundless possibilities of generative AI by
ensuring AI trust and safety. We're looking for people who bring
thoughtful ideas and aren't afraid to challenge the norm. Our team
is small and focused, valuing autonomy and real impact over titles
and management. We need strong technical skills, a proactive
mindset, and clear written communication, as much of our work is
asynchronous. Our product is new and operates in a rapidly changing
ecosystem of generative AI; we are builders with the ability to
dispatch ambiguity to solve customer pain. If you're organized,
take initiative, and want to work closely with customers to shape
our products, you'll fit in well here.
#J-18808-Ljbffr
Keywords: ThirdLaw AI, Alameda , Cloud & Dev Operations Engineer, Engineering , Berkeley, California
Didn't find what you're looking for? Search again!
Loading more jobs...