Back to BlogDevOps

Cost-Optimizing AWS Infrastructure: A Case Study

Cloud Team
Nov 15, 2024
12 min read

Cost-Optimizing AWS Infrastructure: A Case Study


AWS costs can quickly spiral out of control as applications scale. In this case study, we'll share how we reduced a client's monthly AWS bill from $28,000 to $16,000—a 42% reduction—without sacrificing performance or reliability.


Initial Assessment


The client was a B2B SaaS company running a monolithic application on AWS. Their infrastructure included:


  • 12 EC2 instances (mostly m5.xlarge) running 24/7
  • RDS PostgreSQL db.r5.2xlarge with read replicas
  • Elastic Load Balancer and CloudFront CDN
  • S3 storage for user uploads (largely unoptimized)
  • CloudWatch and various other AWS services

  • The monthly bill breakdown:

  • EC2: $9,200
  • RDS: $6,800
  • Data Transfer: $4,200
  • S3 Storage: $3,100
  • Other services: $4,700

  • Optimization Strategy


    We implemented a multi-faceted approach targeting the biggest cost drivers.


    1. Right-Sizing EC2 Instances


    **Finding:** Most EC2 instances were over-provisioned. CPU utilization averaged 18-25%, and memory utilization was below 40%.


    **Action:** We analyzed CloudWatch metrics over 30 days and right-sized instances:

  • Downgraded 8 instances from m5.xlarge to m5.large
  • Converted 4 instances to t3.medium with burstable performance
  • Saved: $3,200/month

  • 2. Spot Instances for Non-Critical Workloads


    **Finding:** Development, staging, and batch processing workloads didn't require 24/7 availability.


    Action:

  • Moved dev/staging to Spot instances (70-90% savings vs on-demand)
  • Configured auto-scaling for graceful handling of interruptions
  • Implemented scheduled shutdown for non-business hours
  • Saved: $2,100/month

  • 3. Database Optimization


    **Finding:** The RDS instance was oversized, and read replicas were underutilized.


    Action:

  • Optimized slow queries (average query time reduced 65%)
  • Downgraded from db.r5.2xlarge to db.r5.xlarge
  • Removed one read replica (consolidated read traffic)
  • Implemented aggressive query caching with Redis
  • Saved: $2,800/month

  • 4. Serverless Migration


    **Finding:** Several API endpoints had sporadic traffic patterns but ran on dedicated EC2 instances.


    Action:

  • Migrated 12 low-traffic API endpoints to Lambda
  • Moved static asset serving to S3 + CloudFront
  • Implemented Lambda@Edge for some dynamic content
  • Saved: $1,600/month

  • 5. S3 Storage Optimization


    **Finding:** 73% of S3 data was rarely accessed, and many objects were stored in Standard tier unnecessarily.


    Action:

  • Implemented S3 Intelligent-Tiering for automatic cost optimization
  • Created lifecycle policies to move old data to Glacier
  • Enabled S3 compression for text-based files
  • Cleaned up incomplete multipart uploads
  • Saved: $1,400/month

  • 6. Data Transfer Optimization


    **Finding:** Significant data transfer costs between regions and to the internet.


    Action:

  • Consolidated resources in a single region where possible
  • Implemented CloudFront caching more aggressively (cache hit rate improved from 62% to 89%)
  • Enabled compression for API responses
  • Saved: $1,800/month

  • Results Summary


    Total monthly savings: **$12,000 (42% reduction)**


    | Category | Before | After | Savings |

    |----------|--------|-------|---------|

    | EC2 | $9,200 | $5,000 | $4,200 |

    | RDS | $6,800 | $4,000 | $2,800 |

    | Data Transfer | $4,200 | $2,400 | $1,800 |

    | S3 Storage | $3,100 | $1,700 | $1,400 |

    | Other | $4,700 | $3,700 | $1,000 |

    | **Total** | **$28,000** | **$16,800** | **$11,200** |


    Performance Impact


    Despite the aggressive cost cutting, we maintained or improved all key performance metrics:


  • **API Response Time:** Improved by 12% (thanks to query optimization)
  • **Uptime:** Remained at 99.9%
  • **Error Rate:** Unchanged at <0.1%
  • **Page Load Time:** Improved by 8% (better CloudFront caching)

  • Lessons Learned


    1. **Monitor Everything:** CloudWatch metrics were crucial for identifying over-provisioned resources

    2. **Start with Quick Wins:** Right-sizing and S3 optimization provided immediate ROI

    3. **Test in Staging:** We tested all changes in staging before production

    4. **Automate Where Possible:** Infrastructure as Code (Terraform) made changes repeatable and safe

    5. **Continuous Optimization:** We implemented monthly cost reviews to prevent future bloat


    Ongoing Optimization


    We continue to optimize by:

  • Reviewing AWS Trusted Advisor recommendations monthly
  • Using AWS Cost Explorer for spend analysis
  • Implementing more aggressive auto-scaling policies
  • Evaluating new AWS services (e.g., Graviton instances for ARM workloads)

  • Conclusion


    A 42% cost reduction is possible without compromising performance. The key is systematic analysis, incremental changes, and continuous monitoring. For most AWS customers, there are low-hanging fruit that can deliver significant savings with minimal risk.


    If your AWS bill is growing faster than your revenue, it's time for an infrastructure audit.


    C

    Cloud Team

    The Cloud Team at Senpai Software shares insights and best practices from real-world software development projects.