Note: This post was originally published by Luis Natera on his personal blog. It has been republished here as part of TYN Studio's content.
Database migrations aren't just for schema changes—they're also a powerful tool for data corrections. When you need to fix data issues across multiple environments (development, staging, production), writing a migration is often better than manual fixes.
The Problem
A previous migration introduced problematic data into the database. Some user records had incorrect gender values ("FEMALE" instead of "Female"), and this inconsistency needed to be fixed across all environments.
Why Not Manual Fixes?
You might be tempted to just update the records manually:
UPDATE user SET gender = 'Female' WHERE gender = 'FEMALE';
But this approach has problems:
- You have to remember to run it in every environment
- It's not tracked in version control
- Other developers won't know about the fix
- It's error-prone (easy to forget or run incorrectly)
The Migration Solution
Django migrations solve all these issues. When you create a migration, it automatically runs in all environments when you deploy.
Here's a migration using RunPython operations:
from django.db import migrations
def forwards_func(apps, schema_editor):
# Get the model from this migration's app state
User = apps.get_model("myapp", "User")
# Filter records matching specific criteria
for user in User.objects.filter(hair_color="brown", gender="FEMALE"):
user.gender = "Female"
user.save()
def reverse_func(apps, schema_editor):
# Optional: define how to reverse this migration
pass
class Migration(migrations.Migration):
dependencies = [
("myapp", "0021_some_other_migration")
]
operations = [
migrations.RunPython(forwards_func, reverse_func)
]
How It Works
- apps.get_model(): Gets the model as it exists at this point in migration history
- Filter and update: Finds records matching the criteria and updates them
- save(): Persists changes to the database
- reverse_func: Optionally defines how to undo the migration
Key Advantages
- Tracked in version control: The fix is part of your codebase
- Automatic deployment: Runs automatically with
python manage.py migrate - Consistent across environments: Same fix applies everywhere
- Documented: Future developers can see what was fixed and why
- Reversible: You can define how to undo the change if needed
Important Notes
When writing data migrations:
- Use apps.get_model(): Don't import models directly. The historical model state is what you want.
- Test thoroughly: Run the migration on a copy of production data first
- Be specific: Use precise filters to avoid unintended changes
- Consider performance: For large datasets, you might need to batch the updates
- Add comments: Explain why the migration is needed
When to Use This Approach
Write data migrations when you need to:
- Fix data inconsistencies introduced by previous migrations
- Populate new fields based on existing data
- Transform data to match new business rules
- Clean up deprecated or invalid data
Data migrations keep your database changes organized, tracked, and reproducible across all environments. Instead of ad-hoc manual fixes, you have a clear history of what changed and when.