Lately we’ve had to change how several documents on the MongoDB database of a project were related to each other. Here’s the how to.

About embedded documents

One of MongoDB’s features is the ability to have documents embedded inside other documents, which makes sense when modelling certain problems in the domain of the application. However, this has the disadvantage of not being able to query the embedded document by itself. An example:

class Student
  include Mongoid::Document
  #...
  embeds_one :referee
  #...
end

class Referee
  include Mongoid::Document
  #...
  embedded_in :student
  #...
end

This allows you to have an instance of Student on which you can do student.referee, but won’t let you do a Referee.find('1233456'). That is, to get to any referee, you need to do it through the parent student.

What we needed

In our application, we needed to have it so a Referee could be tied to several Students at the same time, this is, switching to something like this:

class Student
  include Mongoid::Document
  #...
  belongs_to :referee
  #...
end

class Referee
  include Mongoid::Document
  #...
  field :email
  #...
  has_many :students
  #...
end

Trying to have both relationships at the same time on both models would not work, so we needed to “migrate” our existing referees into this new relationship. After some research, trial and error, we devised the following process, that could be easily adapted to other change of relationships:

NOTE This was written for a Rails 3.2 application running Mongoid v3. It might not work out of the box for newer or (gasp!) older versions.

  1. Rename the embedded model to embedded_model_name.
  2. Create a copy of that embedded model, with the original model_name, and modify it so it is not embedded_in, but instead, has_many
  3. Modify the parent model, so it embeds_one :embedded_model_name and also belongs_to :model_name. Remember to modify the names also in clauses like accepts_nested_attributes_for and validates_associated.

So far, they would look like this:

class Student
  include Mongoid::Document
  #...
  belongs_to :referee
  embeds_one :embedded_referee
  #...
end

class Referee
  include Mongoid::Document
  #...
  field :email
  #...
  has_many :students
  #...
end

class EmbeddedReferee
  include Mongoid::Document
  #...
  field :email
  #...
  embedded_in :student
  #...
  validates :email, uniqueness: true
end

On the new Referee model, we add a validation for the uniqueness of the email address, as we want to have it identify our Referee in the system.

Now we would need a Mongoid migration or a Rake task to perform the migration. Basically, we need to make the Student model aware of the change of the embedded model name, and populate our new Referee collection with the data from the old embedded referees.

namespace :change_relationships do
  desc "Creates referees from the existing embedded ones"
  task create_referees: :environment do
    # First, rename the embedded referee in the student
    Student.all.each{|s| s.rename :referee, :embedded_referee }
    # Now, doing student.embedded_referee will give us the embedded document

    # Then, create collection of Referees for the Students
    Student.all.each do |student|
      if student.embedded_referee.present?
        old_referee = student.embedded_referee

        # Now we create a new Referee based on the email
        # or retrive an already existing one
        new_referee = Referee.find_or_create_by(email: old_referee.email)

        # Now you need to copy over the embedded_referee field
        new_referee.set(:first_name, old_referee.first_name)
        new_referee.set(:last_name, old_referee.last_name)
        # ...

        # Then set in the belongs_to association
        student.set(:referee_id, new_referee.id)
      end
    end
  end
end

After this, we will have a brand new collection of Referees with the relationship we wanted with the Students. However, we must note some things:

  1. For the student.embedded_referee.present? check to work properly, make sure that on your Student model you have disabled or removed out the autobuild option on the embeds_one clause. With this, we avoid getting empty embedded_referees and creating empty referees (or getting validation errors in the new ones because they have no data, if we have introduced them)
  2. When copying over the data from the embedded_referees to the referees, you might want to follow a strategy to overwrite data of an existing referee only if the one you are migrating is newer. It’s up to you and what you need.

Next steps

Obviously, check that the relationship is working properly from both sides. You’ll need to fix lots of tests now, by the way.

Once you’re done and you’re sure everything is a-ok, you can have a cleanup task like this:

desc "Remove old embedded referes from the students"
task cleanup_embedded_referees: :environment do
  Student.all.each do |student|
    # Make sure the new referee is there before removing
    if student.embedded_referee.present? && student.referee.present?
      # Finally, remove the embedded model
      student.embedded_referee.remove #or delete
    end
  end
end

And finally, remove any mentions to embedded_referee from the Student model and the EmbeddedReferee model altogether.

Conclusion

I hope this post is helpful to you, and the method is explained well enough so you can adapt it to other relationships (embeds_many, has_one, …), or it doesn’t need much tinkering when trying to use it on newer versions of Mongoid.

Picture ‘Sliced Chioggia Beets, Mitosis’ by Ano Lobb flickr, used under CC BY 2.0 license.