It’s time for something new

After 14 years we’re hanging up our keyboards. Our team is joining the lovely people at Culture Amp, where we’ll be helping build a better world of work.

Icelab

Run Your Own Piece of Heroku with Foreman

By Tim Riley03 Jun 2011

Last month Adam Wiggans wrote about applying the unix process model to web apps, where he introduced a new tool developed within Heroku: Foreman, for managing the many system processes behind web applications. As it turns out, Foreman is now the centrepiece of Heroku’s new Celadon Cedar platform, which they announced just two days ago. While this process model provides great flexibility to Heroku’s new platform, it’s also very useful to use Foreman as a background task manager on your own self-managed servers. Here’s how we’ve done it.

Foreman

When Foreman was announced, it was particularly interesting to me because it seemed to radically simplify the process of launching and maintaining the background processes that complement most Rails apps these days. Until now, we’ve been using a relatively unstable setup involving custom init scripts and bluepill. We’ve since moved an app to Foreman (as simple as adding gem 'foreman' to the Gemfile), and I’m very pleased with the simple, single Procfile that is the result:

delayed_job:    bundle exec rake jobs:work RAILS_ENV=production
sphinx:         bundle exec rake thinking_sphinx:run_in_foreground RAILS_ENV=production

For our app, we need Delayed Job for background tasks, Sphinx for handling search queries, and Thinking Sphinx’s delayed delta service running to make sure the search index stays up to date. With these three lines, Foreman knows enough to take care of all of these processes. Better still, none of these processes need to worry about daemonizing themselves; Foreman takes care of that too.

To start with, this makes local development very simple, because a single foreman start commend is all that is needed to run everything in a single terminal tab:

Running foreman start on the terminal

Foreman on the Server

When it comes to using Foreman on your server, it can export either to the widely supported inittab format, or the newer event-based upstart system used in Ubuntu. Since Ubuntu is my server distro of choice, we’ll take a look at the upstart export here. Running foreman export upstart /etc/init will populate the /etc/init directory (feel free to substitute this with a local test directory of your choice) with the following files :

* `avalanche-delayed_job-1.conf`
* `avalanche-delayed_job.conf`
* `avalanche-searchd-1.conf`
* `avalanche-searchd.conf`
* `avalanche.conf`

“Avalanche” is the name of our Rails app. The app itself, as the owner of all the services, gets a single master config file, avalanche.conf:

pre-start script

bash << "EOF"
  mkdir -p /var/log/avalanche
  chown -R avalanche /var/log/avalanche
EOF

end script

Then, there’s a master config file for each of the services in the Procfile, such as avalanche-delayed_job.conf:

start on starting avalanche
stop on stopping avalanche

And finally, a config file for the processes that you want running for each of the services. In my case, it’s one process only, so I get a single avalanche-delayed_job-1.conf file:

start on starting avalanche-delayed_job
stop on stopping avalanche-delayed_job
respawn

chdir /Users/tim/Code/icelab/avalanche
exec su avalanche -c 'export PORT=5000; bundle exec rake jobs:work RAILS_ENV=production >> /var/log/avalanche/delayed_job-1.log 2>&1'

This is where the actual process launching happens. The script moves into your app’s working directory, then su’s to a non-root user of your choice, and then runs the process.

Each these process-specific config files ultimately depend on the master app service (in our case, “avalanche”) to start and stop, which means a single start avalanche command will start everything, and the respawn directive in each of the config files ensures that the processes stay running. Surely there’s little else more reliable than the OS’ own init service to take care of this for you!

Managing Foreman with Capistrano

While Foreman’s export command makes it very easy to setup a server to run our processes, we’ll want to integrate it with our Capistrano deployment scripts. To get started, we’ll want to setup our sudoers file so that our admin user can easily export config files to the root-writable /etc/init directory using sudo:

Cmnd_Alias AVALANCHE = /usr/local/bin/bundle exec foreman export upstart /etc/init*,\
                       /sbin/start avalanche, /sbin/stop avalanche, /sbin/restart avalanche
admin ALL=(ALL) NOPASSWD: AVALANCHE

Then, we want to add some Foreman support to our config/deploy.rb file:

after 'deploy:update', 'foreman:export'
after 'deploy:update', 'foreman:restart'

namespace :foreman do
  desc "Export the Procfile to Ubuntu's upstart scripts"
  task :export, :roles => :app do
    run "cd #{release_path} && sudo bundle exec foreman export upstart /etc/init -a #{application} -u #{user} -l #{shared_path}/log"
  end
  desc "Start the application services"
  task :start, :roles => :app do
    sudo "start #{application}"
  end

  desc "Stop the application services"
  task :stop, :roles => :app do
    sudo "stop #{application}"
  end

  desc "Restart the application services"
  task :restart, :roles => :app do
    run "sudo start #{application} || sudo restart #{application}"
  end
end

Now, with every successful cap deploy, an updated set of Foreman-generated process configs are sent to /etc/init, and then all the services are restarted so they’re working from your latest deployed codebase.

You’ll want to take notice of how we run the foreman export here. We’ve made sure that the application name is passed with the -a option, so that it doesn’t used capistrno’s timestamped deployment directory as the base name of all the /etc/init config files. We’ve also passed our unprivileged deployment user with the -u option, which is the user that will run all the processes. Finally, with the -l option we’ve told it to send its log files to a directory that is already writable by our deployment user.

The final trick is in the foreman:restart task, where we try to simply start the application services before we try to restart them. This ensures that they’ll start properly the very first time you deploy with Foreman, and properly restart on all subsequent deployments (start fails if the processes are already running, and restart fails if they are not yet running).

Managing Thinking Sphinx with Foreman

One of my favourite aspects of Foreman is that it is the only part of your application that has to worry about daemonizing, pid file management and all the other ceremony required for reliable background tasks (I can’t recall just how many times I’ve had to fight with the daemons gem). This works well with any rake task or process that can run in the foreground, and most do without an issue.

The one exception I’ve come across so far is Thinking Sphinx, whose rake thinking_sphinx:start task sets an instance of searchd running in the background for you. Thankfully, Jay Zeschin has worked around this issue with a custom rake task to run searchd in the foreground, which you can make even simpler with the latest release of Foreman. So, stick this code in a lib/tasks/thinking_sphinx.rake file:

namespace :thinking_sphinx do
  # Make Thinking Sphinx play nicely with Foreman
  desc "Run searchd in the foreground"
  task :run_in_foreground => :environment do
    ts = ThinkingSphinx::Configuration.instance
      exec "#{ts.bin_path}#{ts.searchd_binary_name} --pidfile --config #{ts.config_file} --nodetach"
  end
end

Then tell Foreman to run rake thinking_sphinx:run_in_foreground, like in my Procfile at the top of this post.

To complete the picture, here is how we take care of Thinking Sphinx with Capistrano. In our deploy.rb:

# Include thinking sphinx's built-in capistrano tasks
require 'thinking_sphinx/deploy/capistrano'

before  'deploy:setup',       'thinking_sphinx:create_index_dir'
after   'deploy:update_code', 'thinking_sphinx:prepare'

namespace :thinking_sphinx do
  desc "Create directory for storing sphinx index files"
  task :create_index_dir, :roles => :app do
    run "mkdir -p #{shared_path}/sphinx"
  end

  desc "Prepare sphinx for running from a new release"
  task :prepare, :roles => :app do
    symlink_index_dir
    configure
  end

  desc "Symlink to the sphinx index dir from a new release dir"
  task :symlink_index_dir, :roles => :app do
    run "ln -nfs #{shared_path}/sphinx #{release_path}/db/sphinx"
  end
end

Caveats

Foreman has been working reliably for quite a while on the staging server where we are testing it. However, it is still a relatively young project and there are a couple of small pieces still missing.

Foreman doesn’t yet fully support the notion of passing a Rails environment to the commands that it runs. The 0.16.0 release now does let you specify an environment file when you run Foreman, eg. foreman start -e .env, where the .env file could simply contain the following:

RAILS_ENV=production

The RAILS_ENV variable would then be used when running each of its processes. This is currently only supported in the foreman start command and not yet foreman export, where we actually want to use an environment variable to specify production mode on the server. To get around this, we’ve manually appended RAILS_ENV=production to all our commands in the Procfile. This means they run in production mode in our local development environments, but that is less of a big deal.

For this to work nicely with Thinking Sphinx, we’ve specified a single, relative path for its config file in config/sphinx.yml, for both development and production modes (by default it wants to generate a config files like config/production.sphinx.conf):

development:
  config_file: config/sphinx.conf
production:
  config_file: config/sphinx.conf

This means that when we run rake thinking_sphinx:configure or rake thinking_sphinx:index in either development or production mode, the same config files are used as the running Sphinx process managed by Foreman.

I’d say that this partially-supported situation with the environment variables is only temporary, given that the Foreman author David Dollar has implied that the it will eventually come to the export command.

A Single Missing Piece?

Finally, the one other piece of a complete background process management system is something to restart the processes if their memory usage gets out of control. I’d imagine some kind of setup would work where you generate a monit config file based on the contents of your app’s Procfile, and have it issue the appropriate start/stop/restart upstart commands if things get out of hand. When we get around to doing this, I’ll be sure to let you know. In the meantime, if you come up with a working solution, we’d love to hear from you!