Serve Static Drupal Content Faster With Boost And Nginx

For Drupal sites that receive a not insignificant amount of anonymous traffic Boost is for you. Following on from yesterday's article on XCache where we went from 49 to 132 requests per second, we'll show you how Boost has taken us to an eye-popping 2516 requests per second for static Drupal content.

While it won't benefit everyone, there are a staggering number of Drupal based sites out there that serve predominately anonymous content. If you fall into this category you could do worse than consider adding Boost to your architecture.

Static vs. Dynamic Content

First things first. We need to define exactly what we mean by static Drupal content or anonymous traffic. Essentially it's content that remains the same no matter who is looking at it. Facebook, for example, is a site where each and every page is tailored for you: it's very dynamic. Conversely The Times is almost (as far as I can tell) entirely static — adverts withstanding — and would be a prime candidate for Boost.

How Does Boost Work?

Boost is a module that replaces Drupal's in-built anonymous page caching. When a page is generated by Drupal it is written by Boost to the file system. This allows your web server to serve a static file (if it's available) instead of invoking PHP. Take a look below at Everita's cache:

 [email protected]:/var/www/drupal/6/drupal/cache# pwd
 [email protected]:/var/www/drupal/6/drupal/cache# find

What you can see above is a static version — ready to serve — of almost every page in the Everita website. Be warned that you must use Clean URLs for Boost to work.

What sets Boost apart how well it is integrated into Drupal compared to something like Varnish. One rather excellent feature is that it knows what pages exist in the site and will crawl them thus warming the cache for you. This gets around the problem of one user having endure a tedious delay while the page is made for the first time.

Time Is Money

This is very important for sites which a substantial amount of content. It's usually the case that the vast majority of pages are only visited once or twice a day (the so called long-tail). Thus — chances are — they won't already be in the cache. You could argue this doesn't matter. After all if they are rarely in demand why worry about caching them?

The point is this: according to research by Amazon and Google even a 500ms delay could result in 20% less traffic. While 500ms may seem insignificant, 20% certainly isn't. Warming your cache is important: don't waste your users' time by having them do it.

Installing Boost

Boost is no different than any other Drupal module, download and extract it to your modules folder:

 cd /var/www/drupal/6/drupal/sites/all/modules
 tar -xzvf boost-6.x-1.17.tar.gz
 rm boost-6.x-1.17.tar.gz

Enable the module in Drupal by checking Boost, under the Caching heading at:

Now configure the Boost module at:

I had to create a directory called 'cache' under my document-root with permission for my webserver to write it. The Drupal status report will tell you if anything is awry:

Once that's done you can start configuring Boost, it has a myriad of options. I'll explain what I changed in order to get the best for my specific setup.

Configuring Boost For Nginx

Firstly I turned off Gzip page compression as Nginx does this for me. Obviously there's another performance gain to be had by serving up pre-zipped content rather than have Nginx do it on-the-fly. However, for the sake of simplicity, we'll leave this off for now.

Next I disabled caching of XML, CSS and JavaScript. Drupal continues to do this more than adequately leaving static files under /sites/ (assuming you've enabled bandwidth optimizations). Boost has only taken over page caching, nothing else.

Finally I enabled the cron crawler as discussed above. The rest I've left for the time being, clearly you can tailor the other options as you see fit.

So, Is It Working? Where Are My Cache Files?

Assuming your files are being cached under 'cache' (the default) you should begin to see .html files appearing. Note that if you're logged in — presumably as an administrator — you won't cause files to be cached as you meander through the site: you need to log out, browse the site, and check again.

 cd /var/www/drupal/6/drupal/cache
 find .

Configuring Nginx

As it stands you're now producing beautifully static .html files but as yet no one is reaping the benefits. We need to tell Nginx to serve cache files if they exist, reverting back to PHP and Drupal if they don't. Without any further hesitation here is that all important snippet from my configuration file:

 server {
   set $  boost "";
   set $  boost_query "_";
   if ( $  request_method = GET ) 
     set $  boost G;
   if ($  http_cookie !~ "DRUPAL_UID") {
     set $  boost "$  {boost}D";
   if ($  query_string = "") {
     set $  boost "$  {boost}Q";
   if ( -f $  document_root/cache/normal/$  host$  request_uri$  boost_query.html ) {
     set $  boost "$  {boost}F";
   if ($  boost = GDQF){
     rewrite ^.*$   /cache/normal/$  host/$  request_uri$  boost_query.html break;
   if (!-e $  request_filename) {
       rewrite ^/(.*)$   /index.php?q=$  1 last;
       rewrite /(.*)/$   /index.php?q=$  1 last;

Credit and thanks go to Mechanix for a healthy amount of direction.

Essentially the above states that a cache file may be served under the following circumstances:

    • The request is a GET

  • You're an anonymous user and not logged in

  • There aren't any URL parameters

  • The file requested exists in the cache

  • Otherwise refer it on to Drupal as before

The $ boost_query variable refers to 'Character used to replace "?"' under 'Generated output storage (HTML, XML, AJAX)' in Boost Settings for what it's worth.

That's it! I've a fairly basic site with equally simple URLs so your rules might become more complex but the principle is the same. Make sure you restart Nginx once you've made these modifications:

   /etc/init.d/fastcgi restart

Clearing The Cache

The strategy you use for clearing your cache is very dependant on the type of site you have. By default Boost will ignore calls from Drupal to clear the entire cache preferring to refresh it according to its own settings.

I've turned this off by setting 'Ignore cache flushing' to disabled. This lets me continue to use 'Clear cache data' to clear the entire cache when I tinker with the site's CSS for example. I'm a small site, it's less of an issue, my cache can be re-generated quickly. You might need to consider this more carefully. Rest assured Boost affords you plenty of control over when and how this happens.


You can see the difference this has made compared to yesterday's efforts with XCache below. Don't be fooled: you still need XCache or similar — especially if you deliver dynamic content — Boost can't help you there. If your content is predominately static however:

 [email protected]:~# ab -n 10000 -c 2
 Server Software:        nginx/0.6.32
 Server Hostname:
 Server Port:            80
 Document Path:          /
 Document Length:        25793 bytes
 Concurrency Level:      2
 Time taken for tests:   3.974 seconds
 Complete requests:      10000
 Failed requests:        0
 Write errors:           0
 Total transferred:      260060000 bytes
 HTML transferred:       257930000 bytes
 Requests per second:    2516.26 [#/sec] (mean)
 Time per request:       0.795 [ms] (mean)
 Time per request:       0.397 [ms] (mean, across all concurrent requests)
 Transfer rate:          63904.10 [Kbytes/sec] received
 Connection Times (ms)
               min  mean[+/-sd] median   max
 Connect:        0    0   0.2      0       2
 Processing:     0    1   0.2      1       6
 Waiting:        0    0   0.1      0       5
 Total:          1    1   0.1      1       6

I could get a further superficial increase if I used the keep-alive option in ab (-k) but it's hardly worth it. As with any benchmark these should be taken with a pinch of salt. The point is, comparing like for like with yesterday's test, Boost is certainly worth considering.

Leave a Reply