Caching in Drupal 7: Varnish, APC, and Memcache

One of the things that I really like about Drupal 7 is how much easier it is to set up caching than in Drupal 6. It was as simple as installing Varnish, APC, and Memcache on my server, installing a few Drupal modules for my site, and updating my Drupal site's settings.php file.

Varnish

Varnish is a reverse-proxy HTTP accelerator that works great for quickly serving anonymous website trafic. Essentially, Varnish responds to HTTP requests (e.g. for pages, images, CSS documents, etc.) and gathers the necessary resources from one or more servers, caches them, and provides the requested resources in its response to the client. After Varnish generates a cached version of a given resource, Varnish will serve that instead of re-contacting any backend servers and processes, such as Apache/nginx and MySQL, for the cache's lifetime. Varnish can serve cached resources extremely quickly, and its use typically speeds up delivery from 100 to 1000 times. Anytime a cached version of a page is served, the backend is not involved, saving server resources.

Once a response is no longer anonymous (e.g. there is a cookie header present, such as after a user logs into the website), Varnish will bypass its cache and instead contact backend servers directly for the resources needed to fulfill the request. One reason for Varnish to act this way is so that the server(s) can respond dynamically to the user (e.g. provide account information) as the user navigates through the site.

Using Varnish's VCL (Varnish configuration language) file, however, Varnish can be configured to ignore specific cookies. Most VCL file examples that I've seen ignore any Google Analytics cookies that are set, since Google Analytics JavaScript tracking code is so often present on websites. If the VCL file was not set up to ignore the presence of these cookies, even otherwise anonymous requests would still miss the cache. There are many other tweaks and configuration options that can be applied through a VCL file, and these changes are often necessary, depending on the architecture of a particular website. Tools like varnishstat are available from the command line that will help to identify problems with a particular Varnish configuration so that cache hit rates can be increased. There is some great documentation that covers all of this and more.

APC and Memcache

I'm not going to talk very much about APC or Memcache, since there is plenty of information about installation and configuration available for both, and I think that since both function squarely in the backend, they're less difficult to understand. In my single-server setup, APC and Memcache also required very little configuration. I do think that it's important to note that APC is PHP-specific and is somewhat limited. There is no way of which I'm aware, for example, to share an APC cache across load-balanced PHP servers. In a multi-server environment such as this, it is better to use Memcache. It is also my understanding that APC works better when a cache is unlikely to change very often, so I'm only caching cache and cache_bootstrap in APC and using Memcache as my default caching mechanism.

Drupal 7: Module Installation

Here is the list of caching-related Drupal modules that I have installed:

Three out of the four modules are fairly obvious; they provide Drupal integration for Varnish, APC, and Memcache. Cache Expiration is great because it provides support for dynamic cache expiration when nodes, users, comments, taxonomy terms, and more are added or updated. This is important because it allows a site administrator to head over to Configuration → Development → Performance and raise the "Expiration of cached pages" value way up, increasing cache hit rates. There is no need to worry, because if a new node is posted or a user comments on an existing one, the page will be dynamically purged and anonymous users who visit the page will see the new content, even if the lifetime of the cached page had not yet been reached.

Drupal 7: settings.php

As part of the process of getting the preceding modules set up, I needed to make some changes to my settings.php file. First, I let Drupal know that a reverse-proxy server (Varnish) is being employed for the site. After finding the following line, I uncommented it:

$conf['reverse_proxy'] = TRUE;

I also searched for and uncommented the reverse-proxy header:

$conf['reverse_proxy_header'] = 'HTTP_X_CLUSTER_CLIENT_IP';

Next, I uncommented the reverse-proxy addresses configuration line and set the address to my localhost (127.0.0.1)

$conf['reverse_proxy_addresses'] = array('127.0.0.1');

Finally, I added several cache configuration options at the bottom of my settings.php file to get everything running:

// Varnish
$conf['cache_backends'][] = 'sites/all/modules/contrib/varnish/varnish.cache.inc';
$conf['cache_class_cache_page'] = 'VarnishCache';
// Drupal 7 does not cache pages when we invoke hooks during bootstrap.
// This needs to be disabled.
$conf['page_cache_invoke_hooks'] = FALSE;

// Memcache
$conf['cache_backends'][] = 'sites/all/modules/contrib/memcache/memcache.inc';
$conf['cache_default_class'] = 'MemCacheDrupal';
$conf['cache_class_form'] = 'DrupalDatabaseCache';

// APC
$conf['cache_backends'][] = 'sites/all/modules/contrib/apc/drupal_apc_cache.inc';
$conf['cache_class_cache'] = 'DrupalAPCCache';
$conf['cache_class_cache_bootstrap'] = 'DrupalAPCCache';
//$conf['apc_show_debug'] = TRUE; // Remove the slashes to use debug mode.

Drupal must be made aware of Varnish, Memcache, and APC via cache_backends, and cache bins can then be assigned to specific caching mechanisms (cache_page for Varnish, cache and cache_bootstrap for APC, and any other cache bins default to Memcache). I'm caching class_form in the database because the cache_form bin must be assigned to non-volatile storage. The reason behind this is that if the data from the cache_form bin is lost, forms can become invalidated. There's a fairly long-standing issue in the Drupal 8 dev queue that goes into a little bit more detail about this issue.

Comments

Excellent article!!! Very helpful.

I am working on using this same setup, however i have 2 questions...

1. I am using boost module at the moment. Would it hurt to leave boost enabled while using this setup?

2. Do you think using "Authcache" would help for authenticated users?

 

Im thinking of using boost, varnish, apc for anonymous users & memcached with authcache for autheticated users. Then using "shadow" for sql queries (views optimization) on filters/sorting.

 

Would you say this is overkill? I use views quite heavily, have many anonymous and authenticated users but need a fast site. What would you recommend and how do you think i should confige this? Thanx again for this great article.

 

- JC

 

Rich's picture

Hi JC,

I apologize for the delay in my response.

  1. While I don't think that leaving Boost enabled would hurt anything, it's also unlikely to provide you with any additional benefit when used with Varnish. For your anonymous traffic, Varnish will serve requests without them even hitting your application layer, which is where Boost is going to provide its benefits. Once you have Varnish configured and humming along, I can't think of any reason why Boost would be necessary. Any more advanced configurations that you would need could be handled through your VCL file.
     
  2. Authcache should help with authenticated users, but I've had issues getting it to work properly in the past. I haven't tried configuring it recently (within the past year or so), however. I'm going to check out a more advanced configuration with Varnish to actually serve authenticated page content through it as well, as outlined in http://joshwaihi.com/content/authenticated-page-caching-varnish-drupal. For content that needs to be served dynamically and personalized for a particular user, ESI (edge side includes) should do the trick.

Other than Boost, I would not say that anything you've mentioned would be overkill, especially with heavy use of Views. I've always found tweaking Views caches to be one of the more difficult pieces to get right, but I think that your configuration will have to be very specific to your site. The frequency of how often your content gets updated (and where that content lives) will play a huge role in optimizing your configuration.

Thanks for your praise of the article, and I hope this helps!

- Rich