Chapter 4: Web performance

User experience is what is important and it is not easy to put objective figures on how fast a page is perceived to be. However, by observing several technical details you can identify things that affect user experience – such as how long it takes to load a webpage. With these metrics, you can eliminate, or at least mitigate potential problems and prevent bottlenecks that can occur, for example when a very successful marketing campaign goes viral.

You should make sure that relatively small proactive activities are done, or put in a performance budget for the next major release.

A successful marketing campaign is a planned event in which you have time to prepare yourself and the website. Still, we don’t always get the chance to prepare. The website can get an unmanageable flood of traffic landing on content pages taxing most of the server resources; we can also meet congestion of various kinds. This chapter suggests some preventive emergency planning to make life easier for the web-related part of organization.

Working with performance optimization has many more benefits than mitigating complete failures, especially now that more and more users are connected to the Web via cellular connections. Since you should not let visitors wait longer than necessary before they can interact with a website, it is important not to be wasteful in the transmission of files. Send as small amounts of data as possible to visitors since they sometimes have dubious connections to the Internet, for instance when in a sparsely populated area or the countryside.

Since 2009, Google has been working hard to help improve the performance of websites, both by providing tools to analyze performance, and also by providing motivation. The following statement in a blog post from Google more than suggests that they will penalize you if your website has poor performance:

“Like us, our users place a lot of value on speed – that’s why we’ve decided to take site speed into account in our search rankings.”
– Google Webmaster Central Blog34

Google itself has one second stated in their performance budget. One second as the time limit for a page to load completely when it comes to any of their services, according to statements from their optimization professional, Ilya Grigorik. Studies show that a few tenths of a second of lag are enough for a user to lose the feeling of instant response when trying to use a website, or any type of digital product, according to Nielsen-Norman Group35.

Planning for the unplanned

There are plenty of examples when it is a good thing to be prepared or proactive when it comes to web performance even though it becomes more obvious when something goes awry. Some examples when websites ran into problems with web performance follow.

The major Swedish newspaper, Aftonbladet.se, during the September 11 attack

On September 11, 2001, the most observable terrorist attack in modern times occurred. The event was filmed and broadcast on television, in real-time, and those Swedes who at the time had access to the Web (probably mostly at work and school) knew of Aftonbladet as the most established Swedish media outlet on the Web. It did not take long before Aftonbladet’s normal website went down, which I suspect happened to other big media organizations all around the world. Instead of the usual website with its flexible content management (Vignette Story Server if I recall correctly), they had to switch to a manually updated sparse version of the website to give visitors the current news they were looking for.

Figure 65: A stripped down version of the newspaper Aftonbladet, stating thousands of dead at the World Trade Center in New York.
Figure 65: A stripped down version of the newspaper Aftonbladet, stating thousands of dead at the World Trade Center in New York.
Figure 66: Meanwhile, at latimes.com, a simple design, probably under heavy load too.
Figure 66: Meanwhile, at latimes.com, a simple design, probably under heavy load too.

Until this extraordinary event occurred, at least I was not aware of the importance of the online role of media, or that websites could go down in a way that made it extremely difficult to get them online again. That websites might be slow was an everyday experience, but that a website was unreachable for a long time was unexpected.

A corresponding disaster scenario on a small, local scale might be a train accident and spillage of hazardous chemicals. It can easily have an impact on the surrounding municipalities’ websites and the shipping company concerned. The longer you can keep your website up and running, the fewer people who need to think about whether they have a traditional FM-radio receiver lying around.

Search Engine Optimization strikes against slow-loading pages

The governmental agency of Western Sweden, Region Västra Götaland, had a website with regional healthcare information. A website that I inherited from a developer who left the company where I worked. In 2009, the website consisted of a mix of national content, fed by an API, and self-produced content. Like most sites, we had been doing search engine optimization work in order to facilitate finding the website’s content through search engines. Suddenly one day in the fall, there was, according to the news channels, an outbreak of swine-flu; there was immediately a high number of requests for such content since the media had started writing hysterical headlines about death, pandemics and the end of humanity.

Figure 67: A sudden increase of visitors because of swine-flu. Stats from Region Västra Götaland’s healthcare-portal.
Figure 67: A sudden increase of visitors because of swine-flu. Stats from Region Västra Götaland’s healthcare-portal.

What was the problem? Well, those pages that now were popular were fetched using a national resource of texts, retrieved from an external web service for every single page view. Previously, most of the visits to the website were on pages that did not have external dependencies. The external API had already shown minor performance issues with response times, which, before the swine-flu epidemic, was not considered a priority issue. The API was not designed to support an increased load, no one had ever demanded such a need, and also, back then, what is today popularly called ‘the cloud’ was not established or offered as a bail-out as is the case today.

After some initial confusion as to why the website went down, the editors used editorial work combined with search engine optimization to remedy the problem. In 2009, individual keywords clearly weighed heavier in Google rankings compared to today, therefore creating new sub-pages that matched even better the keywords people searched for and that could direct traffic past the pages that were difficult to load. Web analytics and editorial work saved the day.

The energy company E.ON’s website goes down during a minor storm

In October 2013, a category 3 storm hit Southern Sweden. Tens of thousands of people were without power. If those without power or others, visited E.ON’s website via their mobiles, I do not know, but on this same day, I entered their website and it took an unprecedented 37 seconds for the server to respond with the first byte, and an additional 100 seconds to load the page completely.

Figure 68: Power company E.ON's page took 37 seconds to respond and 100 seconds to load, as stated in the browser’s status bar.
Figure 68: Power company E.ON’s page took 37 seconds to respond and 100 seconds to load, as stated in the browser’s status bar.

I took a quick look at their performance, such as how Google measures it according to its Pagespeed Insights service. E.ON had an 81 out of 100 rating, which in no way is remarkably bad. As an external spectator of their stripped-down disaster version of a website, I noticed however a number of things which could have made their emergency website perform better:

  • Do not send unnecessary stuff, such as custom fonts for a disaster version of a website. Is typography really your major concern in these situations? The custom font file-size is the same as the HTML code, and the HTML is what carries the actual content to visitors.
  • Instruct visitors’ browsers how long files are supposed to live – for example, the logo and favicon, will, most likely, remain unchanged for some days to come. When these unchanged files are not loaded from the servers, E.ON’s servers are able to serve more visitors.
  • Optimized lossless images could reduce image traffic by 18 %. Images are significantly heavier than text and make for unnecessarily large files which unfortunately, can make a difference in the wrong direction when it comes to crisis-communication.
  • Is it necessary to send 88Kb of style sheets? To design text like ‘Excuse us – our website is not working right now’? It actually takes ten times as much data to send, compared to the readable content, in order to specify that headers are red and the body text is black.

Some organizations with foresight prepare and do test runs with emergency and contingency plans for dealing with too much pressure on their website. To buy themselves more time, and perhaps in some fortunate cases, avoid entering emergency mode, it may be worthwhile to think about web performance.

If you switch to crisis-mode on your website, you will have to make sure that all established URLs pointing to your website work when the crisis occurs. Do not expect users to start erasing characters in the address bar to reach a possible home page – you have already failed them at the first contact attempt. In addition, they may use a mobile device that does not indicate that they have reached a sub-page, something that at least is the case with Apple’s mobile Safari browser since iOS 7. The address bar only shows the domain, not whether they are on a sub-page. Therefore, a user might not be aware that they are on a sub-page.

Of course, we need to have other communication channels when communication during a crisis. For example, you can add text messaging, or reach out via the media, but you still do not want to be known as the one who contributed to traffic congestion on the Internet just because you did not think of obvious optimization.

Performance optimization of databases, web servers and content management systems

Depending on whom you ask, you will get different perspectives on performance. There are several perspectives when asking different types of professionals what affects how quickly a website displays itself to a user. If you are talking to a database administrator, they will talk about databases structure and probably express skepticism about how the handover of data to the content management system is handled, or whatever the developers have built. A developer is often focused on efficient code, something that is certainly important, but only in extreme cases has a significant impact on user experience. User interface developers often try to minimize the amount of code, reducing the number of interface elements on the page, cut the file-size and reduce the number of files sent to users.

All of this is, for sure, confusing for those who have not worked much with optimization. I thought this part is what everyone needs to know if they are involved in website management. Certain parts may certainly be familiar to some. This chapter is likely to be somewhat technical, but hang on in there and google what you are not familiar with.

General troubleshooting

If you think your website is slow, and it is not obvious what is wrong, it is always a good idea to talk to a technically minded person looking through the log files for error messages. Frequently, the database server, web server, or content management system and similar reports errors to a log file when experiencing problems. It varies where you find this log file and it sometimes requires prior knowledge to understand if the error itself is a serious one. If you do this yourself, it is a good idea to google the error discovered (if you did not know, your developers do google the same errors before talking to you), since someone else probably encountered the same problem before you did.

Figure 69: In Windows’ Event Viewer, you can view all kinds of trouble the server encountered.
Figure 69: In Windows’ Event Viewer, you can view all kinds of trouble the server encountered.

If you are somewhat technically minded, there is usually a tool for each part of the system or one where you can check the server’s health.

With today’s complex websites with many interactions directly handled in the browser (without reloading the page that is), sometimes delays occur locally on the user’s computer. This indicates that you need to do something about your interface code, your HTML, Javascript and CSS. Maybe the page contains too many design elements with CSS applications that are not efficient enough. Such as when trying to toggle between hiding and displaying tens of thousands of design elements. If transitions and animations in the browser feel sluggish, this indicates that the interface code is too complicated. Then googling for performance testing of CSS, user interface or the frontend will probably be of help unless the error source is already obvious to you.

Figure 70: Chrome's network view when accessing amazon.com reveals the timeline of and info on the files received.
Figure 70: Chrome’s network view when accessing amazon.com reveals the timeline of and info on the files received.

A test you should do before you give up and call in the pros is to visit the sluggish page on your website with the browser networking feature enabled. Most browsers offer some tools for developers, and what the network (sometimes called timeline) does is to show what happens during loading of the files. The discoveries you can make are that files are sent in the wrong order, that there are more files than you think you need, or that a part of the download time is wasted waiting for the server to start sending anything at all. For example, when I visited Aftonbladet.se with Chrome as my browser, Chrome suggests that there is an initial latency of 0.074 seconds before anything is sent. We can often identify or rule out the problem just by looking at patterns in what is slow. Remember that networking concerns and other perceived slowness issues may depend on your own equipment and connectivity to the network. It is a good idea to test various devices and connections when you are troubleshooting.

Planning for high load – use cache!

A cache is a compiled version of something, a webpage, or any part of an element of an information system, and reflects information that has not changed recently. The point of a cache is to offer frequently used content directly from memory, with no need for it to be computed, or compiled, again and again for every user. It is much faster to send content to a user if the content is already in a cache in the server’s memory. Then the server does not need to check in databases, or talk with external APIs for each page view.

There are many variations of how caching can be used, but the two most meaningful from a developer’s perspective are the so-called accelerators, that create a cache only at the web server end when some content has changed. The other one is the cache you have in your own web application. An accelerator works a bit like a filter between the website and the users. If the content has not changed, what is sent is a cached version of the page from the accelerator, without even involving the web server. This way, we have two specialized softwares. The accelerator to send files as quickly as possible, if the content is unchanged, and a web server to compile web pages and operate the content management system (where editors are updating content).

Figure 71: Even accelerators can fail when put under extreme load, in this case Varnish cache server.
Figure 71: Even accelerators can fail when put under extreme load, in this case Varnish cache server.

Using an accelerator does not solve all of your performance related problems. However, they are incredibly useful for huge loads on semi-dynamic websites, where static information is supplemented by users’ comments for example.

Content Networks (CDN – Content Delivery Network)

Glossary – static files
A file that has the exact same content throughout its lifetime. Frequently used is Javascript library Jquery that version-controls its editions, that is, the contents of the file for Jquery version 1.9 is the same regardless of when it is accessed, and from any location.

A CDN is a network of servers located around the globe, and they are used to serve the Web with content. Dependent on your website needs, these services can be free of charge or cost a lot of money based on usage. A common example that is free, which many people use, is Google’s CDN for running Javascript framework Jquery on their websites. When retrieved, Jquery is not sent from the visited website but from a nearby CDN. Another common usage is to stream large amounts of video, something that draws a lot of traffic.

That such material is sent from a local data-center in the user’s vicinity is important because, for the user, it takes unnecessary time waiting to be connected with the server if it happens to be far away. Internet traffic on a continent works better than if the traffic has to go through cables under an ocean. This means that if you run a service that streams video, it will make a lot of difference if you have servers strategically placed, or rent servers on major networks and spread your content across the globe.

The above example of retrieving Jquery, which is common on most websites, makes very good use of a CDN. If your visitors have recently been on a website that fetches Jquery 1.9 from Google’s CDN, they will not be required to download the file yet again if your website uses the same file version on the same CDN. The browser then notes that it is the same file as previously downloaded, and fetches it from its lightning fast local cache, instead.

Jquery 1.9 is a static file: when an update is detected; a new version number is used, which means that you get a new filename. All addresses to a new version will differ from older versions, and instruct browsers that there is new content, not downloaded yet. Many pictures should be handled the same way, placed in a more or less large CDN repository for fast transmission.

A variant that often pays off, even for small hosting companies, is to create a subdomain like media.mysite.com where you put pictures, sounds and so on. The point is to distinguish the complicated work of compiling web pages by connecting to databases, APIs, etc. and the simpler task of sending static files. Web server software such as Apache and Internet Information Server are not optimized for sending large amounts of static files and can, with a CDN or separate function, be relieved of this chore and instead focus on heavier jobs.

Databases

When trying to identify problems with databases, there are tools available for all major database environments. Most commonly used is logging and profiling, to gain insight into databases task. Logging is exactly as the name suggests, a log file that, in this context, is a means of collecting the queries that take too long to execute, or if there are errors that can affect performance.

A database query may not take, in normal cases, more than a tenth of a second to run. If this is not the case, either the system is not correctly designed, or you need your database environment reviewed. If you have enabled logging of your slow queries, you might see a pattern after having amassed a few thousand logged queries and you can now, thanks to this, start to look for a solution.

The difference between logging and profiling is that profiling is about you manually monitoring performance for a short while and then turning it off when you think you have found something worth improving. The risk of profiling in itself is for it to become a performance issue, which can also be the case with logging, if you are unfortunate or forget to turn it off.

Glossary – database index
Just like an index, or table of contents, in a book. Complete bundles of data frequently sought after, from a database table. For example, filtering needs to be included in the index to give fast response times.

Can you quickly list products by category? Otherwise, the category field in the database table maybe is in need of being indexed for quick viewing. If it is a relational database (such as SQL Server or Mysql), it is common to have to work with database indexes so they reflect how the database is actually used. An index is a bit like a table of contents and makes it possible to choose which parts of a database that should be quick to look up.

I previously had a database of millions of records in the largest table. Just by adding an index, I lowered a typical database query’s execution time from over one hundred seconds to less than a tenth of a second. Without this optimization, it had not been viable for the website to ask questions directly to the database since the webpage’s response times would been several minutes!

Database servers are fairly complicated to wrap your head around. Frequently, it is an erroneous configuration, or non-favorable one, that prevents the database server from working properly. Sometimes it is as simply that you have out-grown the cheap hosting company you have, or perhaps that you need a dedicated database server. I have seen database servers that have been generous in hardware, but had configurations that only accepted a few simultaneous connections. This means that when the website has many visitors, a queue of waiting users is lined up for an available connection to the database server, even though the server is not necessarily overloaded. At these times, people often like to have the kind of cache as mentioned earlier, to remove the need to connect to the database server.

Among the more drastic measures are redesigning the database based on its use. This may involve creating copies of the information. Copies that are optimized only for reading while the original content is used to keep the database’s integrity intact. Others have chosen to partly, or fully, change the database architecture, for example, some data is located in so-called, NoSQL databases, or mem-caches, since they do not have to put any effort into creating relationships within the dataset as a relational database requires.

In larger organizations, it is common to have an enterprise search solution that indexes the entire content of databases and other information systems. If your search engine already holds all data from your databases, you might as well let the search engine serve data to your website, something a search engine does very quickly. It is important to have database changes indexed quickly, in order for the search engine not to present old content.

Web servers, content management, own source code and external dependencies

There are a thousand and one variants of how to optimize the performance of your technical environment. It is mainly about how much time and money you have. Even under normal circumstances, a website encounters growing pains on several occasions during its lifetime not only because of occasional traffic peaks, but also as content grows in scale. Or when your own code grows in complexity.

“A typical CMS is like a digestive system with no capacity to poop.”
– Gerry McGovern, @gerrymcgovern on Twitter

Frequently used web solutions in order of magnitude, smallest first:

  1. Web hosting account at 10 dollars a month – includes web server shared with many others and database in a separate database environment shared with many other customers. Inexpensive and an easy start for a small website.
  2. Dedicated server / colocation – renting space or a physical machine in someone else’s server room. These solutions usually include specific operational support, such as rebooting and replacement of faulty hard drives, but all the software on the server is your own problem (and opportunity).
  3. Virtual Private Server (VPS) – a rented virtual server where you yourself control and set up the behavior of the server. Often you can crank up performance when necessary; this is what many refer to as the cloud.
  4. Load-balanced server farm – sometimes hired in someone else’s server room or in your own hall if you are a large organization. The load is distributed between multiple cooperating servers to give visitors the information they need as fast as possible.

There is no guarantee that a website works better on a dedicated server compared to a budget web hosting server. To have your own server requires knowledge and plenty of time to make sure that all settings are optimal for the website’s needs. The difference between low budget and large-scale solutions is that you can make adjustments manually, but then it is also your problem alone to manage the unique environment you created.

Often, there is a point in having a larger website scattered, architecturally, across multiple servers to achieve the best performance. For instance, one server that only serves static content, like images (a bit like a private CDN), another server that only takes care of the databases and another server that manages the presentation of the website and its content management. These various server tasks have different needs based on hardware, configuration and software. If you divide the chores and distribute specialized roles, they perform better compared to generalist servers.

Dependent on what heavy content needs to be served to the individual website, we can be forced to expand the environment. In an Episerver CMS-environment I am familiar with, we needed an additional server for serving web editors, and all-in-all three servers for visitors. If I remember correctly, this was because there were many editors in the organization and that the load they put on the server was tremendous. They would move around thousands of pages in the pre-production environment, and these changes were reflected every hour in the production environment. This allowed the production environment to be online, but at the expense of editorial content lagging behind somewhat.

Talking about being responsible for configuring your own physical or virtual servers, it is a good idea, in addition to building a fault-tolerant system, to look at the external dependencies you have and how prepared they are for a significantly higher load. Most likely, there are settings in the application configuration files, or in the web server settings, a choice to adjust how long you should wait for a response from APIs, for instance. A web server has a limited number of clients it is able to serve at the same time; therefore, you do not want it to waste time waiting to serve a page to a user because of a sluggish external API.

If you talk to a developer, it is common that their experience in performance optimization is limited to what they can do with the source code that constitutes an application. In my experience, it is rarely worth trying to fix the source code, no matter what the programmer thinks. It is not cost-effective, since there is often more performance to win in databases, configurations, and making sure that you have the right hardware. Hardware is often inexpensive compared to having consultants look at the source code.

It will not hurt, of course, to take a quick look at the source code and see if there are glaring errors. Things I would fix in the source code are unnecessary concatenations of text (the aggregation of multiple text strings into one), unnecessary use of database connections, for example, that it takes more than one database query to retrieve, or write, data. However, unfortunately, it is difficult to assess precisely what impact a change can have before it is performed and tested.

For those who take coding seriously, there are tools for most environments, ReSharper for Visual Studio for instance, which keeps track of code while the developer is writing it. Some versions of Visual Studio also have built-in features to load-test a website. It may be interesting to see how the website behaves under stress. Many of these tools can also check and propose adjustments to the source code. In large development projects, there is often something called a build-server, which is another control instance to see if the code quality is high enough before the code reaches production. In build-servers, such as Jenkins, there are various reports to take note of, not just for performance.

Measuring and improving interface performance from the user’s perspective

A common way to get an idea about your website’s performance is to measure how long a page takes to load. This hides detail, such as latency, the time before the website begins transferring information to the user. In some relatively rare cases, the wait can be longer than the time the transfer takes. Since solutions differ, it is important to identify the problems in order of their priority.

Helpful tools

There are several ways to get figures on wait time versus transfer time. I prefer to have this information directly in my browser since you otherwise might miss the moment a website is slow if you, only after a sluggish page view, want to know the speed. An example of a tool that measures website response time is the Firefox extension app.telemetry Page Speed Monitor36. It has a value for how long it takes the server to send the first binary one or zero to the user. This figure is valuable since a high figure suggests an overloaded server, or misconfiguration. It is important to highlight that these values can vary greatly between different sub-pages. Are you aware of any specific page templates having additionally complex structure, with dependencies to external information; they are definitely worth testing to see how they behave.

Figure 72: App.telemetry Page Speed Monitor showing page load info in the upper right of the browser window.
Figure 72: App.telemetry Page Speed Monitor showing page load info in the upper right of the browser window.

You should not be hung up on exact numbers, but instead try to see if there are patterns in how a website behaves. Worth keeping in mind, your own experience is just anecdotal and that you may need to look for what the average user is experiencing, which is probably already recorded in your website statistics. In Google Analytics, you will find this under Behavior -> Site Speed -> Page Timings and on the websites I looked at, it is often easy to identify why pages are slow. Usually, it is because of an incredible number of lookups of information on databases or other sources. When you start, it feels good to be focused on quantitative data, to see if a performance boost resolved the problem or not.

If you are technically minded, and maybe found something very suspicious, Google Chrome has a great feature in the form of their toolbox for developers. With it, you can get detailed information on all files transferred between the website and your browser. The rule of thumb is that if the initial wait time is high, it is due to the technology or design of the website. If the transmission time is high, it might be due to editorial choices in the form of heavier material such as images, but probably the reason is in the design itself. For example, loading lots of Javascript not actually needed on every single page of the website.

Figure 73: Google Pagespeed Insights is great for indicating possible improvements in both performance and usability.
Figure 73: Google Pagespeed Insights is great for indicating possible improvements in both performance and usability.

Last but not least, Google Pagespeed Insights is an easy way to get a comprehensive view of how a website performs on both desktop computers and mobile phones. Pagespeed Insights can be used via the Web; it is also available as an API. In simple terms, two main divisions can be made about what affects performance. What the editor contributed and what is included in the website’s design.

Editorial performance impact

What an editor contributes to a website is mostly confined to different kinds of content, and their work in a content management system for publishing this content. It is extremely rare to have a problem with text since text does not take any significant effort to transfer. However, issues may arise where text is stored. In some content management systems, such as Episerver CMS, there are functions that retrieve content from other pages on the website. If you use such a feature, you introduce additional complexity to present a page to a user. It need not be a problem, but can lead to serious concerns if the page to retrieve data from has been deleted for example, as it will, at best, make the server work hard not to crash the page view.

The dramatic occasions on which I was able to point out that web editors affect the performance of a website were when someone had copied an entire node with pages and pasted it elsewhere. The node contained thousands of pages and the web server had to figure out the pages’ internal relationship. On these occasions, the database server said that it had deadlocked. On other occasions, very popular pages had been thrown in the trash, or given a expiration date. All this is to be regarded as faults in the web application as all such errors will take a lot of extra computing, and if there is an often sought after address behind these errors, you’re in for a lot of headaches. For the sake of visitors, you should take care of scrapped addresses by referring to new relevant material, and most preferably make sure that others using the old link update their links.

When it comes to media files, it is common for an editor to work with images. Video and audio are also present but usually they do not affect the perceived performance of a page, the material either is streamed or is a separate download. Pictures, on the contrary, are frequently included on pages and contribute significantly to slower load times during typical web surfing.

The first thing to do to make pictures lean is to choose the correct format. On the Web, mainly JPG, PNG and GIF are used. If we neglect all philosophical ideas about image format, we use JPG for photographs and images with many colors across the entire image area, PNG for illustrations and logos and GIF mainly for animated images. In image-editing programs like Photoshop, there are functions to save images in a web format, making it easy to compare which format is most effective in bringing down the file-size and keeping high enough quality. By choosing the right format, images can be optimized for quick viewing on the web. This means that the image quality is reduced and the image file-size too. You need to find the optimal balance where the picture still looks good but is as small as possible.

To top it off, there are applications that can cut image file size in a non-destructive way, i.e. in a way that is not detectable to the human eye. A favorite of mine, for Macs, is Imageoptim where you just drag and drop the image files into the application window and it overwrites the files with optimized versions.

Figure 74: Imageoptim makes images smaller, without loss of quality.
Figure 74: Imageoptim makes images smaller, without loss of quality.

The tool Smush.it37 can optimize images on a single web page. The same function is incorporated in the YSlow plug-in for Firefox or Chrome. There is a plethora of tools to fix your images – experiment to see what works for you. Perhaps you’ll dare to add an optimization plug-in in your content management system.

The principle for images also applies to video and audio. You choose the format with care and choose parameters such as resolution and quality when saving the file for publication.

Technical settings for performance

In 2010 I built a service to test websites’ optimization on my website webbfunktion.com, and based on tens of thousands sites tested so far, a clear pattern has emerged. There are some optimizations that are often lacking but are relatively easy to fix.

1. Forgot to set life expectancy of files

When the browser downloads a website, the website consists of several files. In addition to the HTML document, there are usually a number of images, at least one style sheet and a Javascript file. With each of these files come instructions to the browser about how long the file is expected to be up-to-date, the expected lifespan of the actual content, that is.

An all too common scenario is that the favicon, the logo and other files that are almost never updated do not have any expected lifespan. What happens then is that returning visitors download files yet again that have not changed since their previous visit. It’s silly as the files already exist (in the user’s browser cache) but when downloaded again, this can delay the page from displaying. It is logical not to download files that a returning visitor already has in their browser cache, would you not agree? Failing to attribute life spans to files forces the browser to download not only new files but old ones at every visit.

2. Static text files are sent uncompressed

Many files on a website consist of text, such as HTML documents, CSS and Javascript. These can be transferred as-is to users or you can compress them so they are quick to send. Compression means that an algorithm does the best it can to make a text file as small as possible. The algorithm searches for patterns and repetitions, for instance it is able to compress twenty consecutive whitespaces.

It is very common for those who develop websites to retain spaces and tabs for the code to be readable and easier to edit if necessary. Compression is a good idea so that this does not cause slower load-times for visitors.

Compression is much more than just a question of saving disk space. Frequently, we can strip away 75 % or more of a file’s size in transmission, which contributes to a faster user experience.

3. Images are not optimized

Almost all websites use decorative images, logos and display concepts. Images are the most abundant heavy material you post on a website and often have a high optimization potential. The images to focus on first are those that are part of the design, since they are loaded regardless of which page the visitor looks at, in other words, logos, icons and any wallpapers.

Even if you have saved optimized images for the web with your image-editing software, there is usually potential left to go a step further. This is called lossless optimization and the rule of thumb is that it should not cause any image deterioration visible to the naked eye.

It is common to find websites that send pictures where they could have optimized away a few hundred kilobytes if they had made an effort. That is not much in the grand scheme of things, at least not until someone on a shaky mobile connection is ready to give up and will not accept to wait a few more seconds.

Many organizations use image systems on their websites. Such systems are usually marketed with catchphrases such as ‘taking care of image reduction and optimization’, but in the cases I have seen, the system only performs a few quick tricks of the trade and leaves much to be desired. Be especially observant on pages with many thumbnails since they, in my experience, can be optimized much more than you think.

4. Too many files

The problem is mainly that sometimes there are far too many files. Each file to be downloaded adds to the wait time even before the file is sent. If there are many files, the situation gets worse.

The same applies to icon libraries, which sometimes used to load hundreds of very small images. Each small file contributes with its own unnecessary wait time, and could have been combined with a technique called CSS Sprites. CSS Sprites means that you have a single image file, which in turn contains several images. Then you use CSS to create a small peephole through which you look at the image and you can therefore see a single image at a time.

Instead of each file being loaded individually, all files are loaded at once. A bit slower the first time but the advantage is that there is only one file to download.

Many savvy web developers also combine all the Javascript into a single Javascript file and all the CSS into a single CSS file to reduce the number of files to send to a visitor. If haphazard behavior is observed related to the style sheet or Javascript, check whether the files were combined in the correct order when they were merged. The order of the contents is important.

5. Javascript blocks page load

Many modern websites use a lot of Javascript to give a rich user interface, to simplify interaction with forms, and more.

However, it is very common to load all Javascript code before other files of the page are loaded. Moreover, many execute much Javascript code before the page is presented to the user. So first, you wait to download Javascript, then you wait for it to execute. This is mostly a problem on slow devices, especially if they have a slow connection, and contributes to too long a wait time.

Frequently, there are a few hundred kilobytes of Javascript to download and execute in the browser before continuing with anything a visitor can see on the screen. It is important to prioritize the quick display of visible items on the screen. Sometimes it might be smart to load heavier materials when necessary instead of loading everything directly, so-called lazy-loading.

It is not simple getting an existing website to load Javascript after presenting an interface to the user, but it may be worthwhile examining whether it is possible to move some parts to be loaded last. For example, many people give priority to the page showing up quickly, at least over Javascript used for the web statistics tool to track the visit. This is why such scripts are placed at the end of the code and loaded last.

Check your own website with Google’s Pagespeed Insights service. It usually finds something you can improve.

Recoup an investment in web performance – is it possible?

Before deciding to invest heavily in the best possible web performance, it is important to know your objective.

There are four main objectives with optimization of web performance; cutting the cost of operations of the website and/or earning more money, and potentially, improving user experience and better search engine optimization.

If you are primarily looking to cut operational costs, there are a limited number of things to focus on. Other things you should completely avoid since they instead solicit more operational capacity. In this case, it is possible to make a forecast of future changing expenses.

Work for example, preferably with:

  • More efficient source code. Manual review is not always worth the effort, but many frameworks have tools that can find quality problems, and also check what your development environment and, possibly, build-servers can find.
  • Common database-queries in the cache, or create specific tables optimized for reading, taking into account usage patterns. Among other things, it is often more efficient to retrieve everything matching a given criterion from the database even though you only plan to showcase the first ten hits, at least when users occasionally scroll to the next ten hits. Then all hits are already in the database server cache.
  • Database indices are very important and may need to be continually updated as the content and usage of the website change.
  • Life span of files in the browser cache so files that have not been updated are not downloaded unnecessarily. If using a particular version of a Javascript library in the HTML code, you can safely assume that the contents will live forever since the next version will get a new address.
  • Web accelerators such as Varnish Cache Server are a solution to test for those experiencing extreme load from quite static web pages.
  • Content Delivery Networks (CDN) can be a solution if traffic from the server is the bottleneck. With a CDN, common files are sent from another server so it leaves more capacity for the tasks you keep on your own web server.

If you have a very high number of visitors to relatively few pages, you might consider selective caching of pages a preferable move.

Please talk about this subject with each type of specialist before you begin. If your website does not have at least a few hundred thousand page views per month and you are still experiencing troubles, the problem is probably something else. Maybe your website is on a slow web host, or perhaps the system is not designed for the website’s basic needs?

As an example, I might mention that when I was working as a web consultant, I concluded that each of our customers would have got massive benefits from 20 hours of specialist support for performance optimization, after which the value of further hours and effort fell, rapidly. Some would have got their investment back in a week while a few needed to start afresh with their website.

Continue reading: chapter 5, Test your own website ›

Web Strategy for Everyone

Table of Contents