A screenshot of Netscape Navigator 2, courtesy of Wikipedia
Naturally, the strongest impact of these features has been on the user experience of the Web. Less obvious is the impact on the security model of the Web as we know it today. By introducing the capability to frame other pages within a page, and to use a scripting language to inspect and modify page contents, a potential security risk has been created.
Without the Same-Origin Policy, a page could easily include another page in a frame, and start inspecting its contents. Stealing a username and password would be child’s play. (Disclaimer: no Google pages were harmed making this image)
The essence of the SOP is actually very straightforward. Every context in the browser is assigned an origin, which is defined as the triple
(scheme, host, port), and is derived from the URL. For example, the origin associated with the page you are currently reading is
(https, www.websec.be, 443). The SOP dictates that only contexts that have the same origin are allowed to interact freely. Interactions between contexts that have different origins will very constrained, and are essentially limited to an opt-in message passing mechanism. In practice, this means that the interactions in our illustration from before will be prevented by the SOP. The SOP will not prevent the framing, but it will prevent the top level page from reaching into the frame, and inspecting its contents. A violation of the SOP will cause the browser to throw an error, as shown below. In a nutshell, the SOP ensures that within the browser, nobody from outside your origin can inspect and modify your pages.
Attempts to breach the Same-Origin Policy are met with an error thrown by the browser
XMLHttpRequest object is also constrained by the SOP. Same-origin connections are not subject to any constraints, while cross-origin requests are subject to the rules defined by the Cross-Origin Resource Sharinghttps://www.w3.org/TR/cors/) policy.
As you can see, an origin has become a primary security principal within the browser, and is used in access control decisions to various sensitive resources. Fortunately, the SOP is there to prevent unauthorized access from contexts with another origin. However, there is one important caveat: the SOP only applies to interactions between contexts with different origins, and not to interactions within one origin. Translated to HTML lingo, this means that the SOP is enforced on interactions between windows and frames, but not on interactions between an HTML document and its styles or scripts.
Using the script mechanism in this way was perfectly legitimate in 1995, since Web sites only relied on their own script files to enable dynamic behavior. However, in the past 10 years, the way we develop Web applications has changed drastically. With the introduction of JQuery in 1995, and plenty of other libraries since, we started integrating third-party libraries into our applications. A modern Web application today depends on dozens of external libraries, who load millions of lines of code, all directly into your origin. One might wonder whether that is still a good idea…
While these dependencies introduce a certain security risk, one could argue that a similar risk can be found in other development environments, such as a Maven project that includes various libraries. This argumentation actually makes sense for essential libraries that belong to the core of your application, especially because we have long passed the point where you could do it all yourself, without relying on third-party libraries. Including libraries into your origin to build better Web applications has become the default way of doing things. Fortunately, security initiatives such as Subresource Integrity allow you to verify the integrity of included libraries coming from a CDN.
Well, from a development perspective, this is definitely a major step forward. However, from a security perspective, this quickly becomes a nightmare, as sites have started to include third-party code and components from everywhere. A study from 2012 shows that of the Alexa top 10,000, at least 88.45% includes scripts from one remote host, and one site even from 295 remote hosts. This behavior is worrisome, since you have absolutely no control over what code will be included into your origin. where it has full access to your origin, all its associated resources and permissions.
The list of cases where things go wrong is endless. There’s even a specific term for malicious advertisements: malvertisements. I’m only including a few highlights here:
- Reuters got compromised by the Syrian Electronic Army through ads
- Advertisements on eBay were serving malware for three weeks
- A write-up of how a blog got XSS’d by its advertisement network
By now, we’ve established that including third-party scripts directly into your origin is not such a great idea, regardless whether it are advertisements, widgets or something else. The next section covers a few alternative integration strategies, where you leverage the protection of the SOP to isolate the third-party code from your own.
Effective Content Isolation
The most effective way to prevent third-party content from taking advantage of your origin’s resources or permissions is to avoid including it directly into your origin. By loading the third-party content into a frame with a different origin, you essentially leverage the protection of the SOP to isolate the content from your own origin. This approach is well suited for isolating components such as a chat widget or a social media timeline, which do not really need context information from the page itself. An illustration of this approach in practice is offered by Dropbox. On their main website, they include a support chat widget, offered by a third-party provider. Since they deem the risk of including third-party code in their main origin unacceptable, they isolate the content in an
iframe. To enable communication between the main page and the widget, they use the Web Messaging specification, which offers an opt-in communication mechanism between contexts. This approach is the recommended way of included third-party components, and actually sounds a lot harder than it actually is!
If you’re ready to take frame-based isolation a step further, then you’ll want to hear about the HTML5 sandbox. This sandbox lets you put additional restrictions on content running in a frame, allowing you to apply the principle of least privilege. If you enable the sandbox by setting the
sandbox attribute on an
iframe element, you enable all the restrictions offered by the sandbox, which include the restriction to autoplay audio/video, to submit forms, to run scripts, to load plugin content, … Most of these restrictions can be re-enabled selectively by adding options to the
sandbox attribute, as explained in this tutorial. One particularly interesting feature is the possibility to assign a unique origin to a sandbox. This unique origin will never match any other existing origin, which effectively allows you to load potentially untrusted content from your own origin. A perfect mechanism to safely isolate untrusted content, such as a forum post that may be riddled with XSS attacks.
A third technique that can help you gain control over your own and third-party content is Content Security Policy (CSP). One of the main goals of CSP is to stop XSS attacks from being executed, as explained here. The CSP specification allows the server to define a policy, stating the source of remote content (e.g. scripts, styles, images, …), and the destination of outgoing requests (e.g. form submissions, XMLHttpRequests, …). Essentially, CSP puts you in control over what happens on one of your pages, and allows you to block undesired behavior. Therefore, you can configure your CSP policy so that it allows the loading of the third-party code, but disallows the loading of additional, unknown files. This offers less security guarantees than origin-based content isolation, but can be viable alternative where the use of frames is out of the question.
Every time you include a third-party script into your origin, you’re enlarging the trusted computing base of your application. Every one of these files is an attack vector, and sneaking in malicious code through one of them is sufficient to take full control of your Web site, and all its associated resources.
Fortunately, you can leverage the protection of the Same-Origin policy to effectively isolate third-party components from the rest of your Web site. Further restrictions are available through the HTML5 sandbox attribute, and the incredibly powerful Content Security Policy.