Over a month ago I wrote a blog posting called “Protect your Web Applications through Encryption” in which I started to talk about “crypto-defense” for web applications or better say, I started with it one posting before where I wrote about “Secure Data Transfer over HTTP without SSL“. The basic idea was very simple, I tried to build an integrity check for all attributes we send to the client whenever we know which values can come back to us. This idea brings us to a point, where we don’t need any whitelisting filters anymore to protect us against any kind of attacks on web applications where proper input validation is the problem, like in XSS, SQL-Injection, Path Traversal and so on. From my point of view we can “fix” about 30% or more of all webapp vulnerabilities out there very easy with the technique I describe in this posting.

Now let’s go deeper into the idea, for that you should really first read the “Protect your Web Applications through Encryption” posting because I don’t want to explain everything I already wrote there once more.

The last POC worked fine for demonstration purposes but as I already mentioned there, it’s not really secure and it’s possible to bypass it, even if it takes some time to do it. After that I’ve talked about the problem with Endre Bangerter who’s a professor of computer science at Bern University of Applied Sciences and was before part of the Network Security and Cryptography research group of the IBM Zurich Research Lab, so I really think that he’s the right man to ask. After I explained my problem to him, he gave me the advise to have a look at so called Message Authentication Codes or short MACs or better say, for a special type called HMAC which stands for “keyed-hash message authentication code”.

Because it’s not an encryption anymore now, I don’t have the problem I had before because I’ve the possibility to generate a hash value and because it’s important, that nobody can generate the hashes himself, a HMAC takes also a key as an input so the combination of these two information (value to hash and the key) is exactly what I need.

So now let’s stop talking about HMACs because everyone who wants to get more information about it can search the Internet. So, now I’ve done a new POC, using a HMAC and yes it works just as expected. Just try to edit a value:
POC / Source (index.php) / Source (hmac.php)

I’m really looking forward for feedbacks on this and I hope, that you understand what to do with it out of my explanation :)

Just as an additional information:
During all of this, we found out that there exists a patent by IBM which may “cover” my idea but in a way that I think that they didn’t know what’s really possible with it. The only part of interest is perhaps the following but there’s nothing about a hash or even a MAC in there:
"The URI (20) is split or extracted (30) into transparent part (40) and opaque part (50). According to the example URI in (20), the transparent part (or <transparent part>) (40) may be represented as http://<host>[:<port>], and the opaque part (50) may be represented as [<abs_path>[?<query>]]. As indicated in block (60). the opaque part (50) may be combined with additional information (70), which may comprise, for example, a client's Internet Protocol (IP) address, timestamp, time-to-live, magic number, nonce, sequence counter, hash value, means to ensure integrity, or other application specific information, etc., as those of skill in the art will recognize."


8 Comments to “Crypto Defense for Web Applications – Today, the HMAC”  

  1. 1 kuza55

    From a security standpoint this is a great solution that adds extra security, since the key is big enough that there is almost no chance of it being brute forced.

    From a performance stand point, its not so great, doing this many HMAC calculations is going to bring sites to their knees – I have no clue how much traffic you’re going to have to get before this method kills it, but at some point its going to happen – and it won’t take too much traffic to do so.

    The problem shouldn’t really be the hash checking, sure its a bit expensive, but its not the worst part – the worst part will be generating HMACs for every link on a page which has parameters. The two things I’ve come up with are caching hashes, and using CSRF protections.

    The idea of using a cache should be fairly self-explanatory, and probably easy to integrate into any existing caching mechanisms, the problem will be when returning anything dynamic, such as search results, but those can be probably be limited to once every 30 seconds per user, or similar.

    The other idea I had was using something similar to CSRF protections instead of HMACs for the most part; if instead of writing http://www.disenchant.ch/blog/files/code/hmac_defense/?site=home.php&hmac=40cb4e8d3e39eb17eeb6777b3f65d7a6 to the page you write http://www.disenchant.ch/blog/files/code/hmac_defense/?site=home.php&id=1 which the server looks up in its session management system, generates the HMAC (since the server put it into the session management system its safe to generate a HMAC for it), then redirects the user to http://www.disenchant.ch/blog/files/code/hmac_defense/?site=home.php&hmac=40cb4e8d3e39eb17eeb6777b3f65d7a6 which is the standard link they can send to other people. This way you only have to generate one HMAC per page view, but you do have to store all the other links with ids with the rest of your session info.

    But these are just my ideas, is there anything you’ve thought of to make this a more viable solution?

    Oh, and I’m not sure whether this would work unless it was built as a drop-in product, because otherwise developers are going to forget to use it, it adds more work for them, and it lets them think that its going to save them from their other mistakes, which, as you said, will only happen about 30% of the time.

    Oh, and another issue will obviously be places where you accidentally generate a HMAC for user input, at which point anything found previously, but was unexploitable (and therefore probably wasn’t fixed) will become an issue.

    So yeah, while its a cool idea in theory, it’d be hard to implement properly.

  2. 2 Andy Steingruebl

    I’ve investigated similar systems before. They can work quite well for avoiding certain types of attacks including CSRF. Especially if we use per-user data in the tokens and/or rotate keys periodically/frequently based on the threats. Its good to see POCs put together for this sort of thing.

    There are however a few cases you need to be concerned about and/or this solution doesn’t cover.

    1. When passing the sensitive data via URL you risk it being logged in proxy servers if you’re not using HTTPS. Obviously fixable using HTTPS.

    2. If you have a site with any non-local content on the “secured” pages then you’re going to leak the “secure” URL via a referer header. Depending on the lifetime of the URL (one time, constantly rotating user session data being re-included into the HMAC) you could still open yourself up to CSRF type attacks via this scheme.

    3. By using this mechanism you give away any deep-linking into your site. There are times when this is a bad thing and its important to note the tradeoffs.

    4. Debugging and testing can become painful. Automated test scripts and such are necessarily tricky to create as you must use session aware and flow aware testing frameworks for everything that implement this sort of scheme.

  3. 3 Gabriel

    There are indeed lots of issues with encryption of URLs or parts of it. Some of them are mentioned above, such as deep linking, session based and static encryption, drop-in product.

    At Visonys we have had experience in URL encryption for more than two years. We focus on symmetric encryption of the whole URL instead generating hash codes. This way all the sensitive and less sensitive information in the URL is hidden from the user.

    Some problems not mentioned so far, that you usually run into, are:

    – handling of dynamically created URLs in javascript
    – handling of URLs that are missing a hash code
    – need of an entry-URL

    With symmetric crypto-algorithms there is no big performance issue even for high traffic sites according to our experience. As I’ve mentioned in a comment to a previous blog-post here, visonysAirlock addresses all these problems and provides an advanced implementation of URL-encryption.

    CSRF can only be fought effectively with session based encryption, since static encryption always generates the same links for any given URL. Unfortunately even session based encrytion doesn’t remedy all CSRF-risk. To be totally safe from CSRF you need to divide your webapplication into multiple zones that have different security requirements/properies. If you then control the transitions between these zones with session based encryption, you’re able to prevent CSRF. Unfortunately, not all web application may allow this partition into zones though.

  4. 4 Disenchant

    First, I’d like to thank you very much for your constructive feedbacks on this :)
    In the following lines I tried to answer your open questions and clarify a few things. Please, don’t hesitate to write another comment on this stuff and also say when you don’t agree with something I wrote below or if there are some more open questions.

    @ kuza55
    I think you won’t run into any performance problems, even if I didn’t tested it on performance. The algorithm of building the HMAC is like a hash calculation but with an additional key and also generating hashes is not really a performance problem with the servers today but I really should test it because when there is a performance problem, then I really have to re-think about this technique.
    For your idea with storing the HMACs on the server; I also had this idea but for doing a POC I didn’t want to store such stuff on the server because the way I did it now shows much better how does the technique works and this was my goal because I think we can do very much with it and people should also think about it themselves ;)

    - “the problem will be when returning anything dynamic, such as search results”
    This technique won’t work for parameters which have a value which can be defined by the user (at least most of the time) because there you can’t give one, two, three or more possibilities to the user and he can took only one of those so that you can check if the HMAC matches the parameter’s attribute on the server side.

    About your idea where the links look like /hmac_defense/?site=home.php&id=1; This will break the whole concept of this technique if I get you point right because the basic idea I describe in this blog posting is, that it should be impossible to change values you don’t want to let them change. When you have id=1, id=2, id=3 and so on then the user can try id=4 and that’s exactly one of the things I’d like to protect this way :)

    Last but not least, the good old discussion on how we can bring the developers to a point, that they’ll use such solutions. I really don’t know how we can do that a 100% but I think once more it’s just an idea for a technique and if you really have to write a secure application, then this is perhaps something which will help you, doing that much easier then today. Just think about whitelisting we use today, many times these whitelist aren’t restrictive enough or we even don’t have any of this. There it’s now possible to have some kind of “automated” whitelisting because a user can as soon as you’re using this technique, just use the values you give to him and nothing more, else he will get a warning or whatever. So I think that it’s a good solution for many problems we have today but not that many developers will care about it, even if it’s much easier to do their job right this way.

    @ Andy:
    1. What do you mean with “sensitive data”? This technique I’ve described here is just for doing an integrity check on values you know before. If you even don’t want that it’s possible to see what parameters and attributes will be transferred, you can simply use a normal encryption algorithm and encrypt the HMAC and also all the other values. This should fix that problem.

    2. Don’t worry about that, we already have a thought about this problem :)
    This is the normal situation out of the POC:
    ?site=home.php&hmac=40cb4e8d3e39eb17eeb6777b3f65d7a6
    Now the only thing we have to do is adding a random value or a simple timestamp into the HMAC and store it on the server and as soon as we get back the HMAC we can compare it with the values on the server and so you can implement a protection against CSRF and replay attacks. It could even be, that there’s a better solution for this problem but the basic idea would be the one I’ve described right here.

    3. Why should this be the case? As long as you don’t use any additional encryption or replay attack protection and so on. You have all informations in the URL so you can link to whatever you want without any problems because the HMAC is always the same, if you’re on the same page, it just changes per page (or better say link) changes.

    4. I don’t think so because as long as you don’t implement something like I’ve described in 2., this technique will generate the same link every time and so an automated testing script shouldn’t have any problems with it. Please correct me if you think I wrong on this.

    @ Gabriel
    Please keep in mind that my solution is not about protecting sensitive data which will be transferred over the Internet, it’s just there to have an integrity check but as I already wrote in the part for Andy, you can simply add additional encryption to this if you need to protect sensitive data :)

    - handling of dynamically created URLs in javascript
    As already mentioned before, this solution can’t handle dynamic values. But if you for example talking about AJAX applications where you send URLs to the client which will then be displayed as a link and when a user clicks on it a XHR will be sent to the server then you can just do it the same way as for normal links and so on.

    - handling of URLs that are missing a hash code
    If you would get a HMAC and didn’t get any, than you can react on this in a predefined way, it’s up to the developer if he sends the user in this case to an error page, to an error page, send a message to the IDS or whatever.

    - need of an entry-URL
    Just don’t use the technique for the homepage, it’s also very simple to have things which will be covered by this solutions and other which wouldn’t. In my POC source code you can see that I have a function called linkx() and this adds a link with a HMAC but you can also don’t use it. It depends on different things, if you should use it or not but most of the time I think you should.

    The problem of CSRF I already mentioned above :)
    PS: Please try to not writing future comments in such an advertising manner ;)

  5. 5 Gabriel

    Sorry for the advertising. I didn’t mean to push it too far, I rather thought it might be in your interest to have a pointer to an existing implementation, that basically has the same idea, but goes further by encrypting the whole URL.

    The idea with the HMAC is nice because you can ensure that some URLs was generated by your application and thus prevents forceful browsing somewhat. I say somewhat, because there are many web applications where you would have means to get the hash for a URL you want to enforce. You still get a picture of the application logic, which often is the real vulnerablilty.

  6. 6 kuza55

    @Disenchant:
    The idea of using id=1, id=2, etc, is that all those ids are linked to values which _you_ wrote to the session, so you can’t manipulate them, here’s some example code I threw together quickly – it might not work, but it should give you an indication of what I mean:

    index.php:

    “.$name.”";

    }

    // The demo links
    echo linkx(“home.php”, “Link 1″).”";
    echo linkx(“customers.php”, “Link 2″).”";
    echo linkx(“downloads.php”, “Link 3″).”";
    echo linkx(“contact.php”, “Link 4″).”";

    // Checks if the HMAC matches
    if ($crypt->hash($site) !== $hmac) {
    die(“It seems that you’ve modified a value!”);
    }

    echo “You’re on page: “.$site;

    ?>

    go.php:

    setKey($key);

    //The HMAC function
    header (“Location: http://”.$_SERVER['SERVER_NAME'].”/?site=”.$_SESSION['links'][$id].”&
    hmac=”.$GLOBALS["crypt"]->hash($_SESSION['links'][$id]));
    }
    }
    }

    ?>

    P.S. Maybe performance won’t be an issue, but I would think that if you’re performing say 20 hashes per page you render (without cacheing), then you’ll be adding a lot of overhead, but I haven’t done any tests either, which is why I was talking about things like search results, because then you won’t know before you get the results of the search the links you will want to render, and since those links will be pulled out of your db, or constructed from data pulled out from your db, they will be safe to generate hashes for – and this would seem to be a place where you would need to generate links, because you need to generate hashes in every place, but only for chosen variables.

  7. 7 Disenchant

    Today I got a mail from Endre Bangerter who I already mentioned in my posting. In this mail, he sent me a link to a paper written in 2002. In this paper you can find the same idea I had but they just show how to use it in form of a layer-7 security gateway and not directly in your application. Anyway, it’s a good read and you can find it under the following URL:
    http://www2002.org/CDROM/refereed/48/

  8. 8 rednael

    Good post…

    Please also read the following article:
    http://blog.rednael.com/2008/09/30/SecuringYourPasswordTransfersWithKeyedHashingHMACCramMD5.aspx

    It’s a walkthrough example of implementing HMAC-MD5 / Cram-MD5 on a website. The same technique can be used for various client-server situations.
    The article explains the benefits of using such a password system and shows you how to implement it using the .Net library at server side (examples in C#), and using Paj’s MD5 Javascript functions at client-side.

Leave a Reply