Category Archives: Tales from the Web Scanning Front

the art of automation

Web Application Security Scanning – The Art of Automation

Few people fully appreciate the difficulty in creating a web application security scanner that can actually work well against most sites. In addition, there is much debate about how much application security testing can be automated and how much needs be done by human hands. Lets look at a recent conversation among some industry experts that took place on Twitter (abbreviated for easier reading).

Jeremiah Grossman ‏@jeremiahg:
RT @kdinerman: WebApp Scanners Challenged By Modern WebTech < true, but no way the biggest issue

Neil MacDonald ‏@nmacdona
not the biggest issue < what would you say is the biggest issue?

Jeremiah Grossman ‏@jeremiahg
login & maintaining authed-state, 404 detection, infinite website problem & production safety.

zulla ‏@zulladan
@jeremiahg login & maintaining authed-state, 404 detection – i always believed that whitehatsec was one of the few who solved that

Jeremiah Grossman ‏@jeremiahg
@zulladan ahh. technologically the issues are not “solved.” theyre compensated for w/ [human] config. true for everyone to varying degrees.

Dan Kuykendall ‏@dan_kuykendall
Auth, 404, infinite links, etc are all stuff we solved 5+ years ago. Mobile and API are the new challenges

Jeremiah Grossman ‏@jeremiahg
please defined “solved.” As in, the tech does everything automatically w/o human assistance?

Dan Kuykendall ‏@dan_kuykendall
Yes, in 90% of the cases we just need creds, Then automation does the rest. Was hard, but did it.

Neil MacDonald ‏@nmacdona
@dan_kuykendall hard but did it < more & more automation, “good enough” bar higher, humans at top of pyramid

I agree, humans will continue to be at the top of the pyramid when it comes to web application security, but the practical reality is that organizations don’t have the time and money to hire enough humans to effectively find and remediate all of their application security vulnerabilities. So, while it is true that we may never be able to automate 100% of the possibilities, it is our job to push forward the art and science of automation. It’s not easy, but somebody’s gotta do it.

Why Automated Scanning Is Critical

This does not exclude manual training options, but depending solely on manual training is a failed option for most organizations.

  1. Auditors rarely know the application very well. When you have three guys on a security team responsible for hundreds or thousands of applications, its unlikely that they know the application.
  2. Auditors have limited amount of time to spend training the scanner for each application. Often this is nearly no time at all.
    1. The security team has ever used the applications or
    2. had time to learn the ins and outs of each one,
    3. had time to manually configure a scanner with full manual training.
  3. Auditors time better spent on attacks only humans can do, such as business logic and privilege escalation attacks that automation may never be able to adequately discover.
  4. Even SaaS offerings that are aided by manual effort end up being limited by the quality of their automation. Do you really believe that a highly trained security professional is going to review & train the web application security scanner for every nook & cranny of every application? How long would you expect a highly trained security profession to perform a job like this, before they wanted to poke their eyes out with a fork? Not long I’m sure.
  5. Quality of the manual training will vary. Manual effort is going to focus on a few areas here and there and train for those high profile areas (the ones that probably have the best secure development applied to them). You may also end up with a less competent person doing the training, and you get less than ideal training data into the scanner. In the end, much falls back to the automation.

Bottom line, the effective web application security scanner must do everything possible to accomplish the best possible scan in a fully automated fashion. The less you leave for the human effort the more effective the human effort will be. It’s taken us a decade of pure focus with a team of highly talented team of developers to solve each challenge and to overcome one nitch case after another. We continue to innovate with automation, but we are also looking forward to the next generation of challenges, and the battle ahead.

the art of automation

The Classic Challenges

  1. Form based logins – There are several challenges here which are important to solve if you ever intend to schedule scans or simple be able to run a point & shoot scan.
  2. Single sign-on- It can be a challenge to be able to login, while avoiding crawling and attacking sites not intended to be part of the attack surface. You must prevent sending credentials to the wrong place, and deal with the various cookies & tokens that get passed back and forth between the various domains/hosts involved in the SSO process.
    • You must automate detection of the login form. There are many possible formats, and they must be distinguished from other forms.
    • Deal with forms that include onsubmit events that do crazy stuff such as client-side encryption of the password to “protect” it over the wire, or calculate some predetermined key based on some other token.
    • Automate the determination of a successful login vs. failed login (diff flavors of failures). This is one of the more challenging tasks that give web application security scanning vendors all sorts of headaches.
  3. Auto-populating forms with valid data – To accomplish the best possible code coverage it is critical to populate form fields with valid data in order to get deep into the application that perform data validation.
    Example Scenario:
    A billing address form where all the input names/ids are textbox1, textbox2, etc. Additionally the developer added code to require a valid state & zip code.
    Weak solution:
    Because the scanner doesn’t know what would be valid inputs for textbox1, textbox2, etc, the scanner might enter a bunch of aaaaaaaa’s into the fields.
    Problem remains:
    The web application security scanner will basically be dead in the water without user training

    • It will not pass this step, which could be step one in a multi-step process.
    • It will miss out on the SQL vuln possible in the street address field because the SQL INSERT happens several lines of code after the state & zip code validation.
  4. Dynamic changes based on user events – Often we see changes based on user action. An example is an onchange event for an option list. The javascript that gets executed might changes the possible form field, or may populate hidden fields with data. If you do not perfectly emulate what would have happened in a browser, you can often fail the basic validation that takes place and never get to deliver your attack payloads.
  5. Session management – It is a constant challenge to stay logged into an application. The scanner must avoid logout buttons/links/events, must properly pass along session tokens wherever they happen to be at the moment (sometimes cookies, sometimes on the URL, sometimes in hidden form field) and adjust to multiple possibilities taking place on a single app. The scanner must also properly identify when it has lost its session, and then be able to re-login (requires automated login process mentioned above) to continue its scan.
  6. 404 detection- Some sites will use the standard 404 handler, but most have started to customize them to offer a better user experience. The scanner must employ a collection tricks & techniques to solve the possible scenarios, or, you end up with endless new links on many sites.
    • Custom 404 that response as a 200. This is the simple one, but many scanners will get caught by this
    • SEO friendly sites – In most of these applications there are no real files, and instead all 404 responses are trapped and processed through the framework to look up the intended content from a database. This can cause scanners to be unable to detect real content from 404 equivalent response.
    • Different 404 handlers based on directory. We see many sites that might have a different 404 handler for one application. A simple example is when your site includes a blog that may be installed as The blogging software may use SEO friendly URL’s, thereby making your scanner think that EVERY page under /blog/ exists.
  7. Limiting repetitive functionality- Lets say your scanning an online store with 100,000 items.
    • viewproduct.aspx?productid=5
    • viewproduct.aspx?productid=6

    or maybe it looks like

    • /product/5/view
    • /product/6/view

    You must auto-detect these situations and properly limit the amount of testing or your scan will basically run for a very long time, and when it does eventually complete it might end up reporting the same vulnerability (or root cause) 1000’s of times.

  8. Memory management – As mentioned earlier, a web scanner is a very complex software engineering task. You can ask around to find that, even companies such as HP & IBM are known for having their scanner crash in large part due to memory management issues. The reason is that each web application is different, but all responses must be parsed & analyzed. This parsing and analysis of unpredictable response data ends up requiring very solid engineering to handle properly.
  9. AJAX/HTML5 – I will save this for another blog post.

Those examples are just the start of the crawling problems that come to the top of my mind. I haven’t even started to mix in attacking and how that can cause session loss, and then how to find new application security vulnerabilities (known vulns don’t exist in this world of custom apps) while avoiding false positives, and eventually delivering a usable/useful report that a tester and a developer can both make use of to hopefully fix the problems the application security scanner finds. Trust me, the solution to each problem and its many flavors are each hard fought battles.

Time after time, as product after product attempts to face these challenges, we see them give up and move toward manual training. Enticing manual training interfaces move front and center. Point and shoot falls to the wayside.

At NT OBJECTives, we have confronted these challenges and have invented automation techniques to solve them. We have won those battles. Now we are setting our sights on the future problems. To read more about the battles we are fighting now, download our new whitepaper on Web Application Security Scanner Coverage in RIA, Mobile and Web Services. Or, if you are skeptical that we can effecitvely address these problems for your custom application, go ahead, request a free trial. I dare you! (Free trial NTOSpider)

Tales from the Web Scanning Front: Blacklisting

The smell of melting Blackberries/iPhones/Droids. You have probably smelled it before. You began testing an application and forgot to blacklist the “Contact Us” page so everyone who receives an email for “Contact Us” gets pummelled with emails during the test.

We often remind our customers about this kind of logistical trouble, but we still manage to get the frantic breathless panicky phone call when recipients of the “Contact Us Page begin receiving 1000 emails within 10 minutes.

So what do you do to prevent this from happening? It’s actually very simple.

First, a wee bit of background on web scanners. Because all applications are different (different page names, different parameter names, vulnerable in different spots to different attacks, etc.). Web scanners have to crawl the targeted websites and then attack every page and parameter with hundreds of attacks. Unless told otherwise, every single page will be crawled and every parameter attacked.

Think about it, this includes the following kinds of pages:

  • E-Mail the sales team
  • E-Mail tech support
  • Wire the money
  • Delete this blog
  • Delete this item
  • Reset the admin password

Fortunately, all modern scanners have blacklisting technology. Blacklists in this context simply tell the scanner not to crawl and/or attack that page.

During your planning period or before you execute any application test, carefully consider the pages on your site that you don’t want to be crawled by the scanner dozens of times. Then, simply add the URL’s for those pages to the blacklist in your scanner. It’s that easy.

Whether you outsource your scanning, use software in-house or use a SaaS service, you will have many fewer people screaming at you if you take some time to blacklist the pages and prevent the unexpected deluge in your co-workers inbox.

Spending two minutes to properly configure your scanner will help avoid potential problems and keep the office free from the smell of burnt plastic.


Tales from the web scanning front: Don’t eat the entire buffet at once

One of the more common problems that we see is customers trying to bite off more of their application infrastructure at once than they can chew.  A certain amount of planning will yield better, more digestible results with substantially less indigestion.

Dropping all of into your web scanner when there are 100 applications with 50,000 pages across 60 subdomains is likely not an optimal strategy.  Here are some considerations:

  • Scan time:  Assuming reasonable connectivity and application server horsepower, a scan of a medium-sized application can take 3- 12 hours.  Scanning 60 applications at once will take a week or more before the scan completes and you can start working on the results.
  • Information Segmentation:  Most enterprises will have more than one development team.  It’s not the best policy to ship detailed information about all of your vulnerabilities to people who don’t need to know it.  Also, it’s much easier to have one report per application that you can just send to the team coding it so that they can fix just the vulnerabilities listed in the report.
  • Report Size:  A scan that large will create a report that will be immense if you have any significant number of findings.  Even if your vendor segments and paginates the report, it is going to be harder to navigate than a series of smaller reports.
  • Re-Scanning: Once the developers start remediating vulnerabilities, you will be asked to re-scan to give a clean bill of health for each application.  You don’t want to have to wait the week or more an enterprise scan takes to update the development team.

The one downside to all of this is that you will have to kick off and monitor more scans.  If you have a large number of applications and this is likely to be a logistical headache, you should consider an enterprise portal to schedule and monitor scans and deliver scan results (full disclosure, we offer such a tool).

As in most endeavors, a bit of planning goes a long way in making life easier.  Giving some thought to breaking up your application scanning will make your application scanning program a lot easier and more effective.

Tales from the Web Scanning Front: Why is This Scan Taking So Long?

As CEO, I’m constantly emphasizing the importance of customer support and trying to attend several support calls each week to stay on top of our support quality and what customers are asking.

Surprisingly, application scan times are one of the most common issues raised by customers.  Occasionally, scans will take days or even weeks.

At this point, I would say that in almost all cases, there is an issue that lies within the application’s environment as opposed to a something within the software.

First some background on web application security scanners. Web scanners first crawl websites, enumerate attack points and then create custom attacks based on the site.  So, for example, if I have a small site with 200 attackable inputs and each one can be attacked 200 ways, with each attack requiring 2 requests, I have 200*200*2 or 80,000 requests to assess that site.

Now NTOSpider can be configured to use up to 64 simultaneous requests so depending on the response time from the server, you can run though requests very quickly.  Assuming, for example, 10 requests a second, that’s 600 per minute, 36,000 per hour and you can get through that site in 2.22 hours.

The problem is that quite often the target site is not able to handle 10 or even 1 request per second.  Some reasons can include:

  • Still in development – The site is in development and has limited processing power and/or memory.
  • Suboptimal optimization – The site is not built to handle a high level of traffic and this has not yet shown up in QA.  We were on the phone with a customer last month who allowed us to look at the server logs and we saw that one process involved in one of our requests was chewing up 100% of the CPU for 5 seconds.  Another application was re-adding every item to the database each time the shopping cart was updated (as opposed to just the changes) and our 5,000 item cart was severely stressing the database.
  • Middleware  Not to bash any particular vendor (Coldfusion) but some middleware is quite slow.

So let’s look at our 80,000 request example from above and assume that our site can only handle 1 request per second.  Our 2.2 hour scan time balloons to 22 hours.  For our 5 second response in bullet 2, we get to 4.6 days for our little site.  The good news is that NTOSpider can be configured to slow itself down so as to not DOS the site (this is our Auto-Throttle feature).  The bad news is that it will take some time.

So what’s a poor tester to do?

  • Beefier hardware  If you are budgeting for a web scanner,  consider spending a couple of extra thousand dollars on some decent hardware to test your apps. (Note – a modern laptop with optimal ram for the OS you are running – 32-bit OS = 4 Gigs of ram / 64-Bit OS = 8 Gigs of ram – will solve 90% of all performance issues.)
  • Scheduling  In some cases, you can schedule scans so that even if they are longer, you can still get things done in time.
  • Segmenting  In some cases, if you know that only a portion of the site has changed, you can target the scan to test only that subset and dramatically reduce scan time.
  • Code Augmentation  Not to put too fine a point on it, but if a single request is taking 5 seconds to process, a hacker can DOS your site by hand.  You might want the developers to look at adjusting the code.