Skip to Main Content

Your Data Sucks… We Can Help

I’m sorry to have to be the one to tell you this, but your data sucks… and that’s a big problem. Artificial intelligence and machine learning will transform identity governance over the next few years and you need to get ready so you can take advantage of it.

Here is the secret to effective machine learning: machines start off dumb, but they can get smarter with lots of high-quality data. If your data is bad, then the machine will just be artificially unintelligent and never reach its full potential. Don’t worry, though, it happens to everyone. This is a blame-free article, and I’m here to help. There are three main types of data that can be used for artificial intelligence in identity governance: identity data, governance data, and activity data. We’ll be focusing on the first two here.

Identity Data

Identity data includes personal information, business information, and accounts/entitlements on your systems. If you have a fairly mature identity program, you may already have a lot of rich and clean information here, such as:
• Fully populated, consistent employee and contractor attributes (job title, department, location, etc.)
• Common patterns of entitlements. This does not have to be a full-blown role model, but people that have similar functions in your organization should likely have similar access. In fact, machine learning can pick up on these patterns to help in defining your access model and mining potential roles.

If you’re not there yet, don’t despair. A good place to start is just by putting your business knowledge into your data. Work with your human resources team to determine the defining characteristics of your employees, and then get them added into your HR application or corporate directory.

Governance Data

Governance data is different. These are the decisions that people in your organization make about which access is warranted and which is not. Primarily, this comes from certifications and access request approvals. The unfortunate truth is that when an employee is greeted with an access certification that has hundreds or thousands of pieces of access that they must approve or revoke, they tend to just pull out the rubber stamp and approve everything. And so, what could be a gold mine of data is instead a wasteland.

Here’s how we get from the desolate present to a bright, shiny future: Imagine a giant knob – when you turn it all the way to the left, you have no artificial intelligence and when you turn it all the way to the right you have full artificial intelligence. When we begin, the knob will be turned all the way to the left. In this position, we use configurable rules to determine what should likely be approved and what shouldn’t.

For example, if the “Financial Reports Administrator” entitlement has been approved for one Sr. Financial Analyst within the past six months and 75% or more of the other employees with the same title also have this entitlement, then it should likely just be approved without a certifier having to see it. Now that the bulk of certifications are automated, certifiers have more bandwidth to notice unusual access and provide some nonsucky data by revoking things their employees shouldn’t have. Now that is something that the machine can really learn from!

There you have it. Now the elephant in the room is out in the open. The faster you get through anger and denial, the faster you can start turning that data into something beautiful. And trust me… when you see what is on the horizon for the next wave of identity governance, you will be glad you did.