The role of data discovery and classification in DLP strategies

Data loss prevention (DLP) involves a set of processes and technologies to ensure sensitive data is not lost, misused, or accessed by unauthorized users. Data protection is a huge concern for any organization’s cybersecurity and IT teams. In a recent report by Accenture, it was found that 44 percent of organizations had more than 500,000 customer records exposed in 2019. Even as organizations adopt DLP software, sometimes the investment fails them.

Two core concepts lay a strong foundation for an effective data loss prevention strategy, and many DLP tools miss the mark on this. Those two concepts are sensitive data discovery and data classification. For starters, an organization cannot safeguard information that they are unaware of, and this is where data discovery comes into play. Additionally, sensitive data can mean a lot of things to different people. Creating standards for how your organization classifies sensitive data and then applying those classifications to data gives everyone the same standards to follow. There’s less confusion about what needs to be handled with greater precaution and it’s easier for all departments to work together to prevent data loss.

Of course, there’s more behind why these two core concepts are integral to effective DLP, and we explore that here.

What are DLP best practices?

Getting a DLP tool or software is a great step towards meeting your data protection goals. However, since DLP is both a set of processes and technologies, it’s difficult to find one solution that effectively addresses all of your needs.

Depending on your organization’s objectives, you can tackle a DLP strategy in several ways. No matter how you approach DLP, these five core practices should play a role.

1. Locate sensitive data using data discovery: the essential first step

If the name of the game is protecting sensitive data from unwanted users, the first step needs to be locating that data. Cybersecurity and IT departments can’t protect what they can’t find.

To keep up with digital transformation, organizations will adopt all kinds of software and programs to streamline their business operations or to enhance the customer experience. The introduction of new technologies, however, comes with potential gaps for vulnerabilities. As your organization onboards and trains employees about new third-party systems, it’s easy for information to get lost in the mix or travel where it wasn’t supposed to. It’s not uncommon for an organization to discover complete troves of data they were unaware of.

Having a sensitive data discovery tool in place that works in real-time is critical for ensuring an effective DLP strategy. According to Accenture, leaders are prioritizing cybersecurity tools that help them move fast, including how quickly a tool can help them detect a security breach or execute their response plan. Sensitive data discovery allows your organization to be proactive, finding sensitive data wherever it resides.

2. Classify sensitive data: a critical role

In order to protect the valuable data discovered, organizations also need to know how to classify it into data sensitivity categories.

By classifying data, you:

  • Can shift focus towards data that needs your immediate attention
  • Automate remediation tasks or workflows based on the data’s sensitivity level
  • Monitor or quickly pull reports on data by sensitivity level
  • Create standard policies that are easy for employees to understand and follow

Organizations can create any set of classifications that best address their objectives, but a common set of classifications are: public, internal, regulatory, and confidential. Those classification levels address the data’s level of sensitivity. Sometimes, an organization may choose to classify data based on other parameters such as how much post-breach damage a loss could do or by the regulation that it’s governed by—like HIPAA, PCI, or GDPR.

Regardless of how an organization decides to classify their data, creating a data classification policy and sticking to it will help strengthen a DLP strategy tenfold. By using these classifications, it becomes easier to address sensitive data like personally identifiable information (PII), protected health information (PHI) and intellectual property (IP).

3. Implement technical safeguards: best practices to follow

Along with the core fundamentals of discovery and classification, organizations should follow these technical best practices:


Backups help restore lost data, so creating automatic backups of important areas of your database, website and systems is important. Sometimes data is accidentally lost due to human error or unavoidable events, like a liquid spill or natural disaster. In events like this, having data backed up can save your team from major headaches.


Critical business data should be encrypted when it’s at rest or in transit. While authorized users can view or modify a data file, encryption prevents an unauthorized person from accessing or reading the data, which adds a layer of breach-prevention protection. Along with encryption of data files, software-based and hardware-based encryption can also be applied to things like computers and portable devices. You can encrypt a device’s hard drive to avoid the loss of critical information even if hackers gain access to the device.


Manual DLP processes may work in a small business environment but are not sustainable for large or enterprise organizations. Manual processes are limited in scope, while eating up valuable time and resources. The more DLP processes you can automate, the faster you can act on issues of concern. It also enables your cybersecurity team to focus on strategic planning, rather than repetitive manual work.

When organizations automate their data discovery and classification, they achieve a near real-time view of when new, sensitive data is created or introduced, and classification is practically immediate. Manual classification not only takes longer (in some cases, the time lag between creation, manual discovery and classification can be months) but can be prone to human error and subjectivity. Automating allows your organization to be agile, accurate and better prepared.

4. Create and test your data breach response plan: looking ahead

To help reduce the damage of a data breach, it’s important to plan ahead by creating a data breach response plan. Delegate specific responsibilities to your team and make sure to test your plan. It’s worthwhile to run through the plan you’ve created with casual exercises or even in a simulated environment on a routine basis. As the technology landscape evolves, you may discover gaps in your response plan that need to be filled or areas where you will need to pivot.

5. Educate your entire organization: a “security-first” culture

Now that you have created processes and implemented the technologies, it doesn’t stop there. To ensure strong data loss prevention, creating a “security-first” culture within your organization is critical. According to IBM’s 2019 Cost of a Data Breach Report, 24 percent of data breaches were caused by negligent employees or contractors.

Protecting sensitive data isn’t a task that should be left solely to your cybersecurity or IT teams. If your organization processes a large amount of sensitive data, then all departments should understand how it should be treated, along with the regulatory penalties, fines and other consequences that could come out of carelessness. Creating a culture of data privacy and security is an ongoing effort.

How does data discovery and data classification help with DLP strategies?

Data discovery and classification are two core areas that truly set the foundation for a strong data loss prevention strategy. When organizations skip over these steps, they leave large areas of vulnerability in their DLP strategy—putting at risk reputation, revenue, and regulatory compliance.

In addition to protecting sensitive data and preventing data breaches, an effective DLP strategy should also help your organization achieve regulatory compliance. Regulations like CCPA and GDPR should be a focus for companies. 2020 was a big year where we saw many GDPR penalties issued for noncompliance.

Avoiding CCPA and GDPR fines and penalties, preventing data breaches and safeguarding sensitive data is very tricky if you don’t implement data discovery and data classification.

Choosing a data discovery and classification tool for a better DLP strategy

When you choose a data discovery tool, there are a few key factors you should be looking for:

  • Can sensitive data discovery be active 24/7?
  • Can our team be alerted in real-time? Is there real-time monitoring?
  • Can this tool find sensitive unstructured data—or does it only locate structured data?
  • What is the tool’s accuracy rate?
  • Is it possible to automate data classification? If so, how easy is it?

Strengthen your DLP strategy with Spirion’s data privacy solution

Spirion’s Sensitive Data Platform is a data privacy solution that excels in automated sensitive data discovery, classification, and remediation. Our discovery tool allows you to search locations such as PDFs, images, cloud repositories, databases and even your colleague’s laptops. No matter where your data ends up—whether it’s in the cloud or on-premise, unstructured or structured—Spirion can find it. It’s easy to set up automated classification, so once data is found, it’s instantly classified. Organizations can also create workflows with specific trigger events or actions to streamline remediation. To see our platform in action, you can watch a free demo here.