How to Normalize Websites in Salesforce and Excel

Shaun VanWeelden
5 min readSep 8, 2019

I was recently tasked with seeing how many customers on a list we had were shared with lists others were providing us.

No problem I thought, I’ll just export the report in Salesforce, do a quick VLOOKUP or COUNTIF based on the Account Website in Excel, and Voilà, I’ll have my list!

I exported my report and realized it wasn’t going to be quite as easy… we had all types of values in our “Website” field and the lists to compare to also had different domain formats.

I needed a way to normalize everything together:

Results of Normalizing Different Domain Formats

In this post, I’ll share how I made:

1. A Normalized Website Formula Field in Salesforce

2. The Excel version of that Formula field so you can normalize other lists

3. A Process Builder flow so the Website field is always normalized, and you don’t need to create a second Website field.

The Recipe

First things first, I made a list of what had to be normalized:

  1. Remove https://
  2. Remove http://
  3. Remove anything after the first remaining “/” (including the slash itself)
  4. Remove “www.”
  5. Make it all lowercase

The one thing you could argue should also be removed is any additional subdomains. “blog.google.com” should in theory be “google.com”.

This specific case becomes quite challenging considering domains like “blog.gsuite.google.co.uk” to the point where I find it’s not worth the complexity.

Luckily for us, the probability that website fields include sub-domains is generally very low and can generally be dealt with manually or in the acceptable margin of error.

Salesforce Formula for Normalized Website

SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(
LEFT(
LOWER(Website),
IF(FIND("/", Website, FIND(".", Website)) = 0,
LEN(Website),
FIND("/", Website, FIND(".", Website)) - 1
)
),
"https://", ""), "http://", ""), "www.", "")

If you copy and paste this into a new text-based formula field on the Account, you’ll have a lovely normalized website field and be all set.

Break it down

If you find longer formulas interesting and fun like me, here’s a breakdown of what’s happening starting from the middle outwards.

We start off with LOWER to ensure casing never gets in our way of comparing websites.

Next, we need a way to see if there’s a “/” or things after a “/” at the end of our lower-case domain to remove. We’ll use LEFT to get everything to the left of our potential “/”. LEFT takes in two arguments:
1. Text to potentially cut off
2. The Nth letter in the text where it should cut things off

To get the Nth letter in our website for LEFT, we need to see IF there’s a “/” somewhere in the domain. To FIND if there is a “/”,
1. We pass in our search text, “/”
2. We pass in the text to look at, Website
3. We pass in the first place in that text we want to look for our slash. In this case, we want to look after the first “.” we see because we need to avoid finding a “/” in “http://” and we can count on there always being a “.” before a final slash because of the “.com” or similar part.

IF there is not a “/” in Website (as indicated by FIND returning 0), return the entire Website by getting the length of the Website field.
IF there was a “/”, only return the first N letters of the Website field before the “/”

Now with our LOWER-cased text and everything to the LEFT of a potential “/” at the end of our website, SUBSTITUTE out the “www”, “http”, or “https” part of the domain with an empty string.

🎉 You made it!

Excel Formula for Normalized Website

Luckily for us, Salesforce and Excel formulas are quite similar. We can use essentially the exact same logic in Excel as Salesforce.

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(
LEFT(
LOWER(A2),
IF(ISERROR(FIND("/", A2, FIND(".", A2))),
LEN(A2),
FIND("/", A2, FIND(".", A2)) - 1
)
),
"https://", ""), "http://", ""), "www.", "")

If you’re paying attention, the only change here is that we need to swap out our IF statement to use ISERROR instead of “= 0” because if text can’t be found in Excel, it returns an error instead of just 0.

Normalized Website Process Builder

If you too like to avoid making unnecessary fields but don’t want to give up this normalization efficiency, you can get a bit creative in Process Builder.

Goal:

On any changes to an Account’s Website field, re-normalize the value in the Website field. This works well when doing data imports from different vendors.

Create a New Process

Our Process is for the Account object

Make sure it fires when a record is edited as well

Define our Criteria

If our Website field changed, let’s go update it!

Define our Action

Have the Website be set to our formula above with one small twist, we need to explicitly tell it what object to pull our fields from (Account).

SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(
LEFT(
LOWER([Account].Website),
IF(FIND("/", [Account].Website, FIND(".", [Account].Website)) = 0,
LEN([Account].Website),
FIND("/", [Account].Website, FIND(".", [Account].Website)) - 1
)
),
"https://", ""), "http://", ""), "www.", "")

Click Use this Formula and Save

Turn it On

Click Activate, and maybe do a quick data load to set all existing websites with their normalized version, and you’ll be all set!

Oh, The Places You’ll Go

Hopefully this saves you a bit of time, gives you some inspiration, and helps you get back to making the most of your data instead of fighting domain variances.

If you have ideas, I’d love to hear them! Please leave a comment or find me on LinkedIn.

--

--

Shaun VanWeelden

Director of Computer Vision Engagements and Field Engineering at Scale AI | At the intersection between Customers and Code