Keeping data clean in Excel is not just a matter of aesthetics: it is a prerequisite for reliable calculations, correct joins, and usable tables. Between invisible spaces, non-printable characters, case errors, and misplaced separators, imported files are often a minefield. Here I show, step by step, how to identify these traps and fix them with simple formulas, modern tips, and the Power Query tool when datasets become serious.
Somaire
In brief
🤩 TRIM removes superfluous spaces (leading/trailing spaces and double spaces), but it does not solve all cases: non-breaking spaces (CHAR 160) or certain non-printable characters persist and require a combination with SUBSTITUTE or CLEAN.
🖁 SUBSTITUTE replaces one string with another without affecting case. To remove specific characters (like semicolons or dashes), it is often the most direct answer.
⚙️ For large volumes or repeated transformations, prefer Power Query: visual interface, step history, and better performance on long tables.
🤩 In recent Excel versions, advanced functions (e.g. REGEXREPLACE, TEXTSPLIT) simplify complex cases, but classic techniques remain essential to ensure portability and understanding of sheets.
Why clean your data?
A sloppy spreadsheet produces misleading results: totals that deviate, impossible lookups with VLOOKUP or INDEX/MATCH, and pivot tables that count more rows than necessary. Common causes are unexpected spaces, hidden carriage returns, characters copied from the web, or numeric formats stored as text. Cleaning does not mean “correcting” all values manually, but applying a series of reproducible operations that make the file reliable and auditable.
Basic functions to know
TRIM (SUPPRESPACE)
TRIM removes spaces at the start and end of a string and replaces multiple spaces with a single one. Example usage: =TRIM(A2). It is often the first step because it normalizes spacing between words. Note: TRIM does not necessarily erase non-standard spaces (e.g. non-breaking space), nor non-printable characters.
SUBSTITUTE (SUBSTITUE)
SUBSTITUTE replaces a substring with another, ignoring case. It is the tool of choice to remove specific characters: =SUBSTITUTE(A2,”,”,””) removes all commas. To remove a non-breaking space (CHAR 160), combine: =TRIM(SUBSTITUTE(A2,CHAR(160),” “)). The logic is simple: first replace the problematic value with a standard space, then normalize the spaces.
CLEAN (NETTOYER)
CLEAN removes non-printable characters (often from transfers from systems or the web). If your cell displays squares or causes unwanted carriage returns, CLEAN is a good reflex: =CLEAN(A2). Used in combination with TRIM, it gives a much cleaner result.
UPPER, LOWER, PROPER
Standardizing case improves readability and avoids apparent duplicates. UPPER converts all to uppercase, LOWER to lowercase, PROPER capitalizes the first letter of each word. Examples: =UPPER(A2) or =PROPER(A2). These transformations are essential before string comparisons.
Practical formulas and examples
Here are formulas you can paste and adapt. I also suggest a chaining logic: clean invisible characters → replace troublesome characters → normalize spaces → adjust case.
| Goal | Formula (fr) | Expected result |
|---|---|---|
| Remove carriage returns and tabs | =CLEAN(A2) | Text without non-printable characters |
| Replace non-breaking space (CHAR 160) | =SUBSTITUTE(A2,CHAR(160),” “) | Standard spaces |
| Combine cleaning and trim | =TRIM(CLEAN(SUBSTITUTE(A2,CHAR(160),” “))) | Clean text with correct spacing |
| Remove a specific character | =SUBSTITUTE(A2,”-“,””) | Removes all dashes |
| Standardize case | =PROPER(TRIM(A2)) | Standardized proper name |
Practical case: cleaning an “Address” column
Suppose a column containing addresses imported from a CRM that have non-breaking spaces and carriage returns. The formula below applies corrections successively:
=TRIM(CLEAN(SUBSTITUTE(C2,CHAR(160),” “)))
We start by replacing CHAR(160) with spaces, then CLEAN removes invisible characters, finally TRIM standardizes spaces. Result: an address usable for sorting and merging.
Advanced techniques and modern Excel
If you work with Microsoft 365 or a recent version, new functions speed up cleaning. REGEXREPLACE allows erasing complex patterns (for example removing all punctuation). TEXTSPLIT and TEXTJOIN facilitate recomposition after splitting. But two remarks: 1) not all organizations have these versions, 2) the robustness of classic formulas remains useful for portability.
Replacing repeated patterns
Without REGEX, you can nest SUBSTITUTE to remove several different characters. Example:
- =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2,”.”,””),”,”,””),”;”,””) removes periods, commas, and semicolons.
- If you have REGEXREPLACE: =REGEXREPLACE(A2,”[[:punct:]]”,””) is more elegant (depending on availability).
Power Query: when to use it
Power Query (Get & Transform Data) is the suitable tool if you process large files or regular imports. The interface offers ready-to-use operations: convert to text type, remove spaces, replace values, split columns, apply an action on multiple columns simultaneously. Each step is recorded and can be replayed automatically when updating data.
- Frequent steps: Remove empty rows, convert types, replace values, remove spaces and clean columns.
- Advantage: traceability — you see the list of steps and can go back.
- Disadvantage: slight learning curve to write custom transformations (M language).
Simple example in Power Query
Import a CSV → Click on the column → Transformations → Remove spaces → Replace values → Close and load. The operations become reproducible and avoid manual errors.
Best Practices and Checklist
Before applying irreversible transformations, I recommend:
- Working on a copy of the original data.
- Documenting the steps (a “README” sheet or comments in the workbook).
- Applying transformations in a reversible way: use auxiliary columns rather than overwriting.
- Validating a sample at each step to avoid introducing errors (e.g., removing a ” – ” sign that was part of a code).
Quick Reference Table
| Problem | Tool/Formula | Note |
|---|---|---|
| Excess spaces | TRIM | Combine with SUBSTITUTE for CHAR(160) |
| Invisible characters | CLEAN | Ideal for carriage returns and tabs |
| Replace a character | SUBSTITUTE | Does not change case |
| Repeated operations / large volumes | Power Query | More robust and traceable than formula columns |
Tips to Avoid Pitfalls
One might think a single formula solves everything, but in reality each source has its peculiarities. Copy from a website → watch out for HTML entities, CSV export from software → watch out for local separators, import from an accounting system → watch out for numeric formats stored as text. Test on a subset, then automate via Power Query or macros if necessary.
FAQ
Does TRIM remove all invisible spaces?
No. TRIM handles regular spaces and multiple spaces, but some characters like the non-breaking space (CHAR 160) or other non-printable characters may resist. SUBSTITUTE and/or CLEAN are then used as complements.
What to do if my numbers are stored as text?
Use VALUE to convert to number, or in Power Query change the type to Number. Be careful with local decimal separators (comma vs period) at the time of conversion.
Is REGEX preferable to SUBSTITUTE?
REGEX is more powerful for complex patterns, but it is not available everywhere. To ensure portability and understanding, SUBSTITUTE and classic formulas remain a good compromise.
Does Power Query completely replace formulas?
Power Query is often preferable for initial ingestion and transformation. Formulas remain useful for dynamic calculations within the sheet. The two approaches complement each other.
{
“@context”: “https://schema.org”,
“@type”: “WebPage”,
“about”: {
“@type”: “Thing”,
“name”: “Excel Data Cleaning”
},
“keywords”: [“Excel cleaning”, “TRIM”, “SUBSTITUTE”, “CLEAN”, “Power Query”]
}