The lang attribute on the <html> element declares the primary language of the page using a standardized BCP 47 code (en, fr, en-GB, zh-Hans, etc.). Screen readers use it to choose the correct pronunciation, browsers use it to suggest the right translation, and search engines use it to understand which audience the page should be served to.
<html lang="en"> — on the root element, every page, every time.en, en-US, en-GB, pt-BR, zh-Hans.lang AND invalid lang values as separate failures.lang alongside content analysis to determine the language for ranking. A wrong code can surface your page to the wrong audience.lang on inline elements too (<span lang="la">veni, vidi, vici</span>) so screen readers switch pronunciation mid-sentence.Think of lang as the label on a record sleeve telling the player which speed to spin at. Without it, the needle still drops — but the music comes out garbled. A French page read aloud with English pronunciation rules is unintelligible, even though every letter is correct.
lang is one of the cheapest accessibility wins on the web — a single attribute, zero performance cost, and four distinct payoffs:
lang, screen readers fall back to the user's system language. NVDA, JAWS, and VoiceOver all switch voice profiles based on lang. A French page read with English phonetic rules is functionally incomprehensible to a blind user.lang together with content-language analysis to decide which language pool a page belongs to. A page set to lang="en" but written in Spanish will compete in the wrong queries — and lose.lang. Get it wrong and Chrome offers to translate French to French.langto decide which language pool a page belongs to when answering a query. A page mislabelled in English won't be cited for French queries even if its content is perfectly French.lang a legal requirement, not a best practice.Missing lang is a silent failure for sighted users — the page looks identical. That's precisely why it's the most common accessibility violation on the web: nobody on the team notices it during QA. Industry analysis of automated accessibility scans consistently places "html element does not have a langattribute" in the top 10 most-flagged issues across all sites.
The Level A criterion states: "The default human language of each Web page can be programmatically determined.""Programmatically determined" is the operative phrase — assistive tech needs a code it can parse, not a human guess.
Audits check two distinct conditions:
<html> have a lang attribute at all?en-EN, english, and en_US all fail validation even though they look reasonable.<!-- BAD: no lang attribute -->
<!DOCTYPE html>
<html>
<head><title>About Us</title></head>
<body>...</body>
</html>
<!-- BAD: invalid BCP 47 (en-EN doesn't exist) -->
<html lang="en-EN">
<!-- BAD: underscore instead of hyphen -->
<html lang="en_US">
<!-- BAD: language doesn't match content -->
<html lang="en">
<body>
<h1>Bienvenue sur notre site</h1>
<p>Nous sommes ravis de vous accueillir.</p>
</body>
</html>
<!-- GOOD: valid 2-letter code -->
<!DOCTYPE html>
<html lang="en">
<head><title>About Us</title></head>
<body>...</body>
</html>
<!-- GOOD: language + region when meaningful -->
<html lang="en-GB">
<!-- GOOD: script subtag for Chinese -->
<html lang="zh-Hans"> <!-- Simplified -->
<html lang="zh-Hant"> <!-- Traditional -->BCP 47 is the format the W3C and IETF agreed on. The structure is language[-script][-region]. For most sites the 2-letter language code is enough; add a region only when pronunciation or content genuinely differs.
| Code | Language | When to use the regional variant |
|---|---|---|
en | English | Default for English content where region doesn't matter. |
en-US | American English | Spelling differs (color/colour); voice-over should use US accent. |
en-GB | British English | UK spelling, UK financial/legal context. |
fr | French | Default for French content. |
es | Spanish | Use es-ES vs es-MX only if content is regionalized. |
pt-BR | Brazilian Portuguese | Strongly recommended — pronunciation and vocabulary differ significantly from pt-PT. |
zh-Hans | Simplified Chinese | Mainland China, Singapore. |
zh-Hant | Traditional Chinese | Taiwan, Hong Kong. |
ar | Arabic | Pair with dir="rtl" for right-to-left rendering. |
ja | Japanese | Default — region rarely needed. |
Detection is fast — every accessibility audit catches a missing lang. Validating that the value is correct (and matches the actual content) takes a little more care.
lang attributes alongside the rest of your WCAG audit, with a one-click fix that opens a GitHub PR with the corrected attribute.lang was forgotten on a single layout (often the cause of dozens of failing pages from one missing line).<html> and check the lang attribute is present and accurate.lang values.Every page needs a lang on <html>. Setting it on <body> or a <div> instead does not satisfy WCAG 3.1.1.
<!DOCTYPE html>
<html lang="en">
<head>
<title>Welcome</title>
</head>
<body>...</body>
</html>Prefer en over en-USunless region genuinely matters. Over-specifying can cause some assistive technologies to fall back to a generic voice if they don't carry the regional profile.
Mixed-language content needs inline lang. Screen readers switch pronunciation mid-sentence based on it.
<p>The Roman general declared
<span lang="la">veni, vidi, vici</span>
— "I came, I saw, I conquered."</p>
<p>The French phrase
<span lang="fr">c'est la vie</span>
means "that's life."</p>If you translate a page, update lang. A French translation that still says lang="en" is worse than no lang at all — it actively misleads assistive tech.
SPAs that switch languages at runtime must update document.documentElement.lang when the locale changes — otherwise screen readers continue using the original pronunciation rules.
// React example — keep <html lang> in sync with locale
import { useEffect } from 'react';
function LocaleSync({ locale }) {
useEffect(() => {
document.documentElement.lang = locale;
}, [locale]);
return null;
}BCP 47 uses hyphens: en-US is valid, en_US is not. The underscore form comes from POSIX locale conventions and silently fails validation.
Brand names, place names, and code samples shouldn't carry lang attributes. Tokyo, Nestlé, and a JavaScript snippet are not natural-language content the screen reader needs to switch voices for.
Arabic, Hebrew, Persian, and Urdu need both a language code and a direction attribute for the layout to render correctly.
<html lang="ar" dir="rtl">
<html lang="he" dir="rtl">What's happening: The most common WCAG 3.1.1 failure. <html> has no lang at all — screen readers fall back to the user's system language and mispronounce everything.
Fix: Add lang="en" (or the appropriate code) to the root <html> element in your base template/layout. In Next.js: <html lang="en"> in the root layout.
What's happening: en-EN doesn't exist (England's ISO 3166 code is GB, not EN). Audits flag it as invalid and screen readers ignore the region tag.
Fix: Use en (generic) or en-GB (British) — never en-EN. When in doubt, drop the region: en alone is always valid.
What's happening: Page declares lang="en" but the body text is in French. Common on translated sites that copied a template without updating the root attribute.
Fix: Drive lang from the same locale variable that drives the content. If you have an i18n provider, hook the root layout into it so the two can never drift.
What's happening: A page in English has French quotes, Latin mottos, or German place names with no inline lang. Screen readers read "Schadenfreude" as English phonetics — sounds nothing like the actual word.
Fix: Wrap the foreign phrase in <span lang="...">. WCAG SC 3.1.2 (Language of Parts) is Level AA and requires this for any non-trivial foreign-language content.
WCAG splits language requirements into two criteria. Knowing which one applies clarifies what to fix first.
| Criterion | Level | Where lang goes | Trigger |
|---|---|---|---|
| SC 3.1.1 Language of Page | A | <html lang="..."> | Every page needs a primary language declared. |
| SC 3.1.2 Language of Parts | AA | Inline elements (span, p, blockquote) | Any passage in a different language from the page default. |
Not strictly required — Google can detect language from content alone — but strongly recommended. lang is one of several signals (along with hreflang, content language, and server location) Google uses to assign a page to a language pool. A wrong or missing langcan cost you visibility in your target market's search results.
lang declares the language of the current page. hreflang (used in <link> tags or sitemaps) tells Google about alternate language versions of the same page. Both should be present on multilingual sites and they should agree with each other.
Use en unless region genuinely matters for spelling, currency, or pronunciation. en is universally supported; en-US and en-GB are useful when the content is explicitly localized. Over-specifying can cause some screen readers to fall back to a generic English voice if they lack the regional profile.
Yes. Multilingual AI search systems use lang to decide which language pool a page belongs to when answering a query. A French page mislabelled lang="en" may not surface for French queries even if the content is perfect French. As AI search becomes more multilingual, accurate langincreasingly determines whether you're cited at all.
Three things break. Screen readers use the user's system language to pronounce everything (often badly wrong). Browsers can't offer accurate translation prompts. Search engines must guess the language from the body text — sometimes wrong, especially on short pages or pages with mixed content. Plus, you fail WCAG 2.2 SC 3.1.1 Level A, which has legal implications under EU EAA, US Section 508, UK Equality Act, and similar laws.
The iframe's own document needs lang on its own <html>. The parent document's lang doesn't cascade through the iframe boundary. If you control the embedded content, set lang there too. If you don't (third-party widgets), it's out of your hands — focus on labelling the iframe with a clear title instead.
Yes — and it must. Update document.documentElement.lang whenever the locale changes. Most i18n libraries (next-intl, react-intl, i18next) expose a hook or callback for this; wire it to the root element on every locale change.
The lang attribute is one line of HTML that decides whether a blind user hears your content correctly, whether Chrome offers the right translation prompt, and whether your page surfaces in the right language pool on Google and AI search. WCAG 2.2 SC 3.1.1 makes it Level A — non-negotiable for legal compliance — and BCP 47 makes the format predictable: a 2-letter code, optionally followed by a region or script.
The fix is almost always trivial; the hard part is finding every template that's missing it. Run a Greadme deep scan to surface missing and invalid lang attributes across your site, and to catch the inline-language failures that automated tools usually miss.