ENGINEERING-BASED SEO
Technical SEO
From crawl budget to Core Web Vitals, from schema architecture to hreflang clusters; a technical SEO program that ties rankings to engineering discipline rather than guesswork.
Ranking is not luck; it is the result of whether the system has been built correctly.
Modern SEO is a far deeper engineering problem than a keyword list. How Googlebot crawls your site, which URLs it spends budget on, how JavaScript weight breaks rendering, which template has bad Core Web Vitals, whether the schema graph is enough for entity linking, whether the internal link architecture carries authority to the right pages — all of these are measurable, auditable and systematizable topics. Roibase's technical SEO team rejects the 'produce content, the rest will come' approach: we first build, measure and improve every layer of site engineering; rankings follow naturally, without the need for panic-driven content production.
METHODOLOGY
Our operating framework
In technical SEO the 'do an audit, throw a list, move on' model no longer works. Roibase runs a six-layer engineering framework — each layer ships a measurable deliverable and a sustaining loop refreshes every month.
DISCOVER
Technical & business discovery
We merge Search Console, GA4, Bing Webmaster, log files and crawl data; we place the gap between business goals (revenue, lead, brand) and technical reality on a single slide.
ARCHITECT
Site engineering & template map
URL taxonomy, internal link architecture, schema graph, render strategy (SSR/SSG/ISR) and template-level CWV budget are tied to a single engineering document.
EXECUTE
Sprint-based execution & content refresh
An 8-12 item quarterly roadmap scored by impact x effort x difficulty; each sprint output passes QA and ships with developer tickets.
MEASURE
BigQuery + Looker dashboard
Impressions, CTR, average position, conversions, AI search share, schema validation rate and crawl distribution tracked daily on one screen; alerts fire automatically.
DEFEND
Algorithm & competitor monitoring
We monitor Google core update signals, SERP volatility and top-10 competitors' technical moves 24/7; when impact occurs, we share a root-cause report within 48 hours.
ITERATE
Monthly review & roadmap refresh
At each month-end, outcomes, hypotheses, deviations and new opportunities are compared; the next sprint's priority is updated on numerical evidence.
— COMPARISON
The difference between technical SEO approaches
Very different worlds can sit behind the same deliverable. There are three typical approaches in the market; we build the engineering-based model to close the gaps of the other two.
| Criterion | DIY / in-house junior | Classic SEO agency | Roibase engineering-based |
|---|---|---|---|
| Crawl budget management | Not done | Monthly manual check | Weekly log stream + automated alerts |
| Render audit | Lighthouse score is enough | Mobile-friendly test screenshot | Template-by-template render diff via Puppeteer + URL Inspection API |
| Schema architecture | FAQ + breadcrumb | JSON-LD on some templates | Full graph across 11+ types + entity disambiguation |
| Core Web Vitals | Green in PSI is enough | Shown in monthly report | Template-level budget + live field-data alerts |
| Internal link strategy | Sprinkled based on content | Pillar-cluster suggestions | Internal PageRank simulation + authority flow optimization |
| Migration safety | Fingers crossed | 301 list after the fact | Pre-migration audit + canary launch + post-launch monitoring |
| Algorithm response time | Unclear | Report after 1-2 weeks | Root cause + action list within 48 hours |
| Reporting | GA + Search Console screenshot | Monthly PDF | Looker dashboard + alerted BigQuery + sprint review |
PROOF
Outcomes, measured
Average at month 12 across engagements — data from 24 clients.
Across accounts that started at 24%; mobile + desktop average.
Average drop after log cleanup + index hygiene.
Hreflang cluster management on a single site architecture.
Root cause + action list delivery time after a core update.
With an optimized content ops pipeline (previously 2-3 weeks).
WHAT WE DO
Engagement scope
Every offering is an outcome-based work package. Roibase blends strategy and execution inside a single team — no hand-offs.
Log file analysis & crawl budget management
We parse server logs for Googlebot, Bingbot and LLM crawlers (GPTBot, ClaudeBot, PerplexityBot); how many requests each template receives, which URL groups eat budget for nothing, which 404/302 chains keep getting crawled — shown in a weekly report. We increase the share of crawls reaching critical pages by an average of 3-5x.
JS render audit & SSR/SSG architecture
We test how CSR pages are rendered by Googlebot using Puppeteer + URL Inspection API; we solve hydration bottlenecks in React/Vue/Next/Nuxt projects and, if needed, produce a migration roadmap to SSR/SSG. We prevent JS weight from capping your rankings.
Schema.org & entity graph architecture
We build a full JSON-LD tree for Organization, Product, Article, FAQ, HowTo, BreadcrumbList, Person, Service, Review, Event and Offer. Schema is not just about visibility; it is the foundation for entity disambiguation in generative search engines — rich results + GEO citations are targeted in parallel.
International SEO & hreflang cluster
We set up hreflang clusters that scale to 7 languages + 18 countries and define the x-default fallback correctly; the sub-folder vs sub-domain decision is made based on CTR, authority flow and operational cost. We fix local rankings damaged by wrong hreflang within 4-6 weeks.
Core Web Vitals & field data management
LCP < 2.5s, INP < 200ms, CLS < 0.1 — these numbers are not targets but starting points. We track CrUX and PSI field data by template and alert within 24 hours when regression appears. From image preloading to critical CSS, we own every lever in LCP optimization.
Internal linking & topical authority
We build the topical authority map from a query matrix; we analyze pillar/cluster architecture, anchor text distribution, breadcrumb depth and internal PageRank distribution. 30-40% of most sites' traffic is the result of authority routed to the wrong page — fixing this alone drives serious growth.
Index hygiene & cannibalization audit
We extract cannibalization signals from Search Console + log + GA4 and clean up 30+ pathologies like soft 404, indexed but not in sitemap, duplicate without canonical. We keep the pages that should stay in the index and manage the rest with the right signal via noindex/410/canonical.
Content ops & query matrix
We do optimization, not production: query intent clustering, SERP feature mapping, content gap analysis, brief production, internal link suggestions and a refresh loop. We set up the operational pipeline to work with your content team (in-house or freelance) and cut the brief-to-publish time from 2-3 weeks to 5-7 days.
Migration & replatform risk management
Before a CMS change, domain migration, design system transition or move to headless, we run a technical risk assessment, build a URL map + 301 chain plan + rollback strategy. A pre-migration audit + post-launch monitoring is mandatory to avoid the '40% traffic loss after migration' story.
Algorithm update response protocol
During big waves — core update, helpful content update, spam update or site reputation abuse update — we produce an impact analysis within 48 hours: which pages were affected, why, how long recovery takes and which fixes take priority. Systematic response, not panic.
— OUTCOMES
What the technical SEO program delivers for you
Beyond page rankings we produce commercial and operational outcomes. Here are 6 concrete results our typical client sees within 90-180 days.
Predictable organic revenue
We forecast impressions, CTR and conversions on a quarterly basis; you gain an SEO model that speaks the same language as your finance team.
Escape technical debt
We systematically clean up old URLs, broken schemas, hydration anomalies and cannibalization stacks; the engineering team can focus on new features.
Visibility in AI search
Investment in schema + entity graph directly reflects into citations in ChatGPT, Perplexity and AI Overviews; we run the classic SEO + GEO program together.
Operational speed
Briefs, internal link suggestions, schema templates and QA automation speed up your content production 3-5x; your team keeps up with sprints.
Migration safety
We drive the traffic-loss risk close to zero during a replatform, domain migration or CMS change; no 'we'll fix it later' stress.
Algorithm resilience
We take a proactive, not reactive, approach to core updates; when impact occurs, we share an action list within 48 hours and typically recover within 30-45 days.
DELIVERABLES
What we ship inside the service scope
Concrete artifacts you receive at the end of every sprint — we ship files and systems rather than verbal reports.
Technical SEO audit document
40-60 findings, impact x effort scores, owners assigned in a Notion page.
Crawl & log report
Weekly Googlebot behavior report, anomalies, crawl frequency of critical URLs.
Render audit report
Puppeteer screenshot diffs, hydration metrics, SSR/SSG recommendations.
Schema implementation package
JSON-LD snippets for all templates + validator output + test links.
Internal link architecture map
Pillar/cluster diagram, anchor text matrix, internal PageRank simulation.
CWV optimization plan
Template-level action list for LCP, INP, CLS + developer tickets.
hreflang cluster configuration
XML sitemap + HTML tag + HTTP header triple, x-default fallback included.
Content brief template
Query intent, SERP features, internal link suggestions, schema notes, target keyword cluster.
BigQuery + Looker dashboard
Sprint review, alert configuration, competitor monitoring — one dashboard.
Migration playbook
Pre-migration checklist, 301 map, canary launch plan, post-launch monitoring.
Algorithm response protocol
Format for a root cause + action list within 48 hours of a core update.
Monthly executive summary
A one-pager for the C-level: traffic, revenue, sprint outputs, next quarter target.
— SCOPE
What's inside and what's outside the technical SEO program
Transparency is essential — telling clearly what we do is as important as telling clearly what we don't.
What this service covers
- Technical SEO audit + a 40-60 item action list
- Weekly log file analysis + crawl budget management
- JS render audit + SSR/SSG architecture recommendations
- Schema.org JSON-LD implementation for all templates
- Core Web Vitals optimization plan + live field data tracking
- Internal link architecture + topical authority map
- International SEO + hreflang cluster management
- Content brief production + editorial pipeline setup
- Index hygiene + cannibalization cleanup
- Migration risk management + post-launch monitoring
- Algorithm update response protocol (48-hour SLA)
- Monthly BigQuery + Looker dashboard review + roadmap refresh
What this service does not cover
- Spam links / PBN / black-hat link building
- Automated AI content generation (without QC)
- Guaranteed #1 ranking promises (no one can deliver these on Google)
- Website design & development (we collaborate with partner teams)
- Social media management & influencer marketing
- Performance marketing (Google/Meta Ads) — separate service scope
- Direct press release / PR distribution
- Post-sales technical support / hosting management
HOW WE WORK
Rankings by system-building, not guessing.
Week 1 — Technical & business discovery
Search Console, GA4, log files, crawl data and business goals are synced; a baseline report is produced.
Week 2 — Audit & priority matrix
A 40-60 finding technical audit scored by impact x effort x difficulty; the first-quarter roadmap is presented on one slide.
Week 3-4 — Quick win implementation
Index hygiene, robots/sitemap fixes, critical schema injection, obvious CWV regressions — the first concrete impact appears in this sprint.
Week 5-6 — Architectural refactor
URL taxonomy, internal link structure, render strategy and template CWV budgets are implemented with the engineering team.
Month 2 — Content ops pipeline
Query matrix, brief template, content refresh loop, editorial QA. Content production speed rises 3-5x; the pillar/cluster structure settles in.
Month 3 — Authority building & link earning
Digital PR, partner content, production of high-source-value assets (reports, calculators, datasets); a natural backlink flow is triggered.
Month 4 — SoV defense & competitor monitoring
A defensive sprint to protect market share: competitor technical moves, SERP feature losses and AI Overviews visibility tracked on one board.
Month 5+ — Monthly iteration & scaling
Sprint review, fixing deviations, new language/country/product launches, executive summaries. The program becomes a permanent capability.
— ECOSYSTEM
The platforms & tools we use
Pivotal tools per category — we integrate with the client's existing stack whenever possible and do not force new licenses.
CRAWL & INDEX
RENDER & PERFORMANCE
CONTENT & QUERY
REPORTING & WORKFLOW
QUESTIONS
Frequently asked
— GLOSSARY
Technical SEO glossary
Engineering-based SEO carries its own vocabulary. Short definitions for the 12 concepts we use most, so we speak the same language.
- Crawl budget
- The total crawling capacity Googlebot allocates to your site. Determined dynamically by server speed, site authority and content freshness; when wasted, critical pages get indexed late.
- Indexability
- The set of attributes that decide whether a URL is eligible to enter Google's index: meta robots, canonical, hreflang, status code, content quality and crawl accessibility evaluated together.
- Core Web Vitals (CWV)
- The three-metric set Google uses to measure user experience: LCP (largest content paint), INP (interaction response time) and CLS (visual shifts). Directly factors into rankings.
- LCP / INP / CLS
- LCP is the time for the largest content element to appear (<2.5s), INP is interaction response time (<200ms), CLS is visual shift during load (<0.1). All three must be 'good'; if one is bad, the template's CWV is considered failing.
- Hydration
- The process by which server-rendered HTML is 'brought to life' by JavaScript in the browser. Slow hydration breaks INP; if JS errors, content appears but becomes unclickable.
- Schema.org JSON-LD
- A structured-data format that makes page content machine-readable. The foundational signal for rich-result eligibility, entity disambiguation and LLM citations.
- hreflang
- The <link rel="alternate" hreflang="x"> directive that tells search engines about language/region variants of the same content. In a multilingual architecture it preserves canonical distribution, avoids ranking in the wrong language and self-cannibalisation; deployed reciprocally with an x-default.
- Topical authority
- The aggregate signal that shows your site's expertise depth on a particular topic cluster. A result of pillar/cluster architecture, internal link density and E-E-A-T components.
- Internal PageRank
- The classic PageRank algorithm simulated across your site's internal link structure. Measures how much authority each page accumulates from your own site; authority flow directly impacts ranking power.
- E-E-A-T
- Experience, Expertise, Authoritativeness and Trustworthiness. Google's quality evaluation framework; decisive for YMYL topics, supportive in other areas.
- SERP volatility
- A measure of daily fluctuation intensity in Google rankings. High volatility typically signals an algorithm update; tracked via tools like Semrush Sensor and Mozcast.
- Cannibalization
- A situation where two or more pages serving the same intent compete for the same query. Both pages rank lower and Google becomes uncertain which to show; resolved by consolidation or intent separation.
- Hreflang
- The <link rel="alternate" hreflang="x"> directive that tells search engines about language/region variants of the same content. In a multilingual architecture it preserves canonical distribution, avoids ranking in the wrong language and self-cannibalisation; deployed reciprocally with an x-default.
- Canonical Tag
- The <link rel="canonical"> meta that points search engines to the "primary" URL of identical or near-identical content. Consolidates signal across filtered categories, parameterised URLs, AMP/mobile alternates and syndicated copies; misuse can cause index hijacking.
- Structured Data
- Markup that makes page content machine-readable using the schema.org vocabulary. A prerequisite for rich-result, AI Overviews and voice eligibility; the most common carrier is JSON-LD, validated in Search Console's Rich Results report.
- JSON-LD
- The most common structured-data syntax — schema.org markup carried as JSON inside a <script type="application/ld+json"> block. Decoupled from the HTML, friendly to server-side rendering, and Google's recommended format.
- XML Sitemap
- An XML file listing the site's indexable URLs along with last-modified time and change frequency for search engines. Large sites combine sitemap index files with hreflang annotations; the file is referenced from robots.txt.
- robots.txt
- A plain-text file at the site root that tells crawlers which paths they may crawl. Carries per-user-agent Allow/Disallow directives, the Sitemap reference and — increasingly — AI-crawler hints like LLM-Content; a wrong Disallow can deindex pages.
- NLU (Natural Language Understanding)
- The NLP sub-discipline that turns raw text into intent + entities + sentiment. The layer search engines use to understand queries (BERT, MUM), chatbots use to extract commands and voice assistants use to pick the action. In the RAG/LLM era it gives SEO new meaning: optimise for semantic match.
- Query Intent
- The underlying purpose behind a search: informational ("how to"), navigational ("instagram login"), transactional ("buy airpods") and commercial investigation ("best headphones 2026"). Matching content type to intent (guide vs product page) is the key to ranking and conversion.
- Query Reformulation
- When the search engine rewrites or expands the entered query behind the scenes to understand it better — "iphone problem charging" → "iPhone won't charge". Modern Google does this aggressively with BERT/MUM; LLM-based search (Perplexity, ChatGPT Search) breaks the query into reasoning steps.
- Faceted Search
- Narrowing search results by parameters like category, price, brand, size, colour or rating (filters). Indispensable in e-commerce, but without proper canonical, robots and parameter handling it can spawn millions of duplicate URLs. A native feature in Algolia, Elasticsearch and Coveo.
- Autocomplete / Typeahead
- Showing search suggestions in real time as the user types. Built on trie data structures + popular-query telemetry + personalisation. In e-commerce it lifts conversion 15-30% and lowers bounce; Algolia, Elasticsearch search-as-you-type and Typesense are examples.
- Spell Correction (Search)
- Routing misspelled queries to the correct version. Built on edit distance (Levenshtein), keyboard-aware language models and query-log mining. Google's "Did you mean?" is the classic example; turning this off in e-commerce site search means leaving sales on the table.
- Stemming vs Lemmatization
- Stemming aggressively chops words to a root ("running", "runs" → "run"); lemmatisation uses a dictionary to map to the correct base form ("better" → "good"). Stemming is fast but error-prone; lemmatisation is slower but correct. The layer that lets SEO and search match singular variants of a word.
- Inverted Index
- A data structure that maps each word to the IDs of the documents containing it — "coffee" → [doc 17, 42, 88]. The secret to sub-millisecond search; Lucene, Elasticsearch, Solr and Tantivy all use it. The engine of classic keyword search, opposite vector search.
- BM25 (Okapi BM25)
- The classic keyword-search relevance algorithm (1994) — an improved TF-IDF that adds term-frequency saturation and document-length normalisation. The default scorer in Elasticsearch, Solr and OpenSearch; even in the vector-search era it remains the powerful partner in hybrid search.
- PageRank
- Google's "link authority" algorithm, invented by Larry Page and Sergey Brin in 1998. Recursively scores a page using the count and authority of inbound quality links. In modern Google it is one of hundreds of signals, but it is still the foundation of "link equity".
- E-E-A-T (Experience, Expertise, Authoritativeness, Trust)
- The quality framework at the heart of Google's Search Quality Raters Guidelines. "Experience" was added in 2022: practical hands-on experience. Critical to ranking in YMYL (Your Money or Your Life) niches — health, finance, law. Author bios, citations and fact-check signals matter.
- YMYL (Your Money or Your Life)
- Google's label for pages that can "affect a user's money, health, safety or happiness". Health articles, finance guides, legal content and child-safety pages. In these niches Google evaluates E-E-A-T signals and content quality far more strictly.
- Helpful Content System (Google)
- A site-wide Google signal launched in August 2022 that demotes content "not designed for users, only for search engines". Generic AI-written SEO sludge is the main target; pages with genuine expertise and original perspective gain ground.
- Schema Rich Results
- Search results that, thanks to Schema.org markup (JSON-LD), display extras like star ratings, prices, videos, FAQs and breadcrumbs in Google. CTR rises 20-50%; the "Rich results status" report in Search Console tracks errors.
- FAQ Schema
- Marking up question-and-answer blocks on a page with Schema.org/FAQPage. Google previously showed these as accordion-style rich results, but in 2023 it restricted them to large publishers for most queries. Still valuable as feed for AI Overview, since it gives LLMs structure.
- HowTo Schema
- Marking up step-by-step content — recipes, DIY repairs, setup guides — with Schema.org/HowTo. Google's 2023 restriction on rich results affected this markup too, but it remains valuable structure for AI search and LLM ingestion.
- Product Schema
- Marking up e-commerce product pages with Schema.org/Product — price, availability, sku, gtin, aggregateRating, review. Required for organic listings in Google Shopping, Bing and Pinterest visual search; correct implementation lifts CTR 30%+ and earns price snippets.
- LocalBusiness Schema
- Marking up local businesses (restaurant, hair salon, clinic) with Schema.org/LocalBusiness sub-types. Address, phone, opening hours, geo-coordinates, menu URL and price range. Combined with the Google Business Profile, it's a dominant ranking signal for "near me" searches.
- Featured Snippet (Position 0)
- A short answer box Google shows above the regular search results, pulled from a page. Comes in definition, list, table or video formats; lifts CTR 2-3×; 40-60 words is ideal for paragraph snippets. Its role is shifting under AI Overview but it remains valuable.
- People Also Ask (PAA)
- The accordion of related questions and answers Google shows on a SERP under the heading "People also ask". Clicking expands related questions; content strategists treat it as a roadmap for sub-niche content. SERP volatility is high and AI Overview is reshuffling its real estate.
— DECISION TREE
Is the technical SEO program right for you now?
Four short questions. In 30 seconds we tell you the right starting point — full program, content-first model, foundation before an architectural refactor, or strategic discovery.
01 / 04
Has your site lost organic traffic or revenue in the last 6-12 months?
Trend check via Search Console + GA4.
— LET'S BEGIN
How efficiently is your site architecture telling your story to Google?
We scan crawl, render, schema and ranking signals in 72 hours and show, in a single report, which stones are slowing you down and which gains are reachable in the next 90 days.