Cómo he trabajado el SEO técnico de mi portfolio con Astro

Después de publicar la nueva versión de mi portfolio, quise revisar algo más específico que el diseño o el stack: si la web estaba preparada para ser rastreada, entendida e indexada correctamente.

No quería limitarme a añadir un título, una descripción y un sitemap. Mi objetivo era que cada página saliera publicada con una base SEO coherente desde el primer momento: URLs claras, metadatos consistentes, datos estructurados, alternates de idioma, buen rendimiento y una configuración que facilite el rastreo.

Este es el trabajo técnico que he hecho.

Diseñar la arquitectura pensando en búsquedas concretas

La primera decisión SEO no está en una etiqueta meta, sino en la arquitectura. Cada página tiene que responder a una intención concreta y tener una URL estable que pueda enlazarse, indexarse y medirse.

En mi portfolio esa estructura queda dividida en:

Home.
Sobre mí.
Proyectos.
Página individual para cada proyecto.
Blog.
Página individual para cada artículo.
Contacto.
Versiones en español e inglés.

Esto permite que cada URL tenga una intención más clara. Una página de proyecto puede posicionarse por el nombre del proyecto, por la tecnología usada o por el tipo de trabajo realizado. Un artículo puede posicionarse por una pregunta concreta. La home no tiene que resolver todas las búsquedas posibles.

Astro encaja bien en esta parte porque genera HTML estático por página. No necesito que Google espere a que una aplicación client-side pinte el contenido principal. El HTML ya llega con estructura, títulos, texto y enlaces.

Centralizar los metadatos

Una de las primeras piezas que construí fue un componente SEO reutilizable. En vez de repetir <title>, descripción, canonical, Open Graph, Twitter Cards y robots en cada página, todo pasa por una misma capa.

Esto me da tres ventajas:

Evito olvidos entre páginas.
Mantengo un formato consistente.
Puedo cambiar una regla global sin editar medio proyecto.

Cada página recibe su título y descripción, pero el componente se encarga de completar lo demás: canonical, imagen social, theme-color, meta robots, Open Graph y Twitter.

Para una web pequeña puede parecer excesivo. En la práctica, cuando añades proyectos, blog e idiomas, centralizarlo evita errores muy fáciles de cometer.

Canonicals y trailing slash consistentes

Un detalle que parece menor, pero no lo es: las URLs tienen que ser coherentes.

En mi caso uso trailing slash de forma consistente. Eso significa que la URL canónica de una página termina siempre en /. El sitemap también genera las URLs con ese mismo formato.

La idea es evitar duplicados del tipo:

/proyectos
/proyectos/
/proyectos/index.html

Si Google encuentra varias formas de acceder al mismo contenido, puede entenderlo, pero prefiero no hacerle trabajar de más. Una URL, una canonical, una versión clara.

Hreflang para español e inglés

El portfolio tiene versión en español e inglés, así que cada página emite sus alternates:

es
en
x-default

También el sitemap incluye esos alternates. Esto es importante porque no basta con traducir páginas. Hay que decirle a los buscadores qué versión corresponde a cada idioma y cuál es la versión por defecto.

En mi caso, x-default apunta a la versión española porque es el idioma principal del sitio.

Datos estructurados con JSON-LD

He añadido JSON-LD para que los buscadores entiendan mejor qué representa cada página.

La web incluye esquemas como:

Person, para identificarme como autor y profesional.
WebSite, para describir el sitio.
ProfilePage, en la home y sobre mí.
CollectionPage, en listados como proyectos y blog.
CreativeWork, SoftwareSourceCode o WebApplication, según el tipo de proyecto.
BlogPosting, en cada artículo.
BreadcrumbList, para reforzar la jerarquía de navegación.

No espero que esto posicione por sí solo. Los datos estructurados no sustituyen al contenido. Pero sí reducen ambigüedad: ayudan a conectar autor, sitio, proyectos y artículos dentro de una misma entidad.

Sitemap dinámico

El sitemap no lo mantengo a mano. Se genera desde el contenido real del proyecto.

Incluye:

Páginas estáticas.
Proyectos.
Artículos del blog.
Fecha de última modificación.
Frecuencia de cambio.
Prioridad.
Alternates de idioma.

Esto evita que el sitemap se quede desactualizado cuando publico un nuevo proyecto o artículo. Además, lo he acompañado con una hoja sitemap.xsl para que, si alguien lo abre en el navegador, no vea solo XML plano sino una tabla legible.

El sitemap no hace magia, pero facilita el rastreo. Y en una web que va creciendo con contenido, mantenerlo automatizado es una decisión sana.

Robots.txt sin bloquear recursos importantes

El robots.txt permite rastrear el sitio y apunta al sitemap. También deja explícito que los bots de IA y buscadores pueden acceder al contenido.

Un punto importante: no bloqueo /_astro/. Ahí viven assets generados por Astro, como CSS y JavaScript. Si bloqueas esos recursos, Google puede tener más problemas para renderizar la página como un usuario real.

Sí bloqueo /api/, porque no es una zona que quiera exponer al rastreo si en el futuro añado endpoints.

Archivos llms.txt y llms-full.txt

Además del SEO clásico, he añadido llms.txt y llms-full.txt.

La idea es sencilla: ofrecer una versión estructurada del contenido para asistentes de IA. llms.txt funciona como índice corto del sitio y llms-full.txt reúne más contexto en formato Markdown.

No es un estándar equivalente a robots.txt, pero me parece útil para una web personal. Si alguien busca información sobre mí o mis proyectos a través de un asistente, quiero que el contenido sea fácil de encontrar, citar y resumir correctamente.

Los he dejado accesibles, pero configurados para que funcionen como archivos de apoyo y no como páginas principales del sitio.

Rendimiento como parte del SEO

El SEO técnico no es solo etiquetas. También importa cómo carga la web.

Por eso tomé varias decisiones:

Usar Astro para generar HTML estático.
Evitar convertir todo en una SPA.
Reducir JavaScript en el cliente.
Servir la fuente Inter localmente.
Organizar Sass en una arquitectura mantenible.
Cachear assets generados por Astro durante mucho tiempo.
Evitar animaciones pesadas que no aportaban al contenido.

Mi portfolio anterior tenía más efectos visuales. Este es más sobrio, pero carga mejor y es más fácil de mantener. Para una web profesional, prefiero que el contenido sea rápido, claro y rastreable.

Configuración del hosting

También he revisado la configuración del hosting para que cada tipo de recurso se sirva correctamente.

En la práctica, esto significa cuidar tres cosas:

Que los archivos técnicos tengan el tipo de contenido correcto.
Que los recursos estáticos puedan cachearse sin perjudicar actualizaciones importantes.
Que la configuración global no bloquee archivos necesarios para renderizar o rastrear la web.

No es la parte más visible del SEO, pero puede afectar bastante. Un sitemap mal servido, un recurso bloqueado o una caché demasiado agresiva pueden crear problemas difíciles de detectar si solo miras el HTML.

Contenido antes que trucos

La parte técnica ayuda, pero no sustituye al contenido. Para que el portfolio tenga más posibilidades de posicionar, necesitaba páginas con texto real, no solo tarjetas bonitas.

Cada proyecto tiene su propio contexto:

Qué problema resolvía.
Qué rol tuve.
Qué stack usé.
Qué aprendí.
Qué mejoraría.

Eso convierte cada proyecto en una página útil, no solo en una captura con un enlace. Lo mismo ocurre con el blog: escribir sobre decisiones reales del proyecto me permite crear contenido conectado con mi experiencia.

Qué he aprendido

La principal conclusión es que el SEO técnico funciona mejor cuando forma parte de la arquitectura desde el principio.

No es añadir un plugin, generar un sitemap y olvidarse. Es tomar decisiones coherentes:

URLs claras.
HTML renderizado.
Metadatos consistentes.
Contenido estructurado.
Buen rendimiento.
Recursos accesibles para bots.
Datos estructurados.
Sitemap y robots alineados.
Una configuración de hosting que no bloquee el rastreo.

Mi objetivo no era perseguir atajos, sino construir una base sólida. Si publico más proyectos y artículos, el sistema ya está preparado para que cada nueva página salga con una estructura SEO correcta desde el primer momento.

After publishing the new version of my portfolio, I wanted to review something more specific than the design or the stack: whether the site was prepared to be crawled, understood and indexed correctly.

I did not want to stop at adding a title, a description and a sitemap. My goal was for every page to be published with a coherent SEO base from the beginning: clear URLs, consistent metadata, structured data, language alternates, good performance and settings that make crawling easier.

This is the technical work I did.

Designing the architecture around specific searches

The first SEO decision is not in a meta tag, but in the architecture. Each page needs to answer a specific intent and have a stable URL that can be linked, indexed and measured.

In my portfolio, that structure is split into:

Home.
About.
Projects.
Individual page for each project.
Blog.
Individual page for each article.
Contact.
Spanish and English versions.

This allows every URL to have a clearer intent. A project page can rank for the project name, the technology used or the type of work done. An article can rank for a specific question. The home page does not have to solve every possible search.

Astro fits this part well because it generates static HTML per page. I do not need Google to wait for a client-side app to render the main content. The HTML already arrives with structure, headings, text and links.

Centralising metadata

One of the first pieces I built was a reusable SEO component. Instead of repeating <title>, description, canonical, Open Graph, Twitter Cards and robots tags on every page, everything goes through one layer.

This gives me three advantages:

I avoid missing metadata between pages.
I keep the format consistent.
I can change a global rule without editing half the project.

Each page provides its title and description, while the component completes the rest: canonical, social image, theme-color, meta robots, Open Graph and Twitter.

For a small site it may look like too much. In practice, once you add projects, a blog and languages, centralising it prevents very common mistakes.

Consistent canonicals and trailing slash

One detail that looks small, but is not: URLs must be consistent.

In my case, I use trailing slash consistently. That means the canonical URL of a page always ends in /. The sitemap also generates URLs using that same format.

The goal is to avoid duplicates such as:

/projects
/projects/
/projects/index.html

If Google finds several ways to access the same content, it can usually understand it, but I prefer not to make it work harder. One URL, one canonical, one clear version.

Hreflang for Spanish and English

The portfolio has Spanish and English versions, so each page emits alternates:

es
en
x-default

The sitemap also includes those alternates. This matters because translating pages is not enough. Search engines need to know which version belongs to each language and which one is the default.

In my case, x-default points to the Spanish version because it is the main language of the site.

Structured data with JSON-LD

I added JSON-LD so search engines can better understand what each page represents.

The site includes schemas such as:

Person, to identify me as the author and professional.
WebSite, to describe the site.
ProfilePage, on the home and about pages.
CollectionPage, on listings such as projects and blog.
CreativeWork, SoftwareSourceCode or WebApplication, depending on the project type.
BlogPosting, on each article.
BreadcrumbList, to reinforce the navigation hierarchy.

I do not expect this to rank by itself. Structured data does not replace content. But it reduces ambiguity: it helps connect the author, site, projects and articles as part of the same entity.

Dynamic sitemap

I do not maintain the sitemap manually. It is generated from the real content in the project.

It includes:

Static pages.
Projects.
Blog articles.
Last modification date.
Change frequency.
Priority.
Language alternates.

This prevents the sitemap from becoming outdated when I publish a new project or article. I also added a sitemap.xsl stylesheet so that, if someone opens it in the browser, they do not just see plain XML but a readable table.

The sitemap is not magic, but it makes crawling easier. And on a site that grows with content, keeping it automated is a healthy decision.

Robots.txt without blocking important resources

The robots.txt allows the site to be crawled and points to the sitemap. It also explicitly allows AI bots and search engines to access the content.

One important point: I do not block /_astro/. That is where Astro-generated assets live, such as CSS and JavaScript. If you block those resources, Google may have more trouble rendering the page like a real user.

I do block /api/, because it is not an area I want crawled if I add endpoints in the future.

llms.txt and llms-full.txt

Beyond classic SEO, I added llms.txt and llms-full.txt.

The idea is simple: provide a structured version of the content for AI assistants. llms.txt works as a short index of the site and llms-full.txt gathers more context in Markdown format.

It is not a standard equivalent to robots.txt, but I find it useful for a personal site. If someone searches for information about me or my projects through an assistant, I want the content to be easy to find, cite and summarise correctly.

I keep them accessible, but configured as support files rather than main pages of the site.

Performance as part of SEO

Technical SEO is not just tags. How the site loads also matters.

That is why I made several decisions:

Use Astro to generate static HTML.
Avoid turning everything into a SPA.
Reduce client-side JavaScript.
Serve the Inter font locally.
Organise Sass with a maintainable architecture.
Cache Astro-generated assets for a long time.
Avoid heavy animations that did not support the content.

My previous portfolio had more visual effects. This one is more restrained, but it loads better and is easier to maintain. For a professional site, I prefer the content to be fast, clear and crawlable.

Hosting configuration

I also reviewed the hosting configuration so each type of resource is served correctly.

In practice, this means paying attention to three things:

Technical files should use the correct content type.
Static resources should be cacheable without making important updates harder.
Global configuration should not block files needed to render or crawl the site.

It is not the most visible part of SEO, but it can matter a lot. A sitemap served incorrectly, a blocked resource or overly aggressive caching can create issues that are hard to spot if you only look at the HTML.

Content before tricks

The technical side helps, but it does not replace content. For the portfolio to have a better chance of ranking, I needed pages with real text, not just nice cards.

Each project has its own context:

What problem it solved.
What role I had.
What stack I used.
What I learned.
What I would improve.

That turns each project into a useful page, not just a screenshot with a link. The same applies to the blog: writing about real decisions in the project lets me create content connected to my experience.

What I learned

The main conclusion is that technical SEO works better when it is part of the architecture from the beginning.

It is not about adding a plugin, generating a sitemap and forgetting about it. It is about making coherent decisions:

Clear URLs.
Rendered HTML.
Consistent metadata.
Structured content.
Good performance.
Resources accessible to bots.
Structured data.
Sitemap and robots aligned.
Hosting settings that do not block crawling.

My goal was not to chase shortcuts, but to build a solid base. If I publish more projects and articles, the system is already prepared so every new page starts with a correct SEO structure from day one.