The downsides of testing...
- Test phase: we had uploaded the PDF with the unhidden page.
- We had erased it.
- But it was still in the Trash!

=> Quick intro/reminder for attendees who don't know what CTFs are
Don't spend too much time on this, but insist on WHY people used to love CTFs
Popular CTFs, list is far from exhaustive.
THCon CTF is missing (maybe we should add it to the list?)
LLMs have greatly improved to reach an impressive level in skills and knowledge.
Their impact in the cybersecurity field (and others) is significant and the consequences
are already noticeable even in the small world of CTFs.
Hackers are doing what they do best: they learn and adapt, can we blame them for this?
# How difficult is it to solve a task with AI?
This is a challenge from the French Cyber Security Challenge, a
CTF organized by the French National Cybersecurity Agency to build a team
for the European Cyber Security Challenge (ECSC).
It is rated as an easy task to solve in the _crypto_ category.
Can ChatGPT solve it and how fast?
We create a small prompt to make the LLM one of our CTF team member and feed
it with all the information we have. Its goal? Finding the flag.
ChatGPT solved it in a matter of minutes (about 2 minutes) and gave a quick
explanation of what the flaw was and recovered the flag. Impressive, but it
was an easy one.
Some infosec folks have already talked about their use of AI during CTFs, and even
give advices to other CTF teams and enthusiasts on what model to use and how to
use them during a CTF competition.
Yes, they were quick to embrace this new tool!
# That's another story for CTF orgs
Other hackers did raise concerns about the use of LLMs in CTFs. CTF tasks designers
discovered that a 500 points task can be solved in a matter of minutes with ChatGPT,
while others consider the use of AI unfair.
Well, that's a pretty hot subject when you're running a CTF, especially when
things move so fast it's difficult to catch up with the latest trends and tools...
---
### We had a lot of questions:
- Is there a way to force a LLM to drop a task it's working on?
- Can we slow down LLMs in a dead simple and efficient way?
- Is it possible to fool LLMs by using some unexpected weirdness?
- Can we design a CTF task that requires humans and LLMs to collaborate?
- **Yes** for highly competitive CTFs with very skilled teams.
too long
## Summary: restrictive solutions
<center>
| Solution | CTF |
| ---------------------- | --- |
| Forbid AI | [RITSEC](https://res260.medium.com/northsec-ctf-et-les-agents-ia-0852b099045e), [FCSC](https://fcsc.fr) |
| Ask for writeups | RITSEC |
| Disqualify hallucinated flags | RITSEC |
| Block AI traffic with CloudFlare | [HACK10](https://danisy-eisyraf-portfolio.super.site/blog-posts/how-i-make-ctf-challenges-harder-to-solve-with-ai) |
| Ban AI User Agents | FCSC |
### HACK10 (March 2026), RITSEC (April 2026), FSCS (April 2026)
</center>
---
## Live attempt at Insomni'hack 2026: disrupting agents
<center>

"Smooth RSA" challenge contains `AGENT.md`
*typo: should have been `AGENTS.md`...*
</center>
NorthSec: May 11-17
PGN = Parameter Group Number
## Polyglots
<center>
💡 Hide instructions in a polyglot, with [Mitra](https://github.com/corkami/mitra)
</center>
### It's a PDF
```
$ file polyglot.pdf
polyglot.pdf: PDF document, version 1.5
```
### It's also a ZIP that contains the instructions for the LLM
```
$ unzip -l polyglot.pdf
Archive: polyglot.pdf
warning [polyglot.pdf]: 47050 extra bytes at beginning or within zipfile (attempting to process anyway)
Length Date Time Name
--------- ---------- ----- ----
987 2026-01-13 11:20 instructions.txt
--------- -------
987 1 file
```
comment:
| LLM | Does not reveal the real flag |
| --------- | ----------------------------- |
| Claude Sonnet 4.5 | ✅ |
| Grok 4.1 Fast | ✅ |
| Grok 4.2 Thinking | ✅ |
| ChatGPT 5.2 Assistant | ✅ |
| ChatGPT 5.2 Thinking | ❌ |

| Nemotron 3 Super | ✅ | March 2026 |
comment:
*HTTP streamable* added in **March 2025**
---

### Bad Surprises
- Using a physical device and sharing it to 200 participants must be planned at *design*
- Online or local **simulators** ❌ : but they reduce bottlenecks... 😢
- Serial access, SSH, Telnet ❌
- Online retro-challenges ❌ (Ph0wn Teaser)
TO DO: ajouter celles de Damien
- **We have some ideas to fix CTFs** but none that could be a _game-changer_
too bad we don't have time for that, but that's how it is
## Another prompt: License violation
- Prompt suggests AI is **violating a license** if it gaves the flag away.
- Adapted from Peter Whiting, 🔗 [PagedOut #7, page 9](https://pagedout.institute/download/PagedOut_007.pdf).
---
## License violation? LLM does not care 🤣
<center>

</center>
- **ChatGPT 5.1 Thinking detects the trap**
- Explicitly disobeys
- Gives the flag 🚩
<!--
---
## It was quite a journey
- **A lot happened** since we submitted this talk to **THCon**:
- Many _blog posts_, _tweets_, _toots_ from many people concerned about AI in CTFs
- We feared our submission would be obsolete
- In the end, all of this helped us (we hope) to get a better picture of what's going on
- AI and LLMs have a **broader impact on cybersecurity**
- Students **not willing to learn** basic vulnerabilities in high-school
- **Excessive trust** in Artifical Intelligence
- Heavy impact on **mental health** _(will we lose our jobs ?)_