What's in this article
What's in this article
Bad bots can cause trouble, attacking websites, trying to bring them down! Even legitimate bots can get out of control, making huge numbers of requests to websites! Let's see how we can deal with this issue...
"Bots" in this context refers not to walking and talking robots, but to web bots, which are automated programmes that perform simple tasks at speed.
An example of a bot, is "Googlebot", which is Google's own bot that is constantly discovering new web pages and indexing content, so Google can show results when we search.
Other bots are not so useful and could be designed to attack websites, trying to bring them down!
But even legitimate bots can get out of control, I've seen it many times before. We've seen bots that belong to web applications go mad, making huge numbers of requests to websites. Sometimes it could be because of a bug or glitch, other times it might be by design but still causing us issues.
The huge number of requests can cause a massive load on server and cause websites to struggle or even go down.
Our websites are really for humans to use, so it can get to the point where we need to take action...
As mentioned some bots are "good". Google bot is a good bot, we wouldn't want to block that, noone would find our website!
So are some others from SEO tools and other legitimate software that needs to look at sites in order to assess their content.
Human Security covers the list of the common good bots in their article.
Unwanted bots often orginate from other countries like Russia or China, where all we actually care about is users in the UK. I appreciate you may want traffic from other countries, but we're just saying as a UK business, we want to focus there.
You can find which bots are hitting your site in your Apache logs here:
In cPanel:
/var/log/apache2/error_log
Using SSH:
tail -f /var/log/apache2/access.log
In Plesk:
/var/www/vhosts/<domain.tld>/logs/
Here's an example of bad bots in the Apache log, thanks to ChatGPT:
192.241.200.101 - - [14/Sep/2025:12:03:15 +0000] "GET /wp-login.php HTTP/1.1" 200 3267 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.8; http://mj12bot.com/)" 185.220.101.4 - - [14/Sep/2025:12:05:22 +0000] "GET /robots.txt HTTP/1.1" 200 68 "-" "AhrefsBot/7.0 (+http://ahrefs.com/robot/)" 45.146.165.55 - - [14/Sep/2025:12:07:09 +0000] "GET /admin/ HTTP/1.1" 404 502 "-" "Mozilla/5.0 (compatible; CensysInspect/1.1; +https://about.censys.io/)" 185.191.171.35 - - [14/Sep/2025:12:08:46 +0000] "GET /?author=1 HTTP/1.1" 301 178 "-" "python-requests/2.26.0" 222.186.30.112 - - [14/Sep/2025:12:10:02 +0000] "POST /xmlrpc.php HTTP/1.1" 200 712 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:54.0) Gecko/20100101 Firefox/54.0"
Why these look suspicious:

Here are some example rules you can add to your .htaccess file to block certain bots. Note, the exact syntax of .htaccess rules can vary depending on your server technology stack, so you may need to research further to find one that works for you.
For most websites, you'll have an .htaccess file in the root of your site and you can add some rules to block bad bots.

Warning! - You can break your site by adding incorrect or malformed rules in your .htaccess file. I recommend adding one at a time and testing your site still loads in your browser before going onto the next. If in doubt, add them on a deveopment or staging site.
We can block bots by HTTP_USER_AGENT in our .htaccess file.
Here's the general format to block the bots named in the rules below (substitute BOTNAME, BOTNAME2 and BOTNAME3):
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (BOTNAME|BOTNAME2|BOTNAME3) [NC]
RewriteRule (.*) - [F,L]
So for blocking a bot called SiteAuditBot we would add the following rule
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (SiteAuditBot) [NC]
RewriteRule (.*) - [F,L]
SiteAuditBot is actually a legitimate bot used by software called SemRush, however, it can still make a lot of requests to our site, so we can block it if we want to.
And you can simply add more bot names, by adding "|" between their names.
We can also use HTTP_REFERER to block bots originating from certain domains:
# Block via Referrer
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTP_REFERER} ^http://(.*)spamreferrer1\.org [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(.*)bandwidthleech\.com [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(.*)contentthieves\.ru [NC]
RewriteRule (.*) - [F,L]
</IfModule>
Source: https://chemicloud.com/kb/article/block-bad-bots-and-spiders-using-htaccess/
Here's another rule you can try out to block bots by name:
SetEnvIfNoCase User-Agent "Yandex" bad_bot SetEnvIfNoCase User-Agent "AhrefsBot" bad_bot SetEnvIfNoCase User-Agent "MJ12bot" bad_bot <IfModule mod_authz_core.c> <Limit GET POST> <RequireAll> Require all granted Require not env bad_bot </RequireAll> </Limit> </IfModule>
Source: https://stackoverflow.com/questions/30936220/how-to-block-bot-bot-via-htaccess
You can also individually, block bots if you know their name.
Turn on the RewriteEngine, and set the target to the root using RewriteBase, then list some bots you want to block:
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} almaden [OR]
RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
...
Source: https://perishablepress.com/ultimate-htaccess-blacklist/ - see here for complete list of bots to block
So, this will block the bots called almaden, Anarchie etc.
When we block by IP address, we can block bots, but also unwanted users in general who may be attacking our website.
This is nice and simple to do, by using the deny directive:
deny from 123.123.123.123
You can also block a range of IP addresses like this:
To block the range 123.123.123.1 – 123.123.123.255, use 123.123.123.0/24 To block the range 123.123.64.1 – 123.123.127.255, use 123.123.123.0/18
Source: inmotionhosting.com
We sometimes find we have to block a range of IP addresses when our servers are being hit hard from certain geographical locations!
Another way to block by IP address is to use the following rules:
To block multiple IPs:
# Block multiple IPs
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REMOTE_ADDR} ^123\.456\.789\.000$ [OR]
RewriteCond %{REMOTE_ADDR} ^222\.333\.444\.555$ [OR]
RewriteCond %{REMOTE_ADDR} ^111\.222\.333\.444$
RewriteRule (.*) - [F,L]
</IfModule>
For blocking a range of IPs:
# Block a range of IPs
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REMOTE_ADDR} ^123\. [OR]
RewriteCond %{REMOTE_ADDR} ^111\.222\. [OR]
RewriteCond %{REMOTE_ADDR} ^444\.555\.777\.
RewriteRule (.*) - [F,L]
</IfModule>
Source: https://chemicloud.com/kb/article/block-bad-bots-and-spiders-using-htaccess/
Your websites robots.txt is another file in your website for managing the access to your site. We use it to block certain directories and pages that we don't want Google to index, such as a login page or test page, but we can also use it for blocking bots.
Not everyone is a fan of letting AI bots run amok on their website, and if you are part of this school of thought then there is good news - you can block 'em!
Your robots.txt file is also found in your websites root directory:

So if you wanted to block ChatGPT and Gemini you can add the following rules:
#Block OpenAI’s GPTBot User-agent: GPTBot Disallow: / #Block Google’s Gemini User-agent: Google-Extended Disallow: /
Source: https://datadome.co/learning-center/block-ai-bots/
There is an alternative to totally blocking a bot or user, which is to rate-limit them.
This means, if they make a certain number of requests over a certain time, we can then stop them accessing our server or web page.
See this article for rate limiting using Cloudflare.
There are some great pre-existing software products you can use instead to detect, block and manage bot access to your site, here are some examples:
Cloudflare says we can manage good and bad bots in real-time with speed and accuracy by harnessing data from the millions of Internet properties on Cloudflare.
As a huge CDN they certainly have access to a lot of data regarding internet traffic and have the infrastructure to block bad bots.
Read more here: https://www.cloudflare.com/en-gb/application-services/products/bot-management/
Stop sophisticated bots on your websites, mobile apps, and APIs with real-time, easy-to-use, accurate protection from DataDome.
DataDome prevents fraud, data breaches and site performance issues resulting from bad bots hitting your website. It also prevents bad bots from affecting marketing and sales analytics by adding unwanted hits.
Discover more about DataDone here: https://datadome.co/products/bot-protection/
With a range of services including WAF (Website Application Firewall) and Monitoring & Detection and Performance Boost, Sucuri offer a highly technical team of security professionals distributed around the world, each trained in identifying and fixing any issues you might be faced with. Consider us an extension of your existing team.
Their plugin offers a host of features for your WordPress based website.
Read more on their site here: https://sucuri.net/website-firewall-a/bot-mitigation/
The above is just a few solutions, you'll find many more via Google with this search.
Bots can be a real pest, plaguing your website with fake "traffic" and even threatening to bring it down altogether.
Hopefully this article helps to clarify some of the issues and solutions around controlling bots access to your website.
Get in touch if you need help with bots, or post below if you have any comments on this article.
Article by David Reeder. LinkedIn Profile: https://www.linkedin.com/in/david-e-reeder/
Related Articles
16 June 2025
If you’ve been Googling something like “web design agency in London”, chances are you’ve landed here looking for someone who really knows their stuff… Read more
19 May 2025
London is one of the best places in the UK (and honestly, Europe) to kick-start a career in web design. The capital is buzzing with digital agencies,… Read more
06 May 2025
A high bounce rate will kill your sales! Whether you’re a business owner, marketer or DIY website builder, this guide is packed with practical fixes… Read more
Keep up to date
Subscribe to receive occasional email newsletters from us.