Automating Instagram Comments Using Image Recognition

I used Python to build an instagram bot that uses object detection to automatically generate and post relevant comments

ZhongTr0n

Published in

Geek Culture

11 min readJun 22, 2021

Instagram bots: What, How and Why?

With about one billion monthly active users, Instagram is one of the most popular social media. The main purpose of Instagram is to share photos or videos with your followers. Not only individuals are on Instagram with their personal accounts, but companies ranging from Apple to your florist around the corner all have accounts too to promote their business and share content.

The popularity of this platform results in high value of followers. Like in all social networks, the more followers you have, the more people will get to see your content. Followers are not only attracted and retained by the quality of the content you are posting, but also by the frequency and the interactions you provide. If you like a lot of posts and leave many comments, you might attract more followers.

As leaving comments and liking posts is a time consuming repetitive task many have tried to automate this by creating bots. The bots are simple scripts that will perform actions like:

Like posts
Leave comments
Follow profiles
Etc

The readers that are active on Instagram will most likely have encountered obvious bots. And while your favorite boomer might fall for the comment “Cool picture! If you follow me, you might win an iPhone”, you — as a tech savvy reader — will probably see right through it and be nothing more than annoyed. So let’s see if we can do it better.

The Idea: Building a Smarter Bot

Being annoyed by the obnoxious bots, I was wondering; wouldn’t it be possible to make a -slightly- smarter bot that could trick more people?

Instead of leaving generic comments like “Cool picture!”, I was thinking of using image recognition to actually detect what is in the image and use the output to generate a more suitable comment.

Image source: pexels.com (I could not use actual IG images for copyright reasons)

Using this technique it might be easier to trick people into thinking an actual person is posting the comment instead of a bot. But before getting my hands dirty in the code, I had to choose a more specific strategy as I could not just start commenting on random posts.

The Playground: Painting of the Day

After experimenting with some random posts, I found it tricky to create meaningful comments from which it is not too obvious they are written by a bot. For example something like this:

Leaving a comment about a random object like a bus in the background would be weird and make it more obvious it is computer generated. So I had to come up with a theme of posts where random objects are still relevant.

Thinking about possible topics I could analyze and comment on, I landed at art and paintings. Every minute, new pictures of paintings are uploaded under the hashtag #paintingoftheday and people love to read comments about their art. Furthermore, it is way less awkward to comment on random objects in a painting that is in real life, for example:

Commenting on a random object in a painting might refer to the skill, colors, or detailing.

Another good reason to go for paintings is because my own profile needs to have content too. I mean, why would anyone follow an empty profile? Getting fake, yet authentic content is not an easy task. As there are many AI models that can generate paintings, I decided to use one of those to pose as a painter myself.

The Bot: Meet Sarah Connor

Sarah Connor is the name of my bot, she is young, hip, and posts a picture of her latest painting every month.

Her name is a reference to Sarah Connor from the Terminator movie franchise and her profile description says “bot” in hexadecimal (626f74).

Her avatar is also AI generated on thispersondoesnotexcist.com. I had to refresh a couple of times to find a “hip” looking person who did not look directly into the “camera”, making it a more natural avatar.

The Tech: The Bot’s Internals’

The bot will have to execute and repeat the following sequence:

Log in
Get link to latest #paintingoftheday post
Use image recognition to find objects in said post
Comment on the post, referring to the detected objects.
Repeat

Let’s take a closer look at each step:

Step 1: Log In

This is the most simple step. Using Selenium in a Python script the bot goes to instagram.com and logs in as @sarahs_easel. Basically it is just automating how you would log in using your web browser.

Step 2: Get Link to Latest Post

Still using Selenium, the bot will now navigate to the Instagram page showing all the #paintingoftheday posts. Now it’s time to invoke another library, Beautiful Soup 4, which will scrape all the post links from the source code. This generates a long list of instagram URL’s all containing posts with the hashtag “painting of the day”. The first URL in the list is the most recent post.

Step 3: Detect Objects

It is not that straightforward to download an image from Instagram using BS4, so as a workaround I used Selenium (browser automation) to navigate to the post and take a screenshot. This locally stored screenshot is fed to a pre-trained image recognition model named YOLO.

The model will scan the image and return all the objects found with matching confidence levels. To keep things simple, I only used the object with the highest confidence. In case the image did not return any objects, I fed the model the next post. This process keeps iterating until a painting with a detected object is returned.

Step 4: Leave a Comment

Now for the key step in the process; leave a comment.

In order to generate the comments I drafted a table with different parts;

A Python script will select a random value from each column to create a comment like:

Refreshing work! Especially the [DETECTED OBJECT].

The last part of the comment is the object detected by the AI model (in step 3). Again, we will resort to Selenium to automate the browser usage and type the comment under the post.

And that’s it, the bot did its work. If we are lucky the user that received the comment is curious, will take a look at Sarah’s profile and follow her.

Other Tasks

Aside from posting the comments, Sarah Connor has 3 more assignments:

Avoiding detection
Logging her work
Posting content herself

Avoiding Detection

Bots are annoying, generate fake content and overall have little to no contribution to Instagram. Therefore Instagram, like any other social media, tries to take bots down. Part of Sarah Connor’s mission is not to get detected, but it’s not that easy as I don’t know which criteria Instagram uses to flag behaviour as a bot. To keep it safe, using cron I set a very low level of activity which I might increase later if I get away with it.

The current pattern is the following:

Start the working day at 8AM ET
Post a comment after every 5 to 15 minutes (to keep things random)
After 80 comments it should be around 10PM ET and time to call it a day.
Repeat the next day

I am aware 80 comments is not that much, but it’s more about the experiment. And remember, just by changing a couple of variables I can easily scale this up to 1000 posts per day. However, this will come with a higher risk of getting detected.

Logging the Work

As a data geek I understand the principle of “measuring is knowing”, so I let Sarah log her work. I briefly considered setting up a PostgreSQL database but I realized a simple csv was sufficient to log this project’s progress. After each iteration the bot adds rows to the csv containing data like, datetime, detected objects, post url, comment etc.

Posting Content

Sarah’s paintings are generated by boredhumans.com . Initially I planned on automating the posting process too. But after fiddling around with Selenium and phone emulators in Chrome I discovered it’s a bit of hassle to automate posting content on Instagram. As it is acceptable to post paintings at a low frequency (after all it takes time to paint them right?) I settled for manually posting a new painting every month.

Although this goes a bit against my philosophy of wanting to automate the whole process, I don’t want to block the whole project on a single technological hurdle either.

The Results: Good Bot or Bad Bot?

Although I did not encounter any major hurdles, it took me quite a while to write the code. The idea was to deploy the code on a Raspberry Pi computer and make it run 24/7. But of course before deploying it, I wanted to run a test on my local system.

Test Run

Before the test. I already tested most of the functions separately (like logging in, posting a comment, parsing an image etc). After combining all the functions and adding some timers it was time for the big test and I let the script run until it successfully posted 10 comments.

As I was watching the bot working hard and Selenium automating my browser it took less than 2 minutes before someone replied to the first comment;

My first thought was “Oh damn this seems to work really well” but then I thought about the person behind the computer. This person probably spent a lot of time and effort on this painting and seemed really excited that someone noticed and appreciated it. Little does he/she know that the appreciation comes from nothing more than a Python script. While the bot kept doing its work, more similar comments started popping up in my notifications. The response rate was really high (admittedly in a very small sample) . This actually makes a lot of sense as the top starts with the most recent posts, meaning the person who posted the painting is most likely still using the app.

Unfortunately not every comment was on the mark. The number of objects the pretrained model can detect is limited and it is not always that accurate. For example if there is a coffee mug standing next to the painting, it might refer to the coffee mug instead of the painting itself. Or in other cases it might confuse colorful phone case for a cake or something,

Overall, I concluded the test result was good enough for the script to be deployed on the Raspberry PI.

However,…

Moral Dilemma

The hypothesis of this project was to see if I could build a dynamic Instagram bot using a simple image recognition algorithm. However, being so focussed on the technological challenge I did not pay enough attention to “the person behind the screen”.

Almost everyone from the very limited test run replied to the comments, resulting in my notifications looking like this:

People are naturally and rightfully really proud of their artwork so seeing them overjoyed by fake praise from a Python script didn’t feel right. Although I was really curious and excited to see how far I could take this project and artificially grow a social media account, I decided not to let the bot run. Seeing people getting so excited by fake praise was not something I could get behind.

Moreover, I also refrained from my initial plan on making the code public as I don’t want some morally bankrupt advertising company to use it. It’s not rocket science and anyone with some basic web scraping and image recognition skill could probably still build it, but I don’t want to lower the bar even more for people with malintent.

Bot In Action

Usually, I always describe my projects using articles. For this particular project I made a nice screen recording showing the bot action. Given that the screen recording by itself needs some content, I decided to explore some new territory and record a short YouTube video describing the project. Even though I spent a lot of time censoring all the usernames and comments, unfortunately I got complaints on possible copyright infringements. For this reason, I removed the video. If I find the time and motivation to blur everything in the video I will post it on my social media accounts.

Conclusions

Good Bot

Overall I am happy with the results. Despite the fact the test was very limited, the results looked promising. Overall the good comments were good and more importantly, the instagram users interacted with the comments and seemed to believe they were real.

In addition, the less accurate results are not that big of a deal. When the comment does not make sense people might understand it’s a bot (or might not) and will not engage with Sarah Connor’s profile, but that will probably be the end of it. Of course, in case Sarah would have a bigger following it might be a risk to get flagged as a bot too often.

Room For Improvement

As mentioned, this was a rather simple combination of existing scripts technology which leaves tons of room for improvement. Some things I could have done better:

Generate better wording for the comments. Maybe using GPT-3.
Use a better image recognition model that can detect a wider range of objects
Further expand the script to also follow on replies
Automating posting content

Further Research

At first I thought of completely abandoning the project after deciding not to deploy it. But giving it some more thought, I might change the course a little bit. Some things I had in mind are:

Doing something less invasive than posting comments and use the image recognition merely to analyze content.
Focussing on other topics, that are less personal than artwork.
Try to remove as much browser automation as possible making the whole solution headless.

I am not sure yet whether or not I will continue working on this project, but in case I do I will definitely share the results.

The Future Is Now

This project shows just how easy it is to fool people. I would probably be fooled myself in case a bot would reply to this article saying something like “Nice article. I really like the application of the web scraping”.

I don’t want to get too philosophical, but the distinction between human and computer generated content keeps blurring. The possibilities, quality and scale of modern computer generated content keeps improving. For this reason, I would recommend everyone to stay vigilant and take everything you see on the internet with a grain of salt.

Lastly, I would like to apologise to the few people that were part of the test and were on the receiving end of the fake comments.

AI researcher Lex Fridman recently shared this quote from William Shakespeare, which I think might be fitting to end this project with; Love all, trust a few, do wrong to none.

If you want to see the other stuff I built, like a mumble rap detector, make sure to take a look at my profile. Or connect with me via my website: https://www.zhongtron.me

Happy coding :).