Adventures in automation
Automation is especially important for a solo business since it's just you. Here's some automation I set up for my new data subscription product.
Crawl jobs
- My crawl is a basic axios request with occasional assistance from cloudscraper. There are some evasive maneuvers like user agent randomization and jitter. I grab some values from a JSON object that's available on the target webpage.
- I've been running these crawls locally on a VPN. I needed a virtual private server (VPS) that would be able to run them daily without running into 403 errors. I picked Digital Ocean (well my cofounder ChatGPT did), because it met that criterion and it's cheap at around $8/mo.
- I set up cron jobs to run the crawl functions at various odd hours over the course of the day.
- That's it, pretty straightforward.
Autoblog (this is cooler)
- My dataset is updated every day, and there are some meaningful changes within a week period. I want to identify these changes and write them up in a digest for me and my customers. I wanted to see if I could use an LLM (Large Language Model) to do this automatically.
- I set up a "signal generation" function that looks at the data over the past week and returns a list of significant changes along certain dimensions.
- Then I pass this list of significant changes by dimension to an LLM. Specifically, I'm using OpenAI's GPT-4.1 because it has a large context window (1M tokens), affordable price ($8/1M output tokens), and good reputation for writing ability.
- I include a pretty extensive prompt that tells the model the output I'm looking for. It's one table-setting sentence, about a dozen bullet points about how to write, and then a structured list of signals.
- The response comes back in markdown format, and then it's saved to Supabase in a blog_posts table along with a title and published timestamp. There's a new "Blog" page in the application that lists these posts in reverse chronological order.
- I also have the blog post emailed directly to customers once it's generated, via Resend. This is a bit risky, but having generated many test blog posts and the fact that the prompt is highly specific, I don't think there's a huge amount of risk. The worst thing that could happen is that there's an error which gets emailed out, but I actually have a check request to an LLM before the email goes out that signs off on the email that it's suitable to send to customers. Otherwise, I get an email notifying me that there was a rejected send and what the body was.
- Once I verified that this was all working correctly, I set up another cronjob in Digital Ocean that runs this function weekly on Friday mornings.
I suspect that the autoblog idea might have a broader market. I'm going to be testing the waters to see if this could be a product in itself. I registered the domain autoblog.dev.
My pipe dream is that I can automate marketing. Once the product is public, it would be amazing if the product autonomously attracted new leads and optimized their conversion. Something like this:
- I describe the profile of existing customers who have found value in my product and what their use case(s) are.
- This product goes out every day and finds a batch of those people, and sends them sample content based on the autoblog posts. This would be better than some of the existing automated sales development representatives (SDRs) because it would actually be interesting / original content rather than a generic marketing pitch for a product. Maybe this product would sit on top of Clay, though I don't think they have an API yet.
- Then, those people are directed to the landing page where they can directly convert to a paying customer. Or they can reply to the email and get a hold of me, or book a meeting directly with me.
This post was not generated by an LLM, by the way. I wrote it the old fashioned way.