Earlier this month, I was the curator of @WeAreRLadies for a week. They have a different person tweeting every seven days; I was in charge in the week commencing 5th October.
When I was planning, I wondered how active I should be. I decided to do some analysis of the account to see if there was a normal pattern I could seek to replicate. To do this, I used the package
rtweet to pull in tweets from the account for the past year.
get_timelines() function, you specify the account name and the number of tweets you want. I found through trial and error that I needed to pull ~2000 tweets to go back a year; I then filtered the data to make sure it only included tweets from the period I was interested in.
I took a few more steps to add information that would help me explore. Firstly I added a column to indicate if the tweet was a reply to another account or a retweet, to differentiate between those activities and original tweets directly broadcast on the channel. I also added a column to indicate the date of the Monday in that week so the analysis would pick up the different curator periods.
Throughout, I used
data.table as it’s my preferred package for working with tabular data.
library(rtweet) library(data.table) # Download 2000 tweets from account rladies_tweets <- as.data.table(get_timelines("WeAreRLadies", n = 2000)) # Filter so all tweets are between given dates rladies_tweets <- rladies_tweets[created_at > "2019-10-07" & created_at < "2020-10-05"] # Add a column to indicate tweet is not reply or retweet rladies_tweets[, not_reply_retweet := (is_retweet == FALSE) & (reply_to_screen_name == "WeAreRLadies"|is.na(reply_to_screen_name))] # Add a column to group weeks by start date rladies_tweets[, week_start := cut(created_at, "week")] # Reduce down to relevant columns rladies_tweets <- rladies_tweets[, .(created_at, text, favorite_count, retweet_count, quote_count, reply_count, not_reply_retweet, week_start)]
ggplot2, I plotted activity over time. You can do this through
rtweet too but I found it easier to use
ggplot2 once I’d transformed the data.
You can see that there are some weeks when there was a break between curators. You can also see curators are totally different - both in terms of number of tweets and the number of retweets and replies. One curator tweeted over 200 times!
To get more of a sense of range, I looked at the number of tweets a week and plotted some boxplots.
# Number of tweets each week for each tweet category week_counts <- rladies_tweets[, .N, by = c("week_start", "not_reply_retweet")] # Number of tweets each week that are a reply/retweet week_counts <- week_counts[!not_reply_retweet == FALSE] # Total number of tweets each week (all categories) all_week_counts <- rladies_tweets[, .N, by = week_start] all_week_counts[, not_reply_retweet := FALSE] # Combine data week_counts <- rbind(all_week_counts, week_counts) # Remove weeks with 0 or 1 tweets week_counts <- week_counts[N > 1]
You can see that the median is around 50, although about 35 to 70 looks pretty normal. That’s including retweets and replies - otherwise the median is about 30.
How did I do in the end? I looked at the data and found out I tweeted 61 times, and 8 of those were retweets/replies. I’m a bit surprised because at the time I wondered if I was active enough, but it looks like I was doing very averagely for the account. Which is what I wanted!
In my next post, I’ll go into more detail about what ‘normal’ looks like on this account, such as levels of engagement and tweet timing.