Research digital skills training 2021
Exploring perceptions towards climate change over time on Twitter
Noel Zeng, Nick Young, Centre for eResearch; Associate Professor Giovanni Coco, Dr Michael Martin, School of Environment
Anthropogenic climate change is one of the most urgent threats faced the world today, causing major, possibly irreversible disruptions to the biosphere that have already resulted in significant harm to ecosystems and the lives and livelihoods of people in vulnerable communities. Deliberation on the issue in the public sphere is a key part of the effort to counter disinformation, and create policies for adapting to the changing climate and mitigating further damage. One promising approach to investigating trends in deliberation on the issue is through collecting and analysing user generated content from social media networks such as Twitter.
Associate Professor Giovanni Coco and Dr Michael Martin wanted to explore the social media approach and worked with Nick Young and Noel Zeng from the Centre for eResearch on an initial data collection and analysis of Twitter microblog posts (“Tweets”) about climate change from August 2019 to early 2020.
We collected 28.5 million Tweets that mentioned “climate change” and variants of the word from 2006 to 2019. We then analysed the emotional content of the Tweets using the sentiment analysis model VADER, as well as data analysis and visualisation tools pandas and Matplotlib. We found a steady decrease in mean sentiment in Tweets about climate change over time, as well as an increase in Tweets that use climate denial keywords since 2017. The results are an encouraging sign that analysing social media data can be a fruitful direction for investigation.
We used the Tweet scraping tool twitterscraper to collect Tweets which contained the keywords “climate change”, “sea level rise”, “global warming”, as well as versions of the words without spaces. The tool worked by requesting the Twitter search page with our keywords as the query, then extracting Tweets from the resulting HTML page.
We collected 28,526,845 English language Tweets that mentioned “climate change” and its variants, sent during 2006 to 2019. (see example in Fig 1,2,3) Due to limitations of the scraping tool and the Twitter search page, nearly half of the Tweets in the dataset are from August 2019, when data collection took place. Tweets in the dataset sent before August 2019 will likely only be a subset of all sent Tweets during the time period.
Despite the limitations in data collection, we observed clear trends in changes to expressed sentiment over time. A steady decline of positive Tweets, coupled with an increase in neutral and negative Tweets can be observed until late 2017, when a noticeable decline in neutral Tweets can be seen. Overall, there was a steady decline in mean Tweet sentiment score over the time period, which may indicate Tweets that mentioned climate change had expressed more negative sentiment.
We found 778,166 Tweets within our data that included a climate denial keyword. There was a notable increase in the number of such Tweets since 2017, with notable spikes during certain months.
After we finished collecting the Tweets, we used sentiment analysis to understand the emotional content and intensity of the Tweets. We tried to classify each Tweet based on emojis it used, using an emoji sentiment lexicon – a list of emojis and whether each conveyed a positive or negative sentiment. For example in the lexicon, a smiling face emoji was deemed to convey positive sentiment, while an angry face emoji was deemed to convey negative sentiment. While the approach yielded encouraging results, it only worked for Tweets that have emojis. We found VADER, a model with a human-curated sentiment lexicon attuned to social media texts, to be a superior approach. It analysed text based on the words used and assigned a decimal score between -1 and 1 to indicate sentiment orientation (negative, neutral or positive) and intensity. To classify the collected Tweets we used vaderSentiment, an implementation of the model built with the Python programming language. After classification, we manually read a sample of Tweets classified as having positive, neutral and negative sentiments, to verify the classifications were appropriate.
We then analysed the classified Tweets using pandas, a Python data analysis tool. We used the tool to label each Tweet as positive, neutral or negative based on their VADER sentiment score, and aggregated them by week and month. The dataset was also filtered for Tweets containing the words “fake”, “not real”, “isn’t real”, “doesn’t exist”, “hoax”, “propaganda” and “conspiracy”, to explore trends in Tweets that mentioned climate denial.
Finally, we used Matplotlib, a Python data visualisation tool, to create charts based on the analysis, in order to visualise trends in sentiment and mentions of climate denial.
We found clear trends in changes to sentiment and mentions of climate denial keywords over time. They indicate that analysing data from social media networks can be an effective way to investigate the deliberation on the issue of climate change in the public sphere. However, there are limitations of using social media data, one such as bots which are computer programs that can masquerade as humans to post or send messages on social media (see note1). Improving the data collection process, filtering out bot activities, identifying themes and communities in Tweets, as well as correlating trends with world events are potential directions for further investigation. We have published the analysis results and source code on https://github.com/UoA-eResearch/twitter_analysis, and we are hopeful of continuing this work.
The authors wish to acknowledge the Centre for eResearch at the University of Auckland for their help in facilitating this research, and for providing research compute through the Nectar Research Cloud.
Note1 : Researchers from Brown University examined 6.5m tweets posted in the days leading up to the month after Trump announced the US exit from the Paris accords on 1 June 2017 that a quarter of all tweets about climate crisis on an average day are produced by bots. It suggested that “substantial impact on mechanised bots in amplifying denialist messages” reported by The Guardian and BBC, see original news https://www.theguardian.com/technology/2020/feb/21/climate-tweets-twitter-bots-analysis, 21 Feb., 2020.